Supervised by Prof. Derek Nowrouzezahrai and Prof. Maxime Cohen, built RL environment for the Retail Innovation Lab
Supervised by Prof. Krzysztof Czarnecki, integrated rule-based autonomous vehicle controller to RL environment
show project motivationAlthough rule-based controllers may require more manual effort to setup compared to learning-based algorithms (e.g., RL) and run slower during inference, they are human-interpretable, which allows them to be verified for safety. The broader goal of this project is to align both rule-based and learning-based systems. The advantages are two-fold: this alignment allows the pros and cons of RL to be studied more clearly, and it can also help improve the rule-based system through rapid testing of edge cases in an RL environment.
Wise Move environment
Supervised by Prof. Mark Crowley, study independent algorithms in the multi-agent setting
Algorithms include: [Independent] DQN, PPO, SAC, [Multi-Agent] MADDPG, MAPPO, QMIX, COMA, DRON Paper accepted at NeurIPS 2021 Deep Reinforcement Learning Workshop
Independent reinforcement learning algorithms have no theoretical guarantees for finding the best policy in multi-agent settings. However, in practice, prior works have reported good performance with independent algorithms in some domains and bad performance in others. Moreover, a comprehensive study of the strengths and weaknesses of independent algorithms is lacking in the literature. In this paper, we carry out an empirical comparison of the performance of independent algorithms on four PettingZoo environments that span the three main categories of multi-agent environments, i.e., cooperative, competitive, and mixed. We show that in fully-observable environments, independent algorithms can perform on par with multi-agent algorithms in cooperative and competitive settings. For the mixed environments, we show that agents trained via independent algorithms learn to perform well individually, but fail to learn to cooperate with allies and compete with enemies. We also show that adding recurrence improves the learning of independent algorithms in cooperative partially observable environments.
Developed QR payment Interac app in Flutter for cross-platform compatibility, hosted on Firebase
Developed credit card application in Wayfair's Android app using Kotlin
Decoupled monolithic code by utilizing Dependency Injection (DI) with Dagger
Applied Clean Architecture with VIPER/MVP, increasing testability and reusability of code
Wayfair app (v5.52) from Google Play store at July 14, 2020
Built GUI library using Tkinter and Pillow for forest fire RL environment
Existing forest fire simulators are either too unrealistic or too slow. This simulator focuses on realistic modelling of forest fire behaviour, while ensuring that it is still sufficiently fast for RL research.
Combined active learning with domain heuristics to reduce labelling costs
Fine-tuned BERT using TensorFlow to perform transfer learning for contextual multiclass classification
Built Graph DB in NetworkX to increase querying speed and flexibility for text-based insight generation
Served model using Flask with Docker, deployed on EC2/ECS with Cognito for authentication
Problem
The lack of labelled data is one of the largest bottlenecks in the field of supervised machine learning.
Introduction
At DSpace, we wanted to train a text classifier that classify sentences into more sections, in order to increase precision of our text generation model. The challenge is that we have a lot of data, but none of them are labelled.
Goal
To minimize the cost of collecting labels for existing unlabelled datasets.
Solution
We reduced the cost of collecting labels by:
1) Reducing the number of labelled data required:
By combining active learning and domain heuristics (in the form of labelling functions), we can ensure that the labels we are collecting are the most informative possible for the model, thus potentially reducing the number of labels required to have a functional classifier.
2) Generating insights from the labels collected:
We used a Graph Database to store all labels collected. Not only does it increase our querying speed and flexibility, but it also retains all semantics of the original texts, allowing us to generate insights.
3) Having an enticing web application:
Leveraging on Docker and AWS services such as EC2/ECS, Cognito, CloudWatch, DynamoDB and S3, we were able to build a real-time label collection platform that also extracts domain knowledge from the labellers.
Prototyped tokenization and verification of PIDs via QR/NFC on Android
Hosted serverless backend architecture using AWS API Gateway with Lambda and Firebase Realtime DB
Utilized Design Thinking to develop customer-centric workflows and applications
Why?
Develop next-NEXT-generation technology to blow the boundaries of fin-tech out of the waters.
1) Point of Sale (POS) terminals are often too expensive for merchants of micro-businesses. We want to leverage on technology to allow Interac e-Transfer to be made easily through NFC/QR codes, eliminating the need for POS terminals.
Hosted serverless architecture using AWS API Gateway with Lambda, S3, dynamoDB and Firebase
Utilized design thinking methodology, ensuring that user experience is seamless and convinient.
Developed Point of Sale terminal mobile apps, allowing eTransfer via QR and NFC using Android Beam
2) Conventional methods to verify age or ID is slow, and it reveals unneeded Personal Identifiable Information
Developed Android app that uses NFC and QR code to securely verify digital cards. This ensures that only information that is required is revealed to the reader.
Developed production-ready Java libraries and maintained production test suites
Deployed Alexa lighting control skill with zone localization using AWS Lambda and S3