Experience

Student Researcher

Mila

Sept 2022 - Current

Montreal, Quebec

Reinforcement Learning Research Assistant @Retail Innovation Lab

McGill University

Sept 2021 - Dec 2021

Montreal, Quebec

• Supervised by Prof. Derek Nowrouzezahrai and Prof. Maxime Cohen, built RL environment for the Retail Innovation Lab

Reinforcement Learning Research Assistant @WISE Lab

University of Waterloo

Apr 2021 - Aug 2021

Waterloo, Ontario

• Supervised by Prof. Krzysztof Czarnecki, integrated rule-based autonomous vehicle controller to RL environment

show project motivation

Although rule-based controllers may require more manual effort to setup compared to learning-based algorithms (e.g., RL) and run slower during inference, they are human-interpretable, which allows them to be verified for safety. The broader goal of this project is to align both rule-based and learning-based systems. The advantages are two-fold: this alignment allows the pros and cons of RL to be studied more clearly, and it can also help improve the rule-based system through rapid testing of edge cases in an RL environment.

Wise Move environment

Reinforcement Learning Research Assistant @UWECEML Lab

University of Waterloo

Jan 2021 - Aug 2021

Waterloo, Ontario

• Supervised by Prof. Mark Crowley, study independent algorithms in the multi-agent setting
• Algorithms include: [Independent] DQN, PPO, SAC, [Multi-Agent] MADDPG, MAPPO, QMIX, COMA, DRON • Paper accepted at NeurIPS 2021 Deep Reinforcement Learning Workshop

show abstract

Independent reinforcement learning algorithms have no theoretical guarantees for finding the best policy in multi-agent settings. However, in practice, prior works have reported good performance with independent algorithms in some domains and bad performance in others. Moreover, a comprehensive study of the strengths and weaknesses of independent algorithms is lacking in the literature. In this paper, we carry out an empirical comparison of the performance of independent algorithms on four PettingZoo environments that span the three main categories of multi-agent environments, i.e., cooperative, competitive, and mixed. We show that in fully-observable environments, independent algorithms can perform on par with multi-agent algorithms in cooperative and competitive settings. For the mixed environments, we show that agents trained via independent algorithms learn to perform well individually, but fail to learn to cooperate with allies and compete with enemies. We also show that adding recurrence improves the learning of independent algorithms in cooperative partially observable environments.

arXiv

Innovation Lab Developer

Interac

Sep 2020 - Dec 2020

Waterloo, Ontario

• Developed QR payment Interac app in Flutter for cross-platform compatibility, hosted on Firebase

Software Engineer

Wayfair

Jan 2020 - Mar 2020

Boston, Massachusetts

• Developed credit card application in Wayfair's Android app using Kotlin
• Decoupled monolithic code by utilizing Dependency Injection (DI) with Dagger
• Applied Clean Architecture with VIPER/MVP, increasing testability and reusability of code

show demo

Wayfair app (v5.52) from Google Play store at July 14, 2020

Reinforcement Learning Research Assistant @UWECEML Lab

University of Waterloo

Sep 2019 - Dec 2019

Waterloo, Ontario

• Built GUI library using Tkinter and Pillow for forest fire RL environment

show project motivation

Existing forest fire simulators are either too unrealistic or too slow. This simulator focuses on realistic modelling of forest fire behaviour, while ensuring that it is still sufficiently fast for RL research.

Software Product Prototyper

Deloitte

May 2019 - Aug 2019

Waterloo, Ontario

• Combined active learning with domain heuristics to reduce labelling costs
• Fine-tuned BERT using TensorFlow to perform transfer learning for contextual multiclass classification
• Built Graph DB in NetworkX to increase querying speed and flexibility for text-based insight generation
• Served model using Flask with Docker, deployed on EC2/ECS with Cognito for authentication

show project description

Problem
The lack of labelled data is one of the largest bottlenecks in the field of supervised machine learning.
Introduction
At DSpace, we wanted to train a text classifier that classify sentences into more sections, in order to increase precision of our text generation model. The challenge is that we have a lot of data, but none of them are labelled.
Goal
To minimize the cost of collecting labels for existing unlabelled datasets.
Solution
We reduced the cost of collecting labels by:
1) Reducing the number of labelled data required:
By combining active learning and domain heuristics (in the form of labelling functions), we can ensure that the labels we are collecting are the most informative possible for the model, thus potentially reducing the number of labels required to have a functional classifier.
2) Generating insights from the labels collected:
We used a Graph Database to store all labels collected. Not only does it increase our querying speed and flexibility, but it also retains all semantics of the original texts, allowing us to generate insights.
3) Having an enticing web application:
Leveraging on Docker and AWS services such as EC2/ECS, Cognito, CloudWatch, DynamoDB and S3, we were able to build a real-time label collection platform that also extracts domain knowledge from the labellers.

Developer

Interac

Sep 2018 - Dec 2018

Waterloo, Ontario

• Prototyped tokenization and verification of PIDs via QR/NFC on Android
• Hosted serverless backend architecture using AWS API Gateway with Lambda and Firebase Realtime DB
• Utilized Design Thinking to develop customer-centric workflows and applications

show project(s) description

Why?
Develop next-NEXT-generation technology to blow the boundaries of fin-tech out of the waters.

1) Point of Sale (POS) terminals are often too expensive for merchants of micro-businesses. We want to leverage on technology to allow Interac e-Transfer to be made easily through NFC/QR codes, eliminating the need for POS terminals.
• Hosted serverless architecture using AWS API Gateway with Lambda, S3, dynamoDB and Firebase
• Utilized design thinking methodology, ensuring that user experience is seamless and convinient.
• Developed Point of Sale terminal mobile apps, allowing eTransfer via QR and NFC using Android Beam

2) Conventional methods to verify age or ID is slow, and it reveals unneeded Personal Identifiable Information
• Developed Android app that uses NFC and QR code to securely verify digital cards. This ensures that only information that is required is revealed to the reader.

Junior Automation Developer

ThoughtWire

Jan 2018 - Apr 2018

Toronto, Ontario

• Developed production-ready Java libraries and maintained production test suites
• Deployed Alexa lighting control skill with zone localization using AWS Lambda and S3