Industry Experience
Senior Machine Learning Engineer @ Twitter.com [July 2019 - Present]
One of the founding engineers of the Conversations Quality team that works on tweet replies ranking to drive meaningful conversations.
- Shipped novel features & predictive models based on user engagement graph on Twitter to drive >30% gains in key engagement metrics.
- Built a Light Ranker in reply ranking service for graceful degradation and scaled the service to systematically rank tens of millions of candidates per second. This led to 20% gain in p9999 latency while improving health metrics by 5% overall.
- Drove modernization of ML stack by architecting framework using KubeFlow to train next generation native Tensorflow models. Utilized other GCP technologies like DataFlow and BigQuery to improve end‑to‑end model training duration by 10x.
- Developed ML powered product feature for explainable reply ranking ‑ resulting in improved reply health and number of follows by 3‑5%.
- Influenced roadmaps by data backed analyses & shipped measurement frameworks to quantify metric gains from ranking improvements.
Technologies: Python | Scala | Java | Tensorflow | Hadoop | Airflow | Kubeflow | Scalding | BigQuery | GCP
Software Development Engineer @ Amazon.com [July 2018 - July 2019]
Worked for Amazon Expansions and Exports - Tech team which enables customers to buy eligible products internationally. I was involved in projects around:
- Improving the infrastructure scalability using Native AWS technologies to speed up the eligibility calculation process.
- Improving the eligibility prediction process using Machine Learning models.
Technologies: AWS services | Java | Python | Jupyter Notebook
Side hustles at work:
- I was a part of Amazon’s Machine Learning University program which aims to educate Amazon developers about ML and AI. I was a Teaching Assistant for the “Introduction to Data Science” and “Text Mining” courses.
- I was a reviewer for Amazon’s Machine Learning Conference (AMLC) 2019 reviewing submissions related to recommender systems domain.
- I presented my research on Product Size Recommendation internally at Amazon and it was also selected for presentation at Amazon’s Machine Learning Conference.
Software Development Engineer Intern @ Amazon.com [June 2017 - September 2017]
I interned in the DataForge team which provides a platform for running Big Data operational workloads consistently within service level agreement, obviating the need to learn, set up, and manage Big Data technologies in order to support operational business use cases. I worked towards designing and implementing:
- Support for primary key constraint and batch inserts/updates while ensuring consistent reads in Hive using append-only table and multi-version concurrency control concepts
- Support for transactionality in Hive
- Support for compaction (carefully discarding old data) without blocking other operations
This was particularly challenging as it entailed handling highly concurrent and complex scenarios arising due to the distributed nature of Hive and the fact that Hive is not designed to handle transactional data and operations.
Technologies: Java | Hive | DynamoDB
Member Technical @ Arcesium India Pvt. Ltd. [July 2015 - July 2016]
Arcesium spun out of the D. E. Shaw Group. I worked there in the Arcesium/Tech division as a primary developer for the STP (Straight Through Processing) team. Some of my important responsibilities include:
- Migration of Blotters’ (end-of-day trade report files) scripts from legacy to Java-based infrastructure while ensuring reusability and scalability.
- Adding support for self-sanitization, self-recovery and fault tolerance in the new infrastructure.
- Adding a self-aware triggering mechanism for Blotters, greatly minimizing data completeness issues.
- Creating and exposing various RESTful services to increase the visibility into the system’s state
- Profiling and optimizing(~40%) code (using concurrency) and database (using index and partitions).
- Providing support to the users in case of system’s aberrant behavior.
Technologies: Java | Spring | MyBatis | SQL Server | Git
Research Experience
Graduate Researcher @ UC San Diego [April 2017 - June 2018]
Under Prof. Julian McAuley’s guidance, I worked on several user behavior modeling and NLP problems and published following articles:
- WSDM 2020: Addressing Marketing Bias in Product Recommendations
- ACL 2019: Fine-Grained Spoiler Detection from Large-Scale Review Corpora
- RecSys 2018: Decomposing Fit Semantics for Product Size Recommendation in Metric Spaces
Research Intern @ Indian Institue of Technology, Madras [December 2014 - May 2015]
I worked under the guidance of Prof. Balaraman Ravindran and contributed to two research problems, focusing on the development of scalable Bayesian algorithms for Recommender Systems.
- Contributed to the development of scalable Bayesian Matrix Factorization algorithm, which reduces the cubic time complexity of existing algorithm to linear. Published in ECML/PKDD 2015 workshop: Scalable Bayesian Matrix Factorization
- Contributed to the development of scalable variational Bayesian framework for Factorization Machines, which supplements the existing framework with a scalable alternative. Preprint: Scalable Variational Bayesian Factorization Machine
Summer Fellow @ Indian Institue of Technology, Madras [May 2014 - July 2014]
I was a part of Summer Fellowship Programme of IIT Madras and worked here under the guidance of Prof. Balaraman Ravindran in the field of Statistical Machine Learning. I did a project on Collaborative Tweet Recommendation where I used Collaborative Filtering to efficiently recommend relevant tweets to users.