Machine Learning Engineer II @ Twitter.com [July 2019 - Present]
Working in the Timelines Quality team. The team’s mission is to show users the content they care about by building relevance and machine learning models and systems. Every time users see new tweets, nearly half a billion daily tweets are evaluated to organize and deliver the best timeline experience.
- Working on building end-to-end Machine Learning pipelines.
- Working on engineering better features and models to improve offline and online metrics to increase user satisfaction.
- Performing Data Science analysis to identify potential problems and their impact on user satisfaction.
- Developing ML tooling using BigQuery and GCP to speed up the exploratory analysis process.
Technologies: Python | Scala | Scalding | Hadoop | Airflow | BigQuery | GCP | Tensorflow
Software Development Engineer @ Amazon.com [July 2018 - July 2019]
Worked for Amazon Expansions and Exports - Tech team which enables customers to buy eligible products internationally. I was involved in projects around:
- Improving the infrastructure scalability using Native AWS technologies to speed up the eligibility calculation process.
- Improving the eligibility prediction process using Machine Learning models.
Technologies: AWS services | Java | Python | Jupyter Notebook
Side hustles at work:
- I was a part of Amazon's Machine Learning University program which aims to educate Amazon developers about ML and AI. I was a Teaching Assistant for the "Introduction to Data Science" and "Text Mining" courses.
- I was a reviewer for Amazon's Machine Learning Conference (AMLC) 2019 reviewing submissions related to recommender systems domain.
- I presented my research on Product Size Recommendation internally at Amazon and it was also selected for presentation at Amazon's Machine Learning Conference.
Software Development Engineer Intern @ Amazon.com [June 2017 - September 2017]
I interned in the DataForge team which provides a platform for running Big Data operational workloads consistently within service level agreement, obviating the need to learn, set up, and manage Big Data technologies in order to support operational business use cases. I worked towards designing and implementing:
- Support for primary key constraint and batch inserts/updates while ensuring consistent reads in Hive using append-only table and multi-version concurrency control concepts
- Support for transactionality in Hive
- Support for compaction (carefully discarding old data) without blocking other operations
This was particularly challenging as it entailed handling highly concurrent and complex scenarios arising due to the distributed nature of Hive and the fact that Hive is not designed to handle transactional data and operations.
Technologies: Java | Hive | DynamoDB
Member Technical @ Arcesium India Pvt. Ltd. [July 2015 - July 2016]
Arcesium spun out of the D. E. Shaw Group. I worked there in the Arcesium/Tech division as a primary developer for the STP (Straight Through Processing) team. Some of my important responsibilities include:
- Migration of Blotters’ (end-of-day trade report files) scripts from legacy to Java-based infrastructure while ensuring reusability and scalability.
- Adding support for self-sanitization, self-recovery and fault tolerance in the new infrastructure.
- Adding a self-aware triggering mechanism for Blotters, greatly minimizing data completeness issues.
- Creating and exposing various RESTful services to increase the visibility into the system's state
- Profiling and optimizing(~40%) code (using concurrency) and database (using index and partitions).
- Providing support to the users in case of system's aberrant behavior.
Technologies: Java | Spring | MyBatis | SQL Server | Git
Student Researcher @ UC San Diego [April 2017 - June 2018]
Under Prof. Julian McAuley's guidance, I worked on several user behavior modeling and NLP problems and published following articles:
Research Intern @ Indian Institue of Technology, Madras [December 2014 - May 2015]
I worked under the guidance of Prof. Balaraman Ravindran and contributed to two research problems, focusing on the development of scalable Bayesian algorithms for Recommender Systems.
- Contributed to the development of scalable Bayesian Matrix Factorization algorithm, which reduces the cubic time complexity of existing algorithm to linear. Published in ECML/PKDD 2015 workshop: Scalable Bayesian Matrix Factorization
- Contributed to the development of scalable variational Bayesian framework for Factorization Machines, which supplements the existing framework with a scalable alternative. Preprint: Scalable Variational Bayesian Factorization Machine
Summer Fellow @ Indian Institue of Technology, Madras [May 2014 - July 2014]
I was a part of Summer Fellowship Programme of IIT Madras and worked here under the guidance of Prof. Balaraman Ravindran in the field of Statistical Machine Learning. I did a project on Collaborative Tweet Recommendation where I used Collaborative Filtering to efficiently recommend relevant tweets to users.