Senior Machine Learning Engineer2022 - PRESENTRepustate
Technologies: Artificial Intelligence (AI), AWS S3, Algorithms, Deep Learning, Open Neural Network Exchange (ONNX), Hugging Face, gRPC, Protobuf, Git, Go, Python, PyTorch, Keras, MLflow, Amazon EC2, Docker, NumPy, Scikit-learn, Pandas
- Developed novel and interpretable deep learning solutions for text in PyTorch by implementing and improving multiple research paper algorithms.
- Implemented quantization strategies using ONNX, reducing model sizes by up to 5x and increasing inference time by up to 3x.
- Designed and developed a new generation gRPC microservices API, allowing the main application, which is developed in Go, to communicate with Python deep learning servers.
- Reduced server size by up to 8x compared to the previous generation API and increased inference speed by 2x-3x on average for prediction tasks.
- Reduced the server size by up to 8x compared to the previous generation API.
- Designed and developed AWS S3 schemas for production models and tokenizers.
- Developed custom deep learning Docker images and RPM packages to be installed on on-premise RHEL/Linux servers. Tech used: RPM, AWS EC2, Docker, Git.
- Orchestrated a continuous labeled-data generation pipeline that extracts labeled text using SQL daily and then stores the data in a designed S3 data lake to be used in future language models.
- Designed and developed a custom MLflow tracking server to record and monitor experimentation results and artifacts.
- Managed a multi-client portfolio (Banking, Govt, Healthcare, Marketing, Retail), led technical discussions on sales calls, and aided in landing over five clients.
Data Scientist2020 - PRESENTDecathlon
Technologies: Python, SQL, Jenkins, Keras, TensorFlow, PyTorch, Scikit-learn, Pandas, NumPy, Flask, Google Data Studio, Redshift, AWS S3, Google Cloud Platform (GCP), Jenkins Pipeline, GitHub
- Developed an in-house data-visualization pipeline that replaced a licensed tool saving $60,000 per year. Used SQL, Git, Jenkins, AWS cloud, Google Sheets API, and Google Data Studio.
- Prototyped an NLU solution with customer reviews classification, keyword extraction, and sentiment analysis that outperformed a licensed tool, saving the marketing team $15,000 per year.
- Created a visual search engine that was deployed as a product retrieval API. It's currently being used for product recommendations.
- Built an unsupervised topic modeling solution for customer reviews with visualization, using sentence transformers. Improved original solution using GPT2 and prompt engineering.
- Developed a store turnover forecasting tool using additive models and custom-made regressors (Prophet API).
- Engineered an NLU product-article recommendation solution as part of Decathlon's personalization strategy.
- Worked on data extraction, transformation, and loading tasks for each solution.
- Made a sustainability reporting tool to monitor the performance of second-life and eco-designed products.
- Built a color detection solution using k-means clustering to aid internal object detection models.
- Interviewed new data science candidates, actively contributed to the hiring process, and mentored new interns on various data-related tasks.
Machine Learning Intern2019 - 2019Decathlon
Technologies: Python, Algorithms, Keras, SQL, TensorFlow, Jenkins Pipeline, GitHub
- Developed and deployed a deep recommendation model for user-click item prediction using LSTM RNN architecture.
- Surpassed the benchmarked precision, recall, and coverage metrics by improving the solution using attention models.
- Developed and deployed an object detection model using TensorFlow's API for a hockey-brand detection application.
Machine Learning Developer Intern2018 - 2018Societe Generale
Technologies: Tableau, Python, Django, MicroStrategy
- Developed a BI reporting tool using MicroStrategy.
- Contributed to data visualization projects using Tableau.
- Helped develop a web application using the Django framework.