Data Science Engineer
2022 - PRESENTBCG - Gamma- Integrated and standardized multiple data sets from various vendors into the Snowflake Lighthouse data warehouse, enhancing the data quality and accessibility for machine learning models used by BCG case teams.
- Developed and maintained robust Airflow DAGs, ensuring daily data loads were efficient and accurate, resulting in improved database performance and reliability.
- Utilized a diverse tech stack including Python, Airflow, AWS S3, and Snowflake, effectively streamlining data engineering processes and delivering reliable, scalable solutions to support BCG's machine learning initiatives.
- Collaborated with cross-functional teams to identify data requirements and implement data engineering best practices, contributing to the successful completion of various projects and driving measurable value for BCG case teams.
Technologies: Python, Scikit-learn, Spark, Data Visualization, Snowflake, Amazon S3 (AWS S3), Apache AirflowData Scientist
2021 - 2022Campus Coach- Collaborated with Campus Coach, a training app for runners, to develop targeted marketing campaigns by identifying and segmenting their free user base.
- Employed scikit-learn and Python to create a propensity score model, which helped predict the likelihood of users converting to paid subscriptions.
- Utilized the model's insights to assist Campus Coach in tailoring marketing efforts, resulting in more effective campaigns and increased user conversion rates.
Technologies: Python 3, FastAPI, MongoDB, DockerMachine Learning Engineer
2021 - 2022GoCoupons- Developed a system to read and process grocery invoices for GoCoupons.ca, utilizing Google Cloud AI's Vision and Natural Language APIs with the Python SDK for product and banner detection.
- Integrated GPT-3.5 turbo via the OpenAI API to accurately extract every product from the invoice images, enhancing the overall data extraction process.
- Collaborated with the couponing company to implement this solution, resulting in a more efficient and automated product recognition system.
Technologies: Python 3, OCR, Entity Extraction, Data Extraction, Natural Language Processing (NLP), Google Cloud AI, Google Cloud, OpenAI GPT-3 API, Artificial Intelligence (AI), Computer Vision, Text RecognitionMachine Learning Engineer
2019 - 2020Equifax- Designed and implemented a machine learning-based system to predict the real estate value of over 2 million properties across Canada.
- Utilized the XGBoost algorithm to build an accurate and efficient prediction model, significantly enhancing the property valuation process.
- Enabled data-driven decision-making for investors, property owners, and real estate professionals by providing reliable property value estimates.
Technologies: Python 3, XGBoost, Azure, Scikit-learnData Scientist
2019 - 2020King & Partners- Collaborated with a US-based hotel chain to identify and target high-value customers through the analysis of booking data and CRM records.
- Implemented CLV (Customer Lifetime Value) predictions and segmentation using Python and scikit-learn, enabling the hotel chain to focus on retaining their most valuable guests.
- Leveraged the insights gained from the analysis to inform marketing and customer service strategies, ultimately enhancing guest satisfaction and loyalty.
Technologies: Python, SQL, Machine Learning, Data Analytics, Data Science, Google Cloud Platform (GCP)Data Scientist
2017 - 2019JLR Solutions Foncières- Developed, integrated, improved, and maintained a machine learning model able to estimate the market value of the houses in Canada with LightGBM in Python and SQL for the ETL process.
- Built a housing price index based on the three-stage least-square regression methodology by Case and Shiller (1987) using Python and SQL Oracle.
- Wrote reports on the state of the real estate market. Produced econometric analyzes based on real estate microdata. Communicated the analysis produced from the data and was interviewed on radio and newspapers about these studies.
Technologies: Python, R, Machine Learning, Data Analytics, Data Science, Google Cloud Platform (GCP)