Chief Data Scientist2020 - PRESENTFlat.mx
Technologies: Python 3, PostgreSQL, Dash, Plotly, Scikit-learn, TensorFlow, Docker, AWS ECS, GitHub
- Developed machine learning predictive models for real estate properties in Mexico City, including long and short-term rent prices and selling prices.
- Built a machine learning DevOps framework to easily deploy and update models on ECS.
- Developed an automated offer system that reduced visits to offer lead time from 13 days to minutes.
- Created data visualizations dashboards to democratize data and insights inside the company. Led the KPI efforts to measure every business unit with continuous QA and improvement activities.
- Developed complex Airflow pipelines to curate data from multiple sources and formats to feed the acquisition team with high conversion rates, leading to scaling the business.
- Modeled the public transportation network of Mexico City to understand the access and centrality of different areas.
Director of Data Architecture and Data Analysis2019 - 2020Mexico City Government
Technologies: Python 3, Pandas, SQL, NetworkX, Plotly, Spark, SVMs, Scikit-learn, Neo4j
- Developed the city’s security dashboard for the city’s police department for everyday reporting and crime tracking. This dashboard is a tool that is used daily for crime tracking and decision-making.
- Created optimization algorithms for police distribution on the subway system, including a visualization tool with metrics and a graphic scheduler.
- Diagnosed emergency response time of ambulances identifying the three main root causes of delays. The actions implemented reduced ten minutes the average response time.
- Mentored a team of five data analysts and scientists with best practices and product development methodologies. Implemented on-hands training to develop ETL pipeline and database capabilities on the team.
- Created a data analytics team to assist the mayor and other city departments with decision-making.
- Presented and produced multiple exploratory data analyses to inform decision-makers regarding security, emergency response, and mobility.
Data Scientist | Researcher | Digital Analytics and Insights2018 - 2019Discovery, Inc.
Technologies: Pandas, Python 3, SQL, Apache Hive, NetworkX, Plotly, Dash, Tableau, Cron, Natural Language Processing (NLP), Gensim, SpaCy, Support Vector Machines (SVM), Scikit-learn
- Developed a data-product dashboard (Tableau) based on app reviews, including an automated ETL pipeline that gets the new reviews and ratings. At the core, I implemented a review text classification model using SVM, achieving 80% accuracy.
- Developed the alarm dashboard in Tableau that helped to track performance across different platforms and the discovery ecosystem. It allowed exploring the daily performance of eight variables with an anomaly detection system.
- Created a bipartite graph of streamers and shows for audience clustering. Performed network analysis to understand how audiences overlap in each channel. This data product is used by marketing and programming to guide their business strategies.
- Implemented a network analysis framework to analyze a market research survey creating a visualization tool for the team to explore the survey results adding demographics filters, helping them design future products.
- Developed the ROI marketing campaigns dashboard (Tableau) to measure marketing campaign performance. The dashboard helps visually understand the campaign lifecycle and forecasts the expected ROI.
Data Scientist2017 - 2018ARGO Labs, California Data Collaborative
Technologies: Dash, Python 3, Natural Language Processing (NLP), Statistics, Benchmarking
- Created an ETL pipeline that combines data from water utilities, a web scraper, and public APIs to identify the business type of a water user (commercial or institutional).
- Designed a classification method that uses data from APIs and NLP to assign a business type to each customer.
- Created an automated process to aggregate the census data from the block-group level to the water districts level. This allowed the water utilities to understand their customers and the research team to create an analysis using demographics.
- Led a team of four to develop a water usage benchmark in CA by using publicly available data and water utility data.