Leopoldo Corona
Verified Expert in Engineering
Machine Learning Engineer and Developer
Guadalajara, Mexico
Toptal member since April 27, 2020
Leopoldo is a Certified AWS Machine Learning (ML) specialist who has worked in all data-related positions. He started his career early as a data research analyst and then became a data scientist, developing risk and fraud assessing models. When Leopoldo began struggling with bringing those models to production, he transitioned to an ML position focusing more on data-centric engineering, thus becoming a data engineer. Currently, Leopoldo is the head of data engineering at Clara startup.
Portfolio
Experience
Availability
Preferred Environment
Amazon Web Services (AWS), Python, Databricks, Scala
The most amazing...
...thing I've developed is an identity verification model that extracted and matched faces from an ID card and a selfie with over 95% accuracy.
Work Experience
Head of Data and ML Engineering
Clara
- Built the data engineering team from scratch and grew it to a team of 9+ engineers.
- Developed Clara's global data lake and data lakehouse. Started with Redshift and AWS Glue and migrated all processes and ETLs to Databricks.
- Delivered global data for insights and reporting. Reported directly to the director of data.
Senior Data Scientist
PayClip
- Deployed a fraud assessment model to production, which reduced fraudulent transactions by over 50%.
- Utilized Databricks notebooks and Snowflake to conduct analysis and report on fraud and risk key performance indicators (KPIs).
- Oversaw stakeholder requirements and delivered presentations.
Machine Learning Engineer
Kavak
- Developed and improved feature engineering jobs for the ML models to consume.
- Provided analysis support for credit risk, the financial branch of the company.
- Provided analysis and development support for computer vision projects.
Lead Data Scientist | ML Engineer
Kueski
- Developed and deployed into production a fraud prevention model for a loan application streaming process that saved more than 10% a month in losses.
- Monitored and continuously improved the fraud prevention model to prevent more than 1% decreases in performance.
- Coached junior team members by transferring fraud modeling knowledge and sharing general know-how.
- Proposed a standardized project template for data science model service repositories that made model deployment 80% more efficient and experiment-trackable.
- Led a high-performance ML engineering team and proposed a balanced team workflow based on restricted WIP Kanban Agile methodology. This proposal increased the team's productivity by 100%.
- Developed a face-image-matching deep-learning model with over 95% accuracy when verifying our client's identity.
- Featured a store project using Hopsworks and Databricks.
Data Scientist
Intelimetrica
- Co-developed the nearest-neighbors model used in the company's main platform product used to show the most similar houses geographically close from the property selected in the platform. This model helped to detect anomalies in house appraisals.
- Collaborated in the continuous improvement of the house-pricing prediction model for the two main clients of the firm.
- Created a model to predict optimal delivery routes as a PoC for a client—potentially creating a savings of over 30% in logistic expenses.
- Co-lead the data science team while reporting directly to the CEO.
Research Assistant
UNAM Physics Institute
- Co-authored a conference paper on research where I implemented both the independent image reconstruction and the image registration optimization using affine transformations combined with a non-linear transformation.
- Helped with the preclinical studies by preparing and configuring the microCT unit that was getting over 2GB of data in every study.
- Conducted research on imaging medical physics—manipulating more than 500GB of data in the university supercomputer cluster.
Research Intern
National Institute of Neurology and Neuroscience
- Spearheaded the development of a CT and PET brain atlas on a healthy Mexican population to help improve automatic digital segmentation for radiotherapy and radiosurgery.
- Helped on dosimetry measurements in radiotherapy and radiosurgery sessions.
- Supported experimental setups and data analysis to calibrate the radiotherapy and radiosurgery equipment based on measurement data.
Experience
Optimization of Dual-energy Subtraction for Preclinical Studies Using a Commercial MicroCT Unit
Our investigation used an Albira ARS commercial unit, not designed explicitly for quantitative CT tasks. DE subtraction was divided into stages that were independently analyzed: acquisition, volume reconstruction, image registration, and image weighting. The DE radiological techniques (low- and high- energy) had been previously optimized to enhance the visualization of iodine-based CM.
An independent reconstruction was needed to guarantee linearity between iodine intensity and its concentration for high energy acquisition; it also reduced structured noise occasionally produced by the microCT reconstruction software over uniform regions and improved bone visualization. Image registration was optimized, combining an affine transformation with a non-linear transformation determined with the Free-Form Deformation algorithm.
Two subtraction weight factors were identified: one that maximized the contrast-to-noise ratio (CNR) of iodine mixed with soft-tissue-equivalent resin and another that minimized CNR between bone-like rods and soft-tissue-equivalent material.
Intelimétrica Banca
This platform featured a house pricing model, with KPI of less than 5% of error, and a similar geographically-close houses finder model to avoid fraud in house appraisals. The similar houses model featured a similarity score as a secondary indicator of the quality of the appraisal.
I co-developed both machine learning models and contributed to the operationalization working close with the engineering team.
Face Similarity Identification Model
I proposed and developed an application that extracted the faces and inputted them to a model that returned the probability of being the same person to verify the loan applicant's identity automatically. This model was developed and trained from scratch using proprietary data, which had a state-of-the-art performance.
Education
Master's Degree in Informatics and Applied Mathematics
Higher School of Economics - Moscow
Bachelor of Science Degree in Engineering Physics
Monterrey Institute of Technology and Higher Education - Monterrey, Mexico
Certifications
AWS Certified Machine Learning - Specialty
Amazon Web Services
Skills
Libraries/APIs
Scikit-learn, XGBoost, Keras, Pandas, NumPy, Matplotlib, CatBoost, Dask, PySpark, PyTorch
Tools
Jupyter, Jira, AWS Glue, Seaborn, Git, MATLAB, ITK, Apache Airflow, Ansible, Amazon SageMaker, Spark SQL
Languages
Python, SQL, Bash, Python 3, Scala, Snowflake
Paradigms
ETL
Platforms
Amazon Web Services (AWS), AWS Lambda, Databricks, Apache Kafka
Storage
MySQL, PostgreSQL, Data Validation, Redshift
Frameworks
LightGBM, Spark
Other
Machine Learning, Data Science, Model Validation, Technical Consulting, Data Modeling, Data Analysis, Deep Learning, Data Engineering, Leadership, Algorithms, Engineering, Physics
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring