Camila Andrea Gonzalez Williamson, Developer in Ecublens, Switzerland
Camila is available for hire
Hire Camila

Camila Andrea Gonzalez Williamson

Verified Expert  in Engineering

Bio

Camila is a senior software engineer with nine years of experience in data and machine learning (ML). She has deep experience analyzing and visualizing data, implementing and maintaining scalable data pipelines, prototyping and productionising ML models, and communicating effectively with stakeholders. Camila has worked in domains such as manufacturing, finance, and telecommunications, where she has collaborated on end-to-end solutions to transform raw data into actionable insights.

Portfolio

Atinary Technologies
GitHub Actions, TypeScript, Plotly.js, HTML, CSS, Angular, Pytest, OpenAPI...
Philip Morris International
Scikit-learn, Presto, Jenkins, Dask, Pandas, Apache Spark, Docker, NetworkX...
Pictet Asset Management
StatsModels, Scikit-learn, Pandas, TensorFlow, Matplotlib, Seaborn...

Experience

  • Python - 9 years
  • Data Analytics - 9 years
  • Data Visualization - 8 years
  • Apache Spark - 7 years
  • Data Engineering - 7 years
  • Machine Learning - 5 years
  • Scala - 3 years
  • Amazon Web Services (AWS) - 2 years

Availability

Part-time

Preferred Environment

IntelliJ IDEA, JupyterLab, Amazon Web Services (AWS), Apache Airflow, Python, Scala, Hadoop, Apache Spark, Plotly, P5.js

The most amazing...

...project was building a new anomaly detection service, able to detect anomalies on multiple types of time series, and send alerts in near real time.

Work Experience

Data Scientist | Full-stack Software Engineer

2020 - 2020
Atinary Technologies
  • Designed, implemented, and deployed front- and back-end web applications to transfer, analyze, and visualize data.
  • Added features to existing web applications implementing a machine learning (ML) platform for orchestrating self-driving labs.
  • Designed a web application to create and modify interactive data visualizations.
Technologies: GitHub Actions, TypeScript, Plotly.js, HTML, CSS, Angular, Pytest, OpenAPI, Flask, SQLAlchemy, Alembic, PostgreSQL, Data Engineering, Data Science, Data Analytics, Python, Data Visualization, Data

Enterprise Data Scientist

2017 - 2020
Philip Morris International
  • Developed a statistical analysis, propensity models, and scoring models to predict consumers' conversion to reduced-risk products.
  • Implemented a data-processing pipeline to cluster adoption patterns to reduced-risk products using distributed computing. This pipeline was deployed in 13 markets and brought tangible improvements to key performance indicators.
  • Industrialized a data pipeline to analyze specific global trends—using techniques such as hierarchical clustering, regression, and statistical inference—with an estimated value of tens of millions of dollars.
  • Designed, optimized, and implemented a methodology to evaluate similarities in a series of text documents to detect clusters of duplicates. Developed an API to serve the algorithm.
  • Trained, supported, and mentored interns or new data scientists joining the team and advocated for data science best practices, such as reproducible research, code versioning, use of Docker containers, and test-driven development (TDD).
Technologies: Scikit-learn, Presto, Jenkins, Dask, Pandas, Apache Spark, Docker, NetworkX, Microsoft Power BI, Plotly, XGBoost, CatBoost, StatsModels, Tree-Based Pipeline Optimization Tool (TPOT), Flask, HDFS, Apache Hive, Data Engineering, Data Science, Data Analytics, Python, Data Visualization, PySpark, Data, Data Pipelines, Apache Airflow

Data Science Intern

2017 - 2017
Pictet Asset Management
  • Performed an exploratory data analysis of internal and external fund flows, macroeconomic variables, and market indices to detect leading and lagging variables.
  • Implemented multiple models to predict market indices' performance, covering diverse asset classes and geographical regions using a diverse set of machine learning techniques: Random Forests, Naive Bayes, Markov Chains, SVM, LSTM.
  • Conducted a rigorous statistical inference analysis to evaluate the performance of the models implemented using the Benjamini-Hochberg procedure to control the false discovery rate.
Technologies: StatsModels, Scikit-learn, Pandas, TensorFlow, Matplotlib, Seaborn, Jupyter Notebook, Data Science, Data Analytics, Python, Data Visualization, Data

Temporary Support for Data Science

2016 - 2016
Swissgrid
  • Researched state-of-the-art methodologies for short-term electric load forecasting.
  • Analyzed yearly, weekly, and daily patterns for the Swiss electric load as well as non-linear dependencies with the temperature.
  • Implemented a short-term forecast for the Swiss electric using a state-of-the-art modification of least-squares support-vector machines.
Technologies: Mathematics, PostgreSQL, Pandas, SQL, Tableau, Matplotlib, SciPy, NumPy, Data Science, Data Analytics, Python, Data Visualization, Data

Analyst — Future Atuaries Program

2014 - 2015
Seguros Bolívar
  • Priced insurance products based on mortality tables and clients' data distribution.
  • Implemented forecasts based on the Monte Carlo simulation for sales strategies.
  • Developed a prototype to automatize the monthly data risk profiling of one of the main insurance products.
Technologies: Mathematics, Python, PostgreSQL, SQL, Data Analytics, Data

Experience

Data Pipeline for Global Trends

Implementation of industrialization and automation of a data pipeline to analyze specific global trends.

This was a multidisciplinary team effort that involved the collection of external data sources, an extensive work of data wrangling and text manipulation, the use of data science techniques such as hierarchical clustering, regression, and statistical inference, and the exposure of the results via a dashboard accessible as a web application.

The estimated business value for this data product was in the order of tens of millions of dollars.

Consumer Segmentation

A data-processing-pipeline to cluster adoption patterns to a specific line of products using distributed computing.

This was a team effort that involved the analysis of behavioral patterns in multi-channel customer data to identify actionable opportunities for improvement in the consumer journey. During the development, we integrated data from different sources, verified the data integrity, processed the data with Python and Spark (outlier treatment, filtering, aggregation, feature generation), generated insights from clustering and conversion models, and exposed the final results in a dashboard.

This project was deployed in 13 markets and brought tangible improvements to key performance indicators (KPIs) with estimated business value in the order of millions of dollars.

Short-term Forecast of the Electric Load

A short-term forecast for the Swiss electric load for the day-ahead or intraday market.

I was the main person in charge of implementing and evaluating a novel machine learning technique for short-term load prediction used by the Swiss electric grid operator. The resulting model successfully incorporated seasonal patterns at the yearly, weekly, and daily levels and non-linear dependencies with the temperature.

Education

2015 - 2017

Master's Degree in Financial Engineering

École Polytechnique Fédérale de Lausanne (EPFL) - Lausanne, Switzerland

2009 - 2013

Engineer's Degree in Electrical Engineering

University of the Andes - Bogota, Colombia

2008 - 2012

Engineer's Degree in Electronics Engineering

University of the Andes - Bogota, Colombia

Certifications

NOVEMBER 2019 - PRESENT

How to Win a Data Science Competition by NRU HSE

Coursera

JUNE 2019 - PRESENT

Functional Programming Principles in Scala by EPFL

Coursera

NOVEMBER 2018 - PRESENT

Big Data Analysis with Scala and Spark by EPFL

Coursera

APRIL 2018 - PRESENT

Algorithmic Toolbox by UC San Diego and NRU HSE

Coursera

FEBRUARY 2018 - PRESENT

Big Data Analysis: Hive, Spark SQL, DataFrames, and GraphFrames

Coursera

NOVEMBER 2017 - PRESENT

Professional Scrum Developer I

Scrum.org

Skills

Libraries/APIs

PySpark, Matplotlib, Pandas, SQLAlchemy, OpenAPI, Plotly.js, CatBoost, XGBoost, NetworkX, TensorFlow, NumPy, SciPy, Dask, Scikit-learn, D3.js, REST APIs, P5.js

Tools

Plotly, Git, Slack, PyCharm, Pytest, Tree-Based Pipeline Optimization Tool (TPOT), StatsModels, Microsoft Power BI, Seaborn, Tableau, Jenkins, MATLAB, Apache Airflow, IntelliJ IDEA

Languages

Python, CSS, HTML, SQL, TypeScript, Scala

Frameworks

Apache Spark, Alembic, Flask, Angular, Presto, Hadoop

Platforms

Jupyter Notebook, Docker, Unix, Amazon Web Services (AWS), Visual Studio Code (VS Code)

Paradigms

Scrum, Functional Programming, Continuous Integration (CI), Test-driven Development (TDD), RESTful Development, Maintainability

Storage

PostgreSQL, Apache Hive, HDFS, Data Pipelines

Other

Data Visualization, Data Analytics, Data Engineering, Data Science, Data, Machine Learning, Statistical Inference, Classification Algorithms, Econometrics, Time Series Analysis, Big Data, Algorithms, Feature Engineering, Ensemble Methods, GitHub Actions, Text Classification, Regression Modeling, Full-stack, Mathematics, Energy, Markets, JupyterLab, Digital Electronics, Hardware, Computer Architecture, Probability Theory, Power Electronics, Reliability

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring