Sebastián Castaño, Developer in Berlin, Germany
Sebastián is available for hire
Hire Sebastián

Sebastián Castaño

Verified Expert  in Engineering

Data Scientist and ML Developer

Location
Berlin, Germany
Toptal Member Since
September 13, 2021

Sebastián has a PhD in machine learning and data science and a decade of experience in interdisciplinary projects in medicine, banking, marketing, and consumer products, among others. His expertise includes designing data collection systems, analyzing and modeling complex data, and developing and deploying ML pipelines. As a seasoned researcher and educator, Sebastián constantly delivers compelling data-driven insights and intuitive tools for technical and non-technical colleagues.

Portfolio

Stealth Startup
GPT, Generative Pre-trained Transformers (GPT)...
Global Food and Beverage Corporation
Python, Machine Learning, SQL, TensorFlow, Docker, Amazon Web Services (AWS)...
D-fine
Machine Learning, SQL, Python, R, Java, Neural Networks, Statistical Modeling...

Experience

Availability

Part-time

Preferred Environment

Windows, Linux, Spyder, PyCharm, Jupyter Notebook, Scikit-learn, Visual Studio Code (VS Code), Git, Docker

The most amazing...

...project I've developed is an ML-based, closed-loop system for optimizing brain stimulation therapy in Parkinson's disease and essential tremor patients.

Work Experience

Co-founder

2022 - PRESENT
Stealth Startup
  • Co-founded a company that develops a knowledge management framework for IT teams.
  • Developed and implemented a system for the analysis of multi-domain corpora using transformer-based NLP models.
  • Designed and deployed AWS infrastructure for servicing a neural search engine.
Technologies: Natural Language Processing (NLP), GPT, Generative Pre-trained Transformers (GPT), Amazon Web Services (AWS), SQL, Docker, Generative Systems, OpenAI, Text Processing

Machine Learning Engineer

2021 - 2022
Global Food and Beverage Corporation
  • Set up and rolled out an existing media mix model for a new geographical market and product family as a member of an MLOps team.
  • Designed, implemented, and deployed a PoC for a media mix model based on Bayesian statistical modeling as a member of an ML R&D team.
  • Led a team of five ML engineers and data scientists. The team developed, benchmarked, productized, and deployed a next-generation media mix model for a marketing team.
Technologies: Python, Machine Learning, SQL, TensorFlow, Docker, Amazon Web Services (AWS), Bayesian Statistics

Consultant

2021 - 2021
D-fine
  • Validated credit risk and accounting models in a large German bank.
  • Performed predictive and prescriptive statistical analysis of soccer players' data for injury prediction and talent development at a Bundesliga team.
  • Deployed a data management system for a large European bank.
  • Developed MLOps pipelines, including architecture optimization for neural networks, for an in-house project.
Technologies: Machine Learning, SQL, Python, R, Java, Neural Networks, Statistical Modeling, Predictive Analytics, Bayesian Statistics, Data Analysis, Data Analytics, Data Science, Machine Learning Operations (MLOps), Scikit-learn, Probability Theory, PyTorch, Statistics, Statistical Methods, Seaborn, Matplotlib, Database Management Systems (DBMS), Data Warehousing, CI/CD Pipelines, Jupyter, Statistical Analysis, Artificial Intelligence (AI), NumPy, Consulting

Doctoral Research Assistant

2014 - 2020
University of Freiburg
  • Developed the first machine learning-based, closed-loop, deep brain stimulation system implemented in freely moving patients. The projects related to this achievement were carried out in close collaboration with clinicians and industry partners.
  • Established data-driven adaptive deep brain stimulation as a novel research field in the university.
  • Published seven research articles in peer-reviewed scientific journals and 10+ contributions to scientific workshops and conferences in the fields of machine learning, data science, and neuroscience.
  • Supported the machine learning lecture of the Master in Computer Science program for five years with the conception of exercises and exams and tutoring. The average attendance of the lecture was around 100 students per semester.
  • Supervised a team of 2-5 (paid) master's research assistants in their supporting tasks at the lab.
  • Supervised 15+ students in their master's and bachelor's theses in the research lab.
Technologies: Machine Learning, Digital Signal Processing, Data Science, Research, Neuroscience, Statistics, Data Analysis, Data Analytics, Writing & Editing, Matplotlib, Pandas, Scikit-learn, Time Series Analysis, Control Theory, Probability Theory, Linear Algebra, Deep Learning, Reinforcement Learning, Python, MATLAB, Technical Writing, PyTorch, Statistical Methods, Data Engineering, Seaborn, Neural Networks, Predictive Analytics, Statistical Modeling, Jupyter, AutoML, Artificial Intelligence (AI), NumPy, Consulting, Classification Algorithms

Research Assistant

2012 - 2014
National University of Colombia
  • Taught a course on analog electronics as the sole lecturer. In addition to preparing lectures, exercises sheets, and exams, I supervised the execution of student's projects.
  • Supervised one (paid) undergraduate teaching assistant for the lecture on analog electronics.
  • Developed new methods for source localization of neural signals, resulting in a scientific publication in a peer-reviewed journal.
Technologies: Bayesian Statistics, Linear Algebra, Research, Neuroscience, MATLAB, Python, Probability Theory, Statistical Analysis, Artificial Intelligence (AI), NumPy

Research Discovery Engine Based on NLP Methods

Conception and development of a framework for knowledge management and discovery in machine learning teams. The product is framed as a discovery engine for machine learning research, tackling the problem of information overflow like more than 100 machine learning papers uploaded daily to arXiv.

Key Activities
• Implemented a PoC system consisting of a discovery engine for machine learning research using large language models.
• Designed and deployed cloud infrastructure serving the discovery engine.
• Created a business model and go-to-market strategy.
• Conducted user discovery and development interviews with more than 50 interviewees.

Media Mix Model for Consumer Products

Development, implementation, and deployment of a marketing mix model based on Bayesian statistical modeling for the company's global operations.

Key Activities
• Implemented a learning model based on state-of-the-art research papers.
• Customized the model based on specific properties of the data available.
• Deployed the model to the cloud to be used by the MLOps team.
• Improved the model iteratively using feedback from the business unit.
• Presented results to several non-technical stakeholders in the business unit.

ML-based Adaptive Deep Brain Stimulation System for Essential Tremor Patients

https://www.frontiersin.org/articles/10.3389/fnhum.2020.541625/full
The first machine learning-based adaptive deep brain stimulation system implemented in freely moving essential tremor patients. This project was executed in cooperation with colleagues at the University of Washington.

Key Activities
• Established the cooperation between our research lab at the University of Freiburg (brain state decoding lab) and the University of Washington (biorobotics lab).
• Conceived and developed the underlying machine learning, control, and digital signal processing methods.
• Deployed the algorithms on a host PC and the embedded system of the patients' neurostimulators.
• Executed the data collection experiments.
• Performed offline analysis of the collected data.
• Wrote and edited the final manuscript for a peer-reviewed publication.

Data Analysis of Injury Data in a Soccer Team

Descriptive and prescriptive analysis of injury data for a Bundesliga soccer team. After I analyzed the data and presented the results to all the stakeholders, including non-technical personnel, the project became part of the team's research department to investigate injury prevention and talent development.

Key Activities
• Prepared and cleaned the data from several databases.
• Conducted descriptive and prescriptive analyses of the data.
• Presented the results to all stakeholders, including non-technical personnel.

Deployment of a Data Management System

Handover of a data management system that a large European bank used for the warehousing and analysis of credit risk data.

Key Activities
• Configured and deployed UAT and production environments.
• Implemented a CI/CD pipeline for the back and front end.

Decoding Parkinson's Disease Symptoms from Brain Signals

https://www.sciencedirect.com/science/article/pii/S2213158220302138
A novel supervised ML approach for decoding the intensity of Parkinson's disease symptoms from brain signals. We collected data from seven patients undergoing deep brain stimulation therapy and showed that our novel ML approach improved the decoding performance of the symptoms.

Key Activities
• Designed and executed the data collection experiments.
• Preprocessed data and performed exploratory data analysis.
• Conceived and implemented the novel ML method.
• Validated the novel ML method with the collected data and a benchmark against state-of-the-art models, including deep convolutional neural networks.
• Applied AutoML for hyperparameter optimization of all considered models.
• Wrote and edited the final manuscript published in a peer-reviewed journal.

Data Augmentation Framework for ML in Neuroscience

https://www.frontiersin.org/articles/10.3389/fninf.2019.00055/full
The novel framework developed in this project allows for an objective evaluation and benchmarking of novel ML algorithms to analyze neurological data.

We tackled the following challenges:
• Scarcity of data available when using data-driven methods in the analysis of brain signals.
• Unreliability of the available labels
• High level of noise in the raw signals.

Key Activities
• Conceived the idea.
• Executed the data analysis.
• Wrote the scientific manuscript published in a peer-reviewed journal.

Languages

Python, SQL, R, Java, C#

Libraries/APIs

Scikit-learn, Matplotlib, Pandas, NumPy, PyTorch, React, TensorFlow

Tools

Spyder, MATLAB, Git, Seaborn, Jupyter, AutoML, PyCharm, Excel 2010

Paradigms

Data Science, User Acceptance Testing (UAT)

Platforms

Windows, Linux, Jupyter Notebook, Docker, JBoss EAP, Amazon Web Services (AWS), Visual Studio Code (VS Code)

Other

Digital Signal Processing, Programming, Time Series Analysis, Linear Algebra, Machine Learning, Neuroscience, Deep Learning, Research, Technical Writing, Statistics, Statistical Methods, Data Analytics, Data Analysis, Data Engineering, Neural Networks, Statistical Modeling, Predictive Analytics, Writing & Editing, Statistical Analysis, Artificial Intelligence (AI), Consulting, Classification Algorithms, Generative Pre-trained Transformers (GPT), Probability Theory, Reinforcement Learning, Bayesian Statistics, Algorithms, Electronics, Natural Language Processing (NLP), OpenAI, Text Processing, GPT, Circuit Design, Control Theory, Calculus, Machine Learning Operations (MLOps), Data Warehousing, CI/CD Pipelines, Generative Systems, Large Language Models (LLMs), Business Planning, IT Project Management, Natural Language Understanding (NLU), Customer Research

Storage

Database Management Systems (DBMS)

2014 - 2020

PhD in Computer Science

University of Freiburg - Freiburg, Germany

2012 - 2014

Master's Degree in Engineering

National University of Colombia - Manizales, Colombia

2007 - 2012

Engineer's Degree in Electronics Engineering

National University of Colombia - Manizales, Colombia

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring