Simon Tietze
Verified Expert in Engineering
Data Scientist and Developer
Berlin, Germany
Toptal member since November 17, 2022
Simon is a data scientist with experience in deep learning, machine learning, statistics, big data, and method development. Over his career, he has worked in various fields, including adtech, molecular biology, telecommunication networks, and hardware reliability. Simon has built predictive machine learning systems, reporting dashboards, and in-depth analytical reports, ranging from small datasets to systems operating in real time with thousands of requests per second.
Portfolio
Experience
- Data Science - 20 years
- Neural Networks - 20 years
- Machine Learning - 20 years
- R - 20 years
- Deep Neural Networks (DNNs) - 10 years
- Deep Learning - 10 years
- TensorFlow - 5 years
- sparklyr - 5 years
Availability
Preferred Environment
Linux, RStudio, Python 3
The most amazing...
...project I've worked on is a mobile phone data-based population mobility analysis that provided information to several governments during the COVID-19 pandemic.
Work Experience
Principal Data Scientist | Co-founder
Exago Machine Learning
- Created an hourly population flow model for entire countries based on the mobile phone data used by the State of New York and the UK government to track COVID-19 measures.
- Designed and implemented a user segmentation into around 50 groups using deep learning deployed at several thousand queries per second.
- Implemented and created a model that filters unprofitable traffic in an ad auction server early in the pipeline, reducing the client's cloud cost by roughly 20%.
Senior Data Scientist
BEN Energy
- Created customer churn models based on custom neural networks trained on censored time-to-event data. These models predicted the time until customer churn and could use partial information provided by active customers.
- Developed a SaaS predictive dashboard that provided customers with churn alerts and cross-selling recommendations.
- Presented complex modeling results to over 20 energy utility companies in interactive workshops.
Senior Data Scientist
Motorola Mobility
- Built a complex survival model integrating hardware properties with usage logs to investigate a newly released phone's high-return rates, which were due to the high-end model's target audience, not the hardware.
- Implemented an R library that assembled a concise device history from manufacturing, QA, sales, and the data used to inform multiple reporting and modeling tasks, including connecting sources in Oracle, Apache Hadoop, and BigQuery.
- Supported product launches with data on early product returns by building R Markdown templates that provided reports within days of a product coming to market.
Head of Analytics
Aloqa (acquired by Motorola Mobility)
- Developed an end-to-end big data analytics solution from the mobile client through Hadoop to the web reporting front end.
- Created a randomized keep-alive algorithm to deliver instant push messages to mobile clients before Google and Apple created APIs that enable this.
- Developed an early microservice architecture to scale from thousands to millions of users within weeks.
Lead Developer
MoDeST
- Coordinated the development of a full-stack cheminformatics framework, including fingerprint, graph-based, ligand-ligand superpositioning, and protein/ligand docking methods.
- Implemented novel 3D visualizations for proteins based on OpenGL shaders, such as real-time ambient occlusion.
- Co-invented several novel techniques based on protein-ligand docking, e.g., inverting the normal process to look for molecular targets of known drugs.
Research Assistant
Ludwig Maximilians University of Munich
- Developed machine learning-based methods for automated diagnosis of vertigo-related diseases based on accelerometer recordings of upright stance.
- Worked on text mining, NLP, protein alignment extensions to profile the profile, and statistical approaches to validating lattice-based inference of text topics.
- Contributed to novel methods and applications in protein-ligand docking.
Experience
Population Mobility and Its Effect on the COVID-19 Pandemic in the US
We used a deep learning model to augment the mobility data with user age information. The model was built and previously measured to be accurate to around 80% with five age group bins. This data was then used in a Bayesian hierarchical model analysis to attribute infection spread to different age groups in each US state.
Education
Master's Degree in Computational Biology
Ludwig Maximilian University of Munich - Munich, Germany
Certifications
Certified SAFe 5 Agile Software Engineer
Scaled Agile, Inc.
Skills
Libraries/APIs
TensorFlow, Ggplot2, PyTorch, REST APIs, Keras, Pandas, OpenGL
Tools
sparklyr, Ansible, BigQuery, MATLAB
Languages
R, Python, SQL, SQL-99, Bash, Python 3, C, Ruby, Java
Platforms
RStudio, Linux, Amazon Web Services (AWS), Databricks, Google Cloud Platform (GCP), Docker
Industry Expertise
Bioinformatics
Storage
Data Pipelines, Google Cloud, PostgreSQL, MySQL
Paradigms
ETL, Agile, Scrum, XP
Frameworks
Spark, RStudio Shiny, Hadoop
Other
Deep Learning, Data Science, Neural Networks, Machine Learning, Large Data Sets, Data Analytics, Data Visualization, Artificial Intelligence (AI), Predictive Modeling, Models, Communication, Modeling, Data Analysis, Product Analytics, Geospatial Data, Convolutional Neural Networks (CNNs), Deep Neural Networks (DNNs), Algorithms, Computational Biology, Data Manipulation, Data Extraction, Data Engineering, Data Reporting, Statistical Analysis, Statistical Modeling, Version Control Systems, A/B Testing, Product Development, Computer Vision, Bayesian Inference & Modeling, Google BigQuery, Recommendation Systems, Biology, Molecular Biology, Natural Language Processing (NLP), Signal Processing, Generative Pre-trained Transformers (GPT)
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring