Olivier Manns
Verified Expert in Engineering
Data Engineer and Software Developer
Olivier is a big data engineer skilled in distributed processing, cloud architecture, data visualization, and machine learning. He has been processing terabytes of data as a team player to help hundreds of automotive engineers build your next smart vehicle. With experiences in R&D centers, retail and banking environments, Olivier knows how to make the most of your data.
Portfolio
Experience
Availability
Preferred Environment
Amazon Web Services (AWS), Data Pipelines, Jupyter Notebook, Visual Studio Code (VS Code), Unix, Azure, Python, Data Build Tool (dbt)
The most amazing...
...cloud data platform I've built has made it possible to track the state of health of thousands of vehicles and components across the globe, in real time.
Work Experience
Data Engineer
Clas Ohlson AB
- Implemented a production-ready data platform, infrastructure as code, relying on a big data stack for scaling, enabling company-wide data science, machine learning, and data services projects to run.
- Created a fully automated ELT system, getting data from more than six various sources. Ran daily, fully automated with tests and alerts, using a mix of Azure services and DBT to keep the cost low.
- Deployed and configured an "Airflow-like" orchestration tool (Prefect) to reduce the manual work and ease data pipelines and ML pipeline management.
- Created and configured tools and data-related services for data scientists, data analysts, and business workers.
- Provided expertise formation and advice to the teams over data engineering concepts, cloud infrastructure, network security, and best practices.
Big Data Engineer
Continental
- Designed and deployed an entire scalable data platform for vehicle component real-time monitoring, from data collection to interactive visualization.
- Made it possible for 100+ automotive engineers to query terabytes of structured data daily within seconds.
- Implemented every piece of architecture as infrastructure as code (IaC) using Terraform for modularity and flexibility.
- Built streaming and batch data pipelines for real-time and specific data ingestion and analysis.
- Anticipated the increase of data quantity: taking advantage of serverless code, distributed processing with Spark (Scala), and queries with Athena (SQL).
- Plugged in an interchangeable data visualization tool for business needs versatility.
- Implemented an automated data pipeline creation process for new projects and customers, requiring an engineer for custom needs only.
Machine Learning Engineer
Continental
- Developed many methods and machine learning models to improve Continental's component lifespan thanks to vehicle data exploitation.
- Filed two patents for predictive diagnosis and failure prevention of vehicle components thanks to data acquisition and machine learning models.
- Created machine learning models, tuned feature engineering to improve physical models of engine behaviors, and pollutant emissions.
- Analyzed a large quantity of data for exploration and feasibility studies using Amazon EMR/EC2 with Spark ML/scikit-learn.
Data Engineer
Société Générale
- Wrote technical specifications and gave expert advice to a team of ETL developers.
- ETL development, complex transformation, and loading of TB of data into Teradata thanks to both IBM Datastage and TPT scripts.
- Optimized many SQL queries for performance improvement (Insert, update, and select).
- Managed the Teradata database to ensure availability, scalability, and performance.
- Resolved critical and complex issues and bugs in ETL pipelines, database management, and Unix systems.
Data Engineer
Thales
- Developed and improved ETL and BI processing on Oracle tools: ODI, OBI, and Oracle database (11g).
- Analyzed data integrity and custom calculation bugs.
- Optimized SQL queries for business stakeholders and general performance.
- Managed Oracle database to ensure availability, scalability, and performance.
- Resolved critical and complex issues and bugs in ETL pipelines, database management, and Unix systems.
Data Engineer
La Banque Postale
- Implemented a model for bank check fraud detection in the ETL step.
- Monitored metrics and traceability to ensure performance and data veracity.
- Optimized SQL queries for automatic insert, update, and select statements.
Data Miner
CEA
- United and helped scientific researchers and industrial companies for CEA's research valorization.
- Analyzed competitors thanks to data mining and analysis on patents and scientific publications.
- Analyzed and confirmed the patentability of CEA scientific inventions compared to state-of-the-art techniques.
Experience
Distributing Calculation of Rainflow-counting Algorithm with Spark
This algorithm is time-series oriented and can not be easily distributed over a cluster of worker. This involves long processing times and very limited scalability. The input data consists of vehicles ID, components ID and multiple sensors information sampled at 20 milliseconds, over thousands of hours of driving.
By taking advantage of the specificity in the data, understanding the automotive engineers' real needs, partitioning the data in an effective way and re-implementing the algorithm, I have achieved the parallelization of this processing with Apache Spark. Thus, by accepting a 0.03% mean error on the results, I sped up this processing duration from 28 hours to 5 minutes on the same cluster size.
Skills
Languages
SQL, Python, Scala, Go, SAS
Tools
Amazon Athena, Terraform, IBM InfoSphere (DataStage), Shell, GitHub, Oracle Business Intelligence Applications (OBIA), Azure Machine Learning, Azure Logic Apps
Paradigms
ETL, Parallel Computing, Business Intelligence (BI), Data Science, Distributed Computing, Azure DevOps
Platforms
Amazon Web Services (AWS), Oracle Data Integrator (ODI), Oracle Database, Unix, Jupyter Notebook, AWS Lambda, Azure, Azure Synapse, Docker, Visual Studio Code (VS Code)
Storage
Data Pipelines, Amazon S3 (AWS S3), Databases, Teradata, Microsoft SQL Server, Datastage
Other
Data Engineering, Data Architecture, ELT, Data, Data Warehousing, Data Modeling, Data Warehouse Design, ETL Tools, Query Optimization, Data Visualization, Big Data, Data Analysis, Amazon Kinesis, Azure Data Factory, Data Build Tool (dbt), Machine Learning, Serverless, Metabase, Orbit Intellixir, Azure Databricks, Azure Virtual Machines
Frameworks
Apache Spark, Spark
Libraries/APIs
PySpark, Keras, Scikit-learn, Pandas, Spark ML, NumPy
Education
Master's Degree in Industrial Engineering
ENSIACET, part of Grandes Écoles of Engineering - Toulouse, France
Bachelor of Science Degree in Mathematics and Physics
Lycée Bellevue - Toulouse, France
Certifications
Certified Oracle BI 12 Administrator
Oracle
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring