Valentin Lehuger
Verified Expert in Engineering
Software Developer
Paris, France
Toptal member since June 18, 2020
Valentin has seven years of experience in both startups and big French tech companies. He mainly worked as a back-end data engineer with Scala and Python. He is also familiar with working with Hadoop and Spark, developing data pipelines, and architecting data warehouses to extract value from terabytes of data. Valentin has recently been CTO for a YC tech startup leading a team of 10 to build complex front-end software as well as a full data processing engine in the back end.
Portfolio
Experience
Availability
Preferred Environment
Git, MacOS, Visual Studio Code (VS Code), Scala, Python, TypeScript, JavaScript, Vue, Google Cloud Platform (GCP)
The most amazing...
...project I've worked on is the refactoring of the most critical ETL of Deezer that calculates data used to compute recommendation, royalties, and analytics.
Work Experience
CTO
Actiondesk
- Developed a spreadsheet application connected to dozens of integrations to make data engineering accessible to non-technical business users and automate their dashboards and reporting.
- Conceived and led the implementation—first committer on every service for a long time—of the entire back end.
- Managed a tech team of up to 10 engineers, including front-end, back-end, DevOps, and QA engineers.
- Built dozens of connectors to databases and various APIs like Stripe, Hubspot, Google Analytics, and Quickbooks.
- Created a formula engine to compute Excel-like formulas. The project was in ScalaJS to work both in the back end and in front end and have the exact same results.
- Maintained a Kubernetes cluster for two years until I hired a DevOps engineer that I managed.
- Implemented a sharing feature to share reports with charts on multiple channels, such as Slack, emails, etc.
Data Engineer
Deezer
- Developed and maintained the core ETLs in Scala Spark and streaming pipelines with Kafka and Spark.
- Streamed to process 2.5TB/day to support 50+ engineers, analysts, scientists, and product managers.
- Managed data warehousing on HDFS in ORC, Parquet, and AVRO formats.
- Developed our own scheduler in Python that runs 2,000 jobs per day.
Data Engineer
Artefact
- Developed ETLs in PySpark in collaboration with data scientists.
- Led as main contributor the internal data collection software processing 500GB per day.
- Performed R&D for a stream processing project using Storm and Kafka.
Data Scientist
fifty-five
- Optimized item ordering of product listings for major clothing retailers websites.
- Developed user segmentation and buying prediction algorithms.
- Optimized recommender systems parallelizing algorithms (ALS-WR) with CUDA.
Back-end Engineer
Pricing Assistant
- Developed an eCommerce page parser.
- Developed product matchers in Python.
Experience
Full Spreadsheet Application
https://actiondesk.ioAs the CTO of Actiondesk, I created the architecture and was the lead developer to build the front-end application with Vue and a canvas rendering, as well as a complex data engine back end integrated with dozens of different DBs and external tools. I managed a team of up to 10 engineers, including front-end, DevOps, and all in between.
Facial Recognition
Migrated Critical Data Pipelines from Pig to Spark
The migration saved more than 33% of computing time, making the data available before the analysts started their working day.
I worked on migrating the pipeline from Hive and Pig script to Spark with Scala, optimizing and simplifying the transformations.
Education
Master's Degree in Computer Engineering
42 University - Paris, France
Skills
Libraries/APIs
Vue, Pandas, Vuex
Tools
Git, BigQuery, Microsoft Excel, IntelliJ IDEA, PyCharm, Ansible, Canvas
Languages
SQL, Scala, Python, TypeScript, JavaScript, R, C, C++
Platforms
Visual Studio Code (VS Code), MacOS, Docker, Amazon Web Services (AWS), Google Cloud Platform (GCP), NVIDIA CUDA, Apache Pig, Apache Kafka, Kubernetes
Storage
PostgreSQL, Data Pipelines, Redis, Apache Hive, HDFS, MySQL, MongoDB
Frameworks
Hadoop, Spark, Storm, Akka, Flask, Django
Paradigms
Functional Programming, ETL, Agile Software Development, Actor Model
Other
APIs, Google BigQuery, Data Engineering, Distributed Systems, WebSockets, Data Build Tool (dbt)
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring