Verified Expert in Engineering
Valentin has seven years of experience in both startups and big French tech companies. He mainly worked as a back-end data engineer with Scala and Python. He is also familiar with working with Hadoop and Spark, developing data pipelines, and architecting data warehouses to extract value from terabytes of data. Valentin has recently been CTO for a YC tech startup leading a team of 10 to build complex front-end software as well as a full data processing engine in the back end.
The most amazing...
...project I've worked on is the refactoring of the most critical ETL of Deezer that calculates data used to compute recommendation, royalties, and analytics.
- Developed a spreadsheet application connected to dozens of integrations to make data engineering accessible to non-technical business users and automate their dashboards and reporting.
- Conceived and led the implementation—first committer on every service for a long time—of the entire back end.
- Managed a tech team of up to 10 engineers, including front-end, back-end, DevOps, and QA engineers.
- Built dozens of connectors to databases and various APIs like Stripe, Hubspot, Google Analytics, and Quickbooks.
- Created a formula engine to compute Excel-like formulas. The project was in ScalaJS to work both in the back end and in front end and have the exact same results.
- Maintained a Kubernetes cluster for two years until I hired a DevOps engineer that I managed.
- Implemented a sharing feature to share reports with charts on multiple channels, such as Slack, emails, etc.
- Developed and maintained the core ETLs in Scala Spark and streaming pipelines with Kafka and Spark.
- Streamed to process 2.5TB/day to support 50+ engineers, analysts, scientists, and product managers.
- Managed data warehousing on HDFS in ORC, Parquet, and AVRO formats.
- Developed our own scheduler in Python that runs 2,000 jobs per day.
- Developed ETLs in PySpark in collaboration with data scientists.
- Led as main contributor the internal data collection software processing 500GB per day.
- Performed R&D for a stream processing project using Storm and Kafka.
- Optimized item ordering of product listings for major clothing retailers websites.
- Developed user segmentation and buying prediction algorithms.
- Optimized recommender systems parallelizing algorithms (ALS-WR) with CUDA.
- Developed an eCommerce page parser.
- Developed product matchers in Python.
Full Spreadsheet Applicationhttps://actiondesk.io
As the CTO of Actiondesk, I created the architecture and was the lead developer to build the front-end application with Vue and a canvas rendering, as well as a complex data engine back end integrated with dozens of different DBs and external tools. I managed a team of up to 10 engineers, including front-end, DevOps, and all in between.
Migrated Critical Data Pipelines from Pig to Spark
The migration saved more than 33% of computing time, making the data available before the analysts started their working day.
I worked on migrating the pipeline from Hive and Pig script to Spark with Scala, optimizing and simplifying the transformations.
Visual Studio Code (VS Code), MacOS, Docker, Amazon Web Services (AWS), Google Cloud Platform (GCP), NVIDIA CUDA, Apache Pig, Apache Kafka, Kubernetes
PostgreSQL, Data Pipelines, Redis, Apache Hive, HDFS, MySQL, MongoDB
APIs, Google BigQuery, Data Engineering, Distributed Systems, WebSockets, Data Build Tool (dbt)
Hadoop, Spark, Storm, Akka, Flask, Django
Vue, Pandas, Vuex
Git, BigQuery, Microsoft Excel, IntelliJ IDEA, PyCharm, Ansible, Canvas
Functional Programming, ETL, Agile Software Development, Actor Model
Master's Degree in Computer Engineering
42 University - Paris, France