Corentin Marek, Developer in London, United Kingdom
Corentin is available for hire
Hire Corentin

Corentin Marek

Verified Expert  in Engineering

Machine Learning Developer

Location
London, United Kingdom
Toptal Member Since
January 6, 2021

Corentin is a machine learning specialist with expertise in time series predictions, Computer Vision, and natural language-related applications. Corentin created deep reinforcement learning for cryptocurrency trading —a TensorFlow-based model that included infrastructure (Python and Kubernetes) for data collection and real-time trading on Bitfinex. His background includes software and data engineering, having built several machine learning infrastructures from scratch.

Portfolio

Plural AI
Amazon Web Services (AWS), Python, MongoDB, Jupyter, Keras, Pub/Sub...
Arkera
Python, Neo4j, Jupyter, SpaCy, TensorFlow, Recommendation Systems...
Cytora
Python, Elasticsearch, Jupyter, Google Kubernetes Engine (GKE), Pub/Sub...

Experience

Availability

Part-time

Preferred Environment

Jupyter, Kubernetes, PyCharm

The most amazing...

...tool I've built is a fully functioning trading platform using deep reinforcement learning, aggregating data for training, and allowing for live order passing.

Work Experience

Machine Learning Contractor

2018 - 2020
Plural AI
  • Built models for information extraction from companies' websites using TensorFlow. This includes tooling for data labelling and managing the labelling process (a few remote labelers at any given time).
  • Explored two approaches for time series prediction classification and prediction on our private company's financials dataset: RNN with attention mechanism, Bayesian output, and Gradient Boosting with feature engineering.
  • Developed an industry taxonomy classification model. This includes dataset building using SIC codes taxonomy and implementation of a two-sided model, embedding the taxonomy on one side and the companies' features on the other.
  • Extracted information from private companies' financials reports using Google Cloud Vision and in-house reconstruction software to go from bounding boxes to structured financial information in our database.
  • Designed, developed, and productionized on GKE with PubSub the internal ingestion pipelines. This included the data modeling for MongoDB, GCS, and Neo4j with full automatic redeployment from a back-up using the CircleCI to manage releases.
  • Built microservices to serve our web applications. Semantic search using embeddings to match search terms to relevant companies.
  • Recruited, trained, and supervised several members of the data team (two full-time engineers, on contractors, and several interns).
Technologies: Amazon Web Services (AWS), Python, MongoDB, Jupyter, Keras, Pub/Sub, Google Kubernetes Engine (GKE), CircleCI, Neo4j, Google Cloud Platform (GCP), OCR, Generative Pre-trained Transformers (GPT), GPT, Natural Language Processing (NLP), Time Series Analysis, TensorFlow

Senior Data Scientist

2017 - 2018
Arkera
  • Built and productionized an end-to-end NER and disambiguation model for finance related sources. We used several deep learning including multitask learning and transfer learning (from pre-tagged POS and parsed dependencies), curriculum learning, etc.
  • Built a set of document classification models for finance. We also explored two-sided models (text-based topic representation on one side and document on the other side) to allow for generalization across topics.
  • Explored different areas of graph-based deep learning for knowledge representation to serve as embedded features for other internal applications (TensorFlow - research project).
  • Prototyped several embedding-based approaches to allow bidirectional recommendations in our investor's mobile application, from a piece of news to a financial product and vice-versa (TensorFlow - research project).
  • Designed and built the majority of our internal training infrastructure combining TensorFlow, a set of internal tools, and Elasticsearch for storing and versioning different models and their outputs to track our improvement.
  • Designed and implemented internal processes and tools to allow for more thorough labeling in creating new datasets.
  • Joined as a TensorFlow expert and mentored and trained four team members, allowing for a smooth transition to deep learning approaches.
Technologies: Python, Neo4j, Jupyter, SpaCy, TensorFlow, Recommendation Systems, Elasticsearch, Natural Language Processing (NLP), GPT, Generative Pre-trained Transformers (GPT), Deep Learning

Machine Learning Engineer

2015 - 2017
Cytora
  • Built the internal codebase for text classification and information extraction on news from the ground up using Keras, totaling 20+ classes (i.e., flood detection, strike, storm, terrorist attacks, etc.).
  • Productionized our real-time processing pipelines to GKE using PubSub. This allowed for easy scaling up and down of a pipeline processing several millions of news articles a day.
  • Built several elements of our processing pipeline (Python, Elasticsearch, etc.).
Technologies: Python, Elasticsearch, Jupyter, Google Kubernetes Engine (GKE), Pub/Sub, Kubernetes, SpaCy, Keras, TensorFlow, GPT, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Deep Learning

Deep Reinforcement Learning for Cryptocurrency Trading.

The project included infrastructure (Python and Kubernetes) for data collection and real-time trading on Bitfinex. A TensorFlow-based model allowing for past replay of the market situation using a recurrent actor-critic to learn the cryptocurrency basket distribution using several heads for multitask learning of rewards' objectives (Sharp Ratio, and adjusted returns including fees). I was the sole developer on this project and completed all aspects from start to completion.

Languages

Python, SQL

Libraries/APIs

Keras, TensorFlow, Scikit-learn, SpaCy

Other

Machine Learning, Deep Learning, Natural Language Processing (NLP), Neural Networks, GPT, Generative Pre-trained Transformers (GPT), Deep Reinforcement Learning, Statistics, Ray Framework, Computer Vision, Time Series Analysis, OCR, Recommendation Systems, Pub/Sub

Platforms

Kubernetes, Google Cloud Platform (GCP), Amazon Web Services (AWS)

Tools

Google Kubernetes Engine (GKE), CircleCI, Jupyter

Storage

Data Pipelines, MongoDB, Elasticsearch, Neo4j

2012 - 2014

Master's Degree in Machine Learning & Data Management

National Engineering School of Computer Science and Applied Mathematics - France

2008 - 2012

Bachelor's Degree in Informatics and Applied Mathematics

National Engineering School of Computer Science and Applied Mathematics - France

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring