Charles is available for hire

Charles Camp

Verified Expert in Engineering

Machine Learning Developer

Location

Dijon, France

Toptal Member Since

August 24, 2020

Charles is certified in artificial intelligence and data science. He is highly skilled at producing high-performing models and making them easy to use. He easily adapts to all kinds of environments and has already worked for banks, startups, IT firms, and laboratories. Charles' fields of expertise are natural language processing and time series analysis.

Portfolio

ITG AUTOMOTIVE LLC

Artificial Intelligence (AI), Natural Language Processing (NLP)...

Huxley

Artificial Intelligence (AI), Generative Pre-trained Transformers (GPT)...

Non-Fungible Films, Inc.

Python, Artificial Intelligence (AI), Natural Language Processing (NLP), GPT...

Experience

Python - 7 years Machine Learning - 7 years Pandas - 6 years Data Science - 6 years GPT - 5 years Generative Pre-trained Transformers (GPT) - 5 years PySpark - 5 years Time Series Analysis - 5 years

Availability

Full-time

Preferred Environment

Python, Amazon Web Services (AWS), Natural Language Processing (NLP), Time Series Analysis, Transformers, Reinforcement Learning

The most amazing...

...model I've built can identify people with a connection to financial crimes.

Work Experience

AI Engineer

2024 - 2024

ITG AUTOMOTIVE LLC

Extracted key details from contracts using ChatGPT, LangChain and Pydantic.
Benchmarked a hierarchical clustering algorithm to group extracted key terms based on their content and organize them.
Integrated the solution and deployed it on AWS using Lambdas.

Technologies: Artificial Intelligence (AI), Natural Language Processing (NLP), OpenAI GPT-4 API, GPT, Amazon Web Services (AWS), Machine Learning Operations (MLOps), Azure, Machine Learning, LangChain

AI LLM Expert

2023 - 2024

Huxley

Built a chatbot trained on deaddiction program literature to help people get sober for different types of addictions, like alcohol, overeating, etc.
Built the API of the chatbot and deployed it on GCP in a scalable manner using Docker and Kubernetes.
Transcribed audio testimonials to enrich the quantity of data available for the chatbot's recommendations.
Used llama-index to improve the indexing of the vector store (embeddings of the documents) and get better-quality answers from the chatbot.
Built a model to extract quotes and stories from documents to use them as testimonial summaries.
Trained a recommender system to suggest inspiring quotes to the users based on their profiles. This included experience in the program, past likes/dislikes, inventories, and app activity.
Produced documentation and had a live session with the founder to explain the technicalities of the solutions.
Trained a reinforcement learning model to shorten the path to sobriety. Used this model to choose what actions the sponsor should take to make the user evolve as fast and reliably as possible in the program and remain sober.

Technologies: Artificial Intelligence (AI), Generative Pre-trained Transformers (GPT), OpenAI GPT-3 API, OpenAI GPT-4 API, LangChain, Llama 2, LlamaIndex, Large Language Models (LLMs), Reinforcement Learning, Recommendation Systems, Docker, Kubernetes, Google Cloud Platform (GCP), FastAPI

AI Developer

2023 - 2023

Non-Fungible Films, Inc.

Fine-tuned stable diffusion models to be usable with the company's fictive characters.
Deployed a Discord bot similar to Midjourney but using the custom stable diffusion model.
Integrated the model with a stable diffusion UI to enable inpainting, image to image, and other applications.

Technologies: Python, Artificial Intelligence (AI), Natural Language Processing (NLP), GPT, Generative Pre-trained Transformers (GPT), Computer Vision, Machine Learning, Node.js

ML Engineer

2022 - 2023

Global CPG Company

Created a pipeline to automate look-alike audience computations leveraging internal consumer behavior data.
Compared models to achieve the highest performances and hyperparameters tuning.
Created custom PySpark and scikit-learn estimators to integrate with PySpark and scikit-learn pipelines, respectively.

Technologies: Python, Machine Learning, PySpark, Scikit-learn, Pandas

ML and NLP Engineer

2021 - 2022

Phragmites, Inc.

Set up an EC2 server, analyzed Telegram messages stored on a Postgres DB, and classified them as relevant or not for a given crypto-related project.
Built a bot message-detection model using near-duplicates clustering approaches.
Quantified the influence of Telegram users in crypto-centered conversations using graph theory.
Trained a NER model to detect crypto project names in Telegram messages.

Technologies: Generative Pre-trained Transformers (GPT), GPT, Natural Language Processing (NLP), Machine Learning, Trend Analysis, Sentiment Analysis, Cryptocurrency, Artificial Intelligence (AI), Google Cloud, Python, SQL, Pandas, Scikit-learn, Data Science, Natural Language Toolkit (NLTK), SpaCy, Hugging Face, Neural Networks, Communication, Elasticsearch

Senior Data Scientist

2021 - 2022

Trust & Safety Laboratory

Trained machine learning models to find controversial topics in tweets. Controversial topics were defined as potentially containing harmful misinformation.
Trained ML models to detect false claims and misinformation in tweets.
Built a pipeline to collect human loop reviews (AWS), automated the labeling of potentially misleading tweets, and performed website scraping.
Developed a serverless framework to automatize social media screening tasks.

Technologies: GPT, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Docker, Kubernetes, Bazel, Machine Learning, Python, Scikit-learn, Pandas, Data Science, Linux, Amazon Web Services (AWS), Neural Networks, SpaCy, Artificial Intelligence (AI), Sentiment Analysis, Test Automation, Hugging Face

Python Developer | AI

2020 - 2021

Click Factura SA de CV

Transcribed and summarized Spanish audio meetings: fine-tuned text-to-speech models ( DeepSpeech, NeMo, and Wav2Vec) and used text summary and diarisation models.
Trained an OCR model to extract the information on Mexican expense tickets.
Integrated the models into the existing Django application by creating APIs.
Deployed the models using Docker containers and Flask.

Technologies: Python, Machine Learning, TensorFlow, Test Automation, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), GPT, Artificial Intelligence (AI), Amazon Web Services (AWS), Kubernetes, Docker, Django, APIs, OCR, Speech to Text, SQL, Data Science, Linux, Neural Networks, Communication, Google Cloud, Hugging Face

Machine Learning Expert | Digital Advertisement

2020 - 2020

Primal Analytics

Deployed Lambda to detect anomalies in Google ads statistics automatically.
Compared various state-of-the-art ML models for time-series anomaly detection.
Set up the AWS account for data storage and Lambda execution.

Technologies: Machine Learning, Digital Advertising, Python, Time Series Analysis, Scikit-learn, Anomaly Detection, SQL, Pandas, Data Science, Amazon Web Services (AWS), Neural Networks, ARIMA Models, Artificial Intelligence (AI), Trend Analysis

Senior Data Scientist

2019 - 2020

Glovo

Designed, implemented, and deployed a customer lifetime value model. This was deployed on an EC2 instance using Luigi and scheduled with Jenkins.
Used linear programming to optimize pickers' time shifts.
Built an end-to-end pipeline to decide whether or not to show a product in the app based on its probability of being available in the store to improve the customer experience. The model was trained on SageMaker and then deployed on an EC2 instance.

Technologies: Linear Programming, TensorFlow, Data Science, Pandas, Machine Learning, Scikit-learn, Python, XGBoost, Redshift, Amazon SageMaker, Amazon Web Services (AWS), Luigi, SQL, Communication, Artificial Intelligence (AI), Anomaly Detection

Data Scientist

2016 - 2019

Credit Suisse

Designed and deployed machine learning models to detect money laundering using transaction data.
Led the Negative News screening project to automatically screen news data and find associations with financial crimes to enrich the risk scoring model.
Used NLP to measure the impact of news data on the sales of financial products.
Organized data sourcing and mapping of various transaction and KYC data sources on a big data platform. Also handled the design and implementation of a data model for the transaction and KYC data to facilitate transaction monitoring.

Technologies: Natural Language Processing (NLP), GPT, Generative Pre-trained Transformers (GPT), TensorFlow, PySpark, Data Science, Pandas, Machine Learning, Scikit-learn, Python, SpaCy, SQL, Artificial Intelligence (AI), Communication, Natural Language Toolkit (NLTK), R, Time Series Analysis, ARIMA Models, XGBoost, Sentiment Analysis, Anomaly Detection, Hugging Face, Elasticsearch, Test Automation

Research Scholar

2016 - 2016

Carnegie Mellon University

Designed and implemented a model to predict the survival of patients after a cardiac arrest using their brain activity data (multivariate time series).
Built an evaluation to give a better score to models predicting the survival earlier.
Clustered patients to identify common characteristics and deduce specific preventive actions to increase their survival rate.

Technologies: ARIMA Models, Time Series Analysis, R, Data Science, Pandas, Machine Learning, Scikit-learn, Linux, Artificial Intelligence (AI), Trend Analysis, Anomaly Detection

Data Scientist Intern

2015 - 2015

Capgemini

Set up a Spark cluster to read sensor data from HDFS and preprocess it.
Built a scalable supervised model to detect manufacturing breakdowns using multivariate time series data (sensor data).
Fine-tuned and validated the model. Identified main features leading to breakdowns.

Technologies: ARIMA Models, Time Series Analysis, PySpark, Data Science, Pandas, Machine Learning, Scikit-learn, Python, Linux, XGBoost, Artificial Intelligence (AI), Anomaly Detection

Experience

Recommender System

We are provided a matrix V of dimensions (number of users, number of movies). This matrix is filled with 0 if the user did not like the movie and one if the user liked it. The rest of the values are NaNs.

In the first step, we use non-negative matrix factorization (NMF) to find two matrices W and H of respective sizes (number of users, K) and (K, number of movies) that minimize the difference between V and WH where K is a small value (< 10). That means we look for W and H such as WH is close to V.

Afterward, we use W and H to cluster the users and can now recommend movies that will be liked by their assigned cluster.

Face and Image Recognition

A facial recognizer to detect faces using a webcam and OpenCV. This project was first extended to simple objects before being extended again to custom objects retraining a pre-trained state-of-the-art TensorFlow model.

Skills

Languages

Python, SQL, R

Libraries/APIs

Scikit-learn, Pandas, PySpark, SpaCy, Natural Language Toolkit (NLTK), XGBoost, TensorFlow, Luigi, OpenCV, Node.js

Paradigms

Data Science, Anomaly Detection, Linear Programming, Test Automation

Other

Time Series Analysis, Natural Language Processing (NLP), Machine Learning, Artificial Intelligence (AI), Communication, GPT, Generative Pre-trained Transformers (GPT), Neural Networks, ARIMA Models, Sentiment Analysis, Cryptocurrency, Hugging Face, Analysis of Variance (ANOVA), APIs, Speech to Text, OCR, Decentralized Finance (DeFi), Trend Analysis, Digital Advertising, Computer Vision, Reinforcement Learning, PEFT, LoRa, Transformers, OpenAI GPT-3 API, OpenAI GPT-4 API, LangChain, Llama 2, Large Language Models (LLMs), Recommendation Systems, FastAPI, Machine Learning Operations (MLOps)

Platforms

Linux, Amazon Web Services (AWS), Docker, Kubernetes, Google Cloud Platform (GCP), Azure

Frameworks

Django, LlamaIndex

Tools

Amazon SageMaker, Bazel

Storage

Redshift, Google Cloud, Elasticsearch

Education

2014 - 2016

Master's Degree in Data Science

Grenoble Institute of Technology - Grenoble, France

2011 - 2014

Bachelor's Degree in Computer Science

Grenoble Institute of Technology - Grenoble, France

Certifications

JULY 2023 - PRESENT

Generative AI with Large Language Models

Coursera

APRIL 2022 - PRESENT

Decentralized Finance (DeFi)

Coursera

FEBRUARY 2022 - FEBRUARY 2025

AWS Solutions Architect Associate

Pearson VUE

DECEMBER 2020 - PRESENT

Django for Everybody

University of Michigan | via Coursera

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring