Jeremie Charlet, Data Scientist, Machine Learning Engineer, and Software Developer in Caluire-et-Cuire, France
Jeremie Charlet

Data Scientist, Machine Learning Engineer, and Software Developer in Caluire-et-Cuire, France

Member since October 25, 2021
Jeremie is a full-stack data scientist focused on natural language processing and tabular data. He has a decade of experience in the IT industry, working for the UK government, large corporates, and startups while taking multiple hats as a back-end developer, DevOps, data scientist, and startup technical co-founder. He led two communities on ML and mental health, and last but not least, he is an avid reader and lifelong learner who did many retreats.
Jeremie is now available for hire




Caluire-et-Cuire, France



Preferred Environment

PyCharm, Amazon Web Services (AWS), Jupyter Notebook, Slack, Git, Trello, Python, Notion, Linux

The most amazing...

...project I've done is just my last; I built a NLP framework on few-shot learning, set up active learning, and provided tools to conduct bias analysis.


  • Freelance Full-stack Data Scientist

    2020 - PRESENT
    • Ran experiments and built a framework to classify text with tiny labeled datasets; few-shot learning using natural language inference.
    • Provided a series of tools to better understand the company data and models, such as topic modeling, model explainability, bias analysis, and outlier detection.
    • Set up a complete active learning workflow using Label Studio and dedicated back ends.
    • Packaged products such as NLI framework and active learning workflow using Docker containers.
    • Audited the company's ML platform on MLOps and identified next improvements to focus on.
    • Set up a development environment for Kubeflow with Kubernetes and migrated the NLI framework to the Kubeflow pipeline.
    • Conducted research on federated learning and suggested a series of bespoke and open-source solutions.
    Technologies: Jupyter Notebook, Python, Docker, PyTorch, Hugging Face, Scikit-learn, Amazon Web Services (AWS), Kubeflow, Kubernetes, Seaborn, Matplotlib, Plotly, Discriminant Analysis (LDA), Topic Modeling, Machine Learning, Deep Learning, Active Learning, Natural Language Processing (NLP), Machine Learning Operations (MLOps)
  • Full-stack Data Scientist

    2020 - 2021
    UK-based Startups
    • Evaluated and compared a series of MLOps platform solutions like AWS SageMaker, Databricks, Kubeflow, and Cnvrg.
    • Designed and proposed multiple service architectures to implement MLOps.
    • Set up MLOps using DVC, MLFlow, and SageMaker to track experiments, train models, save, and deploy them.
    • Performed topic modeling and built a series of ensembles on text classification to identify stress and stressors in social media: a multi-modal deep learning classifier combining text and metadata. This was done as part of the data science bootcamp.
    • Generated synthetic data to train and evaluate models.
    • Ran profiling to provide 8x improvement in inference time on a classifier.
    • Freelanced for Updraft and one stealth startup.
    Technologies: Python, PyCharm, Scikit-learn, Amazon SageMaker, Machine Learning Operations (MLOps), Natural Language Processing (NLP), Profiling
  • Data Scientist

    2020 - 2020
    Department for Digital, Culture, Media and Sport (DCMS)
    • Performed a literature review of state of the art in job offers classification.
    • Built models using the spaCy similarity API, comparing job offer descriptions and titles to UK Standard Industrial Classification (UK SIC) descriptions.
    • Built scripts to run the model on the whole collection (more than one million job offers) and run daily on new job offers.
    Technologies: Python, Pandas, NumPy, Matplotlib, SpaCy, Google BigQuery
  • AI Researcher and Senior Developer

    2019 - 2020
    The National Archives
    • Interviewed and assessed five suppliers on their solutions and reports from off-the-shelf record management products to bespoke solutions using fully customized models or AI APIs. Worked on a third-party technology evaluation project,.
    • Wrote a 50+ page report on NLP techniques and tools to select for permanent preservation records held by government departments to be shared with multiple audiences, including government decision-makers, archivists, and data scientists.
    • Delivered a new release on DROID, an open-source project. Implemented or reviewed 60+ pull requests.
    • Managed the GitHub community on DROID, responding to user queries, reviewing and merging pull requests.
    • Increased project transparency and improved project prioritizing.
    • Advocated for the improvement of remote work in my department and offered guidance and support during the COVID-19 lockdown.
    Technologies: Java, Python, Amazon SageMaker
  • CTO and Co-founder

    2015 - 2020
    • Delivered an IoT product to the market to allow horse owners to look after their horses when they have chronic or acute health issues, keep their horses healthy and happy, and find peace of mind.
    • Led the software part for our product, including mobile, web app, and back-end development, servers management, and data science.
    • Ran the project management of our technical team, including two hardware and software engineers and part-time contractors.
    Technologies: PHP, JavaScript, React Native, Amazon Web Services (AWS), Docker, Java, MongoDB, Azure, MapReduce
  • Senior Java Developer

    2015 - 2016
    The National Archives
    • Collaborated closely with a researcher and a data expert, took over a prototype to link documents, and designed and implemented a set of applications to link collections, evaluate them and publish them.
    • Designed, implemented, and deployed to live and maintained a set of back-end applications dedicated to the categorization of 20+ million records of the national archives for their end-user website Discovery, using Lucene, then Solr.
    • Managed the servers where my applications were running.
    • Installed continuous integration platforms such as Jenkins, Nexus, and SonarQube.
    • Created an ML prototype to classify documents based on similarity with Lucene.
    • Ran a series of technical presentations to my department.
    Technologies: Java, Spring Boot, Groovy, MongoDB, Neo4j, JProfiler
  • Software Engineer

    2011 - 2014
    Worldline by Atos
    • Took part in the development, project management, and production support of a mediation platform for a high-visibility project for Orange.
    • Contributed to multiple projects of varying size on the back ends of the leading French telecom company Orange, including project management and customer relationship, functional and technical design, implementation, and production support.
    • Helped develop a banking application dedicated to mandating management in the SEPA norm for BPCE.
    • Supervised an offshore development team based in India, including support, validation, and monitoring.
    Technologies: Java, Maven, Spring, Apache, Apache Tomcat, NGINX, Linux, Hibernate, SQL, MySQL, Object-oriented Design (OOD), Software Architecture


  • IoT Product on Horse Care

    Trackener is a simple, easy-to-use system that enables owners to remotely monitor their horses 24/7.

    I was in charge of the software for our product: mobile, web app, and back-end development, server management, and data science.


  • Languages

    Python, Java, HTML, JavaScript, SQL, CSS, PHP, Groovy
  • Tools

    Slack, Git, PyCharm, Trello, YouTrack, G Suite, Amazon Elastic Container Service (Amazon ECS), Seaborn, Notion, Gensim, Amazon SageMaker, JProfiler, Maven, NGINX, CircleCI, MQTT, Plotly
  • Paradigms

    Service-oriented Architecture (SOA), DevOps, Object-oriented Design (OOD), MapReduce
  • Platforms

    Jupyter Notebook, Linux, Ubuntu, Amazon Web Services (AWS), Docker, Dataiku, Amazon EC2, Azure
  • Storage

    MongoDB, Neo4j, Amazon S3 (AWS S3), MySQL
  • Other

    Computer Science, Software Engineering, Software Architecture, Machine Learning, Natural Language Processing (NLP), Performance Tuning, Hugging Face, Deep Learning, Data Engineering, Interaction Design (IxD), Kubeflow, GloVe, MLflow, Data Versioning, Google BigQuery, Time Series Analysis, Topic Modeling, Active Learning, Machine Learning Operations (MLOps), Entrepreneurship, Profiling, Startup Accelerators
  • Frameworks

    React Native, Spring Boot, Spring, Flask
  • Libraries/APIs

    Scikit-learn, Pandas, NumPy, Matplotlib, PyTorch, SpaCy, NLTK


  • Master of Science (MSc) Degree in Computer Science
    2009 - 2011
    Staffordshire University - Stafford, UK
  • Master of Engineering (MEng) Degree in General Engineering
    2005 - 2010
    Ecole Catholique d'Arts et M├ętiers de Lyon - Lyon, France


  • Startupbootcamp IoT Connected Devices

To view more profiles

Join Toptal
Share it with others