Jan Krepl, Developer in Geneva, Switzerland
Jan is available for hire
Hire Jan

Jan Krepl

Verified Expert  in Engineering

Machine Learning Engineer and Developer

Location
Geneva, Switzerland
Toptal Member Since
July 14, 2023

Jan is a machine learning engineer passionate about software engineering, machine learning, leadership, and online education. He has extensive professional experience applying computer vision, natural language processing, and time series analysis in academic and business settings. Jan also dedicates much of his free time to contributing to open-source software and educational content creation.

Portfolio

The EPFL
Python, CI/CD Pipelines, BERT, PyTorch, Elasticsearch...
The EPFL
CI/CD Pipelines, Python, OpenCV, Keras, PyTorch, Deep Learning...
Nectar Financial
Python, Machine Learning, Deep Learning, Gensim, Scikit-learn, Pandas, NumPy...

Experience

Availability

Part-time

Preferred Environment

Python, Machine Learning, Notion

The most amazing...

...thing I've developed is a question-answering tool extracting knowledge from scientific papers.

Work Experience

Machine Learning Section Manager | Blue Brain Project

2022 - PRESENT
The EPFL
  • Designed a literature search system focused on semantic search, question answering, named entity recognition, and entity linking, built on top of recent large language models. The entire system was deployed at scale with Kubernetes and AWS.
  • Managed a team of four experienced machine learning engineers.
  • Acted as a lead developer enforcing best practices.
Technologies: Python, CI/CD Pipelines, BERT, PyTorch, Elasticsearch, Natural Language Processing (NLP), Amazon Web Services (AWS), Machine Learning Operations (MLOps), Deep Learning, Machine Learning, Docker, Unit Testing, REST APIs, FastAPI, GitLab, GitHub, NumPy, Kubernetes, Vim Text Editor, Shell Scripting, SQL, SpaCy, Agile Software Development, Data Science, Orchestration, Computer Vision, Data Versioning, Apache Airflow, LangChain, Sphinx, TensorBoard, Hugging Face Transformers, Git, Generative Pre-trained Transformers (GPT), ChatGPT, OpenAI GPT-4 API, Artificial Intelligence (AI), Hugging Face, APIs, JavaScript, Regular Expressions, Product Consultant, Natural Language Understanding (NLU), GPT, Algorithms, Back-end Development, Language Models, Amazon S3 (AWS S3), Amazon EC2, AWS Glue, Terraform, Pytest, GitLab CI/CD, Large Language Models (LLMs), Redis, Redis Cache, Technical Leadership, Leadership, Retrieval-augmented Generation (RAG), OpenAI, NoSQL, Back-end, Asyncio, Containerization, Python Asyncio, Test-driven Development (TDD), AWS Lambda, REST, Serverless, SDKs, Pinecone

Machine Learning Engineer | Blue Brain Project

2018 - 2022
The EPFL
  • Conceived and implemented a supervised algorithm for 2D brain slice image registration that became a part of internal workflows.
  • Developed a knowledge extraction pipeline for scientific articles with main functionalities such as parsing, neural search, and named entity recognition.
  • Engaged directly in various neuroscientific projects, including neuron-type classification with graph neural networks and morphology image synthesis with generative adversarial networks.
Technologies: CI/CD Pipelines, Python, OpenCV, Keras, PyTorch, Deep Learning, Machine Learning, GitLab CI/CD, Image Registration, SpaCy, Git, Machine Learning Operations (MLOps), REST APIs, FastAPI, GitLab, GitHub, NumPy, Kubernetes, Vim Text Editor, Shell Scripting, SQL, Agile Software Development, Data Science, Orchestration, Natural Language Processing (NLP), Elasticsearch, Docker, Computer Vision, Data Versioning, Apache Airflow, Unit Testing, Sphinx, TensorBoard, Hugging Face Transformers, MySQL, PostgreSQL, Artificial Intelligence (AI), Hugging Face, APIs, Regular Expressions, Natural Language Understanding (NLU), GPT, Algorithms, Back-end Development, Language Models, TensorFlow, Pytest, Large Language Models (LLMs), NoSQL, Back-end, Asyncio, Containerization, Python Asyncio, Test-driven Development (TDD)

Data Scientist

2018 - 2018
Nectar Financial
  • Enhanced internal portfolio optimization algorithms with return forecasting using supervised learning techniques. Added custom constraints and objective functions, making the tool more flexible.
  • Applied text embedding algorithms, such as Doc2Vec and TF-IDF, on hedge fund fact sheets and reports. In turn, these embeddings were used for clustering, which allowed for better diversification.
  • Developed a custom back-testing framework considering various hedge-fund-specific constraints like lock-ups.
Technologies: Python, Machine Learning, Deep Learning, Gensim, Scikit-learn, Pandas, NumPy, Jupyter Notebook, SciPy, StatsModels, SpaCy, REST APIs, GitHub, SQL, Data Science, Natural Language Processing (NLP), Keras, Docker, Time Series Analysis, Unit Testing, Git, Artificial Intelligence (AI), JavaScript, Regular Expressions, Algorithms, Back-end Development, Pytest, NoSQL, Test-driven Development (TDD)

Quantitative Risk Analyst

2016 - 2017
UBS
  • Maintained the Lombard lending section's stress-testing codebase that used Visual Basic, SQL, and SAS.
  • Generated regular risk reports used as inputs for other departments.
  • Supported senior analysts in creating custom risk models.
Technologies: SQL, SAS, Excel VBA, Shell Scripting, Probability Theory, Time Series Analysis, Algorithms, NoSQL

Mildlyoverfitted | Educational Videos

https://www.youtube.com/@mildlyoverfitted/
A YouTube channel that I developed to host educational content and resources. The channel features videos I've created on machine learning, deep learning, and Python. One of the main goals is to explain how things work under the hood and how we can implement solutions from scratch.

DeepDow | Portfolio Optimization with Deep Learning

https://github.com/jankrepl/deepdow/
A Python package for portfolio optimization with deep learning. It attempts to merge the following two very common steps in portfolio optimization:
• Forecasting the market's future evolution, such as long short-term memory networks (LSTM) and generalized autoregressive conditional heteroskedasticity (GARCH).
• Providing optimization problem designs and solutions, such as convex optimization.

It does so by constructing a pipeline of layers. The last layer performs the allocation, and all the previous ones serve as feature extractors. The overall network is fully differentiable, and one can optimize its parameters by gradient descent algorithms.

MLtype | Command Line Tool

https://github.com/jankrepl/mltype/
A programmer-friendly command line tool for improving typing speed and accuracy. The main goal is to help programmers practice programming languages. It uses neural networks to generate text. One can go for pre-trained networks or train new ones from scratch.

Atlas Alignment | Multimodal Registration and Alignment

https://github.com/BlueBrain/atlas-alignment/
A toolbox to perform multimodal image registration, which includes traditional and supervised deep learning models. This project originated from the Blue Brain Project efforts on aligning mouse brain atlases obtained with in situ hybridization (ISH) gene expression and Nissl stains. The project was published on Frontiers Media, which you can access via this link: frontiersin.org/articles/10.3389/fninf.2021.691918/full/

PyChubby | Automated Face-warping Tool

https://github.com/jankrepl/pychubby/
A Python package for automated face warping. It allows the user to programmatically change the facial expression and shape of any person in an image. It is based on geometric transformations using computer vision.

Languages

Python, SQL, SAS, Excel VBA, JavaScript

Libraries/APIs

PyTorch, Scikit-learn, NumPy, Keras, SciPy, Pandas, Matplotlib, REST APIs, TensorFlow, Asyncio, Python Asyncio, JAX, OpenCV, SpaCy, React

Tools

Vim Text Editor, Git, GitLab CI/CD, Pytest, TensorBoard, GitLab, GitHub, ChatGPT, Notion, Amazon SageMaker, Cloud Dataflow, Google Compute Engine (GCE), AWS Glue, Terraform, Inkscape, Apache Airflow, Adobe Premiere Pro, Seaborn, Gensim, StatsModels, Scikit-image, Google Kubernetes Engine (GKE)

Paradigms

Unit Testing, Data Science, Test-driven Development (TDD), REST, Scrum, Agile Software Development

Platforms

Kubernetes, Docker, Amazon Web Services (AWS), Jupyter Notebook, Vertex AI, Amazon EC2, Google Cloud Platform (GCP), AWS Lambda

Storage

Elasticsearch, Google Cloud Storage, Amazon S3 (AWS S3), Redis, Redis Cache, NoSQL, Neo4j, MySQL, PostgreSQL

Other

Probability Theory, Mathematical Analysis, Linear Algebra, Statistics, Machine Learning, Portfolio Optimization, Orchestration, Machine Learning Operations (MLOps), Shell Scripting, Generative Pre-trained Transformers (GPT), BERT, Hugging Face Transformers, Sphinx, Natural Language Processing (NLP), Finance, Computer Vision, OpenAI GPT-4 API, Artificial Intelligence (AI), Hugging Face, APIs, Regular Expressions, Natural Language Understanding (NLU), GPT, Algorithms, Back-end Development, Language Models, Pub/Sub, Large Language Models (LLMs), Technical Leadership, Leadership, Retrieval-augmented Generation (RAG), OpenAI, Back-end, Containerization, Serverless, SDKs, Pinecone, Optimization, Microeconomics, Macroeconomics, Mathematical Finance, Quantitative Risk Analysis, Numerical Methods, MLflow, FastAPI, LangChain, Time Series Analysis, Product Consultant, Measure Theory, Econometrics, Private Company Valuation, Deep Learning, Scrum Master, CI/CD Pipelines, Online Course Design, Recurrent Neural Networks (RNNs), Open Source, Image Registration, Data Versioning, Google BigQuery, Full-stack Development

Frameworks

Apache Spark

2015 - 2018

Master's Degree in Quantitative Finance

ETH Zurich - Zurich, Switzerland

2011 - 2014

Bachelor's Degree in Economics

Charles University - Prague, Czechia

FEBRUARY 2024 - FEBRUARY 2026

HashiCorp Certified: Terraform Associate (003)

HashiCorp

JANUARY 2024 - JANUARY 2027

AWS Certified Solutions Architect - Associate

Amazon Web Services

SEPTEMBER 2023 - SEPTEMBER 2025

Google Cloud Certified Professional Machine Learning Engineer

Google Cloud

AUGUST 2023 - AUGUST 2026

AWS Certified Machine Learning - Specialty

Amazon Web Services

JUNE 2023 - JUNE 2025

Databricks Certified Associate Developer for Apache Spark 3.0

Databricks Inc.

MARCH 2023 - MARCH 2026

AWS Certified Cloud Practitioner

Amazon Web Services

FEBRUARY 2023 - FEBRUARY 2026

CKAD: Certified Kubernetes Application Developer

The Linux Foundation

JANUARY 2023 - PRESENT

Professional Scrum Master (PSM I)

Scrum.org

JULY 2015 - PRESENT

CFA Level I (Passed)

CFA Institute

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring