Karim Foda, Developer in London, United Kingdom
Karim is available for hire
Hire Karim

Karim Foda

Verified Expert  in Engineering

NLP Researcher and Developer

Location
London, United Kingdom
Toptal Member Since
July 6, 2020

Karim is an NLP researcher with in-depth and hands-on experience working on building machine learning (ML) models that aim to replicate specific human functions, thereby accelerating a business's processes. Most recently, Karim's focus has been on training large language models (LLMs) for natural language understanding (NLU) and natural language generation (NLG) through conversational chatbots.

Portfolio

Kaizan
Artificial Intelligence (AI), OpenAI GPT-4 API, OpenAI GPT-3 API...
Shortform
Natural Language Processing (NLP), OpenAI GPT-4 API, Elasticsearch...
Grata
JSON, Roku, Machine Learning, Deep Neural Networks...

Experience

Availability

Full-time

Preferred Environment

Python

The most amazing...

...thing I believe I've built is a LongT5 model fine-tuned on generating automatic summaries of self-help books.

Work Experience

Lead NLP Engineer

2021 - PRESENT
Kaizan
  • Built a GPT-4-driven chatbot that combined factored cognition, LangChain, and Elasticsearch to augment an organization's employees with a perfect memory of all their teams' calls and emails.
  • Developed an internal annotation platform to increase manual annotations using weak labels and designed a data augmentation strategy that increased user data size fourfold.
  • Fine-tuned a Pegasus large model on video call summary data using the Hugging Face Transformers and Microsoft's DeepSpeed libraries to automatically generate meeting actions and summaries.
Technologies: Artificial Intelligence (AI), OpenAI GPT-4 API, OpenAI GPT-3 API, Language Models, Django, Hugging Face, Generative Pre-trained Transformers (GPT), Elasticsearch, PostgreSQL, Redis, Google Cloud, Docker, Causal Inference, Fine-tuning, Generative Artificial Intelligence (GenAI), Research, AI Content Creation

NLP Consultant

2021 - 2023
Shortform
  • Pre-trained a LongT5 XXL model on three times more data that outperformed LongT5 XL on the BookSum dataset to write coherent reading guides for fiction books with personalized commentary.
  • Built agents powered by language models and vector DB search to assist users in creating expanding and contradicting points to a specific book's main theses.
  • Deployed a pipeline for summarizing book chapters using GPT-4 and a summary of summaries approach.
Technologies: Natural Language Processing (NLP), OpenAI GPT-4 API, Elasticsearch, Google Cloud, Artificial Intelligence (AI), Docker, Hugging Face, Causal Inference, Fine-tuning, Generative Artificial Intelligence (GenAI)

NLP Engineer

2021 - 2022
Grata
  • Finetuned a t5-3b model to generate descriptions of companies in a predefined format using text scraped from their website, achieving an 89% average BERTScore precision.
  • Deployed a finetuned t5-3b model on Amazon SageMaker to automatically generate descriptions of companies from their website.
  • Custom-built a question-answering dataset to finetune a RoBERTa-based model to automatically extract a company's specific information from its website—such as trading name, location, and products.
Technologies: JSON, Roku, Machine Learning, Deep Neural Networks, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Python 3, Sequence Models, BERT, PyTorch, Hugging Face, OpenAI, Artificial Intelligence (AI), Docker, Causal Inference, Fine-tuning, Generative Artificial Intelligence (GenAI)

NLP Engineer

2018 - 2021
Lloyds Banking Group
  • Developed Python scripts that extracted comments from internal social media sites, analyzed their change in sentiment over time, and visualized the findings in the Python Dash app.
  • Built a chatbot focused on improving colleagues' mental health through emotion logging capabilities and using a GPT-2 transformer that enabled it to have basic conversations with users.
  • Classified 100,000 customer cases automatically using categories identified by an LDA topic analysis model run on verbatim text commentary describing each case.
  • Utilized regular expressions to detect and encode personal customer data within an RDS database.
Technologies: Natural Language Generation (NLG), Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), R, Tableau, Python, Sequence Models, Hugging Face, Causal Inference

NLP Engineer

2020 - 2020
FACETITLE
  • Trained a BERT-based NER model to detect when a character was mentioned in tv show subtitles with a 95% degree of accuracy and displayed their headshot in real time on a Roku application.
  • Created a RoBERTa-based multiple-class classification model that categorizes the sentiment of episode reviews with a 92% degree of accuracy using a Hugging Face Transformer library.
  • Consulted with the founding team and helped them secure an NSF seed fund grant.
Technologies: Machine Learning, Natural Language Processing (NLP), Web Scraping, Generative Pre-trained Transformers (GPT), Python

Data Scientist

2016 - 2018
Lloyds Banking Group
  • Built a classification model for the direction of motion of the EUR/USD rate using an aggregation of the predictions of an entropy-based random forest model and bidirectional LSTMs.
  • Coordinated with finance business partners and business managers to develop a transparent deal pipeline income forecasting model with a 5% degree of accuracy.
  • Analyzed intraday correlations between European assets over the period preceding Brexit using VECM and VAR models to promote a strategy focused on German assets.
  • Automated the process for calculating annual income budgets for 21 industries using a linear regression model that analyzed a time series of yearly income data.
Technologies: Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), R, Machine Learning, Visual Basic for Applications (VBA), Python

Data Engineer

2014 - 2016
Lloyds Banking Group
  • Built data capturing and visualization tools for digital, commercial banking, and IT support teams.
  • Led a service improvement initiative that resolved 52% of financial market systems' problem records and set up a dashboard for tracking daily performance.
  • Conducted research on the financial feasibility of two new mobile banking testing products and estimated and discounted future predicted cash flows to drive a £50 million investment decision.
Technologies: Python, Visual Basic for Applications (VBA), Tableau

Emotion Classification Using a WAME Optimizer

Implemented the recently developed WAME optimizer by Mosca et al. to improve the performance of an emotion classification convolution neural network. I achieved accuracies higher than those of baseline optimizers such as Adam and RMSProp.
2018 - 2020

Master of Research Degree in Machine Learning

Birkbeck University of London - London, United Kingdom

2016 - 2018

Master's Degree in Finance

London Business School - London, United Kingdom

2010 - 2014

Master of Science Degree in Aeronautical Engineering

Durham University - Durham, United Kingdom

Libraries/APIs

TensorFlow Deep Learning Library (TFLearn), Keras, TensorFlow, Pandas, DeepSpeech, DeepSpeed, PyTorch

Tools

MATLAB, Named-entity Recognition (NER), Tableau

Languages

Python, R, SQL, Bash, C++, Visual Basic for Applications (VBA), Python 3

Platforms

Docker, Google Cloud Platform (GCP)

Paradigms

Data Science

Storage

PostgreSQL, JSON, Elasticsearch, Redis, Google Cloud

Frameworks

Django

Other

Dashboard Design, Transformers, Natural Language Processing (NLP), Dash, Topic Modeling, Emotion Recognition, Sentiment Analysis, Machine Learning, Statistics, Artificial Intelligence (AI), Natural Language Generation (NLG), Neural Networks, Custom BERT, OCR, Hugging Face, Generative Pre-trained Transformer 3 (GPT-3), Language Models, Generative Pre-trained Transformers (GPT), Causal Inference, Bittensor, Fine-tuning, Generative Artificial Intelligence (GenAI), Research, Chatbots, Image Recognition, Web Scraping, Econometrics, Time Series Analysis, Deep Neural Networks, Recurrent Neural Networks (RNNs), Convolutional Neural Networks (CNN), Decision Tree Classification, Finite Element Analysis (FEA), Deep Learning, Generative Adversarial Networks (GANs), Roku, Voice, Sequence Models, BERT, OpenAI, OpenAI GPT-4 API, OpenAI GPT-3 API, AI Content Creation

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring