Badr Jaidi, Developer in Vancouver, BC, Canada
Badr is available for hire
Hire Badr

Badr Jaidi

Verified Expert  in Engineering

Bio

Badr is a data scientist specializing in natural language processing. He speaks four languages fluently and has been in the technology field for nearly a decade, where he has worked on a broad range of projects going from hardware to front-end development.

Portfolio

Private Search Engine
Python, Large Language Models (LLMs), Natural Language Processing (NLP)...
Plutoshift, Inc.
Data Science, Python, Machine Learning, Time Series, Signal Processing
Bhavik Muni
Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP)...

Experience

Availability

Part-time

Preferred Environment

Visual Studio Code (VS Code), Unix

The most amazing...

...product I've developed is a multilingual text pipeline that uses LLMs to provide tens of thousands of users answers to their questions in less than a second.

Work Experience

NLP Engineer

2022 - 2024
Private Search Engine
  • Developed and maintained a suite of AI products that reliably serve tens of thousands of customers.
  • Implemented LLM solutions to summarize and answer questions about search results in less than one second with state-of-the-art relevancy and accuracy metrics.
  • Worked and maintained a reliable and fast web crawling utility that extracts high-quality text from a large number of supported media formats.
Technologies: Python, Large Language Models (LLMs), Natural Language Processing (NLP), PyTorch, OpenAI, Chatbots, OpenAI GPT-3 API, OpenAI GPT-4 API, ChatGPT, Azure, Google Cloud Platform (GCP), Prompt Engineering, LangChain, Web Development, Cloud Computing, Machine Learning Operations (MLOps), LlamaIndex, Retrieval-augmented Generation (RAG), AI Chatbots, Model Tuning

Data Scientist

2022 - 2022
Plutoshift, Inc.
  • Trained high accuracy time series classification models for a Fortune 500 company.
  • Preprocessed huge amounts of complex time series hardware signals into an interpretable and trainable format.
  • Used various complex time series transformations and machine learning techniques to do time series classification.
Technologies: Data Science, Python, Machine Learning, Time Series, Signal Processing

NLP Expert

2022 - 2022
Bhavik Muni
  • Implemented an NLP solution that extracts actionable insights from YouTube-related text data.
  • Efficiently extracted large amounts of text data from YouTube's platform.
  • Designed an architecture that combines the NLP analytics and then extracted data to display insights live to the client.
Technologies: Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Data Science, GPU Computing, Machine Learning Operations (MLOps)

Data Scientist

2019 - 2021
Ai Outcome
  • Researched and developed a topic extraction model that led to the creation and sales of a new product.
  • Developed an optimized and efficient bilingual French and English text processing pipeline.
  • Used time series forecasting to help clients manage their resources more efficiently.
  • Set up and managed a database server infrastructure to host hundreds of gigabytes of raw data.
Technologies: Python, Topic Modeling, Text Classification, SpaCy, Gensim, Machine Learning, Data Science, Artificial Intelligence (AI), Deep Learning, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Full-stack, Azure, Web Development, Cloud Computing, Machine Learning Operations (MLOps), Model Tuning

Teacher Assistant

2018 - 2021
École de Technologie Supérieure
  • Assisted researchers by conducting experiments in a THz research lab.
  • Helped students learn to program in C by assisting them through the whole learning cycle, from the basics to making their first project.
  • Helped students learn to use Linux platforms, install, manage, configure their networks, and use a bash script to automate them.
Technologies: Waveforms, C, Unix, Cisco, Bash Script, MATLAB

Software Developer

2018 - 2019
iBwave Solutions
  • Developed features for an application used by hundreds of clients worldwide.
  • Tracked and resolved bugs using Jira as a reporting tool.
  • Contributed to quality insurance by testing the application thoroughly.
Technologies: C#, Agile, QA Testing, Web Development

Associate Developer

2016 - 2017
Carrotsoftware.co.ltd
  • Developed parts of web and iOS applications and set up databases for a diverse range of customers.
  • Created a map application that smoothly displayed live data from millions of database rows.
  • Maintained the company's internal chatbot regularly.
Technologies: SQL, C#, Swift, JavaScript, Java, PHP, HTML, Bootstrap, Full-stack, Web Development

Fake News Classification

https://github.com/LiamNiisan/fake_news_detection
The project's goal was to train a model capable of detecting fake news.

Two models were trained and compared. A linear model with FastText and a neural model with TensorFlow. With the training data, the neural model gave the best results, but on manually annotated data from online news, FastText performed much better.

This showed that linear models are very good at generalizing, and neural models need to be trained on lots of good data to perform well.

Legal Corpus NER

https://github.com/LiamNiisan/legal-corpus-NER
The project's goal was to build an annotated corpus from scratch with a browser interface for non-experts.

The data was scrapped from BC's Court of Appeal and Supreme Court and was annotated using label-studio.

BERT for Hate Speech Detection

https://github.com/LiamNiisan/BERT-Fine-Tuning-Hate-Speech-Detection
This project experiments with many BERT variants to find the one that can best detect hate speech on social media. Tried BERT variants: BERT base, DistilBERT, RoBERTa base, DistilRoBERTa, RoBERTa large, BERTweet, and BERTweet large.

BGC NASA Landslide Detection

https://github.com/LiamNiisan/BGC-NASA-landslide-detection
The goal of this project is to expand NASA's Cooperative Open Online Landslide Repository (COOLR) by automatically extracting landslide events from online sources.

The project consists of two parts:

1. News articles are extracted from online sources and then passed to a model that extracts the landslide's event properties.

2. The model extracts information from the articles: time, location, casualties, landslide category, and landslide trigger.
2021 - 2022

Master's Degree in Data Science

University of British Columbia - Vancouver, BC, Canada

2017 - 2021

Bachelor's Degree in Engineering

École de Technologie Supérieure - Montreal, QC, Canada

Libraries/APIs

Natural Language Toolkit (NLTK), SpaCy, PyTorch, TensorFlow, jQuery

Tools

Gensim, ChatGPT, MATLAB, Geocoder

Languages

Python, C, SQL, C#, Swift, JavaScript, Java, PHP, HTML, Bash Script

Frameworks

LlamaIndex, Bootstrap

Platforms

Unix, Docker, Visual Studio Code (VS Code), Azure, Google Cloud Platform (GCP)

Paradigms

Agile

Storage

NoSQL

Other

Topic Modeling, Artificial Intelligence (AI), Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Large Language Models (LLMs), OpenAI, Chatbots, LangChain, Retrieval-augmented Generation (RAG), AI Chatbots, Computational Linguistics, BERT, Long Short-term Memory (LSTM), Machine Learning, Data Science, Deep Learning, Artificial Neural Networks (ANN), Full-stack, OpenAI GPT-3 API, OpenAI GPT-4 API, Prompt Engineering, Web Development, Cloud Computing, Machine Learning Operations (MLOps), Model Tuning, Electrical Engineering, Text Classification, QA Testing, fastText, Deep Neural Networks (DNNs), FastAPI, Scraping, Annotations, Custom BERT, Waveforms, Cisco, GPU Computing, Time Series, Signal Processing, Information Retrieval

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring