Khaled Abdelhamid
Verified Expert in Engineering
Machine Learning Engineer and Developer
6th of October City, Giza Governorate, Egypt
Toptal member since June 20, 2022
Khaled is a senior machine learning engineer with four years of experience building state-of-the-art solutions. He is passionate about natural language processing and computer vision. Khaled specializes in web scraping, data collection, and publishing competitive datasets in the Arabic language.
Portfolio
Experience
Availability
Preferred Environment
Linux, Windows, Visual Studio Code (VS Code), Slack, Notion, Python 3, Jira
The most amazing...
...thing I've developed is a smart engine that creates a knowledge graph of jobs in different industries for an HR development company.
Work Experience
Machine Learning/NLP Engineer
Agolo
- Led the implementation of multiple cutting-edge machine learning models for natural language processing (NLP) applications, achieving state-of-the-art accuracy and performance.
- Developed custom pre- and post-processing pipelines to enhance named-entity recognition (NER) robustness and accuracy while also improving multilingual support.
- Contributed to the development and optimization of a knowledge base system specifically tailored for information retrieval, streamlining access to critical information within the organization.
- Created tailored evaluation pipelines to meticulously measure the accuracy of various machine learning models, contributing to ongoing excellence.
- Constructed comprehensive quantitative evaluation dashboards, empowering development teams to conduct rigorous assessments and identify areas for improvement.
- Actively engaged in diverse research initiatives aimed at harnessing the capabilities of large language models, integrating them into the pipeline using tools like LangChain.
- Orchestrated end-to-end pipelines to facilitate seamless communication among numerous interconnected services, optimizing system performance.
- Played a pivotal role in seamlessly integrating machine learning models into CI/CD pipelines, ensuring scalability and enhanced performance.
- Devised and executed extensive performance stress testing protocols for deployed ML services, determining optimal resource requirements under varying loads.
- Managed a dynamic team of five professionals, successfully achieving the goal of implementing robust multilingual support within an impressive timeframe of under a month.
AI Expert
Safeguard My Car
- Developed a large language model (LLM)-based sales voice chatbot integrated with the client database, enabling real-time, high-throughput client interactions with an average response latency of one second.
- Implemented a retrieval-augmented generation (RAG) system to address user FAQs during live calls, utilizing LangChain and ChromaDB for seamless support.
- Directed project development using Jira and Agile methodologies, overseeing sprint coordination for efficient task management. Led a team of five engineers to optimize the architecture, ensuring market readiness.
Session Lead in Computer Science
Udacity
- Successfully managed weekly sessions with a cohort of 35 students, achieving an impressive graduation rate of 93%.
- Demonstrated exceptional leadership as a top-rated session lead, consistently earning a 5-star rating and receiving outstanding feedback from students.
- Actively maintained and provided mentorship to numerous students through the Slack platform, fostering a supportive and conducive learning environment.
Machine Learning Engineer
Online Freelance Agency
- Contributed to building machine learning-based solutions for numerous customers.
- Designed and established data pipelines for NLP-based projects.
- Acted as the machine learning freelance achieving the maximum ratings and a 98% job success rate.
Research Assistant
Zewail City
- Developed a transformer-based model for the Arabic text diacritization task and outperformed the state-of-the-art method with a total accuracy of 98%.
- Processed the Arabic speech data and coded a deep learning-based model to do speech-to-text over the processed data.
- Wrote comparative articles and literature reviews on natural language processing and deep learning and their application in Arabic for non-specialized Arabic readers.
Machine Learning Engineer
Proteinea (startup)
- Conducted training, testing, and logging pipelines to use deep learning in predicting protein expression levels using DNA features.
- Established monitoring systems of the factory production status for logging and further analysis.
- Developed a computer vision-based system to count moving occluded objects over a conveyor built with fast and robust predictions.
Experience
Event Information Extraction from Tweets for Tech Startup
I have implemented the project using a combination of pattern recognition techniques using regex in addition to using Spacy's named entity recognition features. As for the dates, I have used a temporal model to parse the text and extract the dates and normalize them to be queried and analyzed.
Ranking and Associating Job Descriptions Along With Their Job Families for HR Agency
Text Mining and Analysis for Marketing Company
End-to-end Chatbot for Customer Services
https://www.youtube.com/watch?v=SJRmzDfWIec• Connects to REST APIs to handle users' voice and text commands from their database.
• Performs the extraction of the answers from archived business contracts using deep learning and fuzzy search.
• Conducts named-entity recognition, intent classification, and form-filling to extract all the needed information to perform a successful information retrieval for users.
AI-based OCR for Extracting Data from Electric Meters
First, I applied gamma correction to the images and other noise-canceling techniques to reach the optimal state the OCR engine could handle. Then, I developed the solution using EasyOCR and PaddleOCR. To enhance the performance of the seven-segment display, I created a dataset using the TRDG library with custom seven-segment fonts to train the OCR engines. Because there were expected text patterns, I used NLP methods to correct text based on the most likely predictions.
YouTube Scraper
Traffic Analysis with CCTV Cameras
I used bash scripting and FFmpeg for the video processing and the FairMOT model to classify and track the objects. The categories were pedestrians, cars, motorcycles, tricycles, and buses. The tracking data was filtered and used to create a traffic flow map, a visualization chart showing the most condensed areas in a specific location for a given time.
Email Scraper
I used Scrapy to make asynchronous requests to speed up scraping. For each page, I extracted all the emails using regular expression patterns to ensure the minimum amount of false negatives. The code finished with successful results and extracted emails from 90% of the websites.
Genuine Artistic Images from Custom Tags Using GANs
I built a Streamlit dashboard to interact smoothly with the user and get the name tags. Then I scraped websites containing high-resolution free images and used the DeepDream model in processing the scraped images to get newer artistic ones that were unique and related to the given name tags. The output was delivered, allowing the user to refresh and edit the parameters provided to the DeepDream model.
Named-Entity Extractor Pipeline
Text Data Augmentation
• random replacements of characters in words based on the most likely mistakes for such a word;
• use of synonyms and antonyms with negation;
• translation of phrases into a different language and then re-translation to the original language (this method could chain multiple languages sequentially); and
• paraphrasing sentences using the BERT-based model.
Converting Driver License Images Into Tabulated Data
ECG Analysis and Cardiac Arrhythmia Detection
I built hardware to extract the ECG signal from a person using NodeMCU and ECG custom kit. Then I read and uploaded the data into Firebase and trained and deployed a machine learning classifier in GCP. The signal could have six categories—normal and the other five cardiac arrhythmia types. The signal was processed and analyzed with LabVIEW.
Website Topic Categorization
I used Scrapy to process all the links asynchronously and get all the paragraph and header tags, which contained most of the informative text inside the URL. Then I used Google NLP API to extract the most likely category, saved the data continuously, and stored it in another CSV file.
JPEG Image Compression
https://github.com/Khaled-Abdelhamid/JpegI was able to get off 50% compression efficiency with almost 0.02% data loss between the raw and compressed image.
Spectrum Analyzer Application
https://github.com/Khaled-Abdelhamid/Spectrum-analyzer-applicationEducation
Bachelor of Engineering Degree in Communications and Information Technology Engineering
University of Science and Technology, Zewail City - Giza, Egypt
Certifications
Natural Language Processing Specialization
Coursera
AWS Cloud Practitioner Essentials
Amazon Web Services
Advanced Data Analysis Nanodegree Program
Udacity
IELTS | 7.5 | C1
British Council
Deep Learning Specialization
Coursera
Skills
Libraries/APIs
Natural Language Toolkit (NLTK), SpaCy, NumPy, Pandas, Rasa NLU, PyTorch, SciPy, TensorFlow, Keras, Matplotlib, OpenCV, Beautiful Soup, Scikit-learn, PySpark, DeepSpeech, Python Asyncio, FFmpeg
Tools
Git, Jupyter, Notion, Named-entity Recognition (NER), Seaborn, LabVIEW, Slack, Trello, Rasa.ai, ChatGPT, Kibana, Azure Machine Learning, MATLAB, TensorBoard, Amazon Elastic Block Store (EBS), LaTeX, Plotly, ELK (Elastic Stack), Jira
Languages
Python 3, Regex, Markdown, Python, C++, C, SQL, Julia, C#, Bash, Bash Script, YAML
Frameworks
Streamlit, Hadoop, Spark, Scrapy, Selenium, Apache Spark, LlamaIndex
Platforms
Jupyter Notebook, Amazon Web Services (AWS), Docker, Visual Studio Code (VS Code), Amazon EC2, Linux, Windows, Google Cloud Platform (GCP), Firebase
Storage
JSON, Data Pipelines, PostgreSQL, Amazon S3 (AWS S3), Elasticsearch, HBase, SQL Server 2016, Apache Hive
Paradigms
REST, ETL
Other
Deep Learning, Machine Learning, Data Science, English, Natural Language Processing (NLP), Google Colaboratory (Colab), Chatbots, Image Processing, Data Cleaning, Data Processing, Data Analysis, Artificial Intelligence (AI), Text Analytics, Data Engineering, Neural Networks, Speech Recognition, APIs, AI Design, Chatbot Conversation Design, Project Consultancy, Advisory, Generative Pre-trained Transformers (GPT), Large Language Models (LLMs), Data Cleansing, LangChain, BERT, Data Visualization, Web Dashboards, Computer Vision, Web Scraping, Signal Processing, Digital Signal Processing, GPU Computing, Cloud, Machine Vision, Text Generation, Language Models, Google BigQuery, OpenAI GPT-3 API, Amazon Machine Learning, Google Cloud Machine Learning, OpenAI GPT-4 API, Speech to Text, Speech to Text AI, Automatic Speech Recognition (ASR), Text to Speech (TTS), Serverless, Model Tuning, Statistics, Visualization, A/B Testing, OCR, NodeMCU, FastAPI, Data Collection, Speech to Intent
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring