Mohab Ayman, Data Scientist and AI Developer in Cairo, Cairo Governorate, Egypt
Mohab Ayman

Data Scientist and AI Developer in Cairo, Cairo Governorate, Egypt

Member since September 30, 2020
Mohab is a data scientist and machine learning developer, specializing in natural language processing (NLP) and computer vision. He has five years of professional experience, and recent projects have focused on machine learning in the areas of natural language understanding (NLU), cheminformatics, and self-driving cars. Mohab stays current with cutting-edge advancements in deep learning.
Mohab is now available for hire


  • Octimine
    Web Development, Data Modeling, JavaScript, Docker Hub, NumPy, Matplotlib...
  • Microsoft
    Web Development, Matplotlib, NumPy, .NET, Visual Studio Code, Visual Studio...
  • Self-employed
    C++, Visual Studio Code, GitLab, GitHub, Git, Jupyter Notebook, R...



Cairo, Cairo Governorate, Egypt



Preferred Environment

Anaconda, PyTorch, Linux, Python

The most amazing...

...project I've developed is a deep learning system for pairing work partners to cooperate on their similar goals based on semantic similarity of their profiles.


  • Data Scientist

    2019 - 2021
    • Conducted research in biomedical named entity recognition (NER) and developed a system in Python that extracts and normalizes chemical entities and diseases from the legal text.
    • Created a monitoring system in Node.js to collect information from staging and production servers. Visualized the results and created monitoring dashboards using Grafana.
    • Used Docker to containerize external dependencies and runtimes for various system components to alleviate the dependency overhead and create faster development pipelines.
    Technologies: Web Development, Data Modeling, JavaScript, Docker Hub, NumPy, Matplotlib, Machine Learning, Visual Studio Code, GitLab, Git, Jupyter Notebook, Natural Language Processing (NLP), Word2Vec, Linux, Java, Python, Data Science, Software Engineering, Neural Networks, Deep Neural Networks, NLTK, Node.js, Docker, Grafana, Pandas, Cheminformatics, NER, Data Visualization, Deep Learning, Transformers, HDF5, Scikit-learn, Seaborn, PyTorch, Elasticsearch, Kibana, Data Engineering, Big Data, XPath, XQuery, Scraping, Data Scraping, Text Classification, Regex, Categorization, Data Pipelines, Data Analytics, Data Analysis, Analysis, Analytics, Scientific Data Analysis, JSON, Redis
  • Research Software Development Engineer

    2018 - 2019
    • Developed an automated benchmarking pipeline in Python based on various NLU evaluation metrics. The pipeline runs periodically in an automated fashion and produces up-to-date evaluation metrics of the system and comparisons with competitor systems.
    • Worked on back-end servers with C# and .NET framework. Created new API endpoints and optimized existing ones, resulting in a significant drop in response latency.
    • Refactored a large system component with legacy code to an extensible design following best-practice design patterns, thus allowing for easier future extendibility while maintaining backward compatibility.
    Technologies: Web Development, Matplotlib, NumPy, .NET, Visual Studio Code, Visual Studio, Git, Natural Language Processing (NLP), Word2Vec, Anaconda, Agile Software Development, Software Engineering, Data Science, Seaborn, Scikit-learn, NLTK, Pandas, Natural Language Understanding (NLU), NER, Data Visualization, ASP.NET, C#, Python, Regex, Text Classification, Classification, Text Categorization, SQL, Data Analysis, Data Analytics, Data Pipelines, Analysis, Analytics, Scientific Data Analysis
  • Data Scientist

    2016 - 2016
    • Collaborated with a chemist colleague on chemical data analysis tasks, focusing on finding patterns and relations between chemical compound structures and their usage in drugs related to specific diseases.
    • Conducted experiments in natural language understanding and created a pipeline that performs intent classification and named-entity recognition to automate the processing of client receipts.
    • Used image recognition and computer vision algorithms to enhance the capabilities of a license plate recognition system to identify non-standard, hand-written, and multilingual characters.
    Technologies: C++, Visual Studio Code, GitLab, GitHub, Git, Jupyter Notebook, R, Natural Language Processing (NLP), Word2Vec, Anaconda, Data Science, Software Engineering, Data Visualization, NLTK, Computer Vision, OpenCV, NER, Natural Language Understanding (NLU), Cheminformatics, NumPy, Matplotlib, Seaborn, Scikit-learn, Pandas, Python, Exploratory Data Analysis, Text Categorization, Data Analysis, Data Analytics, Analysis, Analytics, Scientific Data Analysis
  • Research Assistant

    2015 - 2015
    Ulm University
    • Conducted research in neuroinformatics, focusing on analyzing biomedical data of patients and identifying patterns that reflect the level of pain a patient is undergoing during a medical operation.
    • Created machine learning models that predict pain intensity of a specific patient based on visual data from their facial expressions and biopotential data from sensors recording signals in their nervous system.
    • Developed a neural network package in the R language that implements a parameterized, multi-layer perception optimized with resilient and classic backpropagation algorithms.
    Technologies: Ggplot2, C#, Data Science, Data Visualization, Deep Learning, Neuroinformatics, Neural Networks, Machine Learning, Python, R, Data Analysis, Data Analytics, Analysis, Analytics, Scientific Data Analysis, Clustering, RStudio, RStudio Shiny, Dplyr, Tidyverse


  • Deep Learning Helper for Annotating Pixels for Semantic Segmentation

    A system that helps human annotators label image pixels for semantic segmentation. The system uses active learning to suggest a subset of the pixels to be labeled, yet arrives at comparable accuracy. I built and implemented the system using PyTorch and other helping libraries. It significantly reduces the effort required by a human annotator to label an image.

  • Word Embeddings for Work Colleague Matching

    A system that matches colleagues with similar objectives and skills to facilitate collaboration. I built a system that uses word-embedding algorithms to find the semantic similarity of employees based on their profiles. Employees can easily use the system to find colleagues to help them with their tasks.

  • Automated Data Processing and Visualization Pipeline

    An automated pipeline, created with Python and Bonobo, runs chained data processing operations with multiple parameters and automatically produces analysis plots with Seaborn. The goal of the pipeline was to reverse engineer the parameters needed to replicate some results for the client.

  • Generative Adversarial Networks for Improving Image Quality

    A system that uses Generative Adversarial Networks to recover high-resolution images from low-resolution images and semantic labels used by the client. The system is implemented in PyTorch and uses the pre-trained model of the SuperResolution GAN architecture.


  • Languages

    Python, SQL, C#, R, Java, C++, JavaScript, SPARQL, RDF, XPath, XQuery, Regex
  • Libraries/APIs

    Pandas, SciPy, PyTorch, NLTK, HDF5, TensorFlow, Node.js, Scikit-learn, Matplotlib, NumPy, OpenCV, Ggplot2, Spark ML, SQLAlchemy, NetworkX, Tidyverse
  • Paradigms

    Data Science, Agile Software Development, MapReduce, ETL
  • Storage

    PostgreSQL, Cassandra, Elasticsearch, MySQL, Data Pipelines, JSON, Redis, Redshift
  • Other

    Neural Networks, Data Visualization, Machine Learning, Data Analysis, Data Analytics, Analysis, Analytics, Transformers, Computer Vision, Active Learning, Deep Learning, Natural Language Processing (NLP), Artificial Intelligence (AI), A/B Testing, Cohort Analysis, Metabase, Natural Language Understanding (NLU), Semantic Segmentation, Software Engineering, Cheminformatics, Word2Vec, GloVe, Convolutional Neural Networks, Recurrent Neural Networks, Neuroinformatics, Deep Neural Networks, Data Modeling, Web Development, Linear Regression, Linear Algebra, Time Series, Time Series Analysis, Social Network Analysis, Network Analysis, Data Engineering, Mathematics, Statistics, Data Processing, Bonobo, Reverse Engineering, Big Data, Scraping, Data Scraping, Text Classification, Classification, Exploratory Data Analysis, Text Categorization, Categorization, Scientific Data Analysis, Clustering, FAISS, Social Network Analytics, BERT, Image Processing, Data Building Tool (DBT), Funnel Analysis, Hypothesis Testing, Generative Adversarial Networks (GANs), Image Analysis, Shell Scripting, AWS, Web Scraping, Statistical Data Analysis, GPT-3, Hugging Face, Machine Learning Operations (MLOps)
  • Platforms

    Visual Studio Code, Linux, Anaconda, Docker, Jupyter Notebook, Amazon Web Services (AWS), RStudio
  • Frameworks

    ASP.NET, Flask, .NET, Spark, Apache Spark, RStudio Shiny
  • Tools

    NER, Seaborn, Git, Visual Studio, Grafana, GitLab, Docker Hub, GitHub, Spark SQL, Kibana, Apache Airflow, Amazon SageMaker, Elastic, Dplyr


  • Bachelor's Degree (Hons) in Computer Science
    2011 - 2016
    The German University in Cairo - New Cairo, Egypt

To view more profiles

Join Toptal
Share it with others