Mohab Ayman
Verified Expert in Engineering
Data Scientist and AI Developer
Mohab is a data scientist and machine learning developer, specializing in natural language processing (NLP) and computer vision. He has five years of professional experience, and recent projects have focused on machine learning in the areas of natural language understanding (NLU), cheminformatics, and self-driving cars. Mohab stays current with cutting-edge advancements in deep learning.
Portfolio
Experience
Availability
Preferred Environment
Anaconda, PyTorch, Linux, Python
The most amazing...
...project I've developed is a deep learning system for pairing work partners to cooperate on their similar goals based on semantic similarity of their profiles.
Work Experience
AI Developer
Quantum Innovation Ventures LLC
- Developed an LLM-powered application to automate the process of investment memo creation.
- Created the architecture for the LLM with Langchain.
- Wrapped the LLM app in a Django application and deployed it to Azure Cloud.
Data Scientist
Octimine
- Conducted research in biomedical named entity recognition (NER) and developed a system in Python that extracts and normalizes chemical entities and diseases from the legal text.
- Created a monitoring system in Node.js to collect information from staging and production servers. Visualized the results and made monitoring dashboards using Grafana.
- Used Docker to containerize external dependencies and runtimes for various system components to alleviate the dependency overhead and create faster development pipelines.
Research Software Development Engineer
Microsoft
- Developed an automated benchmarking pipeline in Python based on various NLU evaluation metrics. The pipeline runs periodically in an automated fashion and produces up-to-date evaluation metrics of the system and comparisons with competitor systems.
- Worked on back-end servers with C# and .NET framework. Created new API endpoints and optimized existing ones, resulting in a significant drop in response latency.
- Refactored a large system component with legacy code to an extensible design following best-practice design patterns, thus allowing for easier future extendibility while maintaining backward compatibility.
Data Scientist
Self-employed
- Collaborated with chemist experts on chemical data analysis tasks, focusing on finding patterns and relations between chemical compound structures and their usage in drugs related to specific diseases.
- Conducted experiments in natural language understanding and created a pipeline that performs intent classification and named-entity recognition to automate the processing of client receipts.
- Used image recognition and computer vision algorithms to enhance the capabilities of a license plate recognition system to identify non-standard, hand-written, and multilingual characters.
Research Intern
Ulm University
- Conducted research in neuroinformatics, focusing on analyzing biomedical data of patients and identifying patterns that reflect the level of pain a patient is undergoing during a medical operation.
- Created machine learning models that predict the pain intensity of a specific patient based on visual data from their facial expressions and biopotential data from sensors recording signals in their nervous system.
- Developed a neural network package in the R language that implements a parameterized, multi-layer perception optimized with resilient and classic backpropagation algorithms.
Experience
AI Assistant for Investment Memo Creation
AI Assistant for Lawyers
AI Judge for Automating Customer Service Chat Evaluation
Generating Informed Sitemaps Using Web Crawling and GPT
Word Embeddings for Work Colleague Matching
Deep Learning Helper for Annotating Pixels for Semantic Segmentation
Automated Data Processing and Visualization Pipeline
Generative Adversarial Networks for Improving Image Quality
Traffic Scene Generation Based on Graph CNNs and GANs
Skills
Languages
Python, SQL, C#, R, Java, C++, JavaScript, SPARQL, RDF, XPath, XQuery, Regex, Google Apps Script, TypeScript
Libraries/APIs
Pandas, Scikit-learn, NumPy, SciPy, PyTorch, Natural Language Toolkit (NLTK), HDF5, TensorFlow, Node.js, Matplotlib, OpenCV, Ggplot2, Spark ML, SQLAlchemy, NetworkX, Tidyverse, Google Sheets API, Google Speech API, Google Speech-to-Text API, React, Office API, LINQ, D3.js
Paradigms
Data Science, Agile Software Development, MapReduce, ETL, Search Engine Optimization (SEO)
Platforms
Jupyter Notebook, Visual Studio Code (VS Code), Amazon Web Services (AWS), Linux, Anaconda, Docker, RStudio, Google Cloud Platform (GCP), Azure
Storage
PostgreSQL, Cassandra, Elasticsearch, MySQL, Data Pipelines, JSON, Redis, Redshift
Other
Neural Networks, Data Visualization, Natural Language Processing (NLP), Machine Learning, Artificial Intelligence (AI), Data Analysis, Data Scraping, Data Analytics, Analysis, Analytics, ChatGPT, Large Language Models (LLMs), GPT, OpenAI GPT-4 API, LangChain, OpenAI GPT-3 API, OpenAI, Prompt Engineering, Transformers, Computer Vision, Active Learning, Deep Learning, Data Engineering, BERT, A/B Testing, Cohort Analysis, Metabase, Language Models, Natural Language Understanding (NLU), Semantic Segmentation, Software Engineering, Cheminformatics, Word2Vec, GloVe, Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNNs), Neuroinformatics, Deep Neural Networks, Data Modeling, Web Development, Linear Regression, Linear Algebra, Time Series, Time Series Analysis, Social Network Analysis, Network Analysis, Mathematics, Statistics, Data Processing, Bonobo, Reverse Engineering, Big Data, Scraping, Text Classification, Classification, Exploratory Data Analysis, Text Categorization, Categorization, Scientific Data Analysis, Clustering, FAISS, Social Network Analytics, Image Processing, Data Build Tool (dbt), Funnel Analysis, Hypothesis Testing, Generative Adversarial Networks (GANs), Image Analysis, Shell Scripting, Web Scraping, Statistical Data Analysis, Hugging Face, Machine Learning Operations (MLOps), Google Cloud Functions, Predictive Analytics, Data Mining, ETL Tools, ETL Testing, Text Mining, Self-driving Cars, Code Review, Technical Hiring, Interviewing, Recommendation Systems, Excel 365, Experimental Design, OfficeJS, Office Add-ins, Database Analytics, Artificial Neural Networks (ANN), Search, Generative Pre-trained Transformers (GPT), GPT Neo, HTML Parsing, Text Generation, OCR, Text Recognition, CSV, Data Transformation, Word Embedding, Back-end, Dashboards, Gunicorn, Chatbots, Full-stack, Software Architecture, APIs, Cloud, Text to Task, Data Synthesis
Tools
Celery, Named-entity Recognition (NER), Seaborn, Git, Visual Studio, Grafana, GitLab, Docker Hub, GitHub, Spark SQL, Kibana, Apache Airflow, Amazon SageMaker, Elastic, Dplyr, Google Sheets, Pytest, Babel, Yeoman, Doc2Vec, Jupyter
Frameworks
ASP.NET, Flask, .NET, Spark, Apache Spark, RStudio Shiny, Django, Jinja, Streamlit
Education
Master's Degree in Data Science
Technical University of Munich (TUM) - Germany
Bachelor's Degree (Hons) in Computer Science
The German University in Cairo - New Cairo, Egypt
Certifications
C#: Advanced Practice
React.js: Building an Interface
Amazon Redshift Essentials
Microsoft Office Add-ins for Developers
React.js Essential Training
React: Creating and Hosting a Full-stack Site (2019)
The Data Science of Experimental Design
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring