Yahia Mahmoud, Developer in Cairo, Cairo Governorate, Egypt
Yahia is available for hire
Hire Yahia

Yahia Mahmoud

Verified Expert  in Engineering

Data Analyst and Developer

Location
Cairo, Cairo Governorate, Egypt
Toptal Member Since
January 9, 2024

Yahia is an experienced data analyst and web scraper with nearly three years of experience. He has focused on cleaning tasks, building scraping scripts, and analyzing data for companies in Germany, Canada, and Egypt. Yahia specializes in data analysis, Python, Selenium, Beautiful Soup, Microsoft Excel, SQL, and MongoDB.

Portfolio

E-motion Digital Creative Agency
Python, Python 3, SQL, Pandas, NumPy, Regex, Jupyter, Jupyter Notebook...
Analytic Company GmbH
SQL, Regex, Excel 365, Data, Data Cleaning, Data Cleansing, Algorithms...
Aview International
Selenium, Beautiful Soup, Python, Python 3, Visual Studio Code (VS Code)...

Experience

Availability

Part-time

Preferred Environment

Windows, Jupyter Notebook, Visual Studio Code (VS Code), Google Colaboratory (Colab), Python, Selenium, Beautiful Soup, Microsoft Excel, SQL, MongoDB

The most amazing...

...project I've been a part of transformed complex data cleaning tasks into streamlined processes, boosting productivity through scripts that saved time by 98%.

Work Experience

Data Operation Specialist

2023 - PRESENT
E-motion Digital Creative Agency
  • Led the "Gromart" project's foundation and served as the data team's central figure, overseeing the development and execution of web scraping initiatives across 15+ websites.
  • Implemented efficient data collection strategies using Python libraries such as Selenium and Beautiful Soup, enabling the extraction of diverse data types. I stored information in Microsoft Excel, MongoDB, and relational and non-relational databases.
  • Engineered time-saving scripts that transformed time-consuming tasks into streamlined processes completed in just five minutes, showcasing a significant productivity and resource utilization boost.
  • Orchestrated the integration of multiple data storage solutions, including MongoDB, for unstructured and structured data, ensuring a versatile and scalable approach to data management within the "Gromart" project.
Technologies: Python, Python 3, SQL, Pandas, NumPy, Regex, Jupyter, Jupyter Notebook, Visual Studio Code (VS Code), Selenium, Beautiful Soup, MongoDB, Requests, Excel 365, Google Colaboratory (Colab), Microsoft PowerPoint, Data, Data Analysis, Data Cleaning, Data Cleansing, Algorithms, Data Entry, CSV, XLSX File Processing, Microsoft Excel, JSON, Website Data Scraping, Data Scraping, PDF Scraping, PDF to Excel, Excel VBA, Financial Data, HTML, Manual QA

Data Analyst

2022 - PRESENT
Analytic Company GmbH
  • Conducted comprehensive research to gather missing information and enhance data completeness, contributing to a broader understanding of the data.
  • Executed thorough data analysis, including filling in missing values, performing model-specific analyses, and conducting quality checks to ensure data accuracy and completeness.
  • Delivered over seven datasets monthly by inputting and processing relevant vehicle data into the existing data entry system and online editor.
Technologies: SQL, Regex, Excel 365, Data, Data Cleaning, Data Cleansing, Algorithms, Data Entry, CSV, XLSX File Processing, Data Analysis, Excel VBA, Financial Data, Manual QA

Data Analyst and Web Scraper

2022 - 2023
Aview International
  • Scraped over 40,000 rows of data from diverse websites, including YouTube and Udemy, using Python libraries such as Selenium and Beautiful Soup.
  • Led a team, ensuring smooth project execution by setting clear objectives, providing regular reports, and assisting colleagues in troubleshooting issues.
  • Leveraged Python libraries such as Pandas and NumPy to clean the data, ensuring data quality and reliability for subsequent analysis and decision-making.
  • Organized the extracted data into structured formats, including CSV files and Microsoft Excel sheets, to facilitate efficient analysis and reporting.
Technologies: Selenium, Beautiful Soup, Python, Python 3, Visual Studio Code (VS Code), Excel 365, Pandas, NumPy, Requests, Data, Data Analysis, Data Cleaning, Data Cleansing, Algorithms, CSV, XLSX File Processing, Web Scraping, Microsoft Excel, Website Data Scraping, Data Scraping, Excel VBA, HTML, Manual QA

Data Analyst

2021 - 2022
Freelance
  • Collaborated with a diverse range of clients, including over five companies and various individual customers, delivering specialized data services.
  • Leveraged my expertise as a data analyst, data entry specialist, data collector, web scraper, and data visualization specialist to meet client-specific requirements.
  • Successfully managed projects through online freelance agencies, platforms, and personal contacts, ensuring timely delivery of high-quality work.
  • Demonstrated versatility and adaptability, consistently meeting client expectations and exceeding project goals.
Technologies: Python, Google Colaboratory (Colab), Data Analysis, Data Cleansing, Data Cleaning, Data, Jupyter, Microsoft PowerPoint, Excel 365, SQL, Tableau, Selenium, Python 3, Pandas, NumPy, Requests, Algorithms, Data Entry, CSV, XLSX File Processing, Matplotlib, Seaborn, PDF Scraping, PDF to Excel, Excel VBA, Financial Data, HTML, Manual QA

Dream2000 Products Scraper

https://github.com/YahiaML/Dream2000-Scraper
The Dream2000 Products Scraper is a meticulously crafted web scraping solution to extract extensive product details from the Dream2000 eCommerce platform. My role in this project involved conceptualizing, developing, and optimizing the scraper to navigate the dynamic website, overcoming challenges such as asynchronous content loading and multilingual support.

Leveraging Python, Beautiful Soup, and Selenium, I ensured the scraper's adaptability to the site's structure, enabling it to systematically collect product information, including categories, subcategories, links, images, prices, and descriptions. Implementing a checkpoint system mitigated potential data loss and provided reliability during prolonged scraping sessions.

This project showcases my expertise in web scraping, data processing with Pandas, and problem-solving in handling diverse website structures. The Dream2000 Products Scraper is a testament to my commitment to delivering robust, versatile, and user-friendly solutions for extracting valuable insights from complex online platforms.

Coldwell Banker Data Processing

https://github.com/YahiaML/Coldwell-Banker-Data-Processing
This project involved a comprehensive initiative aimed at optimizing the integration of real estate data into the Coldwell Banker system. Leveraging Python and Pandas, the project focuses on extracting valuable insights from diverse datasets, ensuring accuracy, and aligning data with Coldwell Banker's standards. The script streamlines the integration process through meticulous data cleaning, mapping, and purpose-specific Microsoft Excel sheets.

My involvement in the project encompassed the data processing script's design, development, and refinement. I played a pivotal role in crafting the logic for data cleaning, implementing mapping strategies, and orchestrating the creation of Microsoft Excel sheets. Additionally, I spearheaded the project's overarching goal of comparing new projects with existing ones and refining project names based on a predefined mapping schema. My commitment to data accuracy and consistency has been instrumental in the project's success, contributing to the seamless integration of real estate data into Coldwell Banker's system.

Movie Data Analysis

https://github.com/YahiaML/TMDb-movies-data-investigation
This project dives deep into The Movie Database (TMDB) dataset using Python, unveiling intricate details of the film industry. Employing Pandas, Matplotlib, and Seaborn, the analysis unfolds multifaceted insights, from genre popularity evolution to complex correlations between revenue, popularity, and vote averages. Unveiling the influence of stars and directors on movie ratings adds a layer of sophistication. The project serves as a comprehensive guide for movie enthusiasts and data aficionados, illuminating the dynamic landscape of cinema. Despite inherent limitations like dropped rows and missing values, the project highlights Python's prowess in extracting meaningful narratives from diverse datasets.

Manga Downloader

https://drive.google.com/file/d/1RyB6YIzS-mRsMWItHXSHNDR50yyoyf7_/view?usp=sharing
I developed Manga Downloader to enable users to download and save their favorite manga and manhwa series for offline reading. It was implemented in Python, utilizing Beautiful Soup for web scraping and the Requests library for seamless connections with web pages. In addition, image processing libraries like PIL or OpenCV handle downloaded images for tasks such as resizing and cropping.

Key features include a vast selection of manga and manhwa titles, chapter-specific downloads for offline reading, and a user-friendly offline reading experience. Automation supports batch downloads of multiple chapters or entire series, saving time for avid readers. This project, driven by a passion for manga and manhwa, leverages Python and diverse libraries to provide a delightful offline reading experience for enthusiasts worldwide.
2021 - 2023

Bachelor's Degree in Data Science

Arab Open University - Cairo, Egypt

APRIL 2022 - PRESENT

Advanced Data Analysis

Udacity

JANUARY 2022 - PRESENT

Python for Data Science

Sololearn

DECEMBER 2021 - PRESENT

Python: Working with Predictive Analytics

LinkedIn

NOVEMBER 2021 - PRESENT

Data Analysis Professional

Udacity

SEPTEMBER 2021 - PRESENT

Data Analysis Challenger Certificate

Udacity

Languages

Regex, Python, Python 3, Excel VBA, SQL, R, HTML

Frameworks

Selenium

Libraries/APIs

Pandas, Requests, Matplotlib, Beautiful Soup, NumPy, OpenCV, PIL

Tools

Seaborn, Jupyter, Microsoft PowerPoint, Tableau, Microsoft Power BI, Microsoft Excel

Other

Excel 365, Data, Data Analysis, Data Cleaning, Data Cleansing, Algorithms, CSV, XLSX File Processing, Web Scraping, Website Data Scraping, Data Scraping, PDF Scraping, Financial Data, Manual QA, Google Colaboratory (Colab), Data Entry, PDF to Excel, Calculus, Algebra, Linear Algebra, Discrete Mathematics, Statistics, Pivot Tables, Data Visualization, Data Processing

Platforms

Jupyter Notebook, Visual Studio Code (VS Code), Windows

Storage

JSON, MongoDB, Databases

Paradigms

Data Science

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring