Keval Katrodiya, Developer in Surat, Gujarat, India
Keval is available for hire
Hire Keval

Keval Katrodiya

Verified Expert  in Engineering

Bio

Keval is a Python developer and web scraping expert focused on Python and cloud computing. Since 2015, he has contributed to product development from concept to maintenance across startups and corporate projects. Keval also excels in big data processing with PySpark and advanced analytics in AWS.

Portfolio

Firmagraphix LLC
AWS Lambda, AWS Glue, Amazon EC2, Data Extraction, Data Mining, Web Scraping...
Luminoso
Amazon EC2, AWS Lambda, Celery, Docker, Web Scraping, Data Analytics
Code-X
Data Mining, Data Analytics, Web Development, AWS Glue, AWS CLI, AWS SDK, Apps

Experience

  • Data Scraping - 7 years
  • Data Extraction - 7 years
  • Web Scraping - 7 years
  • Python - 7 years
  • Scrapy - 7 years
  • Data Mining - 7 years
  • MySQL - 7 years
  • Docker - 5 years

Availability

Full-time

Preferred Environment

AWS CLI, Python, Web Scraping, Data Extraction, MySQL, Scrapy, Web Development, Docker, Data Scraping, Data Mining

The most amazing...

...thing I've built is a real-time, distributed data processing system with Python, Celery, and RabbitMQ for scalable task management and efficient resource use.

Work Experience

Senior Web Scraping Developer

2022 - PRESENT
Firmagraphix LLC
  • Automated data collection using Python and libraries such as Requests, Scrapy, Beautiful Soup, lxml, Selenium, and pandas.
  • Scraped data from diverse websites, including real estate, finance, travel and leisure, eCommerce like Amazon and Alibaba, and sales leads from Instagram, Facebook, and TikTok.
  • Converted scraped data into various structured formats, including TXT, PDF, image, JSON, FTP, API, CSV, XML, XLS, and SQL.
  • Enabled automatic data delivery to cloud-based storage solutions like AWS, Dropbox, Amazon S3, and Oracle.
  • Designed master-slave architecture using AWS Lambda for high-volume data scraping, with proxy rotation for Cloudflare bypass.
  • Developed scrapers with advanced security features for data collection from various websites, including banking sites with real transaction fetching.
  • Utilized Google Cloud Pub/Sub for message queuing, facilitating ARMLS data syndication to the platform database.
  • Implemented data processing pipelines for feeding scraped data into dashboards and visualization tools.
  • Created and deployed real-time dashboards using Tableau, providing valuable insights to the business team.
Technologies: AWS Lambda, AWS Glue, Amazon EC2, Data Extraction, Data Mining, Web Scraping, Scrapy, Python 3, Pandas, lxml, Selenium, Puppeteer, Playwright, Node.js, AWS SDK, Docker, Celery, RabbitMQ, Apache Airflow, MySQL, MongoDB, Cron

Data Specialist

2021 - 2022
Luminoso
  • Developed advanced web scraping and web crawling scripts to handle large datasets.
  • Utilized a combination of Python, Scrapy, pandas, Perl, MySQL, MS SQL, Alteryx, RedPoint, and Pentaho for script development and data management.
  • Implemented advanced techniques for crawling, finding, fetching, parsing, and cleaning data, enhancing accuracy and efficiency.
  • Integrated data management solutions like Alteryx, RedPoint, and Pentaho to streamline data processing and analysis workflows.
  • Devised strategies to handle dynamic content and circumvent anti-scraping measures, ensuring reliable data extraction.
  • Created scalable web scraping solutions using Python, Scrapy, and Pentaho, handling over 200 million records from hundreds of web pages, enhancing data accuracy and speed.
  • Implemented parallel processing and distributed computing techniques to manage large datasets, utilizing advanced tools like Alteryx, RedPoint, and AWS.
Technologies: Amazon EC2, AWS Lambda, Celery, Docker, Web Scraping, Data Analytics

Data Engineering

2019 - 2021
Code-X
  • Designed and developed a data pipeline to extract, transform, and load sales data from multiple sources into a centralized database.
  • Built data models and analyzed data to generate insights and visualize sales trends using Python, SQL, and visualization libraries.
  • Implemented automated data updates and scheduled ETL processes using AWS Lambda and cron jobs.
  • Utilized Flask to create a web-based analytics dashboard for real-time monitoring and interactive visualizations.
Technologies: Data Mining, Data Analytics, Web Development, AWS Glue, AWS CLI, AWS SDK, Apps

Web Developer

2018 - 2019
Ecotech IT Solutions Pvt Ltd
  • Improved user engagement and satisfaction by 30% by collaborating with clients to redesign and launch several websites.
  • Developed responsive website layouts and user interfaces using HTML, CSS, and JavaScript, ensuring seamless functionality across all devices.
  • Boosted team productivity by 20% by integrating back-end services and databases using Python, Django, and MySQL, contributing to internal tool development.
  • Conducted regular testing, debugging, and optimization, implementing SEO best practices that increased website visibility and organic traffic.
Technologies: Apps, HTML, CSS, Automation, Selenium, Node.js, Playwright, Flask, jQuery, Python 3

Experience

Automated Real Estate Data Scraper

Developed an automated system to scrape real estate data from multiple sources, including websites like Zillow, Realtor, and ARMLS. The system was designed to fetch property details such as price, location, and property features, overcoming CAPTCHA challenges and dynamic content using proxy rotation and headless browsers.

eCommerce Web Scraper and Data Pipeline

Created a scalable web scraping tool to collect product details, prices, and reviews from eCommerce platforms like Amazon and Alibaba. Data was cleaned, processed, and stored in Amazon S3, a centralized cloud database. The system also performed scheduled ETL processes to update the data regularly, sending real-time updates to a Tableau dashboard.

Education

2014 - 2018

Bachelor's Degree in Computer Engineering

Parul University - Surat, Gujarat, India

Skills

Libraries/APIs

Playwright, Node.js, Puppeteer, jQuery, Pandas

Tools

Visual Studio, AWS CLI, AWS SDK, Git, AWS Glue, RabbitMQ, Celery, Jira, PyCharm, Tableau, Apache Airflow, Cron

Frameworks

Scrapy, Selenium, Django, Flask

Paradigms

Automation

Languages

Python, Snowflake, HTML, CSS, Python 3, SQL

Platforms

Docker, Amazon EC2, AWS Lambda, Jupyter Notebook

Storage

MySQL, MongoDB, Amazon S3 (AWS S3)

Other

Web Scraping, Data Extraction, Data Scraping, Data Mining, Web Development, Data Analytics, Apps, lxml, Proxies, Cloudflare

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring