Artur Marinho Gaspar, Developer in São Paulo - State of São Paulo, Brazil
Artur is available for hire
Hire Artur

Artur Marinho Gaspar

Verified Expert  in Engineering

Software Developer

São Paulo - State of São Paulo, Brazil

Toptal member since April 20, 2021

Bio

Artur is an expert Python programmer with seven years of experience. He specializes in web scraping and back-end web development. He has built multiple web scraping solutions, including the use of multiple browser engines to handle advanced JavaScript where one project averaged over five million requests per day across 30 different regional versions of a website. Artur's domain expertise includes eCommerce and real estate.

Portfolio

Software Habit Inc.
Python, JavaScript, Flask, APIs, Full-stack, Redshift...
Sky.One Solutions
Python, Django, Django REST Framework, Amazon Web Services (AWS), Oracle Cloud...
Zyte (formerly ScrapingHub)
Scrapy, SQLite, JavaScript, Web Scraping, Asynchronous I/O, Python, Puppeteer...

Experience

  • Scrapy - 7 years
  • Python - 7 years
  • Django ORM - 3 years
  • Django - 3 years
  • C - 2 years
  • Flask - 2 years
  • Qt - 1 year
  • C++ - 1 year

Availability

Part-time

Preferred Environment

Linux, Git, PyCharm, Qt Creator

The most amazing...

...project I developed was a web scraping solution for a website with an average of two million daily requests.

Work Experience

Python Developer

2021 - 2024
Software Habit Inc.
  • Developed a Flask application to manage and view data from YouTube creator reports.
  • Implemented generation of reports as PDF and Excel spreadsheets.
  • Managed deployment of the application in Heroku.
  • Designed and implemented processes to summarize and consolidate data in an Amazon Redshift database with around 1 billion rows.
  • Designed and implemented a second version of the API to better fit data model changes and oversaw the corresponding front-end implementation.
  • Built integrations with external services for data processing.
Technologies: Python, JavaScript, Flask, APIs, Full-stack, Redshift, Amazon Web Services (AWS), Google APIs, YouTube API, Amazon RDS, SQL, PostgreSQL, Heroku, PDF, Microsoft Excel

Developer

2021 - 2022
Sky.One Solutions
  • Collaborated in developing the back end of the next version of the main product, Auto.Sky, a solution to facilitate running Windows applications in cloud services (AWS and Oracle Cloud).
  • Participated in tooling and software architecture decisions for the new version of the main product (Auto.Sky).
  • Maintained and fixed bugs in the legacy version of the main product (Auto.Sky).
Technologies: Python, Django, Django REST Framework, Amazon Web Services (AWS), Oracle Cloud, HTML

Developer

2015 - 2021
Zyte (formerly ScrapingHub)
  • Developed multiple web scraping solutions, including the use of multiple browser engines to handle advanced JavaScript. One project averaged over five million requests per day across 30 different regional versions of a website.
  • Contributed to the Scrapy open-source project and related libraries with bug fixes and new features. (https://github.com/scrapy/scrapy).
  • Participated in the elaboration and development of automated checking for web scraping data quality based on fixed rules and previously extracted data.
Technologies: Scrapy, SQLite, JavaScript, Web Scraping, Asynchronous I/O, Python, Puppeteer, Web Crawlers, CSS, HTML

Developer

2014 - 2015
Precifica
  • Developed web scraping solutions for multiple eCommerce websites.
  • Delivered a solution based on Celery and RabbitMQ for post-processing data scraped from websites.
  • Built HTTP APIs to be used across different internal projects with Flask.
Technologies: Python, Scrapy, Flask, Celery, MongoDB, RabbitMQ, Web Development, Web Scraping, Asynchronous I/O, Web Crawlers, Amazon Web Services (AWS), SQL, HTML

Developer

2013 - 2014
Freelance
  • Developed web scraping solutions for multiple websites related to real estate, including scraping information from downloaded PDFs.
  • Built a web application and the corresponding back end for scheduling and managing the aforementioned solution.
  • Integrated the Scrapy framework with browser engines like PhantomJS and Qt WebKit for scraping websites that relied on complex JavaScript code.
Technologies: Python, Scrapy, Django, Web Scraping, Asynchronous I/O, Web Crawlers, PDF Scraping, SQL

Developer

2013 - 2013
BetterBill
  • Developed web scraping solutions for multiple telephony companies' websites, keeping a single data format for data extracted from all sites.
  • Developed and optimized the performance of a Node.js server application providing an API, both HTTP and WebSocket, for a future mobile app.
  • Collaborated on the migration of data and client applications from MySQL to MongoDB.
  • Created a solution based on Celery for diverse long-running asynchronous tasks across the project.
Technologies: Python, Scrapy, Django ORM, Node.js, Celery, MongoDB, Asynchronous I/O, Web Scraping, Web Crawlers, WebSockets, SQL, HTML, Full-stack

Web Scraping Project for Large Marketplace Website

A web scraping solution for a marketplace website (that allows users to sell and buy items). I elaborated the architecture of periodic job scheduling, data storage both for data used by different parts of the process and for the final extracted data, developed the main part of the project, and investigated problems related to request capacity and banning. It averaged five million requests per day from 30 regional versions of the website.

Integration of QtWebKit Browser Engine with Scrapy Framework

https://github.com/ArturGaspar/scrapy-qtwebkit
Scrapy-QtWebKit integrates the QtWebKit browser engine with the Scrapy web scraping framework. It was developed to facilitate web scraping of websites where a browser engine is needed to execute JavaScript or perform other functions. It has a similar use case as Selenium but integrates better into Scrapy.

The released code is not production-quality for a standalone library, as it was extracted, with permission, from a solution developed for a client and refactored from that, but it works and I plan to continue developing it into a useful project.

Libraries/APIs

Django ORM, Node.js, Puppeteer, Google APIs, YouTube API

Tools

Celery, RabbitMQ, Git, Microsoft Excel

Languages

Python, JavaScript, C, SQL, CSS, HTML, C++

Frameworks

Scrapy, Django, Flask, Qt, Django REST Framework

Platforms

Linux, Amazon Web Services (AWS), Heroku

Storage

MongoDB, SQLite, JSON, Oracle Cloud, Redshift, PostgreSQL

Other

Web Scraping, Web Crawlers, Web Development, Asynchronous I/O, WebSockets, PDF Scraping, Full-stack, APIs, Amazon RDS, PDF

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring