Artur Marinho Gaspar
Verified Expert in Engineering
Software Developer
São Paulo - State of São Paulo, Brazil
Toptal member since April 20, 2021
Artur is an expert Python programmer with seven years of experience. He specializes in web scraping and back-end web development. He has built multiple web scraping solutions, including the use of multiple browser engines to handle advanced JavaScript where one project averaged over five million requests per day across 30 different regional versions of a website. Artur's domain expertise includes eCommerce and real estate.
Portfolio
Experience
- Scrapy - 7 years
- Python - 7 years
- Django ORM - 3 years
- Django - 3 years
- C - 2 years
- Flask - 2 years
- Qt - 1 year
- C++ - 1 year
Availability
Preferred Environment
Linux, Git, PyCharm, Qt Creator
The most amazing...
...project I developed was a web scraping solution for a website with an average of two million daily requests.
Work Experience
Python Developer
Software Habit Inc.
- Developed a Flask application to manage and view data from YouTube creator reports.
- Implemented generation of reports as PDF and Excel spreadsheets.
- Managed deployment of the application in Heroku.
- Designed and implemented processes to summarize and consolidate data in an Amazon Redshift database with around 1 billion rows.
- Designed and implemented a second version of the API to better fit data model changes and oversaw the corresponding front-end implementation.
- Built integrations with external services for data processing.
Developer
Sky.One Solutions
- Collaborated in developing the back end of the next version of the main product, Auto.Sky, a solution to facilitate running Windows applications in cloud services (AWS and Oracle Cloud).
- Participated in tooling and software architecture decisions for the new version of the main product (Auto.Sky).
- Maintained and fixed bugs in the legacy version of the main product (Auto.Sky).
Developer
Zyte (formerly ScrapingHub)
- Developed multiple web scraping solutions, including the use of multiple browser engines to handle advanced JavaScript. One project averaged over five million requests per day across 30 different regional versions of a website.
- Contributed to the Scrapy open-source project and related libraries with bug fixes and new features. (https://github.com/scrapy/scrapy).
- Participated in the elaboration and development of automated checking for web scraping data quality based on fixed rules and previously extracted data.
Developer
Precifica
- Developed web scraping solutions for multiple eCommerce websites.
- Delivered a solution based on Celery and RabbitMQ for post-processing data scraped from websites.
- Built HTTP APIs to be used across different internal projects with Flask.
Developer
Freelance
- Developed web scraping solutions for multiple websites related to real estate, including scraping information from downloaded PDFs.
- Built a web application and the corresponding back end for scheduling and managing the aforementioned solution.
- Integrated the Scrapy framework with browser engines like PhantomJS and Qt WebKit for scraping websites that relied on complex JavaScript code.
Developer
BetterBill
- Developed web scraping solutions for multiple telephony companies' websites, keeping a single data format for data extracted from all sites.
- Developed and optimized the performance of a Node.js server application providing an API, both HTTP and WebSocket, for a future mobile app.
- Collaborated on the migration of data and client applications from MySQL to MongoDB.
- Created a solution based on Celery for diverse long-running asynchronous tasks across the project.
Experience
Web Scraping Project for Large Marketplace Website
Integration of QtWebKit Browser Engine with Scrapy Framework
https://github.com/ArturGaspar/scrapy-qtwebkitThe released code is not production-quality for a standalone library, as it was extracted, with permission, from a solution developed for a client and refactored from that, but it works and I plan to continue developing it into a useful project.
Skills
Libraries/APIs
Django ORM, Node.js, Puppeteer, Google APIs, YouTube API
Tools
Celery, RabbitMQ, Git, Microsoft Excel
Languages
Python, JavaScript, C, SQL, CSS, HTML, C++
Frameworks
Scrapy, Django, Flask, Qt, Django REST Framework
Platforms
Linux, Amazon Web Services (AWS), Heroku
Storage
MongoDB, SQLite, JSON, Oracle Cloud, Redshift, PostgreSQL
Other
Web Scraping, Web Crawlers, Web Development, Asynchronous I/O, WebSockets, PDF Scraping, Full-stack, APIs, Amazon RDS, PDF
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring