Lukasz is available for hire

Lukasz Jaworowski

Verified Expert in Engineering

Web Scraping Developer

Location

Warsaw, Poland

Toptal Member Since

April 10, 2020

Lukasz has been a Python developer since 2015 and is a web scraping expert who focuses on Python and cloud computing. He is a strong contributor to product development, from idea generation to product maintenance. He's focused on big data processing in PySpark and advanced data analytics in AWS. Lukasz has experience in dynamic startup environments and corporate projects for international clients.

Back-end Back-end Development Web Scraping Web Crawlers Python 3 Python Git Python 2 Celery AWS Lambda Django SQL MySQL PostgreSQL Linux Scrapy PCF

Portfolio

Qredo

Go, Python 3, Amazon, MongoDB, PostgreSQL, APIs, Back-end, Kubernetes

Global eCommerce and Retail Media Advertising Agency

Python, FastAPI, SQL, Elasticsearch, Amazon CloudWatch, Docker, Pytest...

Freelance (Three Month Project for Startup in the Education Sector)

Amazon Web Services (AWS), Pandas, Django REST Framework, Amazon Cognito...

Experience

Python 3 - 8 years Web Scraping - 6 years Scrapy - 4 years Celery - 4 years REST - 4 years ETL - 4 years Amazon Web Services (AWS) - 3 years Go - 2 years

Availability

Part-time

Preferred Environment

Amazon Web Services (AWS), PyCharm, Linux

The most amazing...

...thing I've designed and developed was an efficient distributed data processing system in Python, Celery, and RabbitMQ.

Work Experience

Technical Lead

2021 - 2023

Qredo

Created an application for the exchange of compliance-related data among Virtual Asset Service Providers (VASPs) using Python, FastAPI, and Matrix (matrix.org).
Built compliance-related microservices in Go, including a tainted asset scanner, web3 data reporting, and on-ramp providers.
Engaged in improving code quality and testability, maintaining a test coverage rate exceeding 70% across repositories.
Conducted quarterly performance reviews within the team as a tech lead.

Technologies: Go, Python 3, Amazon, MongoDB, PostgreSQL, APIs, Back-end, Kubernetes

Python Developer

2020 - 2021

Global eCommerce and Retail Media Advertising Agency

Built a new version of an app using FastAPI for a top eCommerce company that managed over USD 250M of ad spends. The application allows enterprise clients to manage online campaigns in marketplaces like Amazon.
Developed authentication and user management module using AWS Cognito. Later, during development, changed it to PostgreSQL DB-based authentication. Provided features like MFA using Twilio and CLI tool for basic operations using 'click' library.
Created Lambda functions to handle campaigns automation and asynchronous report generation using Python, Chalice and AWS SAM.
Worked directly with the CTO and Product Owner to design solutions leveraging available AWS Services support scaling up the application designed to use a multi-account AWS environment.
Developed a set of endpoints based on Amazon Advertising API for various campaign management operations.
Created a set of endpoints for dashboards and graphs from various sources such as external APIs, Elasticsearch, PostgreSQL.
Provided unit, integration and E2E tests using pytest and Testcontainers.
Delivered application logging, captured important user events and developed middlewares for detailed logging.
Collaborated closely with front-end team and shared new back-end versions on a daily basis using internal Docker repositories.
Worked as the main Python developer in a project. Cooperated with two other Python developers on data-related and deployment tasks and three people on front-end. Collaborated on a project for nine months and delivered a beta version to first users.

Technologies: Python, FastAPI, SQL, Elasticsearch, Amazon CloudWatch, Docker, Pytest, AWS Lambda, Stripe API, REST, REST APIs, Amazon Cognito, Lambda Architecture, Twilio, MySQL, PostgreSQL, Asyncio, APIs, Back-end

AWS/Python Developer

2020 - 2021

Freelance (Three Month Project for Startup in the Education Sector)

Developed Lambdas in Python to extract data from PDF files using Apache Tika, PyMuPDF, PDF2text, and more.
Created API in Django REST framework for data management with custom filters using Elasticsearch on AWS.
Built front end in Vue with Amplify and integrated it with the Django REST framework back end.
Managed app deployment, hosting on AWS using Elastic Beanstalk, Chalice, Amplify, and Amazon Cognito.
Established an AWS environment comprising VPC networking, Elasticsearch service, Elastic Beanstalk, Route53, S3, Lambda, and Amazon Cognito.

Technologies: Amazon Web Services (AWS), Pandas, Django REST Framework, Amazon Cognito, Django, Tika, AWS Lambda, Data Extraction, Python 3, Vue, MySQL, PostgreSQL, Scraping, Vue 2, Amazon Simple Queue Service (SQS), APIs, Back-end

Python Developer

2019 - 2020

GFT Technologies

Designed a GraphQL API for a leading European fintech company using Python, Django, and Graphene.
Discovered and fixed performance issues related to Celery and database operations. Built user notification modules.
Developed PySpark ETL jobs for one of the largest financial institutions.
Set up a project from scratch, from repository creation to Git hooks, CI/CD, and deployment.
Created a REST API for managing datasets, developed using Python and FastAPI.
Supported application deployment to Pivotal Cloud Foundry.
Created unit and integration tests for PySpark ETL jobs and the FastAPI application.
Conducted cross-team Python learning sessions for a group of 50 - 100 developers and analysts. Topics included core Python, clean code, and Git webhooks.

Technologies: Amazon Web Services (AWS), Pandas, Django REST Framework, Python 3, Spark, PCF, Linux, Jenkins, PySpark, Docker, GraphQL, Django, Python, Python 2, MySQL, PostgreSQL, ETL, Asyncio, Scraping, Amazon Simple Queue Service (SQS), APIs, Back-end, Apache Kafka, Data Analytics, Data Science, Data Analysis

Python Developer

2018 - 2019

Appriss

Developed a Django app integrated with the Jira platform to support the development of the web scraping software.
Developed ETL scripts to process complex incarceration data.
Migrated ETL scripts from AWS Lambda functions to on-premise Rancher + Docker environment.
Created over 150 web scraping spiders in Scrapy (Python).
Developed unit and integration tests. Performed code reviews.
Created data dashboards in Dash (Python) to present web scraping results statistics.
Migrated data from external PDF, Excel, and Docx files to on-premise systems.

Technologies: Amazon Web Services (AWS), Pandas, Django REST Framework, Python 3, Web Scraping, Web Crawlers, Docker, SQL, Scrapy, Django, Python, Python 2, MySQL, PostgreSQL, ETL, Asyncio, Scraping, Amazon Simple Queue Service (SQS), APIs, Back-end

Python Developer

2015 - 2018

Vacancysoft

Developed a core web scraping system to gather data about the latest job offers in Europe.
Replaced a third-party web scraping and data processing software with Python applications.
Automated various operations processes in Python, Celery, and Django. Moved existing Linux scripts to a more maintainable Python environment.
Introduced CI/CD to automatically deploy Celery applications.
Developed extract transform load (ETL) pipelines in Celery and RabbitMQ.
Provided in-house training to junior developers on topics like ETL, Python, and SQL.
Built a job title classifier in Python and the Natural Language Toolkit (NLTK) library.

Technologies: Amazon Web Services (AWS), Pandas, Django REST Framework, Python 3, Web Scraping, Web Crawlers, Celery, RabbitMQ, Selenium, Scrapy, Django, Python, Python 2, MySQL, Scraping, APIs, Back-end

Developer

2017 - 2017

DreamLab

Developed web applications for one of the biggest media groups in Poland.
Prepared data migration scripts for articles data stored on Solr.
Developed web app components in jQT, Less, and Node.js.

Technologies: CSS, HTML, Node.js, PostgreSQL, APIs, Back-end

Experience

Recruitment Platform - Applicant Tracking System - Bachelor's Thesis (2018)

I was responsible for the database design, project specifications, and back-end development.
The API was written in Python and Django REST framework and hosted on the AWS stack (EC2 + RDS).
Also, two other developers worked on the web/iOS clients for this API.

The application provided the following functionalities:

Recruiters were able to:
• Add company's profiles
• Add jobs to the company's profiles
• Invite other recruiters to the company
• Add jobs to the company
• Describe recruitment steps for job offer
• Describe recruitment steps templates for various job types
• Move candidate to next recruitment step or reject the candidate
• Contact candidates via chat

Candidates were able to:
• See a list of jobs
• Apply to a job
• Create a candidate's profile
• See various dashboards
• See current progress in pending recruitment processes

Skills

Languages

Python 3, Python, Python 2, SQL, Go, PCF, HTML, CSS, GraphQL, Java

Frameworks

Django, Scrapy, Django REST Framework, Spark, Selenium

Tools

Celery, Git, Amazon Simple Queue Service (SQS), PyCharm, Amazon Cognito, RabbitMQ, Jenkins, Amazon CloudWatch, Pytest, GIS

Platforms

AWS Lambda, Docker, Linux, Amazon Web Services (AWS), Twilio, Amazon, Kubernetes, Apache Kafka

Other

APIs, Scraping, Data Extraction, Web Scraping, Web Crawlers, Back-end, Back-end Development, FastAPI, Data Analytics, Data Analysis, Geodatabases

Libraries/APIs

Pandas, Node.js, Vue, Tika, Vue 2, PySpark, Asyncio, Stripe API, REST APIs

Paradigms

ETL, REST, Lambda Architecture, Data Science

Storage

MySQL, PostgreSQL, Elasticsearch, Databases, NoSQL, MongoDB, PostGIS

Education

2014 - 2018

Bachelor of Science Degree in Computer Science

Polish-Japanese Academy of Information Technology - Warsaw, Poland

2012 - 2014

Bachelor's Degree in Geodesy, Surveying and Cartography

Military University of Technology - Warsaw, Poland

Certifications

DECEMBER 2019 - DECEMBER 2022

Amazon Web Services Solutions Architect Associate

Amazon Web Services (AWS)

JULY 2019 - JULY 2020

Amazon Web Services Cloud Practitioner

Amazon Web Services (AWS)

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring