Verified Expert in Engineering
Web Scraping Developer
Lukasz has been a Python developer since 2015 and is a web scraping expert who focuses on Python and cloud computing. He is a strong contributor to product development, from idea generation to product maintenance. He's focused on big data processing in PySpark and advanced data analytics in AWS. Lukasz has experience in dynamic startup environments and corporate projects for international clients.
Amazon Web Services (AWS), PyCharm, Linux
The most amazing...
...thing I've designed and developed was an efficient distributed data processing system in Python, Celery, and RabbitMQ.
- Created an application for the exchange of compliance-related data among Virtual Asset Service Providers (VASPs) using Python, FastAPI, and Matrix (matrix.org).
- Built compliance-related microservices in Go, including a tainted asset scanner, web3 data reporting, and on-ramp providers.
- Engaged in improving code quality and testability, maintaining a test coverage rate exceeding 70% across repositories.
- Conducted quarterly performance reviews within the team as a tech lead.
Global eCommerce and Retail Media Advertising Agency
- Built a new version of an app using FastAPI for a top eCommerce company that managed over USD 250M of ad spends. The application allows enterprise clients to manage online campaigns in marketplaces like Amazon.
- Developed authentication and user management module using AWS Cognito. Later, during development, changed it to PostgreSQL DB-based authentication. Provided features like MFA using Twilio and CLI tool for basic operations using 'click' library.
- Created Lambda functions to handle campaigns automation and asynchronous report generation using Python, Chalice and AWS SAM.
- Worked directly with the CTO and Product Owner to design solutions leveraging available AWS Services support scaling up the application designed to use a multi-account AWS environment.
- Developed a set of endpoints based on Amazon Advertising API for various campaign management operations.
- Created a set of endpoints for dashboards and graphs from various sources such as external APIs, Elasticsearch, PostgreSQL.
- Provided unit, integration and E2E tests using pytest and Testcontainers.
- Delivered application logging, captured important user events and developed middlewares for detailed logging.
- Collaborated closely with front-end team and shared new back-end versions on a daily basis using internal Docker repositories.
- Worked as the main Python developer in a project. Cooperated with two other Python developers on data-related and deployment tasks and three people on front-end. Collaborated on a project for nine months and delivered a beta version to first users.
Freelance (Three Month Project for Startup in the Education Sector)
- Developed Lambdas in Python to extract data from PDF files using Apache Tika, PyMuPDF, PDF2text, and more.
- Created API in Django REST framework for data management with custom filters using Elasticsearch on AWS.
- Built front end in Vue with Amplify and integrated it with the Django REST framework back end.
- Managed app deployment, hosting on AWS using Elastic Beanstalk, Chalice, Amplify, and Amazon Cognito.
- Established an AWS environment comprising VPC networking, Elasticsearch service, Elastic Beanstalk, Route53, S3, Lambda, and Amazon Cognito.
- Designed a GraphQL API for a leading European fintech company using Python, Django, and Graphene.
- Discovered and fixed performance issues related to Celery and database operations. Built user notification modules.
- Developed PySpark ETL jobs for one of the largest financial institutions.
- Set up a project from scratch, from repository creation to Git hooks, CI/CD, and deployment.
- Created a REST API for managing datasets, developed using Python and FastAPI.
- Supported application deployment to Pivotal Cloud Foundry.
- Created unit and integration tests for PySpark ETL jobs and the FastAPI application.
- Conducted cross-team Python learning sessions for a group of 50 - 100 developers and analysts. Topics included core Python, clean code, and Git webhooks.
- Developed a Django app integrated with the Jira platform to support the development of the web scraping software.
- Developed ETL scripts to process complex incarceration data.
- Migrated ETL scripts from AWS Lambda functions to on-premise Rancher + Docker environment.
- Created over 150 web scraping spiders in Scrapy (Python).
- Developed unit and integration tests. Performed code reviews.
- Created data dashboards in Dash (Python) to present web scraping results statistics.
- Migrated data from external PDF, Excel, and Docx files to on-premise systems.
- Developed a core web scraping system to gather data about the latest job offers in Europe.
- Replaced a third-party web scraping and data processing software with Python applications.
- Automated various operations processes in Python, Celery, and Django. Moved existing Linux scripts to a more maintainable Python environment.
- Introduced CI/CD to automatically deploy Celery applications.
- Developed extract transform load (ETL) pipelines in Celery and RabbitMQ.
- Provided in-house training to junior developers on topics like ETL, Python, and SQL.
- Built a job title classifier in Python and the Natural Language Toolkit (NLTK) library.
- Developed web applications for one of the biggest media groups in Poland.
- Prepared data migration scripts for articles data stored on Solr.
- Developed web app components in jQT, Less, and Node.js.
Recruitment Platform - Applicant Tracking System - Bachelor's Thesis (2018)
The API was written in Python and Django REST framework and hosted on the AWS stack (EC2 + RDS).
Also, two other developers worked on the web/iOS clients for this API.
The application provided the following functionalities:
Recruiters were able to:
• Add company's profiles
• Add jobs to the company's profiles
• Invite other recruiters to the company
• Add jobs to the company
• Describe recruitment steps for job offer
• Describe recruitment steps templates for various job types
• Move candidate to next recruitment step or reject the candidate
• Contact candidates via chat
Candidates were able to:
• See a list of jobs
• Apply to a job
• Create a candidate's profile
• See various dashboards
• See current progress in pending recruitment processes
Python 3, Python, Python 2, SQL, Go, PCF, HTML, CSS, GraphQL, Java
Django, Scrapy, Django REST Framework, Spark, Selenium
Celery, Git, Amazon Simple Queue Service (SQS), PyCharm, Amazon Cognito, RabbitMQ, Jenkins, Amazon CloudWatch, Pytest
AWS Lambda, Docker, Linux, Amazon Web Services (AWS), Twilio, Amazon, Kubernetes, Apache Kafka
APIs, Scraping, Data Extraction, Web Scraping, Web Crawlers, Back-end, Back-end Development, FastAPI, Data Analytics, Data Analysis
Pandas, Node.js, Vue, Tika, Vue 2, PySpark, Asyncio, Stripe API, REST APIs
ETL, REST, Lambda Architecture, Data Science
MySQL, PostgreSQL, Elasticsearch, Databases, NoSQL, MongoDB
Bachelor of Science Degree in Computer Science
Polish-Japanese Academy of Information Technology - Warsaw, Poland
Amazon Web Services Solutions Architect Associate
Amazon Web Services (AWS)
Amazon Web Services Cloud Practitioner
Amazon Web Services (AWS)
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.Start hiring