Attila Pápai, Developer in Budapest, Hungary
Attila is available for hire
Hire Attila

Attila Pápai

Verified Expert  in Engineering

Bio

Attila is a seasoned software engineer with over 15 years of experience. He has worked across a variety of sectors, systems, technologies, and roles, enabling him to better understand the big picture and make decisions that best suit the business's needs. Attila is a passionate engineer who values results achieved by quality work. A team player but equally great at working independently on smaller projects, he is currently focusing on becoming a skilled AI engineer.

Portfolio

Labelbox
Node.js, NestJS, TypeScript, Python, Kotlin, Elasticsearch, GraphQL...
Labelbox
Node.js, TypeScript, GraphQL, Python, Elasticsearch, Debezium, Google Cloud...
Cruise
React, TypeScript, GraphQL, PostgreSQL, Docker, CircleCI, NestJS...

Experience

  • Python - 6 years
  • TypeScript - 5 years
  • Node.js - 5 years
  • GraphQL - 5 years
  • NestJS - 5 years
  • Google Cloud - 3 years
  • Elasticsearch - 3 years
  • React - 3 years

Availability

Part-time

Preferred Environment

TypeScript, Node.js, Python, Machine Learning, Artificial Intelligence (AI), Google Cloud, GraphQL, Microservices

The most amazing...

...project I've worked on is a complete rebuild of the core services handling data ingestion in which we achieved ten times faster uploads with zero downtime.

Work Experience

Senior Software Engineer II

2023 - 2024
Labelbox
  • Re-designed the old data export system. It was unreliable, generated oversized outputs, and lacked structure. Exports V2 introduced a robust, scalable streaming system with a public JSON schema and granular control over data export.
  • Rebuilt the data ingestion system for ten times faster uploads. Replaced MySQL with Spanner for infinite scalability, implemented new microservices for better parallelism and scalability, and switched it over to the new system without zero downtime.
  • Implemented a solution to automatically generate a text layer for uploaded PDF documents using the Google Document AI service. This helped customers speed up their document labeling processes.
  • Created a new Python microservice to act as a connector between Labelbox and Census's APIs. With Census integration, customers can easily import their data to Labelbox in an even easier and faster way to speed up their AI and ML pipelines.
Technologies: Node.js, NestJS, TypeScript, Python, Kotlin, Elasticsearch, GraphQL, Google Cloud, Google Cloud Spanner, Pub/Sub, MySQL, PostgreSQL, Codefresh, Helm, Docker, Datadog, Architecture, Back-end, Software Design, SaaS

Senior Back-end Engineer

2022 - 2022
Labelbox
  • Redesigned and completely overhauled the previous slow Catalog/Search experience, resulting in a remarkable performance improvement. Regular searches are completed in milliseconds across hundreds of millions of data rows.
  • Self-initiated a project to create a tool for the ML support team to resolve common customer-facing data inconsistency issues in our databases more easily and quickly.
  • Found and fixed one of the most well-hidden bugs related to signing public URLs in the system, causing random disruptions for end users and negatively impacting the user experience.
Technologies: Node.js, TypeScript, GraphQL, Python, Elasticsearch, Debezium, Google Cloud, MySQL, NestJS, Datadog, Codefresh, Architecture, Back-end, Software Design, SaaS

Senior Full-stack Engineer

2019 - 2021
Cruise
  • Improved the performance and user experience of a web application's most heavily used page by eliminating unnecessary network requests, refactoring internal state management, and adding skeleton loaders while fetching some parts of the page.
  • Became the de facto quality engineer of the team by always finding hidden bugs in special edge cases while doing PR reviews for team members.
  • Gained recognition as one of the top-performing engineers in the team based on survey results.
Technologies: React, TypeScript, GraphQL, PostgreSQL, Docker, CircleCI, NestJS, Software Design, Full-stack

Back-end Developer

2019 - 2019
AirWorks (via Teracode/Toptal)
  • Implemented new APIs in Node.js for an upcoming web application.
  • Helped the team achieve their milestones ahead of the first public demo.
  • Added tests to existing and new APIs with over 95% code coverage.
  • Wrote design documentation, including new features and security enhancements.
Technologies: MongoDB, Postman, Jest, Node.js, JavaScript, Back-end, Software Design

Senior Software Engineer

2017 - 2019
Sophos
  • Served as the technical lead for various tasks and projects and became a go-to person for technical questions.
  • Used AWS Lambda and AWS Step Functions to create automated workflows.
  • Found and fixed many challenging bugs in a legacy system.
  • Reviewed team members' code. Wrote system documentation and user guides.
  • Employed LogicMonitor for application health checks and monitoring. Created and configured alerts for critical services.
  • Utilized AWS EC2 services to create a load-balanced, auto-scaled REST API.
Technologies: SQL, Jenkins, JavaScript, TypeScript, Angular, Python, Docker, Amazon Web Services (AWS), Back-end, Software Design

Systems Developer

2016 - 2017
Sophos
  • Maintained various high-availability systems with a variety of technologies (mostly Perl).
  • Created new internal systems to better support the security analysts.
  • Refactored existing legacy systems to use the latest and greatest technology solutions.
  • Served as the technical lead of a small group and represented this team at stakeholder meetings.
Technologies: Docker, Python, Angular, Node.js, MySQL, Perl, Linux, Gentoo, Back-end, Software Design

Senior Software Engineer | Scrum Master

2011 - 2016
evosoft Hungary Kft
  • Handled and was responsible for several subsystems of TIA Portal—the integrated engineering framework by Siemens that redefines engineering.
  • Participated in the development of many internal tools.
  • Maintained and ran servers.
  • Automated build processes.
  • Developed a metrics and KPI statistics website (Yii Framework) and a KPI platform (ASP.NET, AngularJS, SPA).
  • Wrote tech articles for the intranet blog site.
  • Led a small team of student workers, conducted interviews. and later became a scrum master of a team of full-time workers.
Technologies: JavaScript, MySQL, Angular, Jenkins, Microsoft SQL Server, Software Design

Support Analyst

2011 - 2011
ExxonMobil
  • Worked as part of the global IT team.
  • Supported an application that facilitated online customer sales orders.
Technologies: HTML

Software Researcher

2010 - 2011
Sense/Net, Inc.
  • Developed the core of the company owned by Sense/Net ECMS.
  • Created the data provider layer for MySQL and SQLCE.
  • Posted articles to the company's blog.
Technologies: Microsoft SQL Server, JavaScript

Junior Developer

2008 - 2009
University of Pannonia
  • Developed informatic sensors for an integrated security system.
  • Built a GUI for creating and editing artificial intelligence-based rules.
Technologies: Microsoft SQL Server

Experience

Catalog and Search Experience Overhaul

https://docs.labelbox.com/docs/catalog-overview
Redesigned and completely rewrote the old and slow Catalog and Search experience. The old system wasn't scaling well and became unreliable on larger datasets. In the new system design, we chose Elasticsearch as the search engine. We used Debezium to look for changes in other databases to store asynchronously and index relevant or searchable information in Elasticsearch. The performance of regular search improved to only milliseconds on hundreds of millions of data rows. Later, we added a similarity search for an even better user experience.

Census Integration into Labelbox

https://docs.labelbox.com/docs/census-integration
Census is a data activation and reverse ETL platform that makes it easy to connect customers' data warehouses to other services.

My role in this project was to design and create a back-end service called Connector to receive and transform data from Census into Labelbox's data warehouse. With this, customers were able to import their data stored in different warehouses into Labelbox without writing a single line of code.

After setting up the sync pipeline in Census UI, Census would call the connector service's APIs with batches of data to be imported or updated (upserted). This new feature enabled our customers to import data to Labelbox in an even easier and faster way to speed up their AI and ML pipelines.

PDF Text Layer Generation with Document AI

https://docs.labelbox.com/reference/import-document-data
When labeling PDF documents, a text layer is essential so labeling becomes even more accurate. At Labelbox, customers usually upload the text layer with the PDF assets. This requires them to generate this file independently, taking precious time and resources.

I took the initiative to work on this mini-project. At first, I evaluated different off-the-shelves solutions: AWS Textract, Google Document AI, and some Python libraries (pymupdf, extract, etc.). I picked Document AI at the end because of its ease of use and sync/async support, and I added image enhancements before running the OCR, which made processing scanned PDFs more reliable.

After an agreement with the tech lead engineer, I implemented this feature into our data ingestion pipeline, bulletproofing the solution with many automated tests. Ultimately, customers were not required to upload their text layer, speeding up their document labeling process.

Data Ingestion System Overhaul

https://labelbox.com/blog/10x-faster-uploads-with-labelboxs-new-ingestion-upgrades/
I was part of the engineering team that completely rebuilt the data ingestion system to achieve ten times faster uploads, enhancing performance and efficiency. The data rows, including assets like images, videos, texts, and more, are the core of Labelbox. They were stored in MySQL, but the table didn't scale well, slowing down ingestion daily.

I created two new and separate microservices to handle data ingestion and stored the data rows in Spanner, which scales automatically infinitely. The old and new ingestion systems ran in parallel for months to ensure consistency, and then we switched to the new system with a flip of a switch, as pretty much no maintenance window was required.

This was a major breakthrough that enabled large enterprises to upload hundreds of millions of new data rows, also leading to an increase in revenues.

Data Export System Overhaul

https://labelbox.com/guides/how-to-export-your-data-with-more-granular-control/
The old export system was often unreliable and unable to export data for longer intervals of time. It also exported all available (relevant) data, hence generating super-large outputs. The output format was also not structured well, making it complicated to add new features to it.

In the Exports V2 project, we first defined a public JSON schema of the exported data to ensure consistency across version changes. Then, we implemented a robust, scalable streaming export system that allowed granular control over the data to be exported. The cleverly designed export APIs allowed the front-end team to enhance the export experience across the Labelbox UI.

SophosLabs

https://community.sophos.com/sophos-labs
SophosLabs is the core team behind Sophos, which offers the best anti-malware protection, real-time web protection, mobile security, network security, and more for mid-sized companies. I worked on various microservices, refactored legacy systems, added systems monitoring with LogicMonitor metrics, served as a technical lead for various tasks and projects, and became a go-to person for any technical questions.

TIA Portal

http://www.industry.siemens.com/topics/global/en/tia-portal/Pages/default.aspx
TIA Portal from Siemens is an engineering framework for creating, testing, and deploying robust automation systems, as well as simulation, diagnostics, and remote management. It is one of the largest C# projects in the world.

Education

2006 - 2011

Bachelor's Degree in Computer Engineering

University of Pannonia - Veszprem, Hungary

Certifications

JUNE 2019 - PRESENT

Advanced Architecting on AWS

Amazon Web Services

MAY 2019 - PRESENT

Architecting on AWS

Amazon Web Services

MARCH 2018 - PRESENT

AWS Certified Developer — Associate 2018

Udemy

APRIL 2017 - PRESENT

Accelerated ES6 JavaScript Training

Udemy

FEBRUARY 2017 - PRESENT

Understanding TypeScript

Udemy

MARCH 2013 - PRESENT

Microsoft® Certified Professional Developer (MCPD): Web Developer 4 (70-519)

Microsoft

Skills

Libraries/APIs

Node.js, React

Tools

Jenkins, Postman, CircleCI, Helm, JSON Schema, RabbitMQ

Languages

TypeScript, JavaScript, Python, GraphQL, SQL, HTML, Perl, Kotlin, C#

Frameworks

Angular, NestJS, Jest, Express.js

Paradigms

Microservices, Scrum, Agile Software Development

Storage

Elasticsearch, MySQL, Google Cloud, Microsoft SQL Server, MongoDB, Amazon S3 (AWS S3), PostgreSQL, Datadog, Google Cloud Spanner, Redis

Platforms

Linux, Docker, Amazon Web Services (AWS), AWS Lambda, Debezium, Codefresh

Other

Architecture, Back-end, Software Design, Full-stack, Pub/Sub, SaaS, Gentoo, Scrum Master, Machine Learning, Artificial Intelligence (AI), SDKs, Census API, Google Document AI, Google Cloud/Suite

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring