Nazar Barabash
Verified Expert in Engineering
Software Developer
West Hartford, CT, United States
Toptal member since July 3, 2020
Nazar is a lead big data engineer with comprehensive experience in Hadoop and Spark. He has successfully led and delivered a number of Java and Scala-based projects and frequently participates in building CI/CD pipelines using Jenkins, GitlabCI, Airflow, and Ozzie.
Portfolio
Experience
- Java - 12 years
- Jenkins - 9 years
- Spark - 6 years
- Hadoop - 6 years
- Docker - 5 years
- Spring Boot - 4 years
- Apache Kafka - 4 years
- Kubernetes - 3 years
Availability
Preferred Environment
Python, Scala, Java, Spark, IntelliJ IDEA, iOS
The most amazing...
...issue I've fixed was a client's Hadoop cluster configuration that saved around $2.5 billion annually and reduced stagnation time.
Work Experience
Lead Big Data Engineer
EPAM Systems
- Implemented the first stable, repeatable, and reusable data ingestion ETL for the fifth-largest US bank.
- Found and resolved the issue in the Cloudera cluster resource manager configuration. Proposed other changes to the default cluster configuration. Applied all of the suggestions reduced job latency and helped save $2.5 billion annually.
- Implemented a streaming ETL for collecting and unifying insurance company information about brokers and agencies. The new runtime source of truth helped to save the majority of time and efforts for the sales team on collecting data.
- Implemented a reusable DSL-based data ingestion framework that provided out-of-the-box flexibility in the implementation of new ETL jobs, collecting logs and deployments.
- Implemented a crowdsourcing platform based on the Amazon Mechanical Turk service for collecting a golden data set for a further supervised training ML platform.
Senior Big Data Engineer
Lohika Systems (Altran Group)
- Redesigned and reimplemented Hadoop v1 batch-oriented ETL to Storm based architecture.
- Migrated the custom computation engine of a fintech company to a Spark-based one. Migrated data from Neo4j and RDBMS to parquet files.
- Built a content enrichment service for a trend prediction system between social networks content.
Senior Java Engineer
Epam Systems
- Implemented one of the first Bid Data ETL and analytic systems in the industry for the biggest advertising company on the US market, which later was acquired by PayPal based on company performance.
- Implemented key functionality in one of the MVP for the EU government required using triplets, Semantic Web, SPARQL, and Alegro Graph to find matched and relative lows between the EU countries.
- Implemented crowd sourcing system for collecting golden data set to speed up supervised ML platform learning, increasing quality of results. Results being used by media company for more precocious tagging and labeling sold media content.
Experience
Senior Big Data Engineer
I built a new Spark-based engine by moving data from Neo4j and RDBMS to parquet files. The new engine was able to process more data and provide more granular reports.
Switching to the new execution engine allowed my client to keep ongoing contracts, onboard new clients, and later sell the business with a good position on the market.
Lead Big Data Engineer
I led a team to design and implement a comprehensive DSL-based ingestion framework. The framework provided out of the box features like extensibility and code reusability, unified logging, integration testing (in a small Docker-based cluster), and single command deployment.
Using the implemented framework, our team set up about 1,000 ingestion pipelines, data deduplication, and aggregation jobs in a few months. We provided a solid data set to the data science team.
Lead Big Data Engineer
By itself, data was collected in Data Lake and the analytics built on top of it did not provide many insights to business in time because of difficulties in accessing data. Proper reports and research requests take weeks.
With the help of a UI engineer, implemented a single page application where businesses can easily query for high-level data with human-readable requests like "top sales in 2020 by province." Direct match search also proposed relevant results for similar dimensionality.
The business was able to query data on demand without requests to BA teams.
Education
Master's Degree in Applied Math and Informatics
Ivan Franko National University of Lviv - Lviv, Ukraine
Bachelor's Degree in Applied Math and Informatics
Ivan Franko National University of Lviv - Lviv, Ukraine
Skills
Libraries/APIs
REST APIs
Tools
GitLab CI/CD, Jenkins, IntelliJ IDEA
Languages
Java, Scala, Python
Frameworks
Spark, Spring Boot, Hadoop, Spring, Hibernate
Paradigms
REST
Platforms
Apache Kafka, Docker, Kubernetes, iOS, OpenShift, Amazon Web Services (AWS)
Storage
MySQL
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring