Nazar Barabash, Software Developer in West Hartford, CT, United States
Nazar Barabash

Software Developer in West Hartford, CT, United States

Member since July 3, 2020
Nazar is a lead big data engineer with comprehensive experience in Hadoop and Spark. He has successfully led and delivered a number of Java and Scala-based projects and frequently participates in building CI/CD pipelines using Jenkins, GitlabCI, Airflow, and Ozzie.
Nazar is now available for hire

Portfolio

Experience

Location

West Hartford, CT, United States

Availability

Part-time

Preferred Environment

Python, Scala, Java, Spark, IntelliJ IDEA, iOS

The most amazing...

...issue I've fixed was a client's Hadoop cluster configuration that saved around $2.5 billion annually and reduced stagnation time.

Employment

  • Lead Big Data Engineer

    2016 - PRESENT
    EPAM Systems
    • Implemented the first stable, repeatable, and reusable data ingestion ETL for the fifth-largest US bank.
    • Found and resolved the issue in the Cloudera cluster resource manager configuration. Proposed other changes to the default cluster configuration. Applied all of the suggestions reduced job latency and helped save $2.5 billion annually.
    • Implemented a streaming ETL for collecting and unifying insurance company information about brokers and agencies. The new runtime source of truth helped to save the majority of time and efforts for the sales team on collecting data.
    • Implemented a reusable DSL-based data ingestion framework that provided out-of-the-box flexibility in the implementation of new ETL jobs, collecting logs and deployments.
    • Implemented a crowdsourcing platform based on the Amazon Mechanical Turk service for collecting a golden data set for a further supervised training ML platform.
    Technologies: REST APIs, Apache Kafka, OpenShift, Docker, Hadoop, Spark, Python, Scala, Java
  • Senior Big Data Engineer

    2014 - 2016
    Lohika Systems (Altran Group)
    • Redesigned and reimplemented Hadoop v1 batch-oriented ETL to Storm based architecture.
    • Migrated the custom computation engine of a fintech company to a Spark-based one. Migrated data from Neo4j and RDBMS to parquet files.
    • Built a content enrichment service for a trend prediction system between social networks content.
    Technologies: Spark, Hadoop, Java
  • Senior Java Engineer

    2011 - 2014
    Epam Systems
    • Implemented one of the first Bid Data ETL and analytic systems in the industry for the biggest advertising company on the US market, which later was acquired by PayPal based on company performance.
    • Implemented key functionality in one of the MVP for the EU government required using triplets, Semantic Web, SPARQL, and Alegro Graph to find matched and relative lows between the EU countries.
    • Implemented crowd sourcing system for collecting golden data set to speed up supervised ML platform learning, increasing quality of results. Results being used by media company for more precocious tagging and labeling sold media content.
    Technologies: Amazon Web Services (AWS), Hibernate, AWS, Spring, Jenkins, MySQL, Java

Experience

  • Senior Big Data Engineer

    The fintech company from the Bay area built its own computation engine to serve performance management reports. However, in time, the amount of data has grown significantly, and the user wanted to be able to drill down to the lowest levels of the reports. The custom execution engine was not able to handle such a request.
    I built a new Spark-based engine by moving data from Neo4j and RDBMS to parquet files. The new engine was able to process more data and provide more granular reports.
    Switching to the new execution engine allowed my client to keep ongoing contracts, onboard new clients, and later sell the business with a good position on the market.

  • Lead Big Data Engineer

    The biggest Canadian retailer decided to extract new insights and business value on top of siloed data.
    I led a team to design and implement a comprehensive DSL-based ingestion framework. The framework provided out of the box features like extensibility and code reusability, unified logging, integration testing (in a small Docker-based cluster), and single command deployment.
    Using the implemented framework, our team set up about 1,000 ingestion pipelines, data deduplication, and aggregation jobs in a few months. We provided a solid data set to the data science team.

  • Lead Big Data Engineer

    I was engaged on behalf of Canadian retailer to design and develop a proof-of-concept search engine solution for support of data discovery activities over sub-set of the client’s existing enterprise data warehouse.
    By itself, data was collected in Data Lake and the analytics built on top of it did not provide many insights to business in time because of difficulties in accessing data. Proper reports and research requests take weeks.
    With the help of a UI engineer, implemented a single page application where businesses can easily query for high-level data with human-readable requests like "top sales in 2020 by province." Direct match search also proposed relevant results for similar dimensionality.
    The business was able to query data on demand without requests to BA teams.

Skills

  • Languages

    Java, Scala, Python
  • Frameworks

    Spark, Spring Boot, Hadoop, Spring, Hibernate
  • Libraries/APIs

    REST APIs
  • Tools

    GitLab CI/CD, Jenkins, IntelliJ IDEA
  • Paradigms

    REST
  • Platforms

    Apache Kafka, Docker, Kubernetes, iOS, OpenShift, Amazon Web Services (AWS)
  • Storage

    MySQL
  • Other

    AWS

Education

  • Master's Degree in Applied Math and Informatics
    2008 - 2009
    Ivan Franko National University of Lviv - Lviv, Ukraine
  • Bachelor's Degree in Applied Math and Informatics
    2004 - 2008
    Ivan Franko National University of Lviv - Lviv, Ukraine

To view more profiles

Join Toptal
Share it with others