Branko Fulurija, Data Engineer and Developer in Višegrad, Republika Srpska, Bosnia and Herzegovina
Branko Fulurija

Data Engineer and Developer in Višegrad, Republika Srpska, Bosnia and Herzegovina

Member since December 20, 2019
Branko is a data engineer who specializes in building big data platforms in the cloud. He has worked in diverse environments, including two Microsoft internships, a specialized cloud consulting agency, and one of the best mobile gaming startups in Europe. In addition to extensive experience in cloud architecture, data analytics, serverless solutions, and cost optimization, Branko has five AWS and two GCP certifications and is an award-winning competitive programmer and hackathon winner.
Branko is now available for hire

Portfolio

  • Nordeus
    Google Cloud Platform (GCP), Google Cloud Storage, Presto DB, Apache Airflow...
  • Kumulus Soft
    AWS Lambda, AWS, Redshift, AWS S3, Serverless, Graphs, GraphDB...
  • Microsoft
    C++, C#, React, TypeScript, JavaScript, Algorithms

Experience

Location

Višegrad, Republika Srpska, Bosnia and Herzegovina

Availability

Part-time

Preferred Environment

AWS, Google Cloud Platform (GCP), Apache Airflow, Docker, Git, Terminal, Serverless, Agile, Infrastructure as Code (IaC), JetBrains

The most amazing...

...thing I've developed is a petabyte-scale big data analytics platform capable of handling billions of events per day.

Employment

  • Data Engineer

    2020 - PRESENT
    Nordeus
    • Developed, maintained, and optimized a petabyte-scale big data platform on GCP, handling billions of events per day. This work is ongoing.
    • Created an internal tool for analyzing the cost of the data warehouse and data lake, which helped detect unneeded and overpriced data.
    • Maintained and improved an in-house ETL (workflow orchestrator) application written in Python. This work is ongoing.
    • Maintained, improved, and extended a data warehouse and data lake that store a huge amount of data. This work is ongoing.
    • Integrated data from internal and external third-party systems into the data warehouse.
    • Designed and wrote 50+ data import (Python) and SQL (Presto or Hive) transformations while ensuring high data quality.
    • Migrated the event ingestion pipeline from the in-house ETL to Apache Airflow on Google Cloud Composer and optimized the pipeline, reducing the event latency SLO by 70%.
    • Suggested and implemented several data lake storage optimization techniques that reduced the storage size and cost by 40%.
    Technologies: Google Cloud Platform (GCP), Google Cloud Storage, Presto DB, Apache Airflow, BigQuery, Data Engineering, Data Warehousing, Big Data, Big Data Architecture, Hadoop, Business Intelligence (BI), Tableau, Google Cloud Composer, Apache Hive, Data Quality, Data Modeling, Data Lakes, ETL, Data Science
  • Cloud Data Engineer

    2018 - 2020
    Kumulus Soft
    • Built end-to-end data analytics solutions on top of AWS Cloud.
    • Collected, ingested, and integrated multiple data sources via an ETL process and stored the data in Amazon Neptune. The graph database was used to create a property graph that represents the complex hierarchical structure of the client's business.
    • Visualized the graph data using the D3.js library on a custom-built front-end webpage and enabled interactive data querying via Apache TinkerPop and the Gremlin graph traversal language.
    • Designed and implemented big data pipelines capable of handling terabytes of data.
    • Built front-end clients using vanilla JavaScript and visualization libraries.
    • Implemented RESTful APIs to support front-end applications.
    • Designed and implemented serverless microservices.
    Technologies: AWS Lambda, AWS, Redshift, AWS S3, Serverless, Graphs, GraphDB, Amazon Neptune, Elasticsearch, AWS CloudFormation, AWS DynamoDB, API Gateways, AWS Athena, AWS RDS, AWS EMR, AWS Serverless Application Model, Big Data, ETL, Data Science
  • Software Engineering Intern

    2018 - 2019
    Microsoft
    • Built a Microsoft Office add-in to help students practice math skills.
    • Combined and integrated services from multiple Microsoft Office products.
    • Assisted in shipping a feature to production that has thousands of users worldwide.
    • Implemented the front end and back end using C++, C#, TypeScript, and React.
    Technologies: C++, C#, React, TypeScript, JavaScript, Algorithms
  • Programming Tutor

    2016 - 2018
    Educational Center Belgrade
    • Prepared high school students for national programming competitions.
    • Taught advanced computer science topics, such as dynamic programming, graph theory, and data structures.
    • Assisted students in winning medals during international olympiads in programming.
    Technologies: Algorithms, Data Structures, Competitive Programming, Computer Science
  • Software Engineering Intern

    2017 - 2017
    Microsoft
    • Created an internal big data tool that provides insights about system health and performance.
    • Enabled users to discover performance bottlenecks, quickly recover from failures, and understand the system state through visualizations by using this tool.
    • Used a tech stack that included C# and proprietary big data engines.
    Technologies: C#, C++, Graphs, Algorithms, Monitoring, Big Data, Big Data Architecture

Experience

  • Coding Interview Jumpstart Online Course Creator for Udemy
    https://www.udemy.com/course/coding-interview-jumpstart/

    Independently created and published a Udemy online course that's been taken by 23,000+ students. The course content teaches some foundational algorithms and computer science concepts and explains the principles behind fundamental algorithms asked about during interviews with top tech companies.

  • Recommendation Engine | Hackathon Winner

    My team built a service to recommend the users who have the highest probability to complete a given survey. We were given a relational model dataset, which we transformed, enriched, and inserted into a PostgreSQL database. Then we built an ETL job that denormalizes the relational data model and extracts user-related aggregations into a NoSQL DynamoDB table used by our recommendation engine. We also created a recommendation pipeline orchestrated with AWS Step Functions to extract the most relevant user pool.

  • MTS Assistant | Hackathon Winner

    A distributed cloud-based analysis system that consumed and processed telecom data to create interactive data visualization. The tool included personalized package recommendations, which used clustering and a tool for detecting irregularities in the network in real time. The telecom data was stored in Elasticsearch, and Kibana was used for data visualizations and machine learning jobs to detect anomalies.

  • US Accidents | Udacity Nanodegree Project
    https://github.com/brfulu/us-accidents-data-engineering

    This project was the capstone project in the Udacity Data Engineering Nanodegree program. The idea was to create an optimized data lake that would enable users to analyze US accident data and determine the root causes of accidents. The main goal was to build an end-to-end data pipeline capable of processing big volumes of data. We wanted to clean, transform, and load the data to our optimized data lake on AWS S3. The data lake would consist of logical tables partitioned by certain columns to optimize query latency.

  • Redshift Data Modeling | Udacity Nanodegree Project
    https://github.com/brfulu/redshift-data-modeling

    This project was part of the Udacity Data Engineering Nanodegree program. The task was to build an ETL pipeline that extracted data from AWS S3, staged it in Redshift, and transformed it into a set of dimensional tables that were ready analytics for consumption and enabled users to find insights into which songs certain users listened to. The output was a relational star schema in Redshift.

  • Data Lake ETL with Spark | Udacity Nanodegree
    https://github.com/brfulu/datalake-spark-etl

    This project was part of the Udacity Data Engineering Nanodegree. The task was to take data from a source system and move it from a data warehouse to a data lake. The data resided in AWS S3, a directory of JSON logs of user activity on the app, and a directory with JSON metadata on the songs users listened to in the imaginary app.

    The requirements included building an ETL pipeline that extracted user data from AWS S3, processing it using Spark, and loading the data back into AWS S3 as a set of dimensional tables. This would allow the analytics team to continue finding insights into what songs their users were listening to.

  • Airflow Data Pipeline | Udacity Nanodegree Project
    https://github.com/brfulu/airflow-data-pipeline

    This project was part of the Udacity Data Engineering Nanodegree program. The task was to implement an automated and monitored ETL pipeline for loading data to a data warehouse. The pipeline was implemented using Apache Airflow. The source data resided in AWS S3 and needed to be processed in the target data warehouse in Amazon Redshift. The source datasets consisted of JSON logs of user activity in the application and JSON metadata about the songs the users listened to.

Skills

  • Languages

    Python, Java, SQL, C++, JavaScript, C#, TypeScript
  • Frameworks

    Hadoop, Presto DB, Spark, AWS EMR
  • Tools

    Apache Airflow, Google Cloud Dataproc, IntelliJ, PyCharm, Terraform, Ansible, Google Compute Engine (GCE), Google Kubernetes Engine (GKE), BigQuery, AWS Glue, AWS Athena, Amazon Athena, AWS Step Functions, Git, Terminal, JetBrains, TeamCity, Grafana, AWS CloudFormation, Google Cloud Composer, Apache Beam, Cloud Dataflow, Tableau, Looker, Jenkins, AWS CloudWatch, Kibana
  • Paradigms

    ETL, Object-oriented Programming (OOP), Testing, Microservices, Business Intelligence (BI), DevOps, Agile, Data Science
  • Platforms

    Google Cloud Platform (GCP), AWS Lambda, Apache Kafka, AWS EC2, Jupyter Notebook, Docker, Kubernetes, AWS Kinesis
  • Storage

    Data Pipelines, Databases, AWS S3, Data Lakes, Google Cloud Storage, Apache Hive, Redshift, PostgreSQL, Google Cloud SQL, AWS DynamoDB, Elasticsearch, Amazon Aurora, Google Bigtable, JSON
  • Other

    AWS, Data Warehousing, Software Engineering, Algorithms, Data Structures, Cloud Computing, Serverless, Data Engineering, Data Analytics, Data Processing, Data Modeling, Dataproc, Big Data, Competitive Programming, ELT, Graphs, Networking, Identity & Access Management (IAM), Google BigQuery, EMR, AWS RDS, Streaming, Internet of Things (IoT), GraphDB, Amazon Neptune, Google Data Studio, Infrastructure as Code (IaC), CI/CD Pipelines, AWS DevOps, Monitoring, Prometheus, AWS API Gateway, Cloud Architecture, AWS Cloud Architecture, Big Data Architecture, Data Quality, API Gateways, AWS Serverless Application Model, Computer Science, AWS QuickSight, Machine Learning, Recommendation Systems
  • Libraries/APIs

    PySpark, React

Education

  • Bachelor's Degree in Computer Science
    2016 - 2020
    Faculty of Computing - Belgrade, Serbia

Certifications

  • Google Cloud Certified Associate Cloud Engineer
    JUNE 2021 - JUNE 2023
    Google Cloud
  • Google Cloud Certified Professional Data Engineer
    OCTOBER 2020 - OCTOBER 2022
    Google Cloud
  • Data Engineering Nanodegree
    DECEMBER 2019 - PRESENT
    Udacity
  • AWS Certified Big Data - Specialty
    NOVEMBER 2019 - SEPTEMBER 2022
    AWS
  • AWS Certified SysOps Administrator - Associate
    MARCH 2019 - MARCH 2022
    AWS
  • AWS Certified Developer - Associate
    JANUARY 2019 - JANUARY 2022
    AWS
  • AWS Certified Solutions Architect - Associate
    NOVEMBER 2018 - NOVEMBER 2021
    AWS
  • AWS Certified Cloud Practitioner
    SEPTEMBER 2018 - MARCH 2022
    AWS

To view more profiles

Join Toptal
Share it with others