Data Engineer
2020 - PRESENTNordeus- Developed, maintained, and optimized a petabyte-scale big data platform on GCP, handling billions of events per day. This work is ongoing.
- Created an internal tool for analyzing the cost of the data warehouse and data lake, which helped detect unneeded and overpriced data.
- Maintained and improved an in-house ETL (workflow orchestrator) application written in Python. This work is ongoing.
- Maintained, improved, and extended a data warehouse and data lake that store a huge amount of data. This work is ongoing.
- Integrated data from internal and external third-party systems into the data warehouse.
- Designed and wrote 50+ data import (Python) and SQL (Presto or Hive) transformations while ensuring high data quality.
- Migrated the event ingestion pipeline from the in-house ETL to Apache Airflow on Google Cloud Composer and optimized the pipeline, reducing the event latency SLO by 70%.
- Suggested and implemented several data lake storage optimization techniques that reduced the storage size and cost by 40%.
Technologies: Google Cloud Platform (GCP), Google Cloud Storage, Presto DB, Apache Airflow, BigQuery, Data Engineering, Data Warehousing, Big Data, Big Data Architecture, Hadoop, Business Intelligence (BI), Tableau, Google Cloud Composer, Apache Hive, Data Quality, Data Modeling, Data Lakes, ETL, Data ScienceCloud Data Engineer
2018 - 2020Kumulus Soft- Built end-to-end data analytics solutions on top of AWS Cloud.
- Collected, ingested, and integrated multiple data sources via an ETL process and stored the data in Amazon Neptune. The graph database was used to create a property graph that represents the complex hierarchical structure of the client's business.
- Visualized the graph data using the D3.js library on a custom-built front-end webpage and enabled interactive data querying via Apache TinkerPop and the Gremlin graph traversal language.
- Designed and implemented big data pipelines capable of handling terabytes of data.
- Built front-end clients using vanilla JavaScript and visualization libraries.
- Implemented RESTful APIs to support front-end applications.
- Designed and implemented serverless microservices.
Technologies: AWS Lambda, AWS, Redshift, AWS S3, Serverless, Graphs, GraphDB, Amazon Neptune, Elasticsearch, AWS CloudFormation, AWS DynamoDB, API Gateways, AWS Athena, AWS RDS, AWS EMR, AWS Serverless Application Model, Big Data, ETL, Data ScienceSoftware Engineering Intern
2018 - 2019Microsoft- Built a Microsoft Office add-in to help students practice math skills.
- Combined and integrated services from multiple Microsoft Office products.
- Assisted in shipping a feature to production that has thousands of users worldwide.
- Implemented the front end and back end using C++, C#, TypeScript, and React.
Technologies: C++, C#, React, TypeScript, JavaScript, AlgorithmsProgramming Tutor
2016 - 2018Educational Center Belgrade- Prepared high school students for national programming competitions.
- Taught advanced computer science topics, such as dynamic programming, graph theory, and data structures.
- Assisted students in winning medals during international olympiads in programming.
Technologies: Algorithms, Data Structures, Competitive Programming, Computer ScienceSoftware Engineering Intern
2017 - 2017Microsoft- Created an internal big data tool that provides insights about system health and performance.
- Enabled users to discover performance bottlenecks, quickly recover from failures, and understand the system state through visualizations by using this tool.
- Used a tech stack that included C# and proprietary big data engines.
Technologies: C#, C++, Graphs, Algorithms, Monitoring, Big Data, Big Data Architecture