Big Data Consultant2016 - PRESENTClients (via Toptal)
Technologies: Druid.io, Spark, Apache Kafka, Hadoop
- Consulted with startups and medium-scale organizations to build data lakes for analytics.
- Advised organizations on building real-time data pipelines using Kafka and Spark.
- Helped organizations to analyze and report on their datasets.
Data Modeling/Data Analyst (Hive and Kafka)2020 - 2020Uniphi Inc
Technologies: Amazon Web Services (AWS), Azure, Spark, Apache Hive, Apache Kafka
- Understood the problem statement thoroughly, explored available options to solve the problem in the data engineering world, and proposed the most optimized and stable architecture.
- Developed a configuration-driven streaming engine that auto-detects changes from well-known distributed systems like AWS S3, Azure Blob, and GCP File System and ingest them into Streaming Queue (Kafka).
- Affected the handover successfully to the in-house development team with proper knowledge transfer and supported end-to-end functionality for a week.
Senior Data Engineer2016 - 2020Morgan Stanley
Technologies: Amazon Web Services (AWS), Apache Airflow, Apache Hive, Apache Kafka, AWS EMR, AWS Glue, Hadoop, HDFS, Spark, Spark Structured Streaming, Spark Streaming
- Received consecutive promotions for four years for exceptional performance.
- Got MD recognition for exceptional deliverables for real-time data ingestion Initiative.
- Worked on analytics system that currently processing 10K records per minute on 10 node spark cluster.
- Managed and nurtured a team of six people to work on the next-generation real-time cyber analytics engine.