Graduate Teaching Assistant2018 - 2020Arizona State University
- Supported computer science and software engineering courses, with topics such as principles of distributed software systems, web applications programming, design and processes, and embedded computing, including Java, C, Gradle, and Docker.
Software Development Engineer Intern2019 - 2019Amazon.com, Inc.
Technologies: AWS DynamoDB, AWS S3, Amazon SQS, AWS Push Notification Service (AWS SNS), AWS Lambda, Scala
- Held the complete ownership of a POC project to use S3-based triggering for big data jobs. Contributed to the big data platform Horizon in the finance tech department.
- Designed and developed the serverless architecture to take SNS and S3 events as input, extracted auxiliary data from DynamoDB, and then integrated the output using SQS with the existing code and infrastructure.
- Set up the proprietary framework for continuous integration and deployment to AWS infrastructure. Conducted unit testing using ScalaMock and AWS SAM.
Software Development Engineer2016 - 2018Fractal Analytics, Inc.
Technologies: Jenkins Pipeline, Jenkins, Travis CI, Apache Airflow, Bash Scripting, Bash, AWS Push Notification Service (AWS SNS), APIs, REST, JSON, Amazon Aurora, AWS RDS, PostgreSQL, AWS CloudWatch, AWS EBS, Amazon SQS, AWS EMR, Spark ML, PySpark, Apache Spark, Redis, RabbitMQ, Celery, Django REST Framework, Django ORM, Django, Python 3, Python 2
- Collaborated within the back-end engineering team on developing our cloud-based product, Trial Run. Held the complete ownership for the back-end of the customer-level experimentation module.
- Set up an ETL pipeline using PySpark, Airflow, and AWS to handle terabytes of data. Using LSH for approximate NN distance allowed us to decrease the processing time by 62% and infrastructure cost by 26% compared to the existing pipeline.
- Re-architectured API level code using Celery, RabbitMQ, and Redis caching to handle a 95x increase in experiment data size.
- Modified the data models, making the architecture more flexible to accommodate different data feeds from the clients.
- Reduced the data-error rate by 73% by setting up an automated QC process before and after the ETL process, also eliminating any manual intervention required.
- Set up a fully automated, modular deployment pipeline consisted of unit testing, post-release sanity checks, code quality assurance, artifact building and storing, and deployment to internal and cloud servers.
Software Development Engineer2015 - 2016S&P Global Market Intelligence
Technologies: SQL Server 2012, Oracle, Java
- Collected and delivered financial data feeds to clients within the Xpressfeed team. Xpressfeed is S&P's powerful data feed management solution that delivers data directly into the client's workflow.
- Implemented SQL Server long-running data jobs to process the daily market index data feed, collaborating with the index ingestion team.
- Collaborated within the index ingestion team to implement SQL jobs replicating data to the shared data layer that enables sharing the market index data across the organization.