Founder2020 - PRESENTFirmation
Technologies: Google Analytics, Google Ads, OneDrive, Google APIs, Microsoft Graph API, Pandas, AWS SES, AWS S3, AWS Lambda, OAuth 2, Python
- Founded a legaltech startup to help Indian lawyers automate their time-keeping and billing, saving them time and increasing their revenue. Led product development, sales, marketing, and development.
- Built an Azure app to integrate with law firms' timesheets in OneDrive, and automatically generate invoices from them; used Oauth 2, Microsoft Graph API, and Pandas.
- Used Oauth 2 and Microsoft Graph API to build an Azure app that automatically generates summaries of billable work lawyers have done by reading from their Outlook emails and calendars.
- Extended this functionality to Google accounts through the use of Oauth 2, Gmail API, and Google Calendar API.
- Created AWS Lambda functions with API Gateway endpoints that end-users could access to generate invoices and summaries of their billable work whenever they wanted.
- Used AWS SES to deliver invoices and billable work summaries to lawyers whenever they were generated.
- Called/emailed potential clients, demoed our product to several potential users, and successfully set up customers with the product for usage.
- Designed website (www.firmation.in) and handled marketing using Google Ads.
- Wrote a Python script to automatically contact potential leads on LinkedIn.
Data Science Consultant2019 - PRESENTBrookings Institution India Center
Technologies: Selenium, Beautiful Soup, AWS Route 53, AWS CloudFront, Droplets, DigitalOcean, AWS RDS, AWS EC2, AWS S3, Scikit-learn, AWS Lambda, Redshift, AWS, Pandas, Python
- Created a data warehouse in Redshift with one-minute resolution demand data for a large Indian state using Python, Pandas, and EC2. The data was about 6 TB of column-formatted .xls files compressed as .rars.
- Built a real-time carbon emissions tracker for India at carbontracker.in using Vue.js and Plotly, as well as AWS S3, Route 53, and Cloudfront for hosting.
- Featured in the Wall Street Journal (https://www.wsj.com/articles/solar-power-is-beginning-to-eclipse-fossil-fuels-11581964338?mod=hp_lead_pos5).
- Scraped data for the carbon tracker using Python, BeautifulSoup and a Digital Ocean Droplet, storing it in the RDS instance used by the Lambda API.
- Created a machine learning model using Scikit-learn, Python, and Pandas to predict daily electricity demand for a large Indian state trained on data from a Redshift warehouse.
- Created Python scripts to scrape housing data from various Indian state government websites using Selenium and Pandas.
- Created an API for the carbon emissions tracker using AWS Lambda, AWS API Gateway, Python, and an AWS RDS MySQL instance to serve real-time generation data, as well as various statistics.
Data Warehouse Developer2020 - 2020Confidential NDA (Toptal Client)
Technologies: Plotly, Heroku, Stitch Data, MongoDB, BigQuery
- Designed and developed a production data warehouse with denormalized tables in BigQuery using a MongoDB database on Heroku as the data source and Stitch Data as an ETL tool.
- Scheduled extractions from MongoDB every six hours using Stitch Data, ensuring that only recently-updated data was included.
- Created scheduled query to join and load data to a denormalized table in BigQuery after extraction from MongoDB is complete.
- Created graphs and geospatial plots from BigQuery data for customer demo to his client using Plotly and Jupyter Notebooks.
- Thoroughly documented instructions for setting up and querying BigQuery for future developers working on the project.
- Researched integrating Google Analytics into BigQuery to track the customer lifecycle from acquisition onwards.
- Created views on the denormalized BigQuery table to allow users to easily see the most recent state of the database.
- Worked closely with QA lead to the end-to-end test data warehouse, from automated extractions to loads and views.
Scraping Engineer2019 - 2020Tether Energy
- Wrote Bash and SQL scripts that ran on a cron job to download data from the New York ISO website and upload it to Tether's data warehouse using Presto and Hive.
- Developed Python scripts to scrape data from various formats of PDF electricity bills and then upload them to an internal service using Tabula and Pandas.
- Implemented a robust regression testing framework using Pytest to ensure that PDFs are correctly scraped.
- Augmented an internal API by adding new endpoints and models using Ruby on Rails.
- Improved an internal cron service by adding a JSON schedule that methods could run on.
- Added documentation on how to set up and test various internal services locally.
Senior Software Engineer2014 - 2018AutoGrid Systems, Inc.
Technologies: Docker, Kubernetes, YARN, Apache Kafka, RabbitMQ, Celery, Resque, Redis, HBase, Apache Hive, Spark, Python, Ruby on Rails (RoR), Ruby
- Led an engineering team both on- and off-shore and drove on-time development and deployment of product features using Agile.
- Implemented several features across AutoGrid's suite of applications using Ruby on Rails, MySQL, RSpec, Cucumber, Python, and Nose Tests.
- Created PySpark jobs to aggregate daily and monthly electricity usage reports for viewing through AutoGrid's customer portal using HBase, Redis, and RabbitMQ.
- Designed and developed a data warehouse for use by customers using Hive, HBase, and Oozie. This data warehouse was used to replace all custom in-house visualizations done by AutoGrid.
- Built an API endpoint to allow end-users to opt-out of demand response events via SMS using Ruby on Rails and Twilio.
- Optimized SQL queries to take 40% less time, making loading times much quicker in the UI.
- Designed a messaging microservice to send and track emails, SMS, and phone calls via Twilio and SendGrid.