Tech Lead and Full-stack Developer
2021 - PRESENTThoughtLeaders (via Toptal)- Worked as a full-stack developer and technical lead, implementing several new features in the platform using Django, AngularJS, Heroku, and AWS and ramping up data gathering efforts and mentoring other team members.
- Added new formats to the platform, including Twitch, and designed processes for enhancing automated scraping of YouTube and Podcast data while reducing API usage.
- Designed several new tables in the Postgres database and worked extensively with Elasticsearch to query and store data.
- Developed a cost-effective solution to generate transcripts for podcasts using Zapier, Celery, and AWS.
- Reduced monthly AWS costs by around 70% by optimizing Lambda usage, S3 storage, and removing redundant processes.
- Optimized Heroku dyno usage to reduce budget, preventing a potential cost increase of 100%. Additionally handled the upgrade of several libraries and migration of infrastructure when a security incident occurred between Heroku and GitHub.
- Designed and developed an authorization back-end and integrated the application with a payment gateway via BlueSnap.
- Created several predictive and automation features, reducing the burden on employees in other verticals of the company.
- Designed end-to-end data pipelines using AWS Kinesis Firehose, SNS, SQS, Lambda, and S3.
- Integrated AWS Athena with an existing JSON data lake in S3, allowing the querying of unstructured data.
Technologies: Django, Celery, Amazon S3 (AWS S3), AWS Lambda, AWS Kinesis, Architecture, Elasticsearch, Heroku, Python 3, Node.js, PostgreSQL, PyCharm, AngularJS, APIs, YouTube API, Scraping, Cron, Cost Reduction & Optimization, BlueSnap, AWS Push Notification Service (AWS SNS), Amazon SQSBack-end Developer
2020 - 2021Hoomi (Toptal client)- Designed and created infrastructure, databases, and APIs for a baked-goods delivery app using DynamoDB, Lambda, API Gateway, and Python with AWS Cognito for authentication.
- Created an order management front-end system for bakeries with React and authentication using AWS Cognito, using APIs that I wrote connecting to the DynamoDB database.
- Used geo libraries in DynamoDB to allow indexing and sorting by location, allowing APIs to return bakeries by distance.
- Utilized local secondary indexes in DynamoDB to index and sort by various attributes, such as rating, price, distance, etc.
- Created various APIs for both the customer and bakery apps, correctly handling authentication, order histories, order statuses, etc.
Technologies: Amazon DynamoDB, AWS Lambda, AWS Amplify, Amazon Cognito, Amazon API Gateway, Geolocation, Databases, User Authentication, Python, Serverless, ReactData Warehouse Developer
2020 - 2020Confidential NDA (Toptal Client)- Designed and developed a production data warehouse with denormalized tables in BigQuery using a MongoDB database on Heroku as the data source and Stitch Data as an ETL tool.
- Scheduled extractions from MongoDB every six hours using Stitch Data, ensuring that only recently-updated data was included.
- Created scheduled query to join and load data to a denormalized table in BigQuery after extraction from MongoDB is complete.
- Created graphs and geospatial plots from BigQuery data for customer demo to his client using Plotly and Jupyter Notebooks.
- Thoroughly documented instructions for setting up and querying BigQuery for future developers working on the project.
- Researched integrating Google Analytics into BigQuery to track the customer lifecycle from acquisition onwards.
- Created views on the denormalized BigQuery table to allow users to easily see the most recent state of the database.
- Worked closely with QA lead to the end-to-end test data warehouse, from automated extractions to loads and views.
Technologies: Plotly, Heroku, Stitch Data, MongoDB, BigQueryFounder
2020 - 2020Firmation- Founded a legal tech startup to help Indian lawyers automate their time-keeping and billing, saving them time and increasing their revenue. Led product development, sales, marketing, and development.
- Built an Azure app to integrate with law firms' timesheets in OneDrive, and automatically generate invoices from them; used Oauth 2, Microsoft Graph API, and Pandas.
- Used Oauth 2 and Microsoft Graph API to build an Azure app that automatically generates summaries of billable work lawyers have done by reading from their Outlook emails and calendars.
- Extended this functionality to Google accounts through Oauth 2, Gmail API, and Google Calendar API.
- Created AWS Lambda functions with API Gateway endpoints that end-users could access to generate invoices and summaries of their billable work.
- Used AWS SES to deliver invoices and billable work summaries to lawyers whenever they were generated.
- Called and emailed potential clients, demoed our product to several potential users, and successfully set up customers with the product for usage.
- Designed a website and handled marketing using Google Ads.
- Wrote a Python script to automatically contact potential leads on LinkedIn.
Technologies: Automation, REST APIs, Google Analytics, Google Ads, OneDrive, Google APIs, Microsoft Graph API, Pandas, AWS SES, Amazon S3 (AWS S3), AWS Lambda, OAuth 2, PythonScraping Engineer
2019 - 2020Tether Energy- Wrote Bash and SQL scripts that ran on a cron job to download data from the New York ISO website and upload it to Tether's data warehouse using Presto and Hive.
- Created scripts to automatically fetch electricity bill data for Brazilian consumers and then upload them to an S3 bucket using JavaScript, Puppeteer, and AWS.
- Automated solving of ReCAPTCHAs using JavaScript and 2captcha.
- Developed Python scripts to scrape data from various formats of PDF electricity bills and then upload them to an internal service using Tabula and Pandas.
- Implemented a robust regression testing framework using Pytest to ensure that PDFs are correctly scraped.
- Augmented an internal API by adding new endpoints and models using Ruby on Rails.
- Improved an internal cron service by adding a JSON schedule that methods could run on.
- Added documentation on how to set up and test various internal services locally.
Technologies: Automation, Apache Hive, Presto DB, Pandas, Ruby on Rails (RoR), Puppeteer, Node.js, JavaScript, SSAS Tabular, SQL, PythonData Science Consultant
2019 - 2020Brookings Institution India Center- Created a data warehouse in Redshift with one-minute resolution demand data for a large Indian state using Python, Pandas, and EC2. The data was about six TB of column-formatted .xls files compressed as .rars.
- Built a real-time carbon emissions tracker for India at carbontracker in using Vue.js and Plotly and AWS S3, Route 53, and Cloudfront for hosting.
- Featured accomplishments in the Wall Street Journal.
- Scraped data for the carbon tracker using Python, BeautifulSoup, and a Digital Ocean Droplet, storing it in the RDS instance used by the Lambda API.
- Created a machine learning model using Scikit-learn, Python, and Pandas to predict daily electricity demand for a large Indian state trained on data from a Redshift warehouse.
- Developed Python scripts to scrape housing data from various Indian state government websites using Selenium and Pandas.
- Built an API for the carbon emissions tracker using AWS Lambda, AWS API Gateway, Python, and an AWS RDS MySQL instance to serve real-time generation data and various statistics.
Technologies: Automation, REST APIs, Amazon Web Services (AWS), Selenium, Beautiful Soup, Amazon Route 53, Amazon CloudFront CDN, Droplets, DigitalOcean, Amazon EC2, Amazon S3 (AWS S3), Scikit-learn, AWS Lambda, Redshift, Pandas, PythonSenior Software Engineer
2014 - 2018AutoGrid Systems, Inc.- Led an engineering team both on- and off-shore and drove on-time development and deployment of product features using Agile.
- Implemented several features across AutoGrid's suite of applications using Ruby on Rails, MySQL, RSpec, Cucumber, Python, and Nose Tests.
- Created PySpark jobs to aggregate daily and monthly electricity usage reports for viewing through AutoGrid's customer portal using HBase, Redis, and RabbitMQ.
- Designed and developed a data warehouse for use by customers using Hive, HBase, and Oozie. This data warehouse was used to replace all custom in-house visualizations done by AutoGrid.
- Built an API endpoint to allow end-users to opt-out of demand response events via SMS using Ruby on Rails and Twilio.
- Optimized SQL queries to take 40% less time, making loading times much quicker in the UI.
- Designed a messaging microservice to send and track emails, SMS, and phone calls via Twilio and SendGrid.
Technologies: REST APIs, Docker, Kubernetes, YARN, Apache Kafka, RabbitMQ, Celery, Resque, Redis, HBase, Apache Hive, Spark, Python, Ruby on Rails (RoR), Ruby