Self Employed2018 - PRESENTIndependent Machine Learning Consultant
Technologies: Optimization, AWS S3, AWS EC2, Docker, Python, Pandas, Scikit-learn, XGBoost, Solution Architecture, System Architecture
- Built and deployed multiple personalization ML pipelines to lift offer/coupon conversion rate for customers of major restaurant chain. Application built on Azure Databricks platform with business-configurable pipelines for training, tuning, testing and prediction, using Pandas (Python), Spark (PySpark), sklearn and Spark ML. Deployment to production using Azure Data Factory.
- Automated, hardened, and deployed multiple ML pipelines on AWS (Elastic Map Reduce with Spark and Lambda) to predict next-best-action, forecast performance and predict/prevent churn for sales representatives of major corporation. Data processing used Python, PySpark and Spark SQL. ML models built using Microsoft ML for Spark.
- Advised a mid-stage startup on the requirements, features, and architecture needed to support ML pipelines in their high-speed stream processing framework and in-memory data grid. Worked directly with the CEO/CTO and senior technical team.
Senior Product Architect, Infosys Nia (Palo Alto)2017 - 2018Infosys Technologies
Technologies: AWS EC2, AWS EMR, AWS S3, HDFS, MapReduce, YARN, Hadoop, Scikit-learn, Pandas, NumPy, SciPy, Python, R, Linux, Bash, Java, OpenMP, MPI, C++, Machine Learning
- Developed and prioritized the roadmap for the integration of Skytree software into Infosys Nia.
- Trained 100+ Infosys sales leaders, solutions architects and data scientists on Skytree capabilities, technology, architecture, system requirements, demos etc. Also trained enterprise-wide data science teams on ML science and best practices.
- Evangelized the newly acquired ML capabilities to Fortune 500 prospects as well as existing clients.
Co-Founder2009 - 2017Skytree Inc.
Technologies: Linux, Bash, Java, R, Pandas, NumPy, SciPy, Scikit-learn, Python, AWS EC2, AWS EMR, AWS S3, HDFS, MapReduce, YARN, Hadoop, Apache Spark, OpenMP, MPI, C++
- Worked directly with our Fortune 500 customers and collaboratively built predictive machine learning models/pipelines for fraud detection (for American Express), product and media recommender systems (for Samsung), credit risk scoring - consumer and SMB (for American Express and Equifax), Lead Scoring - Premium Consumer Credit Card (for American Express), Balance Transfer Offer Optimization (for Discover), churn prevention (E-Harmony, ShoeDazzle), real estate price prediction (Brookfield RPS), and many others for Fortune 500 clients.
- Led engineering and data science and ultimately moved to technical product management and ownership for Skytree’s flagship product. Led the research and development of Skytree’s high performance and massively parallel C++ library for tera-scale ML. Implemented (from scratch) mathematically scalable and distributed algorithms for nearest neighbors, random forests, gradient boosted trees, support vector machines, clustering, collaborative filtering, etc. for classification, regression, anomaly detection, and recommender systems. This included many first of the kind innovations in the practical application of ML algorithms to big data.
- Architected Skytree’s (flagship) Infinity AI platform, including APIs, GUI, and SDKs. The Java-based server coordinated with the underlying multi-tenant Big Data or cloud infrastructure, managing data, users, resources, and scheduling jobs (a mix of Apache Spark for data processing and Skytree’s C++ engine for ML). Platform support included Apache Hadoop (YARN & HDFS) from MapR, Hortonworks, and Cloudera as well as AWS Elastic Map Reduce.
- Delivered multiple releases of the full stack of Skytree’s AI software as the product manager for all four technical teams (ML, systems, UI, and data science), including defining and prioritizing the roadmap and coordinating release and development efforts across teams.
- Built world-class engineering (C++/HPC/ML, Java/Systems, and UI) and data science team. Defined requirements, developed and reviewed screening tests, and finalized candidates.
- Spearheaded the technical sales enablement efforts.
- Supported POCs, pre and post-sales activities, renewals, through product demos, sales calls, requirements gathering, trade shows, webinars, seminars, and tutorials.
- Trained solutions architects/sales engineers and had ownership of the technical resources they needed (demos, documentation, guides, questionnaires, etc.).
- Co-authored five patent applications in the areas of ML user experience, recommender systems, and automatic feature engineering.
- Recruited candidates for various other positions, from sales directors to senior leadership (VP of sales, marketing, and engineering).
Graduate Research Assistant2007 - 2009Georgia Institute of Technology
Technologies: .NET, Microsoft, Microsoft SQL Server, Java, C#
- Worked on integrating algorithmically optimized machine learning algorithms directly into SQL Server using the .NET platform and C# so that they ran natively inside the database under the purview of the database scheduler.
- Designed innovative disk-based algorithms to piggyback multi-dimensional space trees over database indexes (B-Tree's) to minimize disk hit rate and optimize cash hit ratio.
- Specialized in computational science and engineering, high-performance computing, and artificial intelligence.
Software Developer (Intern), Analysis Services, SQL Server Team2008 - 2008Microsoft
Technologies: C#, Microsoft SQL Server
- Integrated advanced ML algorithms, optimized for disk-based I/O, as first-class objects into SQL Server Analysis Services and exposed these through the query interface- thus enabling ML models to run in-database.
Technical Associate2005 - 2007Trilogy
Technologies: Microsoft SQL Server, Subversion (SVN), Microsoft, Java
- Designed and developed the software for Trilogy's email marketing service for, used by clients such as Gateway and Orbitz. The software used segmentation and association rule mining to increase sales, margins, and engagement (email opens and clicks), and integrated data such as demographic, email activity, clickstream, promotional, etc.
- Executed weekly campaigns that generated millions of targeted emails, measured lift through A/B testing and reported results in the form of pivot tables and dashboards.