
Vishnu Chevli
Verified Expert in Engineering
Data Science Developer
Surat, Gujarat, India
Toptal member since February 4, 2022
Vishnu, a data scientist at Reddit with 14+ years of experience, has specialized in data science and machine learning for 11+ years. He excels in providing data-driven solutions to diverse stakeholders and has a rich professional background, having worked with renowned companies such as KPIT, General Mills, Cognizant, and Wipro. Vishnu seeks to contribute his expertise to projects encompassing data science, machine learning, analytics, data mining, operational optimization, and logistics design.
Portfolio
Experience
- Data Science - 11 years
- Python - 11 years
- R - 10 years
- Data Analytics - 9 years
- Predictive Modeling - 8 years
- Machine Learning - 8 years
- Data Visualization - 7 years
- Deep Learning - 3 years
Availability
Preferred Environment
Python, R, Tableau, RStudio, Tableau Server, SQL, Jupyter Notebook, Microsoft Excel, Microsoft Power BI
The most amazing...
...projects I've developed are machine learning models which have automated tasks of skilled labor, saving 14 full-time resources and four days of processing time.
Work Experience
Data Scientist
Reddit, Inc.
- Developed dynamic mode analytics dashboards for engineering and product teams, delivering real-time insights, expediting decision-making, and enhancing operational efficiency.
- Analyzed user behavior patterns, deriving actionable insights for product enhancement. Identified key improvement areas and new development opportunities by thoroughly examining user usage patterns.
- Led data-centric A/B testing, ensuring statistical rigor and robust methodologies. Collaborated cross-functionally to prepare, support, and approve tests, influencing data-driven product decisions.
- Investigated and provided insights on incidents, ensuring prompt resolution. Applied advanced analytics to detect pattern shifts, contributing to proactive incident prevention strategies.
- Designed and implemented fact tables on BigQuery, optimizing storage for analytical use. Orchestrated Apache Airflow DAGs, automating workflows for enhanced efficiency and reliability in data processing.
- Led optimization for complex queries, substantially improving space and time performance. Implemented strategies for enhanced database efficiency and reduced latency.
- Supported the user economy and monetizable products from a data science perspective. My work involves growth analysis, monitoring, and experimentation using advanced statistical techniques.
- Developed difference-in-differences models to estimate the impact of various event-specific and community-specific awards on overall product awards.
Principal Data Scientist
Venus Jewel
- Constructed predictive machine learning models in Python using scikit-learn for automating rough and polished diamond grading.
- Leveraged data modeling and statistical analysis to identify data-driven solutions within the organization.
- Created reports, dashboards, and decision engines using RStudio Shiny and Tableau to cater to various stakeholders, including an automated MS suggestion system for grading.
- Developed and deployed analytical models utilizing statistical and machine-learning techniques in R and Python.
- Provided mathematical modeling expertise for diamond pricing, considering factors such as demand and supply.
- Designed a customized machine learning class architecture that combines algorithms like XGBoost, gradient boosting, random forests, and support vector machines, optimizing model predictability.
Data Scientist
KPIT
- Developed a predictive model in Python using scikit-learn based on supervised and semi-supervised learning for anomaly detection and prediction in engine failures.
- Provided city traffic and community planning analysis based on cell phone data in Python using NLTK and clustering methods.
- Designed a business model for a hybrid and electric vehicle charging station based on telematics data using R and R Shiny.
- Optimized territory unit planning for a smart meter data collection based on geographical data using distance and density-based clustering and optimization techniques.
- Performed vehicle and crew scheduling for a state transportation corporation using resource optimization techniques in Python.
- Analyzed vehicle driving patterns and prepared a driver scorecard using R and R Shiny telematics data.
- Delivered anomaly detection in utility (gas and electricity) meter data using a rule-based engine and semi-supervised modeling in Python.
- Handled data churning, scraping, and processing for a client in the engineering domain. Processed over 1TB of data to generate business insights using complex statistical analysis.
Programmer Analyst
Cognizant
- Worked as a module lead for a 2-member team to develop middleware technologies. Acted as a subject matter expert (SME) for end-to-end middleware technology applications. Automated the security and traffic system in the security domain.
- Contributed to the traffic control software for Ikusi (Spain). Designed the integration driver module for the Remote Control Unit—a device to control end devices (traffic activity tracker, variable message panels, weather stations, cameras, etc.).
- Developed the driver integration module for the security system in VC++. Designed the integration driver module to control the CCTV server.
Software Engineer
Wipro
- Worked as an application designer collaborating with client business heads to gauge customer needs. Also served as the single point of contact for handling client escalations and support queries.
- Handled the enhancement of a simulator for an electronic chip-making machine in the embedded domain using MFC (Microsoft Foundation Classes) as technology.
- Contributed to the development of a simulator on C#.NET to integrate the back end, which was developed for a chip-making machine. Implemented communication functionalities between the front and back end based on XML protocols.
Experience
Electric Vehicle Charging Needs
• EV owners need timely charging reminders.
• Fleet managers seek optimized charging operations.
• Charging stations require demand forecasting for better planning.
SOLUTION
• Built a forecasting model to predict EV owners' charging needs based on trip schedules or daily driving patterns.
• Developed a forecasting and optimization model for fleet managers to identify business opportunities and recommend optimal charging strategies.
• Designed a demand forecasting model for charging stations to track utilization, predict future needs, and suggest infrastructure upgrades.
TECHNOLOGY AND METHODS
• Python | XGBoost | Deep Learning | Hybrid Forecasting
• Metrics: RMSE, MAPE for accuracy evaluation
IMPACT
• Reduced charge anxiety for EV owners.
• Optimized fleet charging, cutting costs and improving efficiency.
• Enabled better resource planning for charging stations.
Enhanced Product Search with LLM-powered Matching and Scalable Deployment
• Language model and sentence encoder: Prepared a language model and sentence encoder-based solution to enhance the precision of product matching. Combined cosine similarity with business logic to improve the base solution. Leveraged pgvector to store product embeddings and used required indices to optimize the performance of the DB, considering over 50 million product embeddings.
• Internal UI for model validation: Built an internal Streamlit-based web application to validate matched product results. Provided a solution for a feedback mechanism for continuous model improvement.
• AWS EC2 deployment with REST APIs: Developed and deployed REST APIs using Flask on AWS EC2 to manage logins, process queries, register new products, and gather feedback.
Key results from implementing this solution:
• Improving the search experience as empty search results were reduced by 50%.
• Customer engagement improved as the click rate increased by 10%.
• Brought meaningful impact as the order rate improved by 3-4%.
Data-driven Insights: Tableau Visualization and User-friendly Business Survey Results
In addition to creating the visualization template, I managed a detailed document outlining the process of regenerating the dashboard. This document captures all the steps involved in creating and updating the dashboard, including data sources, data transformations, and any calculations or visual elements used. The documentation serves as a comprehensive guide for future reference, ensuring that the dashboard can be easily replicated or modified as needed.
By combining my expertise in Tableau with the business survey results, I delivered a robust and user-friendly solution for visualizing and understanding the data. The detailed documentation I provided ensures that the dashboard can be maintained and improved over time, enabling stakeholders to make informed decisions based on the survey findings.
Transforming Surveys: Leveraging Telematics for Enhanced Planning and Visualization
PROJECT DELIVERABLES
• Identifying points of interest like residential, commercial, and industrial spots.
• Planning of new roads and traffic signals for the given territory.
• Planning of means of transportation and stations to serve population needs.
• Planning of commercial points like shopping complexes, billboards, and service centers.
• Optimizing current transportation services crew and vehicles.
• Visualizing and creating dashboards on Shiny and Tableau.
Rain Gauge Total Prediction: Python Model Soars to 3rd Place on Kaggle
The challenge was to generate a probabilistic distribution of the hourly rain gauge total using the provided polarimetric data for various variables over a span of 15 days.
SOLUTION
To address this, a Python-based predictive model was developed. This model demonstrated exceptional accuracy and achieved the 3rd position on Kaggle, showcasing its effectiveness in solving the problem at hand.
Microsoft Malware Classification Challenge | Malware Classification: Accurate Family Detection
https://github.com/vrajs5/Microsoft-Malware-Classification-ChallengeThe tand was to classify a collection of known malware files encompassing a diverse range of nine distinct families. The volume of uncompressed data amounted to a substantial 0.5TB (500GB).
SOLUTION
To address this challenge, an advanced approach was employed. Byte-wise frequency counts were meticulously calculated from the malware files, capturing the occurrences of each byte across the dataset. This process involved analyzing the binary representation of the files and deriving statistical information from the byte-level patterns.
Based on these byte-wise frequency counts, a sophisticated model was developed using advanced machine learning techniques. The model utilized this comprehensive statistical representation to accurately classify and differentiate the malware files into their respective families. The approach showcased the ability to effectively analyze and classify large-scale malware datasets, contributing to the broader domain of cybersecurity.
Telematic Fingerprinting: Accurately Identifying Drivers Through Predictive Modeling
The challenge was to develop a telematic fingerprint that could accurately identify instances when a specific driver conducted a trip.
SOLUTION
To address this problem, a predictive model was created, leveraging advanced techniques in machine learning and data analysis. The model was designed to establish a unique signature for each driver based on their individual driving behavior. By analyzing various telematic data such as speed, acceleration, braking patterns, and other driving characteristics, the model could effectively differentiate and identify the driving style of a particular driver. This solution enabled the precise identification of driver-specific trips, contributing to enhanced monitoring and analysis in the field of telematics.
Efficient Milk Run Routing: Excel-based Tool for Cost Reduction and Delivery Optimization
The challenge involved designing a generic Excel-based tool capable of replicating the milk run routing, a transportation model that utilizes mixed-integer linear programming.
SOLUTION
To address this problem, a comprehensive solution was developed. The tool incorporated advanced algorithms and optimization techniques to optimize the routing process and achieve the following deliverables:
1. Reduction in transportation costs by determining the most efficient routes for delivering goods.
2. Improvement in promise delivery by enhancing the predictability and reliability of goods reaching customers within specified timeframes.
3. Decrease in truckload transportation by optimizing the allocation and utilization of available resources, resulting in more efficient transportation operations.
By leveraging this Excel-based tool, businesses could benefit from cost savings, improved delivery performance, and optimized resource allocation in their milk run routing processes.
Driving Efficiency: Optimizing Routes, Vehicles, and Crew for Fleet Management Success
I led the development and implementation of diverse optimization techniques for a fleet management organization. Solutions delivered included:
1. Route Optimization: Employed advanced techniques to optimize routes for scheduled trips, ensuring efficient and cost-effective transportation services for corporate clients.
2. Vehicle Optimization: Utilized sophisticated models to optimize vehicle allocation for inter-state and inter-city transportation, maximizing resource utilization and minimizing costs.
3. Intracity Shuttle Optimization: Developed strategies to optimize the deployment of vehicles for intracity shuttle services, improving service reliability and reducing operational expenses.
4. Crew Optimization: Designed an advanced solution for crew allocation, considering labor laws and operational demands, resulting in improved productivity and compliance.
These solutions revolutionized the organization's operations, enhancing route planning, vehicle allocation, and crew management. The outcomes included improved efficiency, cost reduction, and better compliance with labor regulations.
Virtual Stock Trader: A Dynamic Stock Trading Game with Interactive Panels and Visualizations
• Admin panel
• Broker panel
• Player dashboards
• Technological implementations
To bring this game to life, I employed HTML, PHP, and CSS to craft the user interface. I designed the database using MySQL and employed PHP-Flash plugins to visualize various values.
Insightful Data Pipeline: Python-based Scraping and Summarization with XML Parsing and CSS Output
There was a need to prepare a data pipeline for scraping business-relevant insights from various sources.
SOLUTION
I developed a Python-based module that efficiently parses chunks of XML files and extracts the necessary information, which is then summarized and organized in CSS files.
TECHNOLOGIES
The solution involved the use of Python for programming, XML parsing techniques to extract data, CSS writing mode for structuring the output, as well as libraries such as Pandas and NumPy for statistical analysis and data scraping.
Dynamic Financial and Operational Analysis: Hospital Dashboards for Informed Decision-making
• Developed dynamic reports at the cost-profit level for various stakeholders.
• Analyzed segments and departments, including OPD, operatives, pharmacy, and pathology.
• Conducted specialty analysis, including cardiology, orthopedics, gynecology, and more.
• Examined business segments, such as cashless, cash paid, and corporate.
I focused on the hospital's operational dashboards:
• Monitored segment KPIs like occupancy rate, expenses, and length of stay.
• Conducted material consumption analysis for IPD and pathology.
• Created staff schedules and presence dashboards.
• Performed customer feedback and drill-down analysis.
DATA MANAGEMENT
• Utilized secured data connections over SQL and NoSQL databases to access the required data for the analysis.
VISUALIZATION TOOLS USAGE
• Leveraged Tableau and QlikView's powerful features to create interactive and visually appealing dashboards for financial analysis, operational metrics, and stakeholder reports.
• These visualization tools added an extra layer of insight and interactivity to the project, enabling stakeholders to make data-driven decisions effectively.
Revolutionizing Healthcare Staffing Predictions: Python, Snowflake-Snowpark, and Streamlit
Data Analyst Work
1. Normalized data based on business requirements.
2. Conducted data sanity checks.
3. Prepared a financial decline curve.
4. Documented intermediate steps for client debugging.
Education
Postgraduate Diploma in Industrial Engineering in Operations and Supply Chain Management
Indian Institute of Management Mumbai (formerly known as NITIE Mumbai) - Mumbai, Maharashtra, India
Bachelor of Technology Degree in Computer Engineering
Sardar Vallabhbhai National Institute of Technology - Surat, Gujarat, India
Certifications
Python for Time Series Data Analysis
Udemy
The Complete Google BiqQuery Masterclass: Beginner to Expert
Udemy
Power BI Masterclass from Scratch
Udemy
Digital Twins: Enhancing Model-based Design with AR, VR and MR
University of Oxford
Tableau 20 Advanced Training | Master Tableau in Data Science
Udemy
Complete Course on Product A/B Testing
Udemy
Tableau 2020 A-Z | Hands-on Tableau Training for Data Science
Udemy
AWS Redshift | A Comprehensive Guide
Udemy
Optimization with Python | Solve Operations Research Problems
Udemy
Natural Language Processing (NLP) with Python
Udemy
Pytorch for Deep Learning and Computer Vision
Udemy
Data Analysis and Statistical Inference
Duke University via Coursera.org
Data Science Specialization
John Hopkins University via Coursera.org
Six Sigma Green Belt
RABASQ
Skills
Libraries/APIs
Scikit-learn, NumPy, XGBoost, Pandas, PyTorch, Natural Language Toolkit (NLTK), SpaCy, Matplotlib, TensorFlow, Keras, OpenCV, CatBoost, REST APIs, LSTM
Tools
Tableau, Microsoft Excel, Microsoft Power BI, Tableau Desktop Pro, Scikit-image, Excel 2013, Jupyter, Qlik Sense, Power Pivot, Spreadsheets, GitHub, BigQuery, Plotly, Amazon SageMaker, Git, GitLab, Power Query, Apache Airflow, ChatGPT, Stitch Data
Languages
Python, R, SQL, Python 3, XML, SQL DDL, SQL DML, Data Manipulation Language (DML), CSS, Markdown, Snowflake
Paradigms
ETL, Linear Programming, Data-informed Visual Design, B2B, Constraint Programming, Dynamic Programming
Storage
PostgreSQL, MySQL, Data Pipelines, Microsoft SQL Server, Data Definition Languages (DDL), NoSQL, MongoDB, Redshift
Frameworks
RStudio Shiny, Streamlit, Flask
Platforms
RStudio, Jupyter Notebook, QlikView, Hex, Oracle Database, Amazon Web Services (AWS), Amazon EC2
Other
Optimization, Statistics, Machine Learning, Data Science, Predictive Modeling, Data Analytics, Feature Analysis, Data Analysis, Mixed-integer Linear Programming, Linear Optimization, Computer Vision, Analytics, Reporting, Dashboards, Data Visualization, Supervised Learning, Supervised Machine Learning, Natural Language Processing (NLP), Data Modeling, Operations Research, Time Series, Statistical Forecasting, Statistical Modeling, Time Series Analysis, Random Forest Regression, K-means Clustering, Clustering Algorithms, Random Forests, Multivariate Statistical Modeling, CSV, A/B Testing, Generative Pre-trained Transformers (GPT), Statistical Analysis, Data Cleaning, Regression, Classification Algorithms, Data Scientist, Classification, Predictive Analytics, Data Transformation, Decision Tree Classification, Data-driven Dashboards, Exploratory Data Analysis, Causal Inference, Unstructured Data Analysis, CSV File Processing, Unsupervised Learning, Microsoft Excel, SQL, Logistic Regression, Data Cleansing, Regression Modeling, Machine Learning Algorithms, Fine-tuning, Data Processing, RMSE, MAPE, ETL Tools, Data Engineering, Complex Data Analysis, Generalized Linear Model (GLM), Gusek, Leadership, Team Leadership, Clustering, Statistical Data Analysis, Deep Learning, Neural Networks, APIs, Forecasting, Trend Forecasting, Support Vector Machines (SVM), Gradient Boosting, Dimensionality Reduction, Support Vector Regression, Gradient Boosted Trees, K-nearest Neighbors (KNN), Computer Vision Algorithms, PuLP, Integer Programming, Pivot Tables, Artificial Intelligence (AI), Data Manipulation, Data Reporting, Real-time Data, Data Management, Decision Trees, Text Classification, Mode Analytics, Google BigQuery, Scheduling, Decision Modeling, Excel 365, Programming, Integration, Large Data Sets, Data Gathering, Office 365, Logistics, Inventory Management, Big Data, Algorithms, Large Language Models (LLMs), Technical Leadership, Spreadsheets, Labeling, Text Recognition, Supply Chain Management (SCM), Market Research & Analysis, Consumer Behavior, Churn Analysis, Funnel Analysis, User Journeys, Optimization Algorithms, Path Optimization, Route Optimization, Product Analytics, Trend Analysis, Model Tuning, Tuning, Data Structures, Vehicle Routing Problem (VRP), Vector Databases, Tableau Server, Geospatial Data, Geospatial Analytics, Business Analysis, Convolutional Neural Networks (CNNs), Text Analytics, Semantic Analysis, Topic Modeling, Pyomo, Deep Neural Networks (DNNs), Image Recognition, P&L Forecasting, Web Scraping, Data Scraping, Amazon Redshift, Recommendation Systems, Data-informed Recommendations, Digital Twin, User-defined Functions (UDF), DAX, Supply Chain Optimization, Demand Planning, Data Matching, Financial Modeling, Embeddings from Language Models (ELMo), Vector Data, Pgvector, Cohort Analysis, FastAPI, Data Curation, Web Development, Scraping, Recurrent Neural Networks (RNNs)
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring