Shriyansh Agrawal
Verified Expert in Engineering
Artificial Intelligence (AI) Developer
New Delhi, Delhi, India
Toptal member since March 31, 2022
Shriyansh is a developer who loves to design solutions for real-world challenges through the knowledge of computers and for the greater good. In the past, he has worked remotely with many open source communities and companies and published his work globally at international conferences and meetups. Some of Shriyansh’s work has been awarded great prestige.
Portfolio
Experience
- Python - 8 years
- Amazon Web Services (AWS) - 6 years
- Big Data - 6 years
- Artificial Intelligence (AI) - 6 years
- Data Science - 6 years
- Bash Script - 5 years
- Data Engineering - 5 years
Availability
Preferred Environment
PyCharm, Python, Big Data, Amazon Web Services (AWS), ETL, Artificial Intelligence (AI), Business Intelligence (BI), Data Engineering, Machine Learning, Data Science, REST APIs, Machine Learning Operations (MLOps), Data Pipelines, Full-stack, Software Development
The most amazing...
...work I've done is a US housing market AI prediction model with competitive accuracy which generated a big impact in the market.
Work Experience
Data Scientist
Moss & Associates - Main
- Created indexes and large language models (LLMs) specialized in distinct areas of Moss projects, ensuring they can provide expert-level knowledge and bot assistance.
- Incorporated AI assistants acting as subject matter experts (SMEs) for various functions and domains within MossOps to offer precise and contextually relevant support.
- Embedded the Knowledge Base and AI assistants within the system’s chat interface, ensuring easy access and interaction for users seeking information or assistance.
- Designed the Databricks Vector Search database based on OpenAPI embeddings upon the company's unstructured and structured data.
- Developed multiple ETLs with automated orchestration workflow to feed in data from various data sources (internal and 3rd-party providers).
- Ensured the system's architecture is scalable and maintainable, allowing for easy updates, expansions, and integration of new knowledge areas and functionalities.
Plone Expert (via Toptal)
Office Information Systems C-Corp
- Integrated an MFA app (Google Authenticator) to the client's financial website. This served as a secure layer for the client's financial information on the app.
- Designed the functionality of bulk data import and export to facilitate bulk changes for a non-techy administrator of the client's app.
- Oversaw the production and staging layer of the app and servers to improve the CI/CD layer for other developers.
Data Scientist | Data Engineer (via Toptal)
Cal.net, Inc.
- Designed a data lake to store data points of millions of houses in California for business development.
- Incorporated AI assistants acting as domain matter experts to offer precise and contextually relevant GIS support.
- Implemented processes for continuous updates and enhancements in a data lake using a Directed Acyclic Graph (DAG)-based system.
- Developed big data ETL pipelines to ingest versioned geographical data of California from multiple providers into a data lake. Currently, this lake consists of 28 million location rows, each with 100 feature columns.
- Designed an algorithmic approach to deduplicate locations, ingested from various location providers, and assigned a unique serializable hash ID for analysts to view a single set of unique locations across California.
- Participated in submitting various regulatory ISP requirements and compliances, saving the company millions of dollars within a constrained timeframe and objective.
- Helped the company analyze potential geographies for business expansion provisions, alongside winning grants from regulators of broadband connections.
Senior Software Engineer
Graviton Research Capital
- Developed an in-house solution for controlling distributed jobs over 1,000 servers based on a parent-child dependency graph of 1 million nodes, bridged using 1 billion input/output files of the jobs.
- Generated prompt bot services ensuring they can provide expert-level knowledge and assistance on individual job nodes.
- Integrate skilled AI assistants to offer precise and contextually relevant support to users.
- Worked on the CI/CD pipelines with abstracted Jinja templates for complete prod-orchestration based on the config file.
- Scheduled remote jobs based on their parent completion checks.
- Contributed to a real-time logging, alerting, and monitoring dashboard for deployed jobs.
- Worked on the real-time Sentinel stream over SSH, serving restricted access control of private servers.
- Communicated with multiple trader teams to bring underlying data into a common format for daily reconciliation. This includes constructing ETL pipelines with event triggers to calculate various post-trade metrics.
- Revolutionized traders' workflows by allowing them to focus solely on strategy implementation while my services handled all other aspects. I conducted rigorous monitoring and manual checks to ensure flawless performance and zero fault tolerance.
- Deployed a live P&L dashboard upon daily data influx with access control. Deployed an ETL pipeline, API gateway, and PyPI shell command to make this system user-intractable. This included automation on the cloud, log monitoring, and alerts.
Business Intelligence Expert
Reward Gateway
- Developed a BI dashboard on Sisense using data from a Big Data cluster served via MariaDB.
- Built a dashboard that serves real-time and historical analytics with visual graphs and interactive filters.
- Profiled older SQL queries and increased their performance by 12 times using best industry practices.
- Tracked issues via Jira, handling 221 of them: 112 older tickets and 109 new ones assigned to me.
- Served two sub-clients, Deutsche Bank and Ericsson, and was able to deliver all asked use cases and functionalities.
Senior Data Scientist / ML Engineer
Fello
- Developed an automated valuation model (AVM) to predict the sales price of homes in the US. Led the project since its inception, resulting in the team's expansion from one to five members over time.
- Generated point-in-time sales price predictions for seven million houses in Ohio with competitive accuracy compared to marker leaders like Zillow, who spent millions of dollars to achieve similar benchmarks over the past few years.
- Connected and traded with different real estate data providers to assess their datasets for quality and correctness and to check our models' applicability over AWS infra centered around Sagemaker.
- Worked on the explainability of this AI model by generating nearby similar houses with similar ranks to explain and convince end-users why their home is marked at a specific price band.
- Designed, built, and implemented deep learning-based QA Chatbots and search engines for sales and marketing teams. These bots were generated using in-house data and trained LLM models from OpenAI with GPT-3.
- These chatbots provide prompt responses to varied stakeholders, including procurement, refurbishing, and valuation teams. They also interpret satellite images to use in our AI model for conditional price adjustments.
- Generated historical housing trends based on individual geography to help users understand the market. Continued an ongoing project for market trend forecasting to help online buyers with unforeseen investment ROI.
- Built market trend insights on Amazon Quicksight and Mode.com. These trends have added geographical segregation with restrictive access control and are built so that project managers or non-techy clients can iterate on their own.
- Optimized the runtime complexities of AWS QuickSight manyfold and laid the foundation for real-time streamed insights from data. These BI charts were quite interactive and human-friendly, too.
Machine Learning Engineer
Fourkites
- Developed a Kafka-streamed, Spark ETL pipeline for big data processing in Hadoop clusters to produce AI prediction metrics over various geographies on Grafana.
- Designed a feature layer over an AWS data lake to ease infra consumption by individual ML models.
- Orchestrated automation and alerts using Airflow.
- Eased the coupling between training and production infra using TensorFlow Serving.
- Tested code in UAT and staging environment before pushing it for production.
- Performed exploratory data analysis (EDA) for business development and insights using pandas.
- Developed microservices with API integrations for seamless SaaS operations on the ROR platform.
- Built insights dashboards on Sisense and Grafana. These dashboards are built so that project managers or any non-techy client or client's end-users would find them easy to use. My major contribution was improving the SQL query time by 10X.
Open Source Contributor
Plone
- Developed a Plone add-on named collective.ifttt, which acts as a webhook integrated with IFTTT services to allow auto-exchange of information between platforms. If news were published on the site, it would automatically tweet about it.
- Awarded #1 Plone add-on of 2018 in the annual conference of Plone held in Tokyo for the collective.ifttt add-on.
- Developed another Plone add-on named plone.importexport, which deserializes Zope data into human-readable FileSystem format to assist non-techy users with CRUD operations on data like import and export.
- Oversaw upgrade of the plone.importexport add-on in the following years to serve it as core servings of Plone, a Python CMS platform.
Software Developer
FairShuffle
- Designed a client-side graphics rendering framework for online card games with real-time updates.
- Performed Async rendering of various layers without any observable delay. The primary challenges were the ability to adhere to multiple themes based on the configuration provided by Adobe PhotoShop.
- Multiple wow effects were also highlighted in this game to attract user attention.
Experience
SalePrice Predictions and Chatbots for the US Housing Market
https://hifello.com/The chief elements of this project are:
• To connect with different real estate data providers to assess datasets for quality and correctness along with their applicability in our models
• Since a startup funded this project, we have to ensure high productive output of the provided money
• To achieve competitive prediction accuracy with market leaders like Zillow, who have spent millions on the same issues.
• Produce LLM-based chatbots for prompt responses to non-techy stakeholders
• To explain predicted prices to end-users by finding similar recent sales in the nearby region from big data.
• Market trends data generation alongside price forecasting to help investors realize their ROI.
• To produce the historical price of all houses in certain states to demonstrate market trends.
• To generate all these predictions in real-time, dealing with over ten million houses in a single state.
PnL Dashboard For Algorithmic Trading Firm
Real-time stats were served using Kafka integrations, and historical charts were generated using the ETL pipeline, where data was extracted from AWS cloud storage.
These dashboards are integrated with trading systems, where strategies and users can interact via CLI and API.
Feature Layer over Data Lake
https://www.fourkites.com/I designed a Feature Layer over the firm's data lake, where requirements of all ML models are examined to serve a common Feature layer, thus easing infra consumption of the firm by 14x.
The challenge was to bring harmony among features, ensuring that there was no hampering of any running ML service in production.
Process Controller for Distributed Jobs over In-house Data Center
https://ripple.gravitontrading.com/Additional functionalities of this project were—and every single one of these services was designed and developed from scratch using various state-of-the-art technologies, such as Python, Bash, psql, and C++:
• CI/CD pipelines with abstracted Jinja templates for complete prod-orchestration based on the config file.
• A real-time logging, alerting, and monitoring dashboard for deployed jobs.
• Scheduling remote jobs based on their parent completion checks.
• A real-time Sentinel stream over SSH serving restricted access control of private servers.
Skills
Libraries/APIs
Pandas, REST APIs, Interactive Brokers API, Plotly.js, GitHub API, API Development, Scikit-learn, React, Node.js, PyTorch, Social Media APIs, Amazon API, Twilio API, Google Maps API, D3.js, Vue, Gmail API, PySpark, TensorFlow Deep Learning Library (TFLearn), NumPy, Asyncio, Fabric, TensorFlow, Luigi
Tools
PyCharm, Amazon SageMaker, GitLab, Plotly, GitHub, Git, LaTeX, Microsoft Power BI, Microsoft Excel, Slack, Grafana, Amazon QuickSight, AWS Glue, Tableau, Atlassian, BigQuery, DataGrip, Spotfire, GIS, ChatGPT, Ansible, Terraform, RabbitMQ, Solr, AWS SDK, Google Analytics, Apache Airflow, Sisense, AWS CLI, GitLab CI/CD, Jenkins, PyPI, SAS Business Intelligence (BI), Kibana, Kafka Streams
Languages
Python, Bash Script, Python 3, SQL, JavaScript, R, HTML, CSS, HTML5, Ruby, Scala, Pine Script, Rust, Go, Unicorn, GraphQL, C++, C Shell, Java
Frameworks
Flask, Spark, Django, Apache Spark, Hadoop, Ruby on Rails 4, Sphinx Documentation Generator, Plone, Jinja
Paradigms
ETL, Automation, REST, Good Clinical Practice (GCP), ETL Implementation & Design, B2B, Business Intelligence (BI), Object-relational Mapping (ORM), DevOps, Acceptance Testing
Platforms
MacOS, Linux, Apache Kafka, Amazon Web Services (AWS), Jupyter Notebook, Docker, Unix, Kubernetes, HubSpot, Azure, AWS Lambda, Amazon, Google Cloud Platform (GCP), Databricks, IFTTT, Blockchain, Jet Admin, Twilio
Storage
PostgreSQL, MongoDB, PostgreSQL 10, Amazon S3 (AWS S3), Data Pipelines, MySQL, Elasticsearch, Redshift, Data Lakes, Data Lake Design, AWS Data Pipeline Service, NoSQL, Amazon DynamoDB, Graph Databases, Apache Hive, Google Cloud, Database Administration (DBA), MariaDB, Neo4j
Industry Expertise
High-frequency Trading (HFT), Trading Systems
Other
Telegram Bots, Big Data, Data Engineering, Data Science, APIs, CI/CD Pipelines, Web Scraping, Data Reporting, Data Analytics, ETL Development, Data Architecture, Data Analysis, Datasets, Data Collection, Machine Learning Operations (MLOps), Real-World Evidence, API Integration, Automation Scripting, Large-scale Projects, Analytical Dashboards, Software Development, Data Processing, Multithreading, Software Architecture, Data Entry, Automated Data Flows, Business Analysis, Analytics, Data Manipulation, Reports, FastAPI, Finance, API Design, Debugging, Troubleshooting, Mathematics, Trading, Artificial Intelligence (AI), Business Solutions, Machine Learning, Data Modeling, Data Visualization, Natural Language Processing (NLP), Serverless, Data Warehousing, Data Warehouse Design, Data Quality, Data Cleaning, Data Matching, Cloud Architecture, Time Series Analysis, Forecasting, Amazon Machine Learning, Statistical Analysis, Model Development, Classification Algorithms, Mobile Games, Predictive Modeling, Statistics, Dashboards, Computer Vision, Financial Forecasting, Marketing Attribution, Business to Business (B2B), Full-stack, Data Scraping, Geographic Information Systems, GeoPandas, Geospatial Data, Real-time Data, Data Management, Data Governance, Architecture, Algorithms, Pricing Models, Data-driven Marketing, Back-end, Expert Systems, Decision Trees, Decision Modeling, Data-driven Decision-making, Chatbots, User Interface (UI), eCommerce APIs, Extensions, eCommerce, GPU Computing, Web Development, Scraping, Amazon Marketplace, Endpoint Security, Minimum Viable Product (MVP), Time Series, Reinforcement Learning, Integration, Algorithmic Trading, Frameworks, Quotations, Large Language Models (LLMs), Generative Artificial Intelligence (GenAI), Leadership, System Design, Single Sign-on (SSO), OpenAI, GitHub Actions, Cryptography, Full-stack Development, Environmental, Social, and Governance (ESG), Semantic Search, Embeddings from Language Models (ELMo), Discord, Technical Leadership, User Stories, Marketplaces, HubSpot CRM, Cryptocurrency, Backtesting Trading Strategies, Monitoring, OpenTelemetry, Azure Databricks, Azure Data Factory, Prompt Engineering, Digital Solutions, Marketing Technology (MarTech), B2C Marketing, Client Success, EDA, Servers, Data Aggregation, Data Recovery, Valuation, Webhooks, PEP 8, Cloud Data Fusion, Real-time Business Intelligence, Grafana 2, Google BigQuery, Deep Learning, Language Models, Reporting, BI Reports, BI Reporting, Marketing Mix Modeling, Generative Pre-trained Transformers (GPT), OpenAI GPT-3 API, OpenAI GPT-4 API, QGIS, Google Earth, Graphs
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring