
Srinivas Pendyala
Verified Expert in Engineering
Cloud and Data Solutions Architect and Developer
Frisco, TX, United States
Toptal member since June 7, 2024
Srinivas is a multifaceted cloud and data solutions architect with 27 years of leadership experience across engineering, pre-sales, and professional services. He drives digital transformation and modernization through solution design, architecture, and deployment of highly scalable distributed computing. Srinivas uses GenAI/ML and big data for engineering, analytics, data lakes, and data warehousing, orchestrating microservices with Kubernetes in public, private, and hybrid clouds.
Portfolio
Experience
- Hadoop - 10 years
- Google Cloud Platform (GCP) - 10 years
- Microservices - 10 years
- NoSQL - 10 years
- Kubernetes - 10 years
- Data Lakes - 10 years
- AWS Database Migration Service (DMS) - 4 years
- Generative Artificial Intelligence (GenAI) - 2 years
Availability
Preferred Environment
Generative Artificial Intelligence (GenAI), Hadoop, AWS IoT, Google Cloud Platform (GCP), Terraform, NoSQL, Microservices, Kubernetes, Data Lakes, Data Warehousing
The most amazing...
...thing I've done as a technical engineer is integrate the Amazon Bedrock platform with Redis Enterprise as a vector database.
Work Experience
Senior Cloud Solutions Architect, AWS & GCP
Redis
- Landed $5+ million deals by leading 10+ customers through workload migration and app modernization. Published an AWS prescriptive guide, driving customer adoption and revenue growth. Guided customers, breaking monoliths into microservices on EKS.
- Helped customers by rearchitecting transactional systems to leverage the Redis NoSQL store as a primary database. Developed data visualization and analytics using Grafana dashboards for multiple customers.
- Collaborated with AWS engineering teams to integrate Redis as a knowledge base for retrieval-augmented generation (RAG). Shared Redis best practices, provided API support, and established CI/CD infrastructure.
- Showcased 1,600 to 4,000 times faster latency speeds and saved LLM computation cycles by enforcing an LLM semantic caching benchmark for an aviation industry client. Executed RAG and built a chatbot app using LangChain and sentence transformers.
- Implemented RAG with LangChain templates and vector similarity search—handling semantic, lexical, hybrid, and range-based queries—for telecom, retail, and finance services customers.
- Handled the solution architecture to enable real-time, AI/ML-based fraud detection and published best practices as an AWS Partner Network (APN) blog. Trained data, leveraging XGBoost and Random Cut Forest (RCF) ML algorithms.
- Developed ML operations, leveraging Feast with Redis as a feature store. Demonstrated single-digit latencies with Amazon SageMaker, leveraging Redis as a feature store for fraud detection use cases in finance services.
Senior Solutions Architect, Big Data Solutions | Senior Staff Engineer
Cloudera
- Decreased timelines from 24 hours to only 30 minutes per integration component, including Spark and Impala, by developing a one-click certification evaluation tool to vet ISV partner integrations on data lake for Cloudera certifications.
- Created a technical framework for on-premise to public, private, or hybrid cloud migrations for Cloudera Data Platform (CDP) on AWS and GCP.
- Ran a customer success program for Citibank, from a partner engineering standpoint, to deliver certified partner integrations and drive value for the customer.
- Developed an Altus SDK for ISV partners and customers to leverage Cloudera Altus in the AWS and Azure public cloud.
- Defined a joint ISV and Cloudera product roadmap to drive business value through continuous technical integrations and certification programs for strategic ISV partners in data engineering, analytics, and machine learning.
- Handled data engineering integrations, including H20.ai, SAP HANA, MicroStrategy, Talend, Trifacta, Pentaho, Platfora, and Datameer.
- Certified strategic hardware platform partners like IBM (PowerPC and Spectrum Scale) and Dell EMC (Isilon and ECS) using automation tools.
Senior Solutions Architect, Big Data and API Management (via Tata Consultancy Services)
Walmart
- Defined the enterprise architecture for Walmart and Sam's Club, enabling data discovery across the enterprise via RESTful APIs using big data Hadoop datasets, Camel for polling, and Oozie to invoke Hive ETL for data preparation and enrichment.
- Architected an enterprise data hub (data lake) on Hadoop for early analytics using EL methodologies. Designed and developed data pipelines to offload ETL workloads from the traditional data warehouse into Hadoop.
- Designed the solution architecture and delivered a targeted personalization engine for Sam's Club subscribers. Leveraged enterprise data sets for data mining purposes.
- Built and deployed a campaign management system using an enterprise data layer accessible over APIs. Leveraged NoSQL Cassandra and HBase for operational data and the Greenplum Database for metadata needs.
Experience
Cloud Migrations and App Modernization
https://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/migrate-redis-workloads-to-redis-enterprise-cloud-on-aws.htmlSome of the top customers I've worked with include Henry Schein, IBM Multicloud Management Platform (MCMP), JPMorgan, Citigroup, HDFC Bank, Visa, Mastercard, FedEx, Palo Alto Networks, Telesign, Scentsy, Cordial, T-Mobile, Coinbase, Ekata, Nexmo (now Vonage), Blue Cross Blue Shield Association, and Onclusive.
I drove strategic discussions on solution architecture based on economies, including total cost of ownership (TCO) discussions for optimal ROI on cloud investments. I also tested and certified the Amazon EKS integration with Redis Enterprise's custom Kubernetes operator and automated the AWS infrastructure setup using CloudFormation templates and Terraform scripts.
Amazon Bedrock and Redis Integration
https://redis.io/blog/amazon-bedrock-integration-with-redis-enterprise/• Guided AWS teams on the Jedis API to invoke vector similarity searches and on KNN vs. ANN searching algorithms.
• Provided technical guidance and direction on vector indexing methods like flat and Hierarchical Navigable Small World (HNSW).
• Helped make design choices for vector distance metrics, such as cosine vs. L2 vs. inner product (IP), to measure vector similarity.
• Utilized Amazon Titan as the embedding model and Anthropic Claude 3 as the LLM model.
RAG Implementation in the Financial Services Industry
https://www.youtube.com/watch?v=epdQNfNdl7II implemented an object storage-based data lake. The knowledge base consisted of financial data such as SEC filings, trade rules documents, trading compliance policy PDF documents, and Word documents. Next, I built an MLOps pipeline that sniffs on the data lake to sync data to the vector database periodically, such as every few hours and every night. Finally, I implemented LLM embedding to vectorize the data into the Redis Enterprise cloud vector database and LLM semantic caching to save on LLM computation cycles.
LLM Semantic Cache for Healthcare and Medical Services
https://github.com/RedisVentures/aws-redis-bedrock-stack/blob/main/examples/healthcare-redis-bedrock-gen-ai.ipynbWith a knowledge base of medical literature, patient records, medical research papers, and medical treatment and compliance documents, I implemented an LLM semantic caching benchmark that showcased 1,600 to 4,000 times faster latency speeds, saving on LLM computation cycles.
Realtime Fraud Detection for Financial Services
https://aws.amazon.com/blogs/apn/fraud-detection-for-the-finserv-industry-with-redis-enterprise-cloud-on-aws/Additionally, I used Feast with Redis as a feature store to develop ML operations. I also displayed single-digit latencies with Amazon SageMaker, leveraging Redis as a feature store for fraud detection in finance services use cases.
Data Lake and Warehousing for Advanced and Exploratory Analytics
• Enabled raw data ELT into the data lake from streaming, transactional, and batch data sources into Hadoop using Sqoop and Kafka.
• Modeled data for operational databases using HBase on Hadoop.
• Built ETL pipelines using data wrangling and Trifacta preparation features to reduce redundant analytical cycles.
• Designed the solution and architected end-to-end near-real-time streaming and batch pipelines to veto Cloudera Hadoop on the Informatica, Talend, and Pentaho data engineering stack.
• Enabled data analytics using Tableau and MicroStrategy by running ad hoc analytical queries using Apache Impala.
• Developed time series data modeling for IoT use cases on GE Historian.
Hadoop Security Automation Tooling
Cloudera Altus SDK Development
https://github.com/cloudera/altus-sdk-javaWalmart's Big Data Program
I used Apache Camel for polling and Oozie workflow to invoke Hive ETL actions to prepare and enrich data. I also used EL methodologies and architected the enterprise data hub on Hadoop for early analytics. Lastly, I designed and built data pipelines to offload ETL workloads from the traditional data warehouse into Hadoop.
Sam's Club Personalization Engine
I used enterprise data sets for data mining and built and deployed a campaign management system using an enterprise data layer accessible over APIs. I used Cassandra and HBase for operational data and Greenplum for metadata needs.
Education
Bachelor's Degree in Electronics Engineering
Nagpur University - Nagpur, Maharashtra, India
Certifications
Cloudera Certified Developer for Apache Hadoop (CCDH)
Cloudera
Cloudera Certified Administrator for Apache Hadoop (CCAH)
Cloudera
Skills
Libraries/APIs
Spark ML, Cloud Key Management Service (KMS), Apigee
Tools
Git, Terraform, Amazon EKS, Amazon SageMaker, Cloudera, Apache Sqoop, Tableau, Apache Impala, Spark SQL, Ansible, Oozie, Impala
Frameworks
Hadoop, Apache Spark, Yarn
Paradigms
Microservices, Database Design, ETL, MapReduce
Storage
NoSQL, Data Lakes, Redis, Databases, Amazon S3 (AWS S3), HBase, Apache Hive, Cassandra, Greenplum, HDFS
Languages
Python 3, Python, Java, Bash Script
Platforms
Kubernetes, Apache Kafka, AWS IoT, Google Cloud Platform (GCP), Jupyter Notebook, AWS Lambda, Talend, Pentaho, H20, Cloudera Data Platform, Spark Core
Other
Big Data, Architecture, Enterprise Architecture, Data Architecture, Distributed Systems, Data Warehousing, Generative Artificial Intelligence (GenAI), Engineering, Retrieval-augmented Generation (RAG), Scalable Vector Databases, Data Migration, Workload Migration, Cloud Migration, Application Modernization, Amazon Bedrock, Large Language Models (LLMs), AWS Database Migration Service (DMS), Vector Databases, Data, Amazon Kinesis, Amazon API Gateway, Machine Learning, Big Data Architecture, Data Engineering, Data Analytics, MicroStrategy, ELT, Trifacta, Informatica, Kerberos, Transport Layer Security (TLS), Data Encryption, Apache Cassandra, RESTful Services, APIs, Cloudera Manager
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring