Srinivas Pendyala, Developer in Frisco, TX, United States
Srinivas is available for hire
Hire Srinivas

Srinivas Pendyala

Verified Expert  in Engineering

Bio

Srinivas is a multifaceted cloud and data solutions architect with 27 years of leadership experience across engineering, pre-sales, and professional services. He drives digital transformation and modernization through solution design, architecture, and deployment of highly scalable distributed computing. Srinivas uses GenAI/ML and big data for engineering, analytics, data lakes, and data warehousing, orchestrating microservices with Kubernetes in public, private, and hybrid clouds.

Portfolio

Redis
Data Lakes, Data Warehousing, NoSQL, Redis...
Cloudera
Databases, NoSQL, Data Lakes, Data Warehousing, Cloud Migration, Big Data...
Walmart
Apigee, Hadoop, Cloudera, Apache Hive, Apache Sqoop, ETL, Apache Cassandra...

Experience

  • Hadoop - 10 years
  • Google Cloud Platform (GCP) - 10 years
  • Microservices - 10 years
  • NoSQL - 10 years
  • Kubernetes - 10 years
  • Data Lakes - 10 years
  • AWS Database Migration Service (DMS) - 4 years
  • Generative Artificial Intelligence (GenAI) - 2 years

Availability

Part-time

Preferred Environment

Generative Artificial Intelligence (GenAI), Hadoop, AWS IoT, Google Cloud Platform (GCP), Terraform, NoSQL, Microservices, Kubernetes, Data Lakes, Data Warehousing

The most amazing...

...thing I've done as a technical engineer is integrate the Amazon Bedrock platform with Redis Enterprise as a vector database.

Work Experience

Senior Cloud Solutions Architect, AWS & GCP

2021 - 2024
Redis
  • Landed $5+ million deals by leading 10+ customers through workload migration and app modernization. Published an AWS prescriptive guide, driving customer adoption and revenue growth. Guided customers, breaking monoliths into microservices on EKS.
  • Helped customers by rearchitecting transactional systems to leverage the Redis NoSQL store as a primary database. Developed data visualization and analytics using Grafana dashboards for multiple customers.
  • Collaborated with AWS engineering teams to integrate Redis as a knowledge base for retrieval-augmented generation (RAG). Shared Redis best practices, provided API support, and established CI/CD infrastructure.
  • Showcased 1,600 to 4,000 times faster latency speeds and saved LLM computation cycles by enforcing an LLM semantic caching benchmark for an aviation industry client. Executed RAG and built a chatbot app using LangChain and sentence transformers.
  • Implemented RAG with LangChain templates and vector similarity search—handling semantic, lexical, hybrid, and range-based queries—for telecom, retail, and finance services customers.
  • Handled the solution architecture to enable real-time, AI/ML-based fraud detection and published best practices as an AWS Partner Network (APN) blog. Trained data, leveraging XGBoost and Random Cut Forest (RCF) ML algorithms.
  • Developed ML operations, leveraging Feast with Redis as a feature store. Demonstrated single-digit latencies with Amazon SageMaker, leveraging Redis as a feature store for fraud detection use cases in finance services.
Technologies: Data Lakes, Data Warehousing, NoSQL, Redis, Generative Artificial Intelligence (GenAI), Retrieval-augmented Generation (RAG), Scalable Vector Databases, Data Migration, Workload Migration, Cloud Migration, Application Modernization, AWS IoT, Google Cloud Platform (GCP), Kubernetes, Jupyter Notebook, Architecture, Enterprise Architecture, Data Architecture, Database Design, Distributed Systems, Git

Senior Solutions Architect, Big Data Solutions | Senior Staff Engineer

2015 - 2021
Cloudera
  • Decreased timelines from 24 hours to only 30 minutes per integration component, including Spark and Impala, by developing a one-click certification evaluation tool to vet ISV partner integrations on data lake for Cloudera certifications.
  • Created a technical framework for on-premise to public, private, or hybrid cloud migrations for Cloudera Data Platform (CDP) on AWS and GCP.
  • Ran a customer success program for Citibank, from a partner engineering standpoint, to deliver certified partner integrations and drive value for the customer.
  • Developed an Altus SDK for ISV partners and customers to leverage Cloudera Altus in the AWS and Azure public cloud.
  • Defined a joint ISV and Cloudera product roadmap to drive business value through continuous technical integrations and certification programs for strategic ISV partners in data engineering, analytics, and machine learning.
  • Handled data engineering integrations, including H20.ai, SAP HANA, MicroStrategy, Talend, Trifacta, Pentaho, Platfora, and Datameer.
  • Certified strategic hardware platform partners like IBM (PowerPC and Spectrum Scale) and Dell EMC (Isilon and ECS) using automation tools.
Technologies: Databases, NoSQL, Data Lakes, Data Warehousing, Cloud Migration, Big Data, Big Data Architecture, Data Engineering, ETL, Talend, Data Analytics, Pentaho, MicroStrategy, H20, Kubernetes, Data Migration, Workload Migration, Application Modernization, AWS IoT, Google Cloud Platform (GCP), Jupyter Notebook, Architecture, Enterprise Architecture, Data Architecture, Database Design, Distributed Systems, Git

Senior Solutions Architect, Big Data and API Management (via Tata Consultancy Services)

2014 - 2015
Walmart
  • Defined the enterprise architecture for Walmart and Sam's Club, enabling data discovery across the enterprise via RESTful APIs using big data Hadoop datasets, Camel for polling, and Oozie to invoke Hive ETL for data preparation and enrichment.
  • Architected an enterprise data hub (data lake) on Hadoop for early analytics using EL methodologies. Designed and developed data pipelines to offload ETL workloads from the traditional data warehouse into Hadoop.
  • Designed the solution architecture and delivered a targeted personalization engine for Sam's Club subscribers. Leveraged enterprise data sets for data mining purposes.
  • Built and deployed a campaign management system using an enterprise data layer accessible over APIs. Leveraged NoSQL Cassandra and HBase for operational data and the Greenplum Database for metadata needs.
Technologies: Apigee, Hadoop, Cloudera, Apache Hive, Apache Sqoop, ETL, Apache Cassandra, Apache Kafka, Spark ML, NoSQL, HBase, Greenplum, Architecture, Enterprise Architecture, Data Architecture, Database Design, Distributed Systems, Git, Big Data

Experience

Cloud Migrations and App Modernization

https://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/migrate-redis-workloads-to-redis-enterprise-cloud-on-aws.html
Led 10+ customers, resulting in $5+ million deal sizes, and published an APN prescriptive guide, driving customer adoption and revenue growth.

Some of the top customers I've worked with include Henry Schein, IBM Multicloud Management Platform (MCMP), JPMorgan, Citigroup, HDFC Bank, Visa, Mastercard, FedEx, Palo Alto Networks, Telesign, Scentsy, Cordial, T-Mobile, Coinbase, Ekata, Nexmo (now Vonage), Blue Cross Blue Shield Association, and Onclusive.

I drove strategic discussions on solution architecture based on economies, including total cost of ownership (TCO) discussions for optimal ROI on cloud investments. I also tested and certified the Amazon EKS integration with Redis Enterprise's custom Kubernetes operator and automated the AWS infrastructure setup using CloudFormation templates and Terraform scripts.

Amazon Bedrock and Redis Integration

https://redis.io/blog/amazon-bedrock-integration-with-redis-enterprise/
Integrated Amazon Bedrock with Redis as a vector database. I collaborated with AWS engineering teams to integrate Redis as a knowledge base for RAG, providing Redis best practices and API support and establishing CI/CD infrastructure.

• Guided AWS teams on the Jedis API to invoke vector similarity searches and on KNN vs. ANN searching algorithms.
• Provided technical guidance and direction on vector indexing methods like flat and Hierarchical Navigable Small World (HNSW).
• Helped make design choices for vector distance metrics, such as cosine vs. L2 vs. inner product (IP), to measure vector similarity.
• Utilized Amazon Titan as the embedding model and Anthropic Claude 3 as the LLM model.

RAG Implementation in the Financial Services Industry

https://www.youtube.com/watch?v=epdQNfNdl7I
Implemented a RAG pipeline and built a chatbot application using LangChain and sentence transformers for financial analyst use cases.

I implemented an object storage-based data lake. The knowledge base consisted of financial data such as SEC filings, trade rules documents, trading compliance policy PDF documents, and Word documents. Next, I built an MLOps pipeline that sniffs on the data lake to sync data to the vector database periodically, such as every few hours and every night. Finally, I implemented LLM embedding to vectorize the data into the Redis Enterprise cloud vector database and LLM semantic caching to save on LLM computation cycles.

LLM Semantic Cache for Healthcare and Medical Services

https://github.com/RedisVentures/aws-redis-bedrock-stack/blob/main/examples/healthcare-redis-bedrock-gen-ai.ipynb
Built a RAG application to assist healthcare professionals in their diagnosis and improve patient outcomes.

With a knowledge base of medical literature, patient records, medical research papers, and medical treatment and compliance documents, I implemented an LLM semantic caching benchmark that showcased 1,600 to 4,000 times faster latency speeds, saving on LLM computation cycles.

Realtime Fraud Detection for Financial Services

https://aws.amazon.com/blogs/apn/fraud-detection-for-the-finserv-industry-with-redis-enterprise-cloud-on-aws/
Designed the solution architecture of an AI/ML-based fraud detection solution on AWS for customers and published best practices as an APN blog and reference architecture. I also trained data using XGBoost and RCF ML algorithms.

Additionally, I used Feast with Redis as a feature store to develop ML operations. I also displayed single-digit latencies with Amazon SageMaker, leveraging Redis as a feature store for fraud detection in finance services use cases.

Data Lake and Warehousing for Advanced and Exploratory Analytics

Built a data lake based on Cloudera's distribution of Hadoop (CDP) for multiple customers.

• Enabled raw data ELT into the data lake from streaming, transactional, and batch data sources into Hadoop using Sqoop and Kafka.
• Modeled data for operational databases using HBase on Hadoop.
• Built ETL pipelines using data wrangling and Trifacta preparation features to reduce redundant analytical cycles.
• Designed the solution and architected end-to-end near-real-time streaming and batch pipelines to veto Cloudera Hadoop on the Informatica, Talend, and Pentaho data engineering stack.
• Enabled data analytics using Tableau and MicroStrategy by running ad hoc analytical queries using Apache Impala.
• Developed time series data modeling for IoT use cases on GE Historian.

Hadoop Security Automation Tooling

Developed and deployed Hadoop security automation tooling in-house using Ansible, Bash, Apache Sentry, Hive, Kerberos, TLS, and encryption at rest with AWS Key Management Service (KMS). I also built Jenkins CI/CD pipelines using Docker containers for end-to-end Data warehousing experience.

Cloudera Altus SDK Development

https://github.com/cloudera/altus-sdk-java
Developed Altus SDK for customers to leverage Cloudera Altus in the AWS and Azure public cloud. I worked on the API design for SDK and simulated Amazon's Elastic MapReduce (EMR) API design for portability.

Walmart's Big Data Program

Worked on an API monetization program, defining the enterprise architecture to enable data discovery across the enterprise for Walmart and Sam's Club via RESTful APIs with big data Hadoop datasets.

I used Apache Camel for polling and Oozie workflow to invoke Hive ETL actions to prepare and enrich data. I also used EL methodologies and architected the enterprise data hub on Hadoop for early analytics. Lastly, I designed and built data pipelines to offload ETL workloads from the traditional data warehouse into Hadoop.

Sam's Club Personalization Engine

Designed the solution architecture and developed a targeted personalization engine for Sam's Club subscribers.

I used enterprise data sets for data mining and built and deployed a campaign management system using an enterprise data layer accessible over APIs. I used Cassandra and HBase for operational data and Greenplum for metadata needs.

Education

1993 - 1997

Bachelor's Degree in Electronics Engineering

Nagpur University - Nagpur, Maharashtra, India

Certifications

MAY 2014 - PRESENT

Cloudera Certified Developer for Apache Hadoop (CCDH)

Cloudera

MAY 2014 - PRESENT

Cloudera Certified Administrator for Apache Hadoop (CCAH)

Cloudera

Skills

Libraries/APIs

Spark ML, Cloud Key Management Service (KMS), Apigee

Tools

Git, Terraform, Amazon EKS, Amazon SageMaker, Cloudera, Apache Sqoop, Tableau, Apache Impala, Spark SQL, Ansible, Oozie, Impala

Frameworks

Hadoop, Apache Spark, Yarn

Paradigms

Microservices, Database Design, ETL, MapReduce

Storage

NoSQL, Data Lakes, Redis, Databases, Amazon S3 (AWS S3), HBase, Apache Hive, Cassandra, Greenplum, HDFS

Languages

Python 3, Python, Java, Bash Script

Platforms

Kubernetes, Apache Kafka, AWS IoT, Google Cloud Platform (GCP), Jupyter Notebook, AWS Lambda, Talend, Pentaho, H20, Cloudera Data Platform, Spark Core

Other

Big Data, Architecture, Enterprise Architecture, Data Architecture, Distributed Systems, Data Warehousing, Generative Artificial Intelligence (GenAI), Engineering, Retrieval-augmented Generation (RAG), Scalable Vector Databases, Data Migration, Workload Migration, Cloud Migration, Application Modernization, Amazon Bedrock, Large Language Models (LLMs), AWS Database Migration Service (DMS), Vector Databases, Data, Amazon Kinesis, Amazon API Gateway, Machine Learning, Big Data Architecture, Data Engineering, Data Analytics, MicroStrategy, ELT, Trifacta, Informatica, Kerberos, Transport Layer Security (TLS), Data Encryption, Apache Cassandra, RESTful Services, APIs, Cloudera Manager

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring