Ja-Yuan Pendley, Developer in New York, NY, United States
Ja-Yuan is currently unavailable

Ja-Yuan Pendley

Data Engineer and Developer

New York, NY, United States

Toptal member since April 17, 2026

Bio

Ja-Yuan is an accomplished data and cloud engineer with 11+ years of experience designing, modernizing, and scaling enterprise-grade data platforms across AWS, Azure, and GCP. He excels at developing high-performance pipelines using Kafka, Hive, Scala, PySpark, Spark, Python, Databricks, and Airflow. Adept at aligning architecture with business strategy, Ja-Yuan elevates data reliability and governance and leads cross-functional teams to deliver secure, compliant, and scalable data ecosystems.

Portfolio

Pfizer
Python, Confluence, Azure Functions, YAML, Azure DevOps, Microsoft Purview...
Credit Suisse Group
PySpark, Apache Hive, AWS Glue, Amazon Athena, Amazon Redshift Spectrum...

Experience

  • Amazon Athena - 8 years
  • Delta Lake - 8 years
  • Hadoop - 8 years
  • Google Cloud Platform (GCP) - 8 years
  • Azure Databricks - 8 years
  • ETL - 7 years
  • Python - 7 years
  • Snowflake - 6 years

Preferred Environment

AWS IoT

The most amazing...

...solution I've delivered is an end-to-end clinical-trial data platform, enabling advanced analytics, regulatory reporting, and real-time data availability.

Work Experience

Lead Azure Big Data Engineer

2023 - PRESENT
Pfizer
  • Developed Azure Databricks pipelines for clinical-trial data integration and advanced analytics.
  • Implemented a Delta Lake architecture with Bronze, Silver, and Gold layers to ensure regulatory traceability and auditability.
  • Deployed Azure DevOps YAML pipelines for automated CI/CD, notebook versioning, and environment promotion.
Technologies: Python, Confluence, Azure Functions, YAML, Azure DevOps, Microsoft Purview, Delta Lake, Apache Kafka, Azure Synapse, Azure Data Factory (ADF), Azure Key Vault, Microsoft Power BI, PySpark, Apache Hive, Azure Databricks

Senior AWS Data Engineer

2021 - 2023
Credit Suisse Group
  • Built AWS Glue ETL frameworks for market-risk and compliance data processing.
  • Improved pipeline performance, observability, and compliance alignment through optimized ETL orchestration and governed access.
  • Delivered an end-to-end market-risk and compliance data platform on AWS using Glue, S3, and Redshift to support regulatory reporting, cross-domain analytics, and automated data-quality controls.
Technologies: PySpark, Apache Hive, AWS Glue, Amazon Athena, Amazon Redshift Spectrum, AWS Lambda, Terraform, Amazon CloudWatch, AWS Step Functions, EMR, AWS Lake Formation, Amazon Redshift, Amazon S3 (AWS S3), Python, Apache Kafka

Experience

Clinical-trial Data Platform

I delivered an end-to-end clinical-trial data platform using Azure Databricks, Azure Data Factory (ADF), and Synapse to enable advanced analytics, regulatory reporting, and real-time data availability. I also implemented Delta Lake, Kafka streaming, Hive governance, and automated CI/CD to ensure traceability, compliance, and high-quality data pipelines. I enhanced data reliability through PySpark validation, operational observability, and optimized compute workflows.

Skills

Libraries/APIs

PySpark

Tools

Terraform, Amazon CloudWatch, AWS Step Functions, Amazon Athena, Amazon Redshift Spectrum, AWS Glue, Confluence, Azure Key Vault, Microsoft Power BI

Frameworks

Hadoop, Apache Spark

Languages

Python, Snowflake, YAML

Paradigms

ETL, Azure DevOps

Platforms

Google Cloud Platform (GCP), AWS IoT, AWS Lambda, Apache Kafka, Azure Functions, Azure Synapse

Storage

PostgreSQL, MongoDB, Apache Hive, Amazon S3 (AWS S3)

Other

Azure Databricks, EMR, AWS Lake Formation, Amazon Redshift, Microsoft Purview, Delta Lake, Azure Data Factory (ADF)

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring