
Mehmet Dogan
Verified Expert in Engineering
Data Engineer and Full-stack Web Developer
Katy, TX, United States
Toptal member since December 21, 2021
Mehmet is a lead data engineer and full-stack developer with over 15 years of experience delivering scalable data solutions and back-end systems. He's experienced in designing resilient ETL pipelines with Airflow, BigQuery, and Snowflake for enterprise leaders like Capital One. He specializes in bridging the gap between data engineering and software architecture, having successfully built large-scale analytics platforms for the healthcare sector and founded a revenue-generating edtech startup.
Portfolio
Experience
- PostgreSQL - 10 years
- SQL - 10 years
- Python - 7 years
- Amazon Web Services (AWS) - 6 years
- Django - 6 years
- Snowflake - 2 years
- Amazon Elastic Container Service (ECS) - 1 year
- Terraform - 1 year
Preferred Environment
Amazon Web Services (AWS), Snowflake, PostgreSQL, Python, Apache Airflow, Google Cloud Platform (GCP), BigQuery, Data Build Tool (dbt), Databricks, Django
The most amazing...
...thing I did was deliver data solutions for 100+ million Americans across the healthcare, finance, and education sectors.
Work Experience
Senior Data Engineer
Aetna
- Designed, developed, and maintained scalable ETL pipelines in Apache Airflow for enterprise analytics workloads.
- Built and optimized data models in Google BigQuery to support high-performance querying across large healthcare datasets.
- Integrated Looker and Looker Studio dashboards to deliver actionable insights to business stakeholders.
- Leveraged Google Cloud Platform (GCP) services for secure, reliable, and cost-efficient data storage and processing.
Lead Data Engineer
Capital One Financial
- Led critical data operations within the auto finance division.
- Leveraged agile methodologies to lead a high-performing team to design and implement numerous data pipelines.
- Helped build a big data platform (data lake, data warehouse), providing analytics and data-driven solutions.
- Communicated with cross-functional teams to inform product lifecycle: collect requirements, build data models, and improve data architecture.
- Implemented data quality monitoring systems and championed automated testing practices.
- Provided technical leadership and mentored junior engineers.
Full-stack Data Engineer
Flexion
- Played a key role in building a Beneficiary Data Analytics Platform (BEDAP) for HHS, serving 100+ million Americans.
- Developed data pipelines using AWS Glue and PySpark to ingest 15 terabytes of historical healthcare data from Teradata to Redshift.
- Improved data processing by 20% using ECS Fargate, Lambda, and SQS-EventBridge.
- Reduced data extraction times by 30% through a parallel extraction strategy.
- Empowered data analysts with tools like Jupyter Notebooks for improved outreach campaigns.
Senior Back-end Engineer
ThetaCore
- Acted as primary developer for a back-end web application using Django and DRF running on Google App Engine.
- Refactored Django applications to remove circular dependencies and update authorization systems.
- Developed REST API endpoints for user management and survey functionality.
- Utilized microservices on Google Cloud Functions for survey result analytics.
Founder | Full-stack Developer
Edgle LLC
- Founded and led an edtech startup hosting public data for Texas schools.
- Achieved high-performance data retrieval of around 3,000 data points in under 0.5 seconds.
- Represented information using interactive D3.js visualizations.
Experience
Capital One: Auto Finance Title Perfection Pipeline
PROJECT CONTRIBUTIONS AND TECHNICAL IMPACT
• Scalable Data Architecture: Led a high-performing team in designing and implementing this end-to-end data pipeline to support a comprehensive big data platform, including a data lake and Snowflake data warehouse.
• Engineering Excellence: Built a self-validating system by implementing robust data quality monitoring and automated testing practices directly into the pipeline, which substantially increased test coverage and significantly decreased production bugs.
• System Resilience: Regularly rehydrated infrastructure to maintain optimal performance and applied security updates to mitigate data risks.
• Cross-functional Integration: Collaborated across teams to collect requirements and build data models, ensuring the pipeline adhered to strict data architecture and regulatory compliance requirements.
Data Pipelines with AWS Glue
I played a key role in developing the Beneficiary Data Analytics Platform (BEDAP) for the US Department of Health and Human Services (HHS), a massive-scale project serving over 100 million Americans. My responsibilities spanned the entire product lifecycle and big data platform, covering both the data lake and data warehouse architectures.
• I developed and managed robust data pipelines using AWS Glue and PySpark to ingest 15 terabytes of historical healthcare data from Teradata to Redshift, enabling significantly faster downstream analysis.
• By contributing to platform architecture and utilizing ECS Fargate, Lambda, and SQS-EventBridge, I improved data processing efficiency by 20%.
• Performance Results: I implemented a parallel extraction strategy that successfully reduced data extraction times by 30%.
• To ensure high data quality, I built automated testing and health checks integrated with VictorOps and Slack to mitigate pipeline failures and ensure data integrity.
• I collaborated cross-functionally to ensure the platform met organizational needs and empowered data analysts by providing accessible data through Jupyter Notebooks, which improved outreach campaigns.
Edgle: Texas School Performance Analytics Dashboard
KEY TECHNICAL ACHIEVEMENTS
• High-performance Retrieval: Engineered a system that pulls approximately 3,000 data points—including historical, reference, and statistical data—in under 0.5 seconds per request.
• Advanced Visualization: Leveraged a graph-based data structure and custom D3.js visualizations to provide an interactive, intuitive exploration of hierarchical data by subject, grade, and demographic.
• Actionable Insights: Enabled users to concurrently explore multiple data sets while rating, filtering, and setting targets to drive data-driven decision-making.
• Robust Infrastructure: Developed a reusable role-based authorization package for Django, integrated single sign-on (SSO), and implemented custom Bash deployment scripts to ensure seamless, secure operations.
• Comprehensive Features: Built internal tools for XML parsing, multi-stakeholder survey collection, and reusable Django CRUD views, all supported by rigorous Python unit testing.
Education
Master's Degree in Electrical and Computer Engineering
Texas Tech University - Lubbock, Texas, US
Bachelor's Degree in Electrical and Computer Engineering
Bosphorus University - Istanbul, Turkey
Certifications
AWS Certified Solutions Architect – Associate
Amazon Web Services
Skills
Libraries/APIs
D3.js, PySpark
Tools
Terraform, Jenkins, Amazon Elastic Container Service (ECS), Git, Amazon Athena, AWS Glue, BigQuery, Apache Airflow, Tableau, Looker, Grafana
Languages
Python, JavaScript, Bash, C, Snowflake, SQL
Storage
PostgreSQL, Google Cloud, MySQL, Redshift, Vertica, Data Lakes
Frameworks
Django, Django REST Framework, CakePHP
Platforms
Amazon Web Services (AWS), Docker, Amazon EC2, Linux, Apache Kafka, Oracle, AWS Lambda, Google Cloud Platform (GCP), Google App Engine, Databricks, Vertex AI
Other
Computer Vision, Data Build Tool (dbt), Team Leadership
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring