
Syed Ashar Abbas
Verified Expert in Engineering
Data Engineer and Software Developer
Munich, Bavaria, Germany
Toptal member since June 18, 2020
Syed is a database consultant with 20 years of experience with database techniques, including scalability, load balancing and clustering, high availability and disaster recovery, security, monitoring, backups, recovery, partitioning, archiving, and database performance tuning. He has hands-on experience in database engineering by designing and creating data pipelines using Python, Airflow, dbt, and AWS Glue, with architecting and designing a data warehouse using industry-standard databases.
Portfolio
Experience
- Database Administration (DBA) - 18 years
- SQL - 15 years
- Data Warehouse Design - 12 years
- Database Architecture - 10 years
- Data Engineering - 10 years
- PostgreSQL - 8 years
- Python - 5 years
- Redshift - 3 years
Availability
Preferred Environment
Amazon RDS, Google Cloud Platform (GCP), Oracle, SQL Server DBA, PostgreSQL, MySQL, Apache Airflow, Python, MongoDB, Redshift
The most amazing...
...project I've built is a database auto-heal system that automatically fixes and tunes issues, along with the migration utility to move from on-prem to the cloud.
Work Experience
Snowflake Lead Data Engineer
Fourthwall
- Led design and implementation of a 100TB OLAP data warehouse using a star schema model with fact and dimension tables for an addressable impression marketing project for data analytics based on viewers' demographics, interests, and watching behavior.
- Implemented slowly changing dimensions (SCD) type 2 and type 3 to maintain data versioning and track the history of changes over time.
- Utilized PostgreSQL and AWS S3 as data sources and Snowflake as the target data warehouse. Leveraged Snowpark for Python and SQL to perform complex data transformations and processing within Snowflake.
- Designed and implemented a dedicated schema in Snowflake to manage configuration rules, data processing rules, setup tables, logging, auditing, and notifications, improving monitoring, governance, and real-time insights into ETL processes.
- Created automated, multi-step data pipelines using Snowflake Tasks and Streams. Defined dependencies between ETL steps using the tree feature, dynamically scheduling downstream tasks to ensure efficient workflow execution.
- Implemented dynamic data validation and data quality processes using SQL and stored procedures to verify data integrity at each stage of the ETL process.
- Implemented Parallel Processing for Large Data Sets, using partitioning and distributing the work across multiple Snowflake compute clusters to process large data in chunks and reduced the execution time of ETL by 60%.
- Enhanced pipeline reliability and operational efficiency by reducing manual intervention through automation, improving job scheduling, and ensuring real-time availability of analytics-ready data.
AWS Architect
Flex Analytics, Inc.
- Built a complete ETL solution extracting data from SQL Server and loads into Redshift Serverless using AWS Glue and AWS DMS within different AWS accounts.
- Architected a complete landing zone for all resources like SQL Server, Redshift, and Glue, including connectivity between different accounts and different VPC using VPC pairing.
- Worked on AWS DMS as another solution for ETL and ongoing replication between SQL Server and Redshift serverless.
- Automated DMS tasks using Eventbridge, which loads data into Redshift from the SQL Server at a specified time.
- Worked closely with the client as a strategic partner for choosing the right technology and stack for better business outcomes and long-term strategy.
MongoDB Expert
The Healthy Mummy Pty Ltd
- Conducted comprehensive MongoDB health checks and performance audits, identified key areas for optimization, and provided actionable recommendations to improve system performance and scalability.
- Improved query performance: identified and optimized slow-running queries, leading to significant improvements in query performance. This helped Healthy Mummy scale their instances from M50 to M40, resulting in a 20% cost saving.
- Implemented best practices for indexing, such as removing redundant indexes, applying missing indexes, and dropping unnecessary indexes. This allowed the system to handle more data and more user requests, and improved user experience.
- Improved resource utilization: routed read-only cron jobs to read replicas, which reduced the load on the primary instance and improved response times.
- Rolled out Atlas Search indexes for recipes on UK and AU instances by replacing the old text $search index. This helped perform searches in a more optimized and efficient way.
- Implemented purge and archiving strategies: created utilities to automatically archive and purge garman_callback and events collections and reduced storage size by 10% after moving old archiving data from primary instances.
- Prepared a proof of concept for using Atlas Charts and deployed and developed the complete architecture of Atlas Charts to generate business analysis charts and reports to perform data analytics.
- Worked on the smooth version upgrade on UK and AU instances from 4 to 6.0.8 without downtime, improving the performance of the application by 10%.
Senior Snowflake Data Engineer
Veris Insights
- Orchestrated the creation of a logical data model for the data warehouse, employing the Online Analytical Processing (OLAP) model within Snowflake. This framework was instrumental in housing Qualtrics survey data, providing survey response analysis.
- Built data pipelines executing data extraction, transformation, movement, and cleansing from a PostgreSQL source database. Employed a combination of Python, SQL, and Fivetran, seamlessly loading data into Snowflake's designated target table.
- Devised a utility that harnessed SQL and Python capabilities with Qualtrics APIs. This utility empowered the direct loading of survey data into Snowflake, a departure from the client's previous reliance on 3rd-party solutions.
- Implemented essential Slowly Changing Dimensions (SCD) strategies to systematically track and manage current and historical changes in our online transactional processing (OLTP) data within the data warehouse.
- Played a pivotal role in addressing and optimizing reporting-related post-performance issues, offering invaluable insights to enhance the efficiency of SQL queries and Python scripts.
- Engineered and successfully implemented an automatic replication mechanism, ensuring seamless synchronization between the Postgres OLTP system and the Snowflake OLAP environment, thereby enhancing data consistency and reliability.
- Mentored and trained junior team members, fostering their growth in essential areas such as SQL, Python, Snowflake, Fivetran data modeling, and reporting capabilities.
Redshift Data Warehouse Architect
SiliconExpert (acquired by Arrow Electronics)
- Designed and built an Online Analytical Processing (OLAP) data warehouse solution using the Star schema model tailored for intricate business analytics and reporting, leveraging Redshift as the data warehouse technology.
- Engineered robust data pipelines responsible for extracting, transforming, validation, cleansing, and lookup data operations from Oracle data sources, using a combination of AWS Glue and SQL and loading into Redshift.
- Designed and implemented sophisticated data mapping rules, aggregations, and complex business rules within a dedicated staging area to facilitate the seamless data transition from an OLTP system into an OLAP model for decision-making processes.
- Crafted and fine-tuned complex stored procedures and views, pivotal in generating sophisticated business intelligence reports to cater to critical business requirements, providing invaluable insights for strategic decision-making.
- Assembled a comprehensive and consolidated data dictionary, serving as a foundational information source for our extensive data warehousing solution encapsulated mapping rules, schema designs, scripts, and terminologies.
- Deployed Redshift design and performance best practices to meticulously optimize complex and mission-critical reports, enhancing the overall efficiency and responsiveness of the system.
- Demonstrated leadership by training, coaching, and mentoring junior team members, nurturing their proficiency in Redshift, ETL processes, and SQL, and preparing them for a seamless transition and successful completion.
Senior BigQuery Data Engineer
Braven
- Designed a comprehensive data warehouse architecture. This involved the transformation of the Online Transactional Processing (OLTP) schema into an Online Analytical Processing (OLAP) framework, incorporating dimensions and fact tables.
- Built data pipelines using Python and SQL, seamlessly extracting data from the Salesforce application and loading it into BigQuery, enhancing data accessibility and analytics capabilities.
- Leveraged AirFlow to orchestrate ETL (extract, transform, load) workflows, adeptly incorporating data quality checks, validation processes, and cleansing routines. Implemented notification alerts to address any data quality issues proactively.
- Crafted SQL scripts tailored to the data warehouse, enabling solutions to complex business information and data analytics needs. These scripts served as a crucial resource for insightful decision-making.
- Demonstrated expertise in optimizing resource-intensive SQL queries within the data warehouse environment, significantly improving performance and efficiency and ensuring the smooth execution of critical analytics tasks.
Senior MySQL Database Administrator
Teespring
- Fine-tuned MySQL databases for optimal performance, identifying and resolving bottlenecks. Optimized SQL queries by applying SQL best practices, temporary tables, and stored procedures and reduced execution times by an average of 50%.
- Improved database indexing strategies, identifying missing indexes, leading to a 40% reduction in query execution times and a 20% decrease in overall server load. Drop unused indexes, reclaim up to 30% of spaces, and improve DML performance.
- Fine-tuned memory management and optimized the InnoDB buffer pool, increasing cache hit ratios from 70% to 95%, significantly reducing disk I/O and faster query responses.
- Optimized storage by implementing archiving and cleanup policies to remove obsolete or infrequently accessed data from the database. Optimized table storage engines (InnoDB, MyISAM, etc.) to ensure the database runs efficiently, even with high workloads.
- Designed and implemented MySQL scalable database architectures using partitioning, sharding, read replicas, and load balancing to ensure seamless expansion.
- Developed and tested comprehensive disaster recovery plans, using master-slave replication with automatic failover using VIP and NDB clustering to ensure the seamless continuation of business operations. Used Amazon Aurora MySQL multi-region replication.
- Built sophisticated backup policy using Percona XtraBackup and MySQL native tool mysqldump to store backups in secure, offsite locations, and automate the backup process. Enabled binary logging utilities to perform point-in-time recovery.
Data Warehouse Architect
Tatango, Inc.
- Designed a scalable and highly available Redshift data warehouse architecture capable of handling terabytes of data, allowing the organization to accommodate rapid data growth.
- Improved query performance by 70% through query tuning, optimizing table design and implementing appropriate sort and distribution keys.
- Implemented automated performance monitoring and tuning processes using Amazon CloudWatch and AWS Lambda, proactively identifying and addressing performance bottlenecks for uninterrupted data access.
- Streamlined ETL (extract, transform, load) processes, reducing data loading times by 50% using incremental and differential load. Implemented SCD type 1, type 2, and type 3 and ensured timely access to updated data.
- Developed efficient OLAP data models by implemented star schema designs that facilitated complex data analysis and reporting, ensuring data consistency and integrity.
- Implemented cost-saving measures, such as data compression and resource optimization, resulting in a 20% reduction in Redshift operational costs.
- Implemented data warehousing best practices to optimize data storage and retrieval, covering data partitioning, distribution keys, sort keys, and data retention policies.
- Collaborated with cross-functional teams, including data engineers, analysts, and business stakeholders, to understand data requirements and deliver tailored solutions.
- Conducted capacity planning to ensure the data warehouse could handle future data growth and evolving business needs.
Senior Data Analytic
North Labs
- Designed data pipeline solution in Python to load data in Oracle EBS from CSV.
- Implemented an automated solution to clean CSV files using Python.
- Designed ETL schema using PostgreSQL and QuickSight, providing comprehensive reports on ETL jobs and status.
- Worked on stored procedures and triggers in Oracle EBS to apply validation rules against databases.
SQL | Data Expert
Chime Bank
- Created complex SQL queries as needed by finance, compliance, audit, and other departments.
- Structured the large data better by using SQL best practices and appropriate tools and processes.
- Worked on cross-border money sending application that allows end-users to link up their bank accounts and start transferring money in seconds.
- Structured data flows and built a data warehouse and structure improved processes and tools down the line.
- Fixed BI problems across a couple of verticals within the company.
MySQL Developer
Captira Analytics
- Checked slow SQL queries and fine-tuned queries for optimal performance.
- Reviewed indexes and MySQL parameters for high workload and concurrent users connection.
- Wrote stored procedures and functions for reusing and achieving high performance.
- Upgraded MySQL from version 5.7 to version 8 without downtime.
- Migrated almost 100 databases from an on-premise MySQL to Rackspace Cloud.
MongoDB Database Administrator
AppViewX
- Implemented sharding in MongoDB for load balancing and high availability.
- Designed and implemented the replication for high availability.
- Performed tuning MongoDB workload and optimized MongoDB configuration for an optimal load.
- Implemented the backup, recovery methodologies, and best practices.
- Migrated MongoDB from on-premises to the cloud with optimal configuration.
Database Architect and Schema Expert
Bang the Table
- Designed data warehousing schema and logical model for survey and data analytic application.
- Designed an OLAP data model using Start schema dimension, facts, and cubes.
- Created ETL from a MySQL source database to target the Redshift database.
- Implemented Schema designing best practices, versioning, and ERD documentation.
- Reviewed the OLTP data model and made sure the best practices were followed.
Database Administrator and Developer
Camp Network
- Migrated on-premise MySQL databases to AWS RDS. Solved database performance issues after migration.
- Implemented MySQL replications between EC2 instances and enabled disaster recovery solution.
- Upgraded MySQL 5.6 to 5.7 on AWS RDS without downtime by creating Replica.
Database Developer
Wickfire
- Upgraded MySQL on AWS RDS from 5.7 to 8.0. Fixed performance issues after migration and upgrade.
- Identified performance issues after the upgradation of MySQL on AWS RDS by troubleshooting SQL queries and evaluating database parameters.
- Developed an archiving utility using MySQL, which archives data based on dates and custom criteria.
- Developed a Schema Sync utility that checks the source and targets MySQL schemas and generates a sync script to synchronize both target and source schemas.
- Created a MySQL migration assessment tool that generates a comprehensive report highlighting upcoming issues before migration, like schema, data types, and data.
Senior PostgreSQL Administrator
Salsa Labs
- Designed a robust PostgreSQL high-availability architecture through built-in WAL streaming replication and automated failover using Linux shell script, achieving read replica and 99.9% uptime and minimal data loss during server failures.
- Improved SQL query performance through appropriate indexing, including multi-column indexes, partial indexes, and covering indexes. Used advanced indexing techniques like expression and functional indexes, Bloom filters, and GiST/GIN.
- Implemented a connection pooler and efficient caching management using pgbouncer through PgBouncer and pgpool to efficiently manage and reuse database connections, reducing memory overhead. Optimized memory management through shared_buffer pg_prewarm.
- Tuned and configured optimal configuration settings like autovacuum, workload management, parallel query execution, and other configuration parameters like effective_cache_size, checkpoint_completion_target, and wal_level.
- Optimized database performance by implementing table partitioning and sharding to efficiently handle large table queries and data distribution through pg_partman and pg_shard. This included range partitioning, list partitioning, and hash partitioning.
- Implemented load balancing for PostgreSQL to improve performance, availability, and scalability using Pgpool. This implementation effectively managed high traffic, read-heavy workloads, and ensured fault tolerance.
- Implemented point-in-time recovery (PITR) in PostgreSQL, allowing to restoration of the database to a point-in-time failure using pg_basebackup and barman. This achieved 99.9 % availability and uptime.
Database Optimization Lead
NetSol Technologies
- Identified the root cause analysis of performance issues in client production databases.
- Audited SQL codes according to optimization best practices and scalability in terms of storage and capacity planning.
- Analyzed the blocking, deadlock, and solution discovery using AWR/ADDM and ASH reports analysis.
- Evaluated database initialization parameters (SGA and PGA) and resources (processor, memory, disks, network) for best performance.
- Provided technical upgrade, load balancing, and high-availability solutions to clients according to their business needs.
- Performed regular database health checks as a preemptive approach to figure out the area that should be strengthened proactively.
Lead Database Administrator
Agha Khan University Hospital
- Managed database performance, monitoring, troubleshooting, backup and recovery, and security policies.
- Performed upgrade/patch projects, configuration, capacity planning, and regular database maintenance activities.
- Audited and designed the SQL code for a database to enhance the overall application and database design.
- Ensured 24/7 availability of databases and led end-to-end problem resolution.
- Implemented a Microsoft SQL Server and Oracle cluster environment to achieve load balancing and high availability.
- Created a data pipeline from the EHR system to Legacy Oracle software by defining mapping, validations, and cleansing in the stage area and loading in production.
Lead Database Administrator, Business Intelligence Developer
Folio3
- Performed the optimization of the database flow, architect and SQL queries, stored procedures, and other database code objects for performance tuning and avoiding bottlenecks.
- Collaborated with business and development teams to build database design and models by creating the entity relationship diagram (ERD). Identified anomalies in database design, such as integrity and consistency.
- Participated in developing and implementing data warehousing from scratch and created an ETL process.
- Created scored cards, dashboards, charts, and ad-hoc analytics queries on business intelligence and reporting tools.
- Built data migration and up-gradation scripts to handle auto upgradation of database versions.
- Administered, configured, customized, and implemented security in Oracle E-Business Suite R12.
- Worked on integrating Oracle EBS R12 modules with mobile apps using Java, web services, and Oracle JDeveloper.
IBM Cognos Administrator
Royal Cyber
- Led the successful implementation and configuration of the IBM Cognos 9 platform on Windows and Oracle 10g, ensuring the system met business needs and to improve decision-making reports response.
- Enhanced the performance of the Cognos BI platform by fine-tuning cache settings and connection pooling configurations. Optimized SQL queries running on Oracle 10g resulting in reduced report execution times and enhanced user satisfaction.
- Executed security enhancements by implementing robust security measures and data governance policies, ensuring that sensitive data was protected and met compliance standards while providing authorized users seamless access to critical information.
- Designed and automated report scheduling and distribution processes, reducing manual workload for end-users and increasing operational efficiency.
- Led successful version upgrades from Cognos 8 to 9 and system migrations, ensuring a seamless transition to newer versions of Cognos and minimizing downtime or disruption to business operations.
Experience
Survey Analysis
Health Analytics
Business: Health and disease analytics
A data warehousing health analytics project using SSIS for ETL, Microsoft SQL Server as the source, and XML as the destination. It parses, transforms, and loads XML-based data from a file and loads it into the destination. I used SSIS control for transformation, lookup, merge, filter, and parsing features. A Create Agent job runs ETL automatically and refreshes the data warehouse. I created an SSAS cube and SSRS reports.
Responsibilities:
• Designed the data warehouse logical and physical schema in Microsoft SQL Server.
• Wrote stored procedures using T-SQL for ETL to load data.
• Used CLR to compress XML after transformation and before load to reduce network traffic.
• Troubleshot performance issues identified in long-running queries.
• Created OLAP using SSAS Star schema, dimensions, facts, and tables in Microsoft SQL Server.
• Created SSRS reports and dashboard.
Content Value Management
Business: Customer Value Management (CVM) platform
A data warehousing project using Pentaho for ETL, Oracle as a source, and PostgreSQL as a destination. I denormalized the database and loaded text-based CSV data from the source to the destination. I used transformations with lookup files, filters, and missing zip info. Also, I used AWS integration to load data from another legacy environment. I created data masking and cleansing before loading it into the target and a PDI job to run ETL automatically and refresh the data warehouse. I also created the data mart and BI reporting, pulling data from the data warehouse.
Responsibilities:
• Designed the data warehouse logical and physical schema in PostgreSQL.
• Designed the staging area in Oracle to hold TEMP data.
• Wrote scripts for ETL in PL/SQL, PSQL, and control files to load data.
• Troubleshot performance issues during data loading.
• Scheduled an automatic cron job in Linux for ETL.
• Designed OLAP Star Schema, dimensions, and fact tables in Oracle.
• Created reports and a dashboard using Pentaho reports.
• Created auditing schema to keep track of ETL job status.
MongoDB Performance Expert
Education
Master of Science Degree in Computer Science
Islamia University Bahawal Pur - Bahawal Pur
Certifications
AWS Certified Solutions Architect Associate
AWS
AWS Certified Database Specialty
AWS
MCT
Microsoft
MCSA SQL Server
Microsoft
MCP
Microsoft
Skills
Tools
MySQL Performance Tuning, Microsoft Power BI, AWS Glue, JDeveloper, Oracle E-Business Suite (EBS) R12, Toad, Oracle Forms, Periscope, IBM Cognos, Matillion ETL for Redshift, Amazon Elastic MapReduce (EMR), BigQuery, Apache Airflow, Amazon QuickSight, Qualtrics, MongoDB Atlas, SQL Server BI, Amazon Virtual Private Cloud (VPC)
Languages
SQL, SAS, Python, Perl, Snowflake
Frameworks
Hadoop
Paradigms
OLAP, ETL, DevOps
Platforms
Amazon Web Services (AWS), Oracle, Amazon EC2, Unix, Windows, Kubernetes, AWS Lambda, Pentaho, Linux, AIX, Meteor, Google Cloud Platform (GCP)
Storage
Microsoft SQL Server, Database Architecture, Database Administration (DBA), Database Performance, MySQL, SQL Server 2016, PostgreSQL, IBM Db2, SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS), SQL Server Reporting Services (SSRS), Redshift, MongoDB, SQL Architecture, Amazon S3 (AWS S3), Oracle 11g, Oracle 12c, SQL Server 2014, SQL Server 2008, PL/SQL Developer, PL/SQL, Databases, Amazon DynamoDB, Amazon Aurora, Redis Cache, NoSQL, SQL Server DBA
Other
Logical Database Design, Database Schema Design, Data Warehouse Design, Database Optimization, Data Engineering, Data Warehousing, AWS Certified Solution Architect, ETL Tools, Cloud Migration, Data Strategy, Data Analysis, Big Data, Data Build Tool (dbt), DocumentDB, Google BigQuery, PeopleSoft, WebLogic, Oracle Application R12 EBS, MySQL DBA, Amazon RDS, Computer Science, Oracle EBS, Fivetran, APIs, AS400
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring