Nouman Khalid
Verified Expert in Engineering
Data Engineer and Developer
Nouman is a senior data engineer with over seven years of experience building data-intensive applications, tackling challenging architectural and scalability problems, and collecting and sorting data in data-centric companies. He is helping a news publishing company become the first to fully understand user behavior and make infrastructure more robust, reusable, and scalable. With his solid background, Nouman is eager to take on new challenges and deliver outstanding results.
Portfolio
Experience
Availability
Preferred Environment
Python 3, Amazon Web Services (AWS), SQL, Data Build Tool (dbt), Snowflake
The most amazing...
...design I've built is a reusable data ingestion framework using AWS.
Work Experience
Senior Data Engineer
Data Kitchens
- Engineered a robust data orchestration pipeline framework from scratch, utilizing Airflow as the orchestrator, ensuring seamless data flow, monitoring, and error handling.
- Leveraged dbt to create optimized transformation processes, enhancing data quality and reliability while reducing processing time by 55% on average.
- Set up and managed multiple data ingestion pipelines for disparate sources, including RDS, NetSuite, HubSpot, and Salesforce, resulting in a 45% reduction in data acquisition time.
- Successfully integrated Snowflake as the central data warehouse, enabling high-performance storage, querying, and scalability, resulting in 30% faster analytics.
- Developed a comprehensive mart layer for BI using Lightdash, creating a user-friendly interface for business analysts to access and analyze data insights promptly.
- Implemented data governance practices to ensure data accuracy, consistency, and compliance, leading to a significant reduction in data-related errors.
- Worked closely with cross-functional teams to define data requirements, troubleshoot issues, and optimize data delivery, resulting in faster project delivery.
Senior Data Engineer
Axel Springer
- Designed and implemented real-time streaming solutions for user engagement and connection with the reporting dashboard.
- Created structured dbt models to encapsulate complex data transformations, simplifying code maintenance and contributing to a significant decrease in error rates.
- Orchestrated a seamless migration of custom Python data transformation processes from Apache Airflow to dbt, ensuring consistent and accurate data processing.
- Maintained data quality and integrity, ensuring they were complete, accurate, consistent, and valuable.
- Managed the real-time dashboard's complete extract, transform, and load (ETL) infrastructure.
- Planned, designed, and supervised projects end to end.
- Integrated third-party application programming interfaces (APIs) to collect advertisement reports.
Principal Data Engineer
NorthBay Solutions
- Participated in developing a product for ingestion, transformation, data lake formation, and dataset visualization.
- Worked on connectors of Amazon S3, Amazon Redshift, file transfer protocol (FTP) source, and flat files for ingestion.
- Transformed and modeled the extracted data using dbt to create structured and optimized datasets for downstream analytics and reporting, enhancing data quality and enabling faster insights.
- Created scalable architecture for transcript generator to handle unpredictable loads.
- Migrated the large Oracle server to Amazon Aurora PostgreSQL database and saved the licensing costs.
Senior Data Engineer
NorthBay Solutions
- Created API services on a serverless framework for a cloud-based web app using Node.js 10.x.
- Worked on the database migration from an on-premise Oracle server to an Amazon RDS Aurora PostgreSQL using AWS Database Migration Service (DMS).
- Executed the ingestion processes on flat files using Amazon Athena, AWS Glue catalog, and AWS Glue crawlers.
- Developed a custom Tableau dashboard as per management requirements.
- Extracted data from tables and flat files from mixed systems, such as Oracle E-Business Suite (EBS) and Amazon S3, using Amazon EMR and Amazon Data Pipeline to load in an Amazon S3 bucket.
- Created ETL jobs with Talend and migrated data from the Microsoft SQL Server and MySQL server to Amazon Redshift.
Senior Data Engineer
Starzplay
- Designed and created the specifications for a linear over-the-top (OTT) streaming network using AWS media services.
- Performed the ingestion and transformation of on-premise data into Amazon S3 using AWS Glue PySpark and AWS Glue Python shell jobs.
- Defined a data warehouse architecture (DWH) architecture, including dimensional modeling.
- Developed a custom Tableau dashboard as per management requirements.
- Extracted data from tables and flat files from mixed systems, such as Oracle EBS and Amazon S3, using Amazon EMR and Amazon Data Pipeline to load in an Amazon S3 bucket.
- Created ETL jobs with Talend and migrated data from the Microsoft SQL Server and MySQL server to Amazon Redshift.
Software Engineer
NorthBay Solutions
- Sourced tables and flat files from heterogeneous systems, such as Oracle EBS and Amazon S3, using Amazon EMR and Amazon Data Pipeline and loaded them in the staging area with Amazon Redshift.
- Performed transformations on source tables and built dimensions and facts.
- Created and maintained a serverless architecture using AWS services, including Amazon API Gateway, AWS Lambda, and Amazon RDS.
- Used the AWS Kinesis stream for every event in the system and Amazon Athena to fetch data for the reporting layer.
- Built a 4-tier QlikView Data (QVD) architecture in Qlik Sense to optimize the query performance and minimize the database workload.
Experience
Centralized Educational Platform
Data Warehouse for a Video-on-demand Company
Quantflare
Education
Master's Degree in Computer Science
LUMS - Lahore University of Management Sciences - Lahore, Pakistan
Bachelor's Degree in Computer Science
University of Engineering and Technology, Lahore - Lahore, Punjab, Pakistan
Certifications
AWS Certified Solutions Architect
Amazon Web Services
AWS Certified Solutions Architect Associate
AWS
Skills
Languages
SQL, Python, Snowflake, Bash, Scala, JavaScript, Python 3
Frameworks
Apache Spark, Spark, Hadoop, Serverless Framework, Django
Libraries/APIs
Pandas, REST APIs, Node.js, Amazon EC2 API
Tools
Amazon Simple Queue Service (SQS), AWS Glue, Amazon Athena, Apache Airflow, Amazon Redshift Spectrum, Retool, DataGrip, Jupyter, AWS Step Functions, Amazon Cognito, Amazon CloudWatch, Amazon Simple Notification Service (Amazon SNS), Jenkins, AWS CloudFormation, Tableau, Microsoft Power BI, Amazon Elastic MapReduce (EMR), Stitch Data, Looker, Amazon QuickSight
Paradigms
Serverless Architecture, ETL, Microservices, Automation, Data Science
Platforms
AWS Lambda, Amazon EC2, Amazon Web Services (AWS), Azure, Docker, Databricks, Visual Studio Code (VS Code), Talend, Apache Kafka
Storage
Redshift, Data Pipelines, RDBMS, Database Architecture, MySQL, Databases, PostgreSQL, NoSQL, Relational Databases, SQL Server DBA, MongoDB, Amazon S3 (AWS S3), Amazon DynamoDB, Amazon Aurora, Data Lakes, Google Cloud
Industry Expertise
Healthcare
Other
Data Engineering, Data Warehousing, Amazon RDS, Data Build Tool (dbt), Data Architecture, Lambda Functions, APIs, Big Data, Data Analytics, Data Analysis, Big Data Architecture, Message Queues, Data Transformation, Data Modeling, ELT, English, Query Optimization, AWS SAM, Azure Databricks, Data Governance, BI Reporting, Machine Learning, Analytics, Fivetran, API Gateways, CI/CD Pipelines, Amazon API Gateway, Dagster, Amazon Marketing Services (AMS)
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring