Muhammad Naeem Ahmed
Verified Expert in Engineering
Data Engineer and Software Developer
San Jose, CA, United States
Toptal member since June 18, 2020
Muhammad brings nearly 15 years of IT experience in data warehousing solution implementation. He delivers reliable, maintainable, and efficient code using SQL, Python, Perl, Unix, C/C++, and Java. His work helped eBay increase its revenue and Walmart improve processes. Muhammad has a strong focus on big data-related technologies, automating redundant tasks to improve workflow, and understanding how to achieve exciting, efficient, and profitable client solutions.
Portfolio
Experience
Availability
Preferred Environment
Snowflake, Teradata SQL Assistant, DBeaver, Presto, PyCharm, Google Cloud Platform (GCP), MongoDB, Terraform, Google Kubernetes Engine (GKE), Data Build Tool (dbt)
The most amazing...
...project I developed was converting buyers into sellers at eBay as part of a Hackathon project. This effort turned out to be an overall 0.1% revenue booster.
Work Experience
Senior Data Engineer
Stout Technologies
- Created dbt models and workflows that formed a DAG for a luxury re-selling company's consignments and listings data.
- Optimized production SQL for throughput quality.
- Developed queries and built dashboards for business-critical video attributes.
- Ran and monitored dbt commands such as run and test, performing complex joins, aggregations, and calculations.
Data Scientist
KuKreationz LLC
- Developed Python APIs for Make and Airtable to pull data for PoC purposes for the marketing channel and SEO data.
- Wrote Azure Synapse ETL jobs to extract data from various marketing sources, such as Klaviyo, Commercetools, Google Analytics, and GA-4 Cross Domain, into Azure for analysis and decision-making.
- Came up with recommendations using statistical analysis, such as the Apriori Algorithm, to find opportunities for revenue growth.
- Mentored team members on using visualizations such as Power BI and analysis in Azure Cloud.
Data Analyst/Engineer
Overjet Inc.
- Implemented MongoDB to BigQuery migration using pub/sub, Dataflow jobs, and multi-threaded Python code for deep learning, enabling insightful data analysis and complex transformations.
- Developed efficient Python code for MongoDB aggregation pipelines and created business reports during the migration to BigQuery.
- Developed Looker Explores and dashboards for business analysis, slicing and dicing by important dimensions.
Senior Data Engineer
Walmart Labs
- Architected, developed, and supported new features in the project’s data flow that calculated cumulative/daily metrics such as converted visitors and first-time buyers on the home and search pages.
- Analyzed Hive sensor- and beacon-parsed data for ad-hoc analysis of user behavior.
- Automated the current ETL pipeline through Python to build SQL on the fly into Hive map columns. Reduced the development cycle of 2-3 weeks for each new feature.
- Wrote Hive UDF to replace the use of R to calculate p-value in the Hive pipeline. Supported existing processes and tools, mentored fellow engineers, and triaged data issues in a timely resolution.
- Participated in the effort to migrate on-premise jobs to the GCP cloud.
Senior Software Engineer
ebay
- Converted Teradata SQL to Spark SQL for a migration project. Developed Regex-related string processing UDFs for Spark.
- Wrote Pig, Hive, and Map Reduce jobs on user behavior clickstream data. Automated Unix scripts through crontabs to run analyses, such as first-time buyer count and conversion metrics on listings data.
- Prepared data for predictive and prescriptive modeling.
- Built tools and custom wrapper scripts, using Python to automate DistCp Hadoop commands and logs processing.
- Developed and supported ETL jobs into production. The jobs entailed both Teradata and Hadoop scripts.
Database Analyst
PeakPoint Technologies
- Data modeled and mapped, developed, and deployed ETL code. Wrote advanced Teradata SQL.
- Developed extended stored procedures, DB-link, packages, and parameterized dynamic PL/SQL to migrate the schema objects per business requirements.
- Designed a logical data model and implemented it to a physical data model.
- Developed and placed into production automated ETL jobs scheduled in the UC4 tool.
Experience
Teradata SQL to Spark SQL Migration Project
Experimentation ETL Code Refactor
Converting Buyers Into Sellers Through Purchase History
Python Wrapper for Hadoop Administrative Commands
Senior Data Engineer
Facebook Watch Data Pipeline Engineer
Senior Developer
• Researching various crypto bots available in the market and their technical features-trading strategies.
• Developing Python Codebase to implement effective crypto bot strategies, taking into account fear and greed, on-chain analysis of whale activity, etc. Writing auto-buy, sell, and portfolio balancing code.
• Deep Diving on crawled web data regarding crypto news.
Education
Bachelor's Degree in Computer Science
FAST National University - Islamabad, Pakistan
Certifications
Teradata Certified Master V2R5
Teradata
Skills
Libraries/APIs
PySpark, REST APIs, Dask, Azure Blob Storage API, React, Sigma.js, Pandas
Tools
Microsoft Power BI, PyCharm, Teradata SQL Assistant, Erwin, Apache Sqoop, Flume, Oozie, Tableau, BigQuery, GitHub, Microsoft Access, Terraform, Microsoft Excel, Amazon Elastic MapReduce (EMR), Oracle Application Express (APEX), pgAdmin, Amazon EKS, Elastic, Make, Kibana, Prefect, Amazon CloudWatch, Google Analytics, Apache Storm, GIS, SurveyMonkey, Power BI Desktop, Amazon Textract, Cloud Dataflow, Apache Beam, Talend ETL, Informatica ETL, Informatica PowerCenter, Azure HDInsight, Git, Stitch Data, Amazon Elastic Container Registry (ECR), Amazon Elastic Container Service (ECS), AWS Glue, SSAS, Apache Airflow, Google Cloud Composer, Looker, Google Kubernetes Engine (GKE)
Languages
Python, T-SQL (Transact-SQL), Snowflake, JavaScript, Python 3, SQL, Java, R, Bash Script, SQL DML, Scala, MIPS, XML, C#, Perl, Excel VBA, GraphQL, C++
Frameworks
Apache Spark, Presto, Spark, Hadoop, Windows PowerShell, Ruby on Rails (RoR), .NET, Streamlit
Paradigms
ETL, Database Design, ETL Implementation & Design, Business Intelligence (BI), MapReduce, DevOps, Azure DevOps, Automation, Fast Healthcare Interoperability Resources (FHIR), HIPAA Compliance, Microservices, Microservices Architecture, Dimensional Modeling
Platforms
Azure, Unix, Hortonworks Data Platform (HDP), Apache Pig, Apache Kafka, Amazon Web Services (AWS), Docker, Kubernetes, Google Cloud Platform (GCP), Databricks, Linux, AWS Lambda, Amazon EC2, New Relic, Microsoft, Kubeflow, Alteryx, HubSpot, Azure Synapse, Oracle, Pentaho, Windows, Salesforce, MapR, Microsoft Fabric, Blockchain
Storage
MySQL, Databases, NoSQL, DBeaver, PL/SQL, Data Pipelines, Amazon DynamoDB, Database Architecture, Database Modeling, Apache Hive, Elasticsearch, Teradata, SQL Server 2014, PostgreSQL, Oracle PL/SQL, Azure SQL, Microsoft SQL Server, SQL Performance, MongoDB, Google Cloud, Database Migration, Cloud Firestore, Data Lakes, Database Administration (DBA), Google Bigtable, SQL Server Integration Services (SSIS), Amazon Aurora, Redshift, JSON, Database Testing, Data Validation, Azure Cosmos DB, Master Data Management (MDM), Azure Active Directory, SQL Server DBA, Data Integration, Database Caching, Data Lake Design, Google Cloud Storage, Service Broker, Azure Cloud Services, Amazon S3 (AWS S3), Oracle 11g, Relational Databases, Teradata Databases
Industry Expertise
Insurance
Other
Data Modeling, Data Warehousing, Data Analysis, Data Architecture, ETL Tools, Data Engineering, APIs, Machine Learning, Big Data, ETL Development, Data Warehouse Design, Unix Shell Scripting, Customer Data, Data, Web Scraping, Azure Databricks, Data Cleansing, Azure Data Factory, Unstructured Data Analysis, Data Visualization, Data Analytics, Image Processing, Data Science, Data Queries, Performance Tuning, Analytics, Reports, Scripting, Inventory Management, Google Cloud Functions, Data Loss Prevention (DLP), ETL Testing, Google Data Studio, Big Data Architecture, Message Queues, Google BigQuery, Azure Data Lake, Database Optimization, SAP, OCR, Artificial Intelligence (AI), Airtable, Dashboards, Query Optimization, Distributed Systems, Systems Monitoring, Cloud Platforms, User Permissions, Data Build Tool (dbt), Data Management, Warehouses, ELT, Timescale, Financial Planning & Analysis (FP&A), Azure Data Lake Analytics, Financial Modeling, Modeling, BI Reporting, Orchestration, Web Analytics, ClickStream, Social Media Web Traffic, Advertising Technology (Adtech), Digital Marketing, Leadership, Risk Management, Critical Thinking, Dashboard Development, Reporting, Transportation & Logistics, Transportation & Shipping, Data Integrity Testing, QGIS, Surveys, Dashboard Design, Survey Development & Analysis, Data Processing, CSV Export, Natural Language Processing (NLP), PDF Scraping, Metabase, Pipelines, Cloud Migration, Data Migration, Data Strategy, Advisory, Consulting, Financial Services, Technical Leadership, Infrastructure, Data-level Security, Full-stack, CI/CD Pipelines, Microsoft Data Transformation Services (now SSIS), Architecture, Data Flows, Scalability, Startups, Git Repo, Back-end Development, Infrastructure as Code (IaC), Machine Learning Operations (MLOps), Amazon Kinesis, Cloud Networking, CSV, Business Analysis, Data Extraction, Data Transformation, Looker Studio, Funnel Analysis, Data Encoding, Service Broker Patterns, SSIS Custom Components, User Experience (UX), User Interface (UI), Healthcare Management Systems, Business Requirements, Stakeholder Management, Streaming Data, Microsoft Azure, DAX, Unidash, Teradata DBA
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring