Muhammad Naeem Ahmed, Unix Shell Scripting Developer in San Jose, CA, United States
Muhammad Naeem Ahmed

Unix Shell Scripting Developer in San Jose, CA, United States

Member since July 1, 2018
Muhammad brings nearly 15 years of IT experience in data warehousing solution implementation. He delivers reliable, maintainable, efficient code using SQL, Python, Perl, Unix, C/C++, and Java. His work helped eBay increase their revenue and Walmart improve processes. Muhammad has a strong focus on big data-related technologies, the automation of redundant tasks to improve workflow, and understands how to achieve exciting, efficient, and profitable client solutions.
Muhammad is now available for hire


  • Walmart Labs
    Unix, Spark, Apache Hive, MapReduce, Hadoop, SQL, Python...
  • ebay
    Teradata, Presto DB, Apache Hive, Spark, Hadoop, Python, Databases...
  • PeakPoint Technologies
    Python, Teradata, SQL, T-SQL, PL/SQL, Databases, Data Warehousing...



San Jose, CA, United States



Preferred Environment

Snowflake, Teradata SQL Assistant, DBeaver, Presto DB, PyCharm

The most amazing...

...project I've developed converting buyers into sellers at eBay as part of a Hackathon project. This effort turned out to be an overall 0.1% revenue booster.


  • Senior Data Engineer

    2018 - PRESENT
    Walmart Labs
    • Architected, developed, and supported new features in the project’s data flow that calculated cumulative/daily metrics such as converted visitors and first-time buyers on the home and search pages.
    • Analyzed sensor and beacon parsed data in Hive for ad-hoc analysis of user behavior.
    • Automated the current ETL pipeline through the use of Python to build SQL on the fly into Hive map columns. Reduced the development cycle of 2-3 weeks for each new feature.
    • Wrote Hive UDF to replace the use of R to calculate p-value in the Hive pipeline. Support existing processes and tools, mentor fellow engineers, and triage data issues in a timely resolution.
    • Participated in the effort to migrate on-premise jobs to GCP cloud.
    Technologies: Unix, Spark, Apache Hive, MapReduce, Hadoop, SQL, Python, Data Warehouse Design, Data Warehousing, Databases, Kubernetes, Customer Data, Data, Data Engineering, Apache Airflow, Data Modeling, Data Pipelines, Web Scraping
  • Senior Software Engineer

    2012 - 2020
    • Converted Teradata SQL to Spark SQL for a migration project. Developed Regex-related string processing UDFs for Spark.
    • Wrote Pig, Hive, and Map Reduce jobs on user behavior clickstream data. Automated Unix scripts through crontabs to run analyses such as first-time buyer count and conversion metrics on listings data.
    • Prepared data for predictive and prescriptive modeling.
    • Built tools and custom wrapper scripts using Python to automate DistCp Hadoop commands and logs processing.
    • Developed and supported ETL jobs into production. The jobs entailed both Teradata and Hadoop scripts.
    Technologies: Teradata, Presto DB, Apache Hive, Spark, Hadoop, Python, Databases, Data Warehouse Design, Data Warehousing, AWS, Docker, Customer Data, Data, Data Engineering, Apache Airflow, Data Modeling, Data Pipelines, Web Scraping
  • Database Analyst

    2008 - 2012
    PeakPoint Technologies
    • Data modeled and mapped, developed, and deployed ETL code. Wrote advanced Teradata SQL.
    • Developed extended stored procedures, db-link, packages, and parameterized dynamic PL/SQL to migrate the schema objects as per business requirements.
    • Designed a logical data model and implemented it to a physical data model.
    • Developed and placed into production automated ETL jobs scheduled in the UC4 tool.
    Technologies: Python, Teradata, SQL, T-SQL, PL/SQL, Databases, Data Warehousing, Data Warehouse Design, Data, Data Engineering, Data Modeling, Data Pipelines


  • Teradata SQL to Spark SQL Migration Project

    Involved a detailed analysis of SQL and jobs written in Teradata SQL. The requirement was to convert the whole communications logical data model (CLDM) related ETL pipeline to Spark. The pipeline had around 200 final tables and around 150 jobs. Wrote many UDFs for handling Regex-related calculations in Spark, which were seamlessly handled by inbuilt Teradata functions. Since Spark lacked those functions, I wrote UDFs to handle those cases.

  • Experimentation ETL Code Refactor

    This Hive ETL SQL was generated dynamically using Python to read from a YAML configuration file, where aggregations could be defined in the YAML. This was a huge win because it removed the need for making SQL changes and re-testing before every release.

  • Converting Buyers Into Sellers Through Purchase History

    As part of a Hackathon, I developed a code prototype at eBay for converting buyers into sellers. Based on purchase history and shelf life of items bought, buyers would be sent recommendations to sell the purchased items at the depreciated cost. The project was built in Teradata using user sessions and event/transaction-level information to derive recommendations. This effort increased the overall revenue by 0.1% and was considered a huge success.

  • Python Wrapper for Hadoop Administrative Commands

    Wrote a very detailed and complex Python Wrapper code on top of Hadoop commands to secure data in case of unintentional slip-ups. The project was a huge success and became a robust depository of code that mitigated a lot of pain points as more and more functionality was added.

  • Senior Data Engineer

    Developed Facebook videos pipelines ETL. Created Diffs in Phabricator to be deployed for adding/removing video attributes in production. Created deltoid metrics in MDF. Analyzed data and published reports in Unidash.

  • Facebook Watch Data Pipeline Engineer

    Built important metrics for videos such as genres, trending videos, songs, and movies. Coded in Python and SQL (GitHub for code versioning), powered the ETL pipeline, added new features and tuned the performance of previously written SQL.


  • Languages

    Python, T-SQL, Snowflake, Python 3, SQL, JavaScript, GraphQL, C++, R, Java
  • Frameworks

    Presto DB, Spark, Hadoop, Apache Spark
  • Libraries/APIs

  • Tools

    PyCharm, Teradata SQL Assistant, Erwin, Sqoop, Flume, Apache Airflow, Oozie, Tableau, Microsoft Power BI
  • Paradigms

    ETL, Database Design, ETL Implementation & Design, MapReduce, Business Intelligence (BI)
  • Platforms

    Azure, Unix, Hortonworks Data Platform (HDP), Apache Pig, Apache Kafka, Docker, Kubernetes, MapR
  • Storage

    MySQL, Databases, NoSQL, DBeaver, PL/SQL, Data Pipelines, AWS DynamoDB, Database Architecture, Database Modeling, Apache Hive, Elasticsearch, Teradata, SQL Server 2014, Oracle PL/SQL, AWS S3, PostgreSQL, Oracle 11g
  • Other

    Data Modeling, Data Warehousing, Data Analysis, Data Architecture, ETL Tools, Data Engineering, APIs, Machine Learning, Big Data, ETL Development, Data Warehouse Design, Unix Shell Scripting, Bash Scripting, AWS, Customer Data, Data, Web Scraping, Microsoft Azure, Unidash


  • Bachelor's degree in Computer Science
    2001 - 2005
    FAST - Islamabad, Pakistan


  • Teradata Certified Master V2R5

To view more profiles

Join Toptal
Share it with others