Muhammad Naeem Ahmed, Unix Shell Scripting Developer in San Jose, CA, United States
Muhammad Naeem Ahmed

Unix Shell Scripting Developer in San Jose, CA, United States

Member since June 18, 2020
Muhammad brings nearly 15 years of IT experience in data warehousing solution implementation. He delivers reliable, maintainable, efficient code using SQL, Python, Perl, Unix, C/C++, and Java. His work helped eBay increase its revenue, and Walmart improve processes. Muhammad has a strong focus on big data-related technologies, automating redundant tasks to improve workflow, and understanding how to achieve exciting, efficient, and profitable client solutions.
Muhammad is now available for hire

Portfolio

  • Stout Technologies
    Apache Hive, Python 3, Unidash, GitHub, Spark
  • Walmart Labs
    Unix, Spark, Apache Hive, MapReduce, Hadoop, SQL, Python...
  • ebay
    Teradata, Presto DB, Apache Hive, Spark, Hadoop, Python, Databases...

Experience

Location

San Jose, CA, United States

Availability

Part-time

Preferred Environment

Snowflake, Teradata SQL Assistant, DBeaver, Presto DB, PyCharm

The most amazing...

...project I've developed was converting buyers into sellers at eBay as part of a Hackathon project. This effort turned out to be an overall 0.1% revenue booster.

Employment

  • Senior Data Engineer

    2020 - PRESENT
    Stout Technologies
    • Managed Facebook videos pipeline, containing attribute data, such as genre, PG rating, trending, etc., through Python/SQL.
    • Optimized production SQL for throughput quality.
    • Developed queries and built dashboards for business-critical video attributes.
    Technologies: Apache Hive, Python 3, Unidash, GitHub, Spark
  • Senior Data Engineer

    2018 - 2021
    Walmart Labs
    • Architected, developed, and supported new features in the project’s data flow that calculated cumulative/daily metrics such as converted visitors and first-time buyers on the home and search pages.
    • Analyzed Hive sensor- and beacon-parsed data for ad-hoc analysis of user behavior.
    • Automated the current ETL pipeline through Python to build SQL on the fly into Hive map columns. Reduced the development cycle of 2-3 weeks for each new feature.
    • Wrote Hive UDF to replace the use of R to calculate p-value in the Hive pipeline. Supported existing processes and tools, mentored fellow engineers, and triaged data issues in a timely resolution.
    • Participated in the effort to migrate on-premise jobs to the GCP cloud.
    Technologies: Unix, Spark, Apache Hive, MapReduce, Hadoop, SQL, Python, Data Warehouse Design, Data Warehousing, Databases, Kubernetes, Customer Data, Data, Data Engineering, Apache Airflow, Data Modeling, Data Pipelines, Web Scraping, Relational Databases, Dimensional Modeling, PostgreSQL, DevOps, Google Cloud Platform (GCP), Elasticsearch, ETL, Apache Spark, BigQuery, Google Cloud Composer, Looker
  • Senior Software Engineer

    2012 - 2018
    ebay
    • Converted Teradata SQL to Spark SQL for a migration project. Developed Regex-related string processing UDFs for Spark.
    • Wrote Pig, Hive, and Map Reduce jobs on user behavior clickstream data. Automated Unix scripts through crontabs to run analyses, such as first-time buyer count and conversion metrics on listings data.
    • Prepared data for predictive and prescriptive modeling.
    • Built tools and custom wrapper scripts, using Python to automate DistCp Hadoop commands and logs processing.
    • Developed and supported ETL jobs into production. The jobs entailed both Teradata and Hadoop scripts.
    Technologies: Teradata, Presto DB, Apache Hive, Spark, Hadoop, Python, Databases, Data Warehousing, Data Warehouse Design, AWS, Docker, Customer Data, Data, Data Engineering, Apache Airflow, Data Modeling, Data Pipelines, Web Scraping, Relational Databases, Dimensional Modeling, PostgreSQL, DevOps, Google Cloud Platform (GCP), Elasticsearch, ETL, Apache Spark, BigQuery, Unix Shell Scripting
  • Database Analyst

    2008 - 2012
    PeakPoint Technologies
    • Data modeled and mapped, developed, and deployed ETL code. Wrote advanced Teradata SQL.
    • Developed extended stored procedures, DB-link, packages, and parameterized dynamic PL/SQL to migrate the schema objects per business requirements.
    • Designed a logical data model and implemented it to a physical data model.
    • Developed and placed into production automated ETL jobs scheduled in the UC4 tool.
    Technologies: Python, Teradata, SQL, T-SQL, PL/SQL, Databases, Data Warehousing, Data Warehouse Design, Data, Data Engineering, Data Modeling, Data Pipelines, Relational Databases, Dimensional Modeling, DevOps, ETL, Apache Spark

Experience

  • Teradata SQL to Spark SQL Migration Project

    Involved in a detailed analysis of SQL and jobs written in Teradata SQL. The requirement was to convert the whole communications logical data model (CLDM) related ETL pipeline to Spark. The pipeline had around 200 final tables and around 150 jobs. Wrote many UDFs for handling Regex-related calculations in Spark, which were seamlessly handled by inbuilt Teradata functions. Since Spark lacked those functions, I wrote UDFs to handle those cases.

  • Experimentation ETL Code Refactor

    This Hive ETL SQL was generated dynamically using Python to read from a YAML configuration file, where aggregations could be defined in the YAML. This was a huge win because it removed the need for making SQL changes and re-testing before every release.

  • Converting Buyers Into Sellers Through Purchase History

    As part of a Hackathon, I developed a code prototype at eBay for converting buyers into sellers. Based on purchase history and shelf life of items bought, buyers would be sent recommendations to sell the purchased items at the depreciated cost. The project was built in Teradata using user sessions and event/transaction-level information to derive recommendations. This effort increased the overall revenue by 0.1% and was considered a huge success.

  • Python Wrapper for Hadoop Administrative Commands

    Wrote a very detailed and complex Python Wrapper code on top of Hadoop commands to secure data in case of unintentional slip-ups. The project was a huge success and became a robust depository of code that mitigated a lot of pain points as more and more functionality was added.

  • Senior Data Engineer

    Developed Facebook videos pipelines ETL. Created Diffs in Phabricator to be deployed for adding/removing video attributes in production. Created deltoid metrics in MDF. Analyzed data and published reports in Unidash.

  • Facebook Watch Data Pipeline Engineer

    Built important metrics for videos such as genres, trending videos, songs, and movies. I also coded in Python and SQL (GitHub for code versioning), powered the ETL pipeline, added new features, and tuned the performance of previously written SQL.

  • Senior Developer

    Key Responsibilities:
    • Researching various crypto bots available in the market and their technical features-trading strategies.
    • Developing Python Codebase to implement effective crypto bot strategies, taking into account fear and greed, on-chain analysis of whale activity, etc. Writing auto-buy, sell, and portfolio balancing code.
    • Deep Diving on crawled web data regarding crypto news.

Skills

  • Languages

    Python, T-SQL, Snowflake, Python 3, SQL, Bash Script, JavaScript, GraphQL, C++, R, Java
  • Frameworks

    Apache Spark, Presto DB, Spark, Hadoop
  • Libraries/APIs

    PySpark
  • Tools

    PyCharm, Teradata SQL Assistant, Erwin, Sqoop, Flume, BigQuery, Apache Airflow, Oozie, Tableau, Google Cloud Composer, Looker, Microsoft Power BI, GitHub
  • Paradigms

    ETL, Database Design, ETL Implementation & Design, MapReduce, DevOps, Dimensional Modeling, Business Intelligence (BI)
  • Platforms

    Azure, Unix, Hortonworks Data Platform (HDP), Apache Pig, Apache Kafka, Docker, Kubernetes, MapR, Google Cloud Platform (GCP)
  • Storage

    MySQL, Databases, NoSQL, DBeaver, PL/SQL, Data Pipelines, Amazon DynamoDB, Database Architecture, Database Modeling, Apache Hive, Elasticsearch, Teradata, SQL Server 2014, Oracle PL/SQL, Amazon S3 (AWS S3), PostgreSQL, Oracle 11g, Relational Databases
  • Other

    Data Modeling, Data Warehousing, Data Analysis, Data Architecture, ETL Tools, Data Engineering, APIs, Machine Learning, Big Data, ETL Development, Data Warehouse Design, Unix Shell Scripting, AWS, Customer Data, Data, Web Scraping, Microsoft Azure, Unidash

Education

  • Bachelor's Degree in Computer Science
    2001 - 2005
    FAST National University - Islamabad, Pakistan

Certifications

  • Teradata Certified Master V2R5
    NOVEMBER 2005 - PRESENT
    Teradata

To view more profiles

Join Toptal
Share it with others