Muhammad Naeem Ahmed, Unix Shell Scripting Developer in San Jose, CA, United States
Muhammad Naeem Ahmed

Unix Shell Scripting Developer in San Jose, CA, United States

Member since July 1, 2018
Muhammad brings nearly 15 years of IT experience in data warehousing solution implementation. He delivers reliable, maintainable, efficient code using SQL, Python, Perl, Unix, C/C++, and Java. His work helped eBay increase their revenue and Walmart improve processes. Muhammad has a strong focus on big data-related technologies, the automation of redundant tasks to improve workflow, and understands how to achieve exciting, efficient, and profitable client solutions.
Muhammad is now available for hire

Portfolio

Experience

Location

San Jose, CA, United States

Availability

Part-time

Preferred Environment

Snowflake, Teradata SQL Assistant, DBeaver, Presto DB, PyCharm

The most amazing...

...project I've developed converting buyers into sellers at eBay as part of a Hackathon project. This effort turned out to be an overall 0.1% revenue booster.

Employment

  • Senior Data Engineer

    2018 - PRESENT
    Walmart Labs
    • Architected, developed, and supported new features in the project’s data flow that calculated cumulative/daily metrics such as converted visitors and first-time buyers on the home and search pages.
    • Analyzed sensor and beacon parsed data in Hive for ad-hoc analysis of user behavior.
    • Automated the current ETL pipeline through the use of Python to build SQL on the fly into Hive map columns. Reduced the development cycle of 2-3 weeks for each new feature.
    • Wrote Hive UDF to replace the use of R to calculate p-value in the Hive pipeline. Support existing processes and tools, mentor fellow engineers, and triage data issues in a timely resolution.
    • Participated in the effort to migrate on-premise jobs to GCP cloud.
    Technologies: Unix, Spark, Apache Hive, MapReduce, Hadoop, SQL, Python
  • Senior Software Engineer

    2012 - 2020
    ebay
    • Converted Teradata SQL to Spark SQL for a migration project. Developed Regex-related string processing UDFs for Spark.
    • Wrote Pig, Hive, and Map Reduce jobs on user behavior clickstream data. Automated Unix scripts through crontabs to run analyses such as first-time buyer count and conversion metrics on listings data.
    • Prepared data for predictive and prescriptive modeling.
    • Built tools and custom wrapper scripts using Python to automate DistCp Hadoop commands and logs processing.
    • Developed and supported ETL jobs into production. The jobs entailed both Teradata and Hadoop scripts.
    Technologies: Teradata, Presto DB, Apache Hive, Spark, Hadoop, Python
  • Database Analyst

    2008 - 2012
    PeakPoint Technologies
    • Data modeled and mapped, developed, and deployed ETL code. Wrote advanced Teradata SQL.
    • Developed extended stored procedures, db-link, packages, and parameterized dynamic PL/SQL to migrate the schema objects as per business requirements.
    • Designed a logical data model and implemented it to a physical data model.
    • Developed and placed into production automated ETL jobs scheduled in the UC4 tool.
    Technologies: Python, Teradata, SQL, T-SQL, PL/SQL

Experience

  • Teradata SQL to Spark SQL Migration Project (Development)

    Involved a detailed analysis of SQL and jobs written in Teradata SQL. The requirement was to convert the whole communications logical data model (CLDM) related ETL pipeline to Spark. The pipeline had around 200 final tables and around 150 jobs. Wrote many UDFs for handling Regex-related calculations in Spark, which were seamlessly handled by inbuilt Teradata functions. Since Spark lacked those functions, I wrote UDFs to handle those cases.

  • Experimentation ETL Code Refactor (Development)

    This Hive ETL SQL was generated dynamically using Python to read from a YAML configuration file, where aggregations could be defined in the YAML. This was a huge win because it removed the need for making SQL changes and re-testing before every release.

  • Python Wrapper for Hadoop Administrative Commands (Development)

    Wrote a very detailed and complex Python Wrapper code on top of Hadoop commands to secure data in case of unintentional slip-ups. The project was a huge success and became a robust depository of code that mitigated a lot of pain points as more and more functionality was added.

  • Converting Buyers Into Sellers Through Purchase History (Development)

    As part of a Hackathon, I developed a code prototype at eBay for converting buyers into sellers. Based on purchase history and shelf life of items bought, buyers would be sent recommendations to sell the purchased items at the depreciated cost. The project was built in Teradata using user sessions and event/transaction-level information to derive recommendations. This effort increased the overall revenue by 0.1% and was considered a huge success.

Skills

  • Languages

    Python, T-SQL, Snowflake, Python 3, SQL, JavaScript, C++, R, Java
  • Frameworks

    Presto DB, Spark, Hadoop, Apache Spark
  • Libraries/APIs

    PySpark
  • Tools

    PyCharm, Teradata SQL Assistant, Erwin, Sqoop, Flume, Oozie, Tableau, Microsoft Power BI
  • Paradigms

    ETL, Database Design, ETL Implementation & Design, MapReduce, Business Intelligence (BI)
  • Platforms

    Azure, Unix, Hortonworks Data Platform (HDP), Apache Pig, Apache Kafka, MapR
  • Storage

    MySQL, Databases, NoSQL, DBeaver, PL/SQL, Data Pipelines, AWS DynamoDB, Database Architecture, Database Modeling, Apache Hive, Elasticsearch, Teradata, SQL Server 2014, Oracle PL/SQL, AWS S3, PostgreSQL, Oracle 11g
  • Other

    Data Modeling, Data Warehousing, Data Analysis, Data Architecture, ETL Tools, Data Engineering, APIs, Machine Learning, Big Data, Unix Shell Scripting, Bash Scripting, Microsoft Azure

Education

  • Bachelor's degree in Computer Science
    2001 - 2005
    FAST - Islamabad, Pakistan

Certifications

  • Teradata Certified Master V2R5
    NOVEMBER 2005 - PRESENT
    Teradata

To view more profiles

Join Toptal
Share it with others