Scroll To View More
Hafiz Hamid

Hafiz Hamid

San Francisco, CA, United States
Member since March 10, 2018
Hafiz is a seasoned software architect who's lead complex software projects for last 12 years at organizations like Bing (Microsoft), Lyft, and Salesforce.com in full-time roles—now, he's pursuing a freelancing career. His areas of expertise are back-end/server development, databases, big data, cloud computing, DevOps, web crawling, and search engines.
Hafiz is now available for hire
Portfolio
Experience
  • SQL, 10 years
  • Web Scraping, 6 years
  • Python, 5 years
  • Big Data Architecture, 5 years
  • Large Scale Distributed Systems, 5 years
  • DevOps, 4 years
  • Pub/Sub, 3 years
  • Amazon Web Services (AWS), 3 years
San Francisco, CA, United States
Availability
Part-time
Preferred Environment
Mac/Linux, Git
The most amazing...
...thing I've built was a real-time streaming data pipeline at Lyft. I built a web crawler to scrape 1 billion pages every day at Bing.com.
Employment
  • Staff Software Engineer (Full-time)
    2015 - 2018
    Lyft, Inc.
    • Worked as the tech lead and architect on streaming platform team; also drove the vision and strategy.
    • Built the real-time events ingestion and pub/sub infrastructure for Lyft that ingests/moves more than 200 billion events every day.
    • Developed the highly scalable and reliable message bus at Lyft which is used by hundreds of internal micro-services to asynchronously communicate with each other.
    • Maintained multiple tier-0 services with five nines of reliability guarantees/SLA.
    • Trained and mentored dozens of other engineers.
    Technologies: Python, AWS Cloud (EC2, Lambda, Kinesis, DynamoDB, SQS, S3, Redshift, CloudWatch), Apache Kafka, Apache Flink, Hadoop/Hive
  • Principal Member of Technical Staff (full-time)
    2014 - 2015
    Salesforce.com
    • Developed several relevancy features which involved customizing Apache Lucene’s scoring framework for Salesforce’s needs.
    • Implemented infrastructure work to enable runtime feature extraction for the training of an ML-based ranker and its integration into an Apache Solr’s query processing pipeline.
    • Designed the search infrastructure to scale out Salesforce search’s static rank feature to 100% documents (currently only partially enabled due to infrastructure limitations).
    Technologies: Java, Apache Solr/Lucene
  • Senior Software Engineer (Full-time)
    2005 - 2014
    Microsoft (Bing Search)
    • Led a team of engineers to develop scalable infrastructure for a distributed web crawler and content extraction platform—enableing it to crawl hundreds of millions of web documents every day from hundreds of websites (like Amazon.com, Imdb.com, Walmart.com) and parse them to extract structured content for enriching Bing’s search index.
    • Received a Microsoft Gold Star Award for the above project.
    • Developed a log mining platform to enrich a local search index; enabled it to algorithmically discover/mine URLs and search keywords, associated with local businesses (restaurants, hotels, banks, etc.), by mining search results click logs (petabytes of data). The platform is being used in more than 20 Bing markets to enrich the local search index and cut down the URL coverage gap with Google.
    • Worked both as the technical lead and in the IC capacities to enhance and evolve a machine learning-based text classification framework (originally conceived by Microsoft Research) into a classification platform and integrate it with local data pipeline.
    • Developed a process to train, evaluate and consume statistical models which classify hundreds of millions of local businesses around the world into a taxonomy of more than 1,000 categories; for the above project.
    • Managed (from a tech-lead standpoint) the day-to-day maintenance and operations of a local data ingestion/processing pipeline that feeds into the index of Bing local search engine.
    • Worked on back-end data acquisition/processing pipeline for Bing Entertainment search (music, movies, TV shows, and more).
    Technologies: C#/.NET, Microsoft SQL Server, Hadoop/Hive, Machine Learning
  • Professional Services Consultant (Full-time)
    2005 - 2006
    Teradata Corporation
    • Developed automated ETL framework, for DHL (a Teradata customer) in order for it to ingest data from multiple heterogeneous sources and integrate into an Enterprise data warehouse.
    • Led a team of four developers on Eircom Metadata-driven ETL Tool project which was meant to develop generic parsing and transformation engines for data extraction from more than 50 different semi-structured CDR formats. (Eircom is Ireland’s leading telecommunication operator).
    • Conducted Teradata trainings and data warehouse workshops for new hires.
    Technologies: SQL, Java, Teradata
Experience
  • Lyft, Inc. (Development)

    Technologies: Python, AWS Cloud (EC2, Lambda, Kinesis, DynamoDB, SQS, S3, Redshift, CloudWatch), Apache Kafka, Apache Flink, Hadoop/Hive

  • Salesforce.com (Development)

    Technologies: Java, Apache Solr/Lucene, Search Relevancy

  • Microsoft (Bing Search) (Development)

    I worked on this web crawling and extraction framework.

    Technologies: C#/.NET, Microsoft SQL Server, Hadoop/Hive, Machine Learning

  • Teradata Corporation (Development)

    Technologies: SQL, Java, Teradata

Skills
  • Languages
    Python, SQL, C#.NET, Java, HTML, XML, XPath, XQuery, JavaScript
  • Frameworks
    Hadoop, Scrapy, Machine Learning, Flask, Django
  • Tools
    Amazon SQS, AWS CloudWatch, Zapier, Apache Solr, Flink
  • Paradigms
    DevOps, ETL, Agile Software Development
  • Platforms
    AWS Kinesis, AWS Lambda, AWS EC2, Amazon Web Services (AWS), Apache Kafka, Apache Flink
  • Storage
    AWS DynamoDB, Amazon Kinesis Data Firehouse, PostgreSQL, RedShift, AWS S3, Databases, Teradata, SQL Server 2010, Apache Hive, Elasticsearch
  • Other
    Stream Processing, Large Scale Distributed Systems, Pub/Sub, Web Scraping, Data Warehousing, Big Data, Big Data Architecture, Search Engine Development, Information Retrieval, Data Modeling, Text Classification
  • Libraries/APIs
    Apache Lucene
Education
  • Master's degree in Computer Science and Engineering
    2009 - 2011
    University of Washington - Seattle, WA, USA
  • Bachelor's degree in Computer Science
    2001 - 2005
    FAST | National University of Computer and Emerging Sciences - Islamabad, Pakistan
Certifications
  • Teradata Certified Master
    JANUARY 2006 - PRESENT
    Teradata Corporation
I really like this profile
Share it with others