Wanlin Pang, Data Processing Developer in Mountain View, CA, United States
Wanlin Pang

Data Processing Developer in Mountain View, CA, United States

Member since May 3, 2015
Wanlin is a research scientist from GMD in Germany, NRC of Canada, and NASA Ames with over 30 published research papers. As a software engineer, he has 10+ years’ experience developing complex artificial intelligence (AI), machine learning (ML), and data science (DS) applications. He collaborates with clients to solve real-world problems for the benefit of real people.
Wanlin is now available for hire


  • Adobe
    Amazon Web Services (AWS), Spark ML, Python, Scala, Azure, AWS...
  • Big Data Analytics, Verizon
    Geohash, Data Science, Machine Learning, Scala, Spark, Text Analytics, Python...
  • Yahoo Labs
    Machine Learning, Hadoop, Text Analytics, Data Science, C++, Data Processing



Mountain View, CA, United States



Preferred Environment

Machine Learning, Artificial Intelligence (AI), Big Data, Hadoop, Google Cloud Platform (GCP)

The most amazing...

...thing I've built is an AI planner based automatic data processing framework. It generates data processing pipelines from user specifications and executes them.


  • Senior Machine Learning Engineer

    2017 - 2020
    • Designed the real-time identity service framework and developed the online clustering algorithm for identity stitching (cookies to user-ids).
    • Designed and developed a data insights service on top of a platform.
    • Developed an end-to-end product recommender for e-retailers.
    Technologies: Amazon Web Services (AWS), Spark ML, Python, Scala, Azure, AWS, Machine Learning, Spark, Pandas, Natural Language Processing (NLP), Scikit-learn, Data Science, Data Processing
  • Principal Member of Technical Staff

    2015 - 2017
    Big Data Analytics, Verizon
    • Developed demographic classifiers from mobile tracking data, which was deployed to an Ad Server.
    • Invented a ternary based geo-coding schema that is better than Geohash in many ways.
    • Developed machine learning modeling workflow (with Spark ML) as part of a marketing platform.
    Technologies: Geohash, Data Science, Machine Learning, Scala, Spark, Text Analytics, Python, Data Processing
  • Senior Research Engineer

    2008 - 2015
    Yahoo Labs
    • Developed machine learning algorithms and models for Yahoo Front-page content ranking and personalization.
    • Developed machine learning models for categorization of billions of web pages, search queries, creatives, and questions/answers.
    • Developed a variety of data processing and machine learning modeling framework/pipelines.
    • Developed a constraint-based data validation framework.
    Technologies: Machine Learning, Hadoop, Text Analytics, Data Science, C++, Data Processing
  • Staff Software Engineer

    2006 - 2008
    Alcatel-Lucent Genesys Lab
    • Designed a hybrid scheduling algorithm combining OR and AI approach.
    • Developed a constraint network library (C++), the core of the scheduling engine.
    • Authored internal technical reports on the contact center optimization problems, both theoretical analysis and practical solving algorithms.
    Technologies: C++, Constraint Programming, Python
  • Senior Research Scientist

    2001 - 2006
    NASA Ames Research Center
    • Co-developed an AI planner-based software framework for automatic data processing.
    • Developed a constraint network library as a core component of the framework.
    • Conducted research and developed new planning techniques and constraint algorithms for data processing domains.
    Technologies: Java, Artificial Intelligence (AI), Natural Language Processing (NLP), Data Processing


  • Automated Data Processing (Development)

    An AI planner-based software system for automatic data processing. A data processing problem, specified with a high-level domain-specific language, is translated into a planning problem and solved with a constraint-programming (CP) solver. The system has been implemented in Java as a standalone application, as well as a web application via JSP/Servlets and WebService.

    My contribution to the system includes i) implementation of the constraint programming solver, including a few new and efficient constraint search and propagation algorithms; ii) overall system design; iii) implementation of a simple user interface.


  • Languages

    Scala, Python, Java, C++
  • Frameworks

    Spark, Hadoop
  • Paradigms

    Constraint Programming, Data Science
  • Other

    Artificial Intelligence (AI), Machine Learning, Geohash, Text Analytics, Data Processing, Natural Language Processing (NLP), AWS
  • Libraries/APIs

    Scikit-learn, Pandas
  • Platforms

    Google Cloud Platform (GCP), Azure, Amazon Web Services (AWS)

To view more profiles

Join Toptal
Share it with others