Halim Abbas, Big Data Developer in San Jose, CA, United States
Halim Abbas

Big Data Developer in San Jose, CA, United States

Member since September 30, 2019
Halim is a high tech innovator who's spearheaded world-class data science projects at game-changing tech companies like eBay and Teradata. Formally educated in machine learning, his professional expertise span information retrieval, natural language processing, and big data. Halim has a proven track record of applying state-of-the-art data science techniques across industry verticals such as eCommerce, web/mobile services, airline, and biopharma.
Halim is now available for hire


  • Cognoa
    Machine Learning, Deep Learning, Data Science, Analytics
  • Teradata
    Python, R, Sklearn
  • eBay
    Hadoop, Java



San Jose, CA, United States



Preferred Environment

Python, Jupyter, Git

The most amazing...

...project I've worked on is an AI-driven pediatric behavioral health screener.


  • Head of Data Science

    2016 - 2019
    • Recruited, hired, onboarded, and oversaw a data science team.
    • Applied machine learning (ML) and deep learning (DL) to build diagnostic classifiers for pediatric behavioral health conditions.
    • Developed proof-points for the efficacy of the product by running properly blinded, sufficiently powered clinical validation studies.
    • Provided timely insights by building and maintaining user analytics pipelines and visualization.
    Technologies: Machine Learning, Deep Learning, Data Science, Analytics
  • Principal Data Scientist

    2014 - 2016
    • Managed Think Big's data science consultation practice in the West Coast region.
    • Worked on big data science problems across multiple industries like eCommerce, fintech, biopharma, and medical imaging.
    • Applied ML techniques to various use cases like recommendation engines, customer profiling, churn modeling, predictive analytics, user segmentation, process optimization, next best action detection, and search relevance ranking.
    • Helped to close multiple sales, and build repeatable consulting relationships with large enterprise customers.
    Technologies: Python, R, Sklearn
  • Senior Research Scientist

    2009 - 2012
    • Led an applied research team.
    • Built eBay’s first machine-learned search relevance ranking engine from the ground up.
    • Was involved in machine learning, data mining, auction modeling, user modeling & classification, click log analysis, and more.
    • Managed multiple research tracks.
    • Grew a team of top talent researchers.
    • Oversaw IP processes and more.
    Technologies: Hadoop, Java
  • Machine Learning Research Scientist

    2009 - 2009
    • Developed an adaptive multimedia search relevance ranking system using machine learning (ML).
    • Experimented with ML ensemble decision trees using TreeNet.
    • Mentored new hires and ramped them up on the experimental framework.
    • Ran A/B testing experiments to produce evidence in support of improvement hypotheses.
    Technologies: Java, TreeNet
  • Research Lead

    2006 - 2008
    Code Green Networks
    • Developed an NLP system to classify documents reliably on live network feeds.
    • Contributed to the production R&D cycle by writing production code and fixing bugs in Java and C.
    • Supervised offline experimentation to develop more efficient algorithms underlying the product features.
    Technologies: Java, TreeNet
  • Research Staff

    2005 - 2006
    Columbia University — CCLS Lab
    • Developed a statistical-rule-based hybrid ML system for the automatic translation of natural language news headlines.
    • Worked on Arabic/English automated translation systems.
    • Applied validation tests and reported incremental improvements using the BLEU score.
    Technologies: Java, NLP


  • Machine Learning Approach for the Early Detection of Autism by Combining Questionnaires and Home Video Screening (Other amazing things)

    Existing screening tools for early detection of autism are expensive, cumbersome, time-intensive, and sometimes fall short in predictive value. In this work, we sought to apply machine learning (ML) to gold standard clinical data obtained across thousands of children at-risk for autism spectrum disorder to create a low-cost, quick, and easy-to-apply autism screening tool.


  • Paradigms

    Data Science, MapReduce, Functional Programming, Agile Software Development, Microsoft Query
  • Other

    Machine Learning, Artificial Intelligence (AI), Deep Learning, Natural Language Processing (NLP), Big Data, Computer Science, Supervised Learning, Predictive Modeling, Predictive Analytics, Data Analysis, Data Visualization, OOP Designs, Architecture, SVMs, Convolutional Neural Networks, Recurrent Neural Networks, Natural Language Understanding, Natural Language Queries, Unsupervised Learning, Active Learning, Learning Transfer, Object Identification, Computer Vision, LSTM Networks, Clustering, Cluster Analysis, Artificial Neural Networks (ANN), Neural Networks, Statistical Methods, Statistical Analysis, Bayesian Statistics, Statistical Modeling, Sales Forecasting
  • Languages

    Python, Java, SQL, PHP, JavaScript, Objective-C, HTML, R
  • Frameworks

  • Libraries/APIs

    Sklearn, Matplotlib, TensorFlow, Keras, LSTM, NLTK
  • Tools

    Tableau, Amazon Elastic MapReduce (EMR)
  • Platforms

    iOS, Linux, Mac OS, Databricks, AWS EC2, Amazon Web Services (AWS)
  • Storage

    MySQL, NoSQL, MongoDB, AWS S3


  • Master's degree in Machine Learning
    2004 - 2006
    Columbia University - New York City, NY, USA
  • Bachelor's degree in Computer Engineering
    1998 - 2001
    Carleton University - Ottawa, Canada

To view more profiles

Join Toptal
Share it with others