Reema Kumari, Developer in Samastipur, Bihar, India
Reema is available for hire
Hire Reema

Reema Kumari

Verified Expert  in Engineering

Data Scientist and Developer

Samastipur, Bihar, India
Toptal Member Since
June 9, 2021

Reema has 4+ years of experience in machine learning and data analytics across various projects, such as movie scheduling, cargo space optimization, loan recovery, and digital products. She has focused heavily on natural language processing in the digital product space, and she is proficient in Python and R. Reema is also an incoming student in the UCLA Master of Engineering program, focused on artificial intelligence.


Zenon AI
H20, XGBoost, SAS, Python 3, Data Wrangling, Client Presentations...
Amazon Web Services (AWS), Natural Language Processing (NLP)...
Opera Solutions India Pvt Ltd
Algorithms, R, Python 3, Excel 365, Client Presentations, Clustering...




Preferred Environment

Anaconda, Apple Keynote, Excel 365, Python

The most amazing...

...things I've developed are a query completion and question-answer dialog box for a search engine.

Work Experience

Senior Associate, Analytics

2020 - 2020
Zenon AI
  • Optimized loan recovery across various channels for a leading US bank.
  • Used gradient boosting to predict incremental loan recovery amounts from different channels.
  • Consolidated various signal sources in SAS and performed innovative feature selection and modeling in H20.
Technologies: H20, XGBoost, SAS, Python 3, Data Wrangling, Client Presentations, Predictive Modeling, Data Analysis

Associate Data Scientist

2018 - 2020
  • Developed a plagiarism detector for a peer review platform, saving $2.5 million in gift card reimbursement and reducing manual moderation time by 50%.
  • Enhanced the user search experience by suggesting typeahead expansions for search queries, leading to around $24 million in retention impact.
  • Identified the most important questions asked and documents relevant to the question asked in a given user query.
  • Used embeddings from language models (ELMo) to incorporate features for document recommendation.
Technologies: Amazon Web Services (AWS), Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Python 3, Tf-idf, Gensim, Natural Language Toolkit (NLTK), TensorFlow, Stanford NLP, Stanford NER, NetworkX, Text Analytics, Long Short-term Memory (LSTM), Presentations, Spark, Anaconda, Word2Vec, Deep Learning, Data Analysis, Data Wrangling, Data Science, Machine Learning

Analytics Specialist

2016 - 2018
Opera Solutions India Pvt Ltd
  • Developed movie clusters to predict optimal movie schedules for a large UK cinema chain and upsurged revenue by 23%.
  • Built an ensemble model, using logistic regression and random forest to predict cargo show-up class with an error rate at or below 5%.
  • Developed learning repositories for techniques such as non-negative matrix factorization.
  • Conducted exploratory data analysis to optimize staffing across various projects.
Technologies: Algorithms, R, Python 3, Excel 365, Client Presentations, Clustering, Recommendation Systems, Non-negative Matrix Factorization (NMF), XGBoost, Random Forests, Logistic Regression, K-means Clustering, Exploratory Data Analysis, Predictive Modeling, Data Analysis, Data Science

Courts and Startups

A research project to analyze the impact of the presence of formal courts on business startups by various minority segments. The project identified the significance of courts in facilitating formal contracting and resolving issues, which ultimately boosts the economy by easing new businesses.

This work is published as a chapter in the book entitled: Formal Contract Enforcement and Entrepreneurial Success of the Marginalized. The chapter is entitled: Opportunities and Challenges in Developments (pg 171-197).

Adoption of Balanced Use of Fertilizers
As a research assistant, I performed randomized, controlled trials to compare awareness through peer versus external sources regarding the proper use of fertilizers. I studied surveys and consulted directly with farmers to form a hypothesis.
2011 - 2016

Bachelor's Degree and Master's Degree in Economics

Indian Institute of Technology (IIT), Kanpur - Kanpur, India


Pandas, Natural Language Toolkit (NLTK), Matplotlib, Scikit-learn, TensorFlow, LSTM, NetworkX, Beautiful Soup, XGBoost, Stanford NLP


Seaborn, StatsModels, Gensim, Apple Keynote, STATA, Named-entity Recognition (NER), Stanford NER


Python 3, R, Python, C, SAS


Data Science




Anaconda, Amazon Web Services (AWS), H20


Data Analysis, Economics, Econometrics, Algorithms, Natural Language Processing (NLP), Predictive Modeling, Logistic Regression, Linear Regression, Clustering, Client Presentations, Data Wrangling, Text Analytics, Word2Vec, GloVe, Tf-idf, Recommendation Systems, Gradient Boosting, Machine Learning, Data Modeling, Generative Pre-trained Transformers (GPT), Excel 365, Calculus, Probability Theory, Statistics, Linear Algebra, Data Structures, Data Analytics, Deep Learning, BERT, Non-negative Matrix Factorization (NMF), Random Forests, K-means Clustering, Exploratory Data Analysis, Long Short-term Memory (LSTM), Presentations, Surveys, Data Visualization

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.


Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring