Banu Atav, Developer in Rotterdam, Netherlands
Banu is available for hire
Hire Banu

Banu Atav

Verified Expert  in Engineering

AI Specialist and Developer

Rotterdam, Netherlands

Toptal member since August 24, 2020

Bio

Backed by a master’s degree in econometrics, Banu is an AI specialist with three years of industry experience in building data-driven automation solutions using machine learning and natural language processing. Banu has delivered over ten projects ranging from proof of concept and ideation phases to solution deployment. Skilled in various techniques such as entity extraction, text classification, sentiment analysis, text summarization, among others, Banu can deliver on your AI requirements.

Portfolio

Bayesia
Python, Pandas, NumPy, Gensim, Amazon S3 (AWS S3)...
Bayesia
Git, Python, PyTorch, TensorFlow, Natural Language Processing (NLP)...
Ciphix
Python, Azure Machine Learning, Google Cloud Platform (GCP), NumPy, Pandas...

Experience

  • Generative Pre-trained Transformers (GPT) - 3 years
  • Machine Learning - 3 years
  • Python - 3 years
  • Natural Language Processing (NLP) - 3 years
  • Git - 3 years
  • Statistics - 2 years
  • Google Cloud Platform (GCP) - 2 years
  • PyTorch - 1 year

Availability

Part-time

Preferred Environment

Visual Studio Code (VS Code), MacOS, Git

The most amazing...

...project I've done is create a process that integrates both machine learning and human employees to improve both speed and quality of the completed tasks.

Work Experience

Freelance Data Scientist (NLP, ML)

2020 - PRESENT
Bayesia
  • Designed the flow for detecting sections in contracts and matching sections against templates using information-retrieval / document ranking techniques (TFIDF and shallow neural networks: GloVe).
  • Deployed a Python application as a Flask app on AWS (using Elastic Beanstalk).
  • Augmented Microsoft Word to an HTML5 parser (Pandoc) with Python filters to improve block-line sectioning and list structures.
Technologies: Python, Pandas, NumPy, Gensim, Amazon S3 (AWS S3), Amazon Elastic Block Store (EBS), AWS Lambda, Boto 3, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Information Retrieval, Tf-idf, GloVe, HTML5, Pandoc

Freelance Data Scientist (NLP, ML)

2020 - PRESENT
Bayesia
  • Composed for a project (that dealt with media bias detection in news articles) a relevant dataset and designed and set up a labeling process for labeled dataset generation that ensures labeling consistency and quality (using statistics like Kappa).
  • Project:. Created for a project (that aimed at detecting media bias in news articles) a BERT model for detection of six bias types using text classification (HuggingFace, PyTorch).
  • Created for a project (that involved a recommender system MVP on online recruitment) the MVP model architecture to provide the business with the right information on their data collection process in the future.
  • Built a website summarization tool using T5 and BERT that scrapes website content using Beautiful Soup; also explored both abstractive and generative summarization techniques.
Technologies: Git, Python, PyTorch, TensorFlow, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Machine Learning, Amazon Web Services (AWS)

AI Lead | Machine Learning Specialist

2019 - 2020
Ciphix
  • Developed a machine learning solution for the automatic processing of incoming IT service tickets.
  • Created an entity extraction application that extracts information from text files for further automatic processing.
  • Created an Azure machine learning pipeline for automatic retraining/updating for deployed machine learning models.
Technologies: Python, Azure Machine Learning, Google Cloud Platform (GCP), NumPy, Pandas, Scikit-learn, Hugging Face

Data Quality Consultant

2019 - 2019
Dun & Bradstreet, Rotterdam
  • Assembled a structured dataset from unstructured contractual information (it was text-based and involved multiple languages).
  • Took the initiative to adapt this process into a scalable and automated solution by implementing OCR and PDF scraping algorithms.
  • Transcribed 100s of the contractual data in Salesforce.
Technologies: R, UiPath

Experience

Web Page Summarizer

Project Goal:
Scrape a web page from a provided URL then select the main text and deliver a summary.

Method:
The source code of the web page was parsed using Beautiful Soup with us exploring several summarizers. We had four different extractive summarisers using the following packages Gensim, Summa, NLTK (Natural Language Toolkit), and spaCy. We then implemented two pre-trained abstractive summarisers using HuggingFace's transformers package (a wrapper around PyTorch and TensorFlow).

Legal Contract Matching

I designed the flow for detecting sections in contracts and matching sections against templates in the database using information-retrieval/document ranking techniques (TFIDF and shallow neural networks). I also served the model and deployed it as a Flask app (with a REST API) on AWS.

Media Bias Detection in News Articles

I composed a relevant dataset and designed and set up a labeling process for labeled dataset generation. I also created a BERT model for the detection of 6 types of bias using text classification using the Hugging Face library.

Education

2015 - 2018

Master of Science Degree in Econometrics

Erasmus University Rotterdam - Rotterdam, Netherlands

Certifications

MAY 2021 - PRESENT

Deep Learning Specialization by Deeplearning.AI

Coursera

APRIL 2021 - PRESENT

Full-stack Web Development with Flask

Pirple

APRIL 2019 - PRESENT

Machine Learning with TensorFlow on Google Cloud Platform

Coursera

Skills

Libraries/APIs

TensorFlow, PyTorch, NumPy, Pandas, Scikit-learn, REST APIs

Tools

Git, MATLAB, Azure Machine Learning, Gensim, Amazon Elastic Block Store (EBS), Boto 3, Pandoc

Languages

Python, R, HTML5, CSS

Frameworks

Flask

Platforms

Google Cloud Platform (GCP), MacOS, Visual Studio Code (VS Code), Amazon Web Services (AWS), AWS Lambda

Storage

Amazon S3 (AWS S3)

Other

Machine Learning, Bayesian Statistics, Statistics, Time Series, R Programming, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), UiPath, Information Retrieval, GloVe, Deep Learning, Transformers, Word2Vec, Computer Vision Algorithms, Neural Networks, Recurrent Neural Networks (RNNs), Convolutional Neural Networks (CNNs), BERT, Hugging Face, Tf-idf

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring