Neil du Toit
Verified Expert in Engineering
Software Developer
Cape Town, Western Cape, South Africa
Toptal member since December 14, 2021
Neil is a data scientist specializing in natural language processing, including text classification, summarization, regular expressions, OCR, and data visualization. Most recently, he analyzed court record repositories from 15 different countries. Neil has also developed Django back ends and has experience with many SQL and NoSQL databases. He has worked in data strategy consulting in the insurance and retail sectors.
Portfolio
Experience
- Python - 5 years
- Generative Pre-trained Transformers (GPT) - 5 years
- Natural Language Processing (NLP) - 5 years
- DataViz - 4 years
- Regular Expressions - 3 years
- Elasticsearch - 3 years
- Text Classification - 3 years
- Automated Summarization - 3 years
Availability
Preferred Environment
Linux, Vim Text Editor, Slack, Bash, Git, Docker, Virtualenv, SSH, Tmux
The most amazing...
...text classifier I've built classified court judgments from 15 countries allowing thousands of users access to a more powerful legal research tool.
Work Experience
Python Developer
Artbrain Ltd.
- Identified and fixed bugs in the system that were causing it to produce incorrect results.
- Optimized the system to run on downgraded infrastructure and take less time for inference.
- Ran end-to-end testing and profiling of the system.
Senior Python Developer
Henry Stewart
- Developed an MVP from scratch that can extract the required information from medical documents.
- Created a production environment and deployed a production-ready build of the service.
- Worked with the UI developer to integrate the service into the application.
Data Scientist
University of Cape Town
- Extracted references automatically from thousands of court records.
- Created a network graph based on the extracted references and visualized it interactively in the browser.
- Developed a taxonomy and classified all court records according to the taxonomy using machine learning.
- Created an automated summary generator able to summarize every court record in the collection.
- Configured an Elasticsearch search engine to allow for the searching of court records.
Head of Data Division
Q Division
- Developed architecture for churn prediction and customer lifetime value measurement for an insurance provider, requiring the integration of separate systems managing customer acquisition data and on-book customer data.
- Audited third-party service providers for an insurance provider, specifically marketing agencies, and developed pipelines to integrate third-party data into the insurance provider's analytics.
- Developed a Sankey diagram data visualization that comprehensively displayed the customer acquisition channels, the costs associated with each channel, and the value obtained in return.
Data Strategist
Q Division
- Developed a dashboard for a large retailer, including sales trends, KPI tracking, relevant open data, and predictive analytics.
- Created the game-play "quests" for an educational adaptive-learning tablet game that taught financial literacy to children.
- Developed the calculations for a debt repayment calculator application, which provided a breakdown of the snowball and avalanche debt repayment methods.
Experience
Court Precedent Citation Network Graph
https://ojs.law.cornell.edu/index.php/joal/article/view/89An Evaluation of Four-team-per-contest Swiss Power Paired Tournaments
Artbrain AI
https://www.artbrain.ai/Education
Bachelor’s Degree in Law and Justice Administration
University of Stellenbosch - Stellenbosch, South Africa
Bachelor’s Degree in Mathematics
University of Cape Town - Cape Town, South Africa
Skills
Libraries/APIs
REST APIs, NumPy, D3.js, Matplotlib, Natural Language Toolkit (NLTK), Pandas, SciPy, PyTorch, TensorFlow, SpaCy
Tools
DataViz, Vim Text Editor, Slack, Git, Virtualenv, Tmux, Seaborn, Microsoft Power BI
Languages
Python, Python 3, SQL, Bash, Falcon, C#, R, Octave, Java, HTML, CSS, JavaScript
Frameworks
Django, Selenium, Flask, Unity
Platforms
Docker, Linux, RapidMiner, Amazon Web Services (AWS)
Storage
MySQL, Elasticsearch, Relational Databases, MongoDB, ArangoDB, Amazon DynamoDB, PostgreSQL
Other
Natural Language Processing (NLP), Regular Expressions, Optical Character Recognition (OCR), Automated Summarization, Law, Tesseract, APIs, Robotic Process Automation (RPA), Document Parsing, Generative Pre-trained Transformers (GPT), Legal Documentation, Data Science, Text Classification, Topic Modeling, SSH, Cython, Mathematics, Economics, Civil Law, Business Law, Machine Learning, Servers, Artificial Intelligence (AI), Data Engineering, Pytesseract, Image Recognition, Web Scraping, Generative Pre-trained Transformer 3 (GPT-3), OpenAI GPT-3 API, Pipelines, Text Mining, Deployment, Sentiment Analysis, Entity Extraction, Data Mining, Full-stack
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring