
Neil du Toit
Verified Expert in Engineering
Software Developer
Cape Town, Western Cape, South Africa
Toptal member since December 14, 2021
Neil is a data scientist specializing in natural language processing, including classification, OCR, and entity extraction, as well as retrieval augmented generation, including semantic search, image search, image models, reranking, and tabular data. He's developed systems to summarize thousands of commercial lease agreements at a major property company and worked on its internal AI chatbot. Neil has particular expertise in the legal industry and experience in healthcare.
Portfolio
Experience
- Python - 5 years
- Generative Pre-trained Transformers (GPT) - 5 years
- Natural Language Processing (NLP) - 5 years
- DataViz - 4 years
- Regular Expressions - 3 years
- Elasticsearch - 3 years
- Text Classification - 3 years
- Automated Summarization - 3 years
Preferred Environment
Linux, Vim Text Editor, Slack, Bash, Git, Docker, Virtualenv, SSH, Tmux
The most amazing...
...text classifier I've built classified court judgments from 15 countries, enabling thousands of users to access a more powerful legal research tool.
Work Experience
Data Scientist
Jones Lang LaSalle
- Developed a RAG solution to summarize commercial lease agreements extracting key terms and calculating time periods with reference text highlighted in the original lease.
- Developed a solution to digitize tens of thousands of order documents to help the client identify outstanding debtors, leading to significant direct cash ROI.
- Integrated lease summarization into a property portfolio management solution.
- Improved the performance of retrieval in the RAG pipeline for the company's internal central AI assistant.
Python Developer
Artbrain Ltd.
- Identified and fixed bugs in the system that were causing it to produce incorrect results.
- Optimized the system to run on downgraded infrastructure and take less time for inference.
- Ran end-to-end testing and profiling of the system.
Senior Python Developer
Henry Stewart
- Developed an MVP from scratch that can extract the required information from medical documents.
- Created a production environment and deployed a production-ready build of the service.
- Worked with the UI developer to integrate the service into the application.
Data Scientist
University of Cape Town
- Extracted references automatically from thousands of court records.
- Created a network graph based on the extracted references and visualized it interactively in the browser.
- Developed a taxonomy and classified all court records according to the taxonomy using machine learning.
- Created an automated summary generator able to summarize every court record in the collection.
- Configured an Elasticsearch search engine to allow for the searching of court records.
Head of Data Division
Q Division
- Developed architecture for churn prediction and customer lifetime value measurement for an insurance provider, requiring the integration of separate systems managing customer acquisition data and on-book customer data.
- Audited third-party service providers for an insurance provider, specifically marketing agencies, and developed pipelines to integrate third-party data into the insurance provider's analytics.
- Developed a Sankey diagram data visualization that comprehensively displayed the customer acquisition channels, the costs associated with each channel, and the value obtained in return.
Data Strategist
Q Division
- Developed a dashboard for a large retailer, including sales trends, KPI tracking, relevant open data, and predictive analytics.
- Created the game-play "quests" for an educational adaptive-learning tablet game that taught financial literacy to children.
- Developed the calculations for a debt repayment calculator application, which provided a breakdown of the snowball and avalanche debt repayment methods.
Experience
Court Precedent Citation Network Graph
https://ojs.law.cornell.edu/index.php/joal/article/view/89An Evaluation of Four-team-per-contest Swiss Power Paired Tournaments
Artbrain AI
https://www.artbrain.ai/Education
Bachelor’s Degree in Law and Justice Administration
University of Stellenbosch - Stellenbosch, South Africa
Bachelor’s Degree in Mathematics
University of Cape Town - Cape Town, South Africa
Skills
Libraries/APIs
REST APIs, NumPy, D3.js, Matplotlib, Natural Language Toolkit (NLTK), Pandas, SciPy, PyTorch, TensorFlow, SpaCy
Tools
DataViz, Vim Text Editor, Slack, Git, Virtualenv, Tmux, Seaborn, Microsoft Power BI
Languages
Python, Python 3, SQL, Bash, Falcon, C#, R, Octave, Java, HTML, CSS, JavaScript
Frameworks
Django, Selenium, Flask, Unity
Platforms
Docker, Linux, RapidMiner, Amazon Web Services (AWS), Azure
Storage
MySQL, Elasticsearch, Relational Databases, MongoDB, ArangoDB, Amazon DynamoDB, PostgreSQL
Other
Natural Language Processing (NLP), Regular Expressions, Optical Character Recognition (OCR), Automated Summarization, Law, Tesseract, APIs, Robotic Process Automation (RPA), Document Parsing, Generative Pre-trained Transformers (GPT), Legal Documentation, Data Science, AI Integration, Large Language Models (LLMs), Document Processing, Text Classification, Topic Modeling, SSH, Cython, Mathematics, Economics, Civil Law, Business Law, Machine Learning, Servers, Artificial Intelligence (AI), Data Engineering, Pytesseract, Image Recognition, Web Scraping, Generative Pre-trained Transformer 3 (GPT-3), OpenAI GPT-3 API, Pipelines, Text Mining, Deployment, Sentiment Analysis, Entity Extraction, Data Mining, Full-stack, Azure Function App, Azure AI Document Intelligence, RAG Pipelines, RAG Architecture, Data Anonymization
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring