Joao Diogo de Oliveira, Developer in Fortaleza - State of Ceará, Brazil
Joao is available for hire
Hire Joao

Joao Diogo de Oliveira

Verified Expert  in Engineering

Machine Learning Engineer and Developer

Fortaleza - State of Ceará, Brazil
Toptal Member Since
October 20, 2022

Joao is an AI/ML engineer with more than 14 years of experience at Fortune 100 companies like Procter & Gamble and Hearst and startups in the healthcare, energy, and finance industries. Joao holds a master's degree in computer engineering from the University of Porto and has multiple certifications in ML and deep learning.


Hearst - Technology
Python, Artificial Intelligence (AI), Generative Pre-trained Transformers (GPT)...
Peyton & Greyson Solutions Inc,
Artificial Intelligence (AI), AI Design, Generative Adversarial Networks (GANs)...
Freelance Clients
Python 2, Python 3, Deep Learning, Statistics, Data Analytics, Python...




Preferred Environment

Python 3, PyTorch, TensorFlow, R, Machine Learning, Google Cloud Platform (GCP), Amazon Web Services (AWS)

The most amazing...

...project I've led is predicting over 300 wind and solar farms in a record time of 1.5 months.

Work Experience

MVP Developer

2023 - PRESENT
Hearst - Technology
  • Developed an MVP successfully, demonstrating the ease of replacing a legacy system within 3-4 weeks.
  • Used generative AI (GPT 3.5, GPT 4) and other frameworks and libraries (LangChain and LlamaIndex) to extract structured data from unstructured data. Achieved up to a 98% success rate.
  • Researched and drove the implementation of the newest trends in Generative AI to a broad audience. These included but were not limited to, the newest models like GPT4, Turbo, Gemini, Claude, and multimodal models, and the newest frameworks, like LlamaIndex, LangChain, and AutoGPT.
  • Planned and elaborated working pipelines for training and inference so they could be used seamlessly.
Technologies: Python, Artificial Intelligence (AI), Generative Pre-trained Transformers (GPT), AgentGPT, Generative Artificial Intelligence (GenAI), Google Cloud Platform (GCP), Azure, Gemini, AI Agents, Information Extraction, Generative AI, Large Language Models (LLMs), Data Science, Natural Language Processing (NLP), Amazon Web Services (AWS), OpenAI, Multimodal Models, Multimodal GenAI, AI Prompts

AI Developer

2022 - PRESENT
Peyton & Greyson Solutions Inc,
  • Developed an AI application for writing automatic proposals, saving at least 20% of the time from a specialized employee.
  • Designed and architected the entire IT solution: a) database choice and detail; b) AWS Serverless Services; b) chose and set up the web app back-end implementation; c) API configuration; d) to complete AI model development and deployment with AgentGPT.
  • Tracked team members' development and ensured that milestones were met, from demos to critical development deliverables.
  • Tailored the GPT-3 model to a specific business case successfully.
Technologies: Artificial Intelligence (AI), AI Design, Generative Adversarial Networks (GANs), Language Models, OpenAI, APIs, Backendless, Amazon Web Services (AWS), AWS Lambda, Amazon RDS, Python, DaVinci, Large Language Models (LLMs), Models, AI Programming, Natural Language Understanding (NLU), Matplotlib, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Information Extraction, GitHub, Cloud Platforms, Data Pipelines, Early-stage Startups, Data Processing, Data Transformation, Back-end, ChatGPT, OpenAI GPT-3 API, Generative Pre-trained Transformer 3 (GPT-3), DevOps, Amazon SageMaker, Jupyter Notebook, OpenAI GPT-4 API, Kubernetes, Scraping, Analytics, Keras, Sentiment Analysis, Generative AI, Data Structures

IT Engineer | Artificial Intelligence Engineer

2019 - PRESENT
Freelance Clients
  • Developed an artificial intelligence AI project for energy prediction of solar and wind farms, summing up 2.6 GW of installed power.
  • Built a model for computer vision that did face recognition.
  • Created a model using computer vision to ease pneumonia detection through X-ray.
  • Provided consulting services to deliver wind certification for two offshore projects with a combined predicted installed power of 2GW.
  • Maintained over 20 distributed Linux servers, updating, securing, and creating key performance indicators KPIs.
Technologies: Python 2, Python 3, Deep Learning, Statistics, Data Analytics, Python, Data Science, Deep Neural Networks, Big Data Architecture, Linux, Datasets, Pandas, Machine Learning Operations (MLOps), Image Processing, Hardware, Large Language Models (LLMs), Models, AI Programming, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Data Processing Automation, Artificial Intelligence (AI), Image Generation, ARIMA, ARIMA Models, LSTM, SARIMA, R, Matplotlib, Information Extraction, GitHub, Cloud Platforms, Data Pipelines, Energy, Neural Networks, Regression Modeling, Data Processing, Data Transformation, CSV, Data Analysis, Back-end, DevOps, Amazon SageMaker, Jupyter Notebook, Speech Recognition, Scraping, Analytics, FFmpeg, Keras, Sentiment Analysis, Image Recognition, TensorFlow, PyTorch, Computer Vision, Generative AI, OpenAI, Speech to Text, Speech to Intent

Product Owner | Country Manager

2017 - PRESENT
  • Developed AI models, including deep learning, weather forecast, and energy prediction for multiple markets.
  • Performed business and data analytics for customers.
  • Led the successful establishment of a European institute in Brazil.
  • Managed a portfolio of clients with a combined production of over 3 GW of energy.
Technologies: Deep Learning, Artificial Intelligence (AI), Machine Learning, Data Analytics, Data Science, Data Visualization, Linux, Datasets, Pandas, Amazon Web Services (AWS), Python, Hardware, Models, Matplotlib, Information Extraction, GitHub, Early-stage Startups, Energy, Neural Networks, Data Transformation, CSV, Data Analysis, Back-end, DevOps, Workshop Facilitation, Analytics, Sentiment Analysis, Image Recognition

Managing Director

2013 - PRESENT
Niway Group
  • Managed daily operations of the group's investments, including a shopping mall, business towers, and representation before official government bodies.
  • Reversed a seven-year loss into profit by applying substantial and stable changes.
  • Supervised the financial control of the construction of three towers, 12 floors each, with a total cost of R$ 43 million.
Technologies: Team Leadership, Finance, Data Science, Data Visualization, Python, Real Estate, CSV, Data Analysis, CTO, Workshop Facilitation, Analytics

Machine Learning Developer

2023 - 2023
EIS - Main
  • Did a feasibility study and implemented a POC on capturing, counting and geo-locating valves in a cloud point project of an oil and gas plant scan.
  • Developed an AI model to identify valves in batches of images from a plant scan.
  • Implemented a method to process and slice cloud point data automatically, extracting images and transforming them into 2D.
Technologies: Machine Learning, Computer Vision, Deep Learning, Convolutional Neural Networks (CNN), Artificial Intelligence (AI), Point Clouds, Point Cloud Data, Image Processing, Natural Language Processing (NLP), Python, TensorFlow, PyTorch, Audacity

Team Leader

2023 - 2023
Stop the Traffik
  • Analyzed the most underlying tech issues in a volunteer organization and proposed a plan to tackle them through a team of 11 volunteers scattered over nine countries.
  • Led a team of ML/AI specialists to develop an AI model for sentiment analysis to automize the analysis of trafficking articles and classify them, removing the manual labor currently applied.
  • Guided and led a team of ML/AI specialists to improve the legacy model that classified articles into relevant and non-relevant ones for the organization.
  • Steered through meetings the project success and engagement to deliver the proposed outcomes to the organization. Participated in all parts of development (AI, DevOps, Python) to make sure that commitments were met and delivered.
Technologies: IBM Cloud, Amazon SageMaker, Kubernetes, Data Science, Python, Artificial Intelligence (AI), IBM Cloud Platform

NLP Engineer

2023 - 2023
Mercatus Center at George Mason University - Main
  • Developed a long text classification for documents within 96 labels. The purpose was to use different NLP techniques to get probabilities of the three digits NAICS codes.
  • Explored literature on the most advanced techniques of text classification and long text and applied them; Combined the different techniques to achieve a better result, achieving an improvement of 15% on the F1 score.
  • Used AWS SageMaker to provide an effective and insightful training and inference pipeline.
  • Achieved F1 scores on some categories up to 0.95-0.98 on others using different techniques increased from 0.4 to 0.7.
Technologies: Natural Language Processing (NLP), Python, Generative Pre-trained Transformers (GPT), NLPP, Deep Neural Networks, Amazon SageMaker, Transformers, Data Science, Artificial Intelligence (AI), TensorFlow

Engineering Manager

2012 - 2013
Procter & Gamble
  • Implemented multiple line update projects across plants in France, Italy, and Spain.
  • Developed cost-saving solutions and deployed them across multiple factories.
  • Led technical discussions with suppliers to make sure they would meet the requirements.
Technologies: Agile, Project Design & Management, Process Management, APIs, Linux, Hardware, Supply Chain Management (SCM), Supply Chain Optimization, SARIMA, Data Processing, Data Analysis, Workshop Facilitation

Supply Chain Leader

2009 - 2012
Procter & Gamble
  • Led the design and implementation of a global pilot project to remodel the company's logistics sector.
  • Found a solution to complex problems of inventory costs, achieving a reduction from $12 million to $7 million.
  • Participated in creating an internal cross-docking supply chain prototype, resulting in yearly savings of $2 million.
  • Coached, guided, and coordinated the work of multiple team members.
Technologies: Project Design & Management, Logistics, Agile, Forecasting, Data Science, Datasets, Supply Chain Management (SCM), Supply Chain Optimization, Data Processing, Data Analysis, Workshop Facilitation

NLP in Healthcare | Score Clinical Patient Notes
A project to classify each patient's probable disease according to actual notes taken from clinical trials by doctors and my task was to build a natural language processing (NLP) model on top of the foundation framework RoBERTa.

CV: X-ray Pneumonia Detection
A computer vision model, which receives an X-ray image and detects the presence of foreign tissue, and predicts whether the image belongs to a patient with pneumonia or not. The model performed similarly to a trained physician, with a precision of 86% (no pneumonia) and 19% (pneumonia).

Power Generation Forecast for Wind and Solar Farms

A power generation forecast for over 300 wind and solar farms spread across Portugal. I performed the data analysis for the plant's geolocation and wind and solar profile, structuring all the data, building an ensemble of around five models per farm, and training and deploying the models.

Computer Vision - Face detection

A computer vision model, built with ML techniques, that does video-based facial recognition. I was instrumental in making the model and the necessary pipeline from the beginning. Additionally, I've achieved a positive false acceptance rate (FAR) of around 10^-5, meeting clients' needs.

Developing AI Automated Proposal Generation

The application provides automation for Proposal Writing, as the idea was to develop a model and WebApplication to support the model to save the time of specialized employees by at least 20% and I've accomplished developing a working AI Model based on GPT-3. I've also designed and developed the structure and architecture of the web application, making most of the back-end functions and all database architecture.

CV: Image Captioning - Identifying Objects and Writing Caption

Developed a machine learning model that, through deep learning networks, analyses images, identifies objects, and captions the images accordingly; The project got a BLUE-1 score of 0.679 for an image caption, a score of 0.6-0.7 is considered best in class.

Email NLP/NLU/NER Analysis

Through advanced techniques of NLP, extract insights from emails. Classify within a set of pre-defined (achieving an overall score of +83% accuracy overall), extracting important information from the text, doing data analysis, summarisation, and other relevant tasks.

Surgery Assistance Software

A piece of AI software that could do voice recognition, interpret commands, and recognize the tools needed for the specific surgical moment. On top of that, the AI predicted (based on historical information) what should be the order of the tools within the surgery.

I designed and implemented the architecture of the software, achieving an MVP.
2003 - 2009

Master's Degree in Computer Science

University of Porto - Porto, Portugal

2007 - 2008

Exchange Program Coursework Toward Master's Degree in Computer Science

Delft University of Technology - Delft, Netherlands


Quantum Excellence Certificate

IBM | Qiskit Global Summer School 2022


AI for Healthcare



Machine Learning

Stanford University


Deep Reinforcement Learning



Advanced Computer Vision - Machine Learning



PyTorch, TensorFlow, Scikit-learn, Pandas, LSTM, Matplotlib, Keras, OpenCV, PyTorch Lightning, FFmpeg


GitHub, Amazon SageMaker, ChatGPT, You Only Look Once (YOLO), NLPP, Audacity, AI Prompts


Python 3, SQL, Python, R, Python 2, C++


Data Science, Agile, DevOps, Anomaly Detection


Linux, Amazon Web Services (AWS), Jupyter Notebook, Google Cloud Platform (GCP), Kubernetes, Docker, Azure, Backendless, AWS Lambda, IBM Cloud Platform


Data Pipelines, PostgreSQL, MySQL


Machine Learning, Deep Learning, Data Structures, Artificial Intelligence (AI), Algorithms, Team Leadership, Project Design & Management, Computer Vision, BERT, Natural Language Processing (NLP), Deep Neural Networks, Datasets, Language Models, OpenAI, Image Processing, Hardware, Large Language Models (LLMs), Models, AI Programming, Data Processing Automation, Real Estate, ARIMA, ARIMA Models, Supply Chain Management (SCM), Supply Chain Optimization, Forecasting, Information Extraction, Energy, Neural Networks, Regression Modeling, Data Processing, Data Transformation, CSV, Data Analysis, Generative Pre-trained Transformers (GPT), Back-end, Generative Pre-trained Transformer 3 (GPT-3), OpenAI GPT-4 API, Workshop Facilitation, Analytics, Convolutional Neural Networks (CNN), Sentiment Analysis, Generative AI, Data Analytics, Process Management, Logistics, Statistics, Computer Vision Algorithms, Data Visualization, Big Data Architecture, Machine Learning Operations (MLOps), Generative Adversarial Networks (GANs), DaVinci, SARIMA, Natural Language Understanding (NLU), Hugging Face, Cloud Platforms, Early-stage Startups, Generative Artificial Intelligence (GenAI), Web Development, Word Embedding, OpenAI GPT-3 API, API Integration, Speech Recognition, Scraping, Facial Recognition, Image Recognition, Speech to Text, Finance, Quantum Computing, Healthcare IT, Deep Reinforcement Learning, APIs, Object Detection, Generative Models, AI Design, Amazon RDS, Image Generation, CTO, Transformers, IBM Cloud, Prompt Engineering, Qiskit, AgentGPT, Point Clouds, Point Cloud Data, Gemini, AI Agents, Speech to Intent, Multimodal Models, Multimodal GenAI, AI Translation

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.


Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring