Joao Gabriel Oliveira, Developer in Lisbon, Portugal
Joao is available for hire
Hire Joao

Joao Gabriel Oliveira

Verified Expert  in Engineering

Data Engineer and Developer

Location
Lisbon, Portugal
Toptal Member Since
December 23, 2022

Joao is an experienced, challenge-driven, and detail-oriented software architect specializing in data engineering and machine learning. He is highly skilled in designing and implementing end-to-end data solutions that combine efficient processing, complex querying capabilities, and insights extraction. With solid foundations in computer science and math and excellent communication and teamwork skills, Joao can deliver value to the most challenging data-driven projects.

Portfolio

Izea
Apache Spark, Elasticsearch, Python, Amazon SageMaker...
Izea
Apache Spark, Elasticsearch, Python, Amazon SageMaker, Data Modeling...
TapInfluence
Scala, Spark, Java, Spring, AWS Cloud Architecture, SQL, Terraform, PostgreSQL...

Experience

Availability

Part-time

Preferred Environment

Python, Amazon Web Services (AWS), Visual Studio Code (VS Code), Slack, Jira, GitHub

The most amazing...

...contribution I've made was modeling and coordinating the development of a social media data platform with over 10 million profiles and 1.5 billion posts.

Work Experience

Data Manager

2022 - PRESENT
Izea
  • Increased the average team velocity up to 50% through process improvements, better planning, and knowledge-sharing initiatives.
  • Supported multiple teams by delivering the technological direction of the company's social data platform.
  • Led and managed a team of five members, including back-end engineers, data engineers, and data scientists.
Technologies: Apache Spark, Elasticsearch, Python, Amazon SageMaker, Amazon Elastic MapReduce (EMR), Jira, Data Modeling, Data Lakes, Spark Streaming, Amazon Kinesis, AWS Cloud Architecture, SQL, Data Engineering, PySpark, ETL, PostgreSQL, Data Architecture, Big Data Architecture, Data Warehousing, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), JavaScript, NoSQL, Amazon Web Services (AWS), Data, Docker, CI/CD Pipelines, DevOps, Data Visualization, Machine Learning, Spark, Programming, Semantics, Data Pipelines, Amazon Elastic Container Service (Amazon ECS), ETL Implementation & Design, Amazon S3 (AWS S3), Business Intelligence (BI), APIs, Reporting, Databases, Data Transformation, Database Architecture, Database Design, Data Science, Architecture

Senior Data and Machine Learning Engineer

2018 - 2022
Izea
  • Modeled and coordinated the deployment of a social media data platform with over 10 million profiles and 1.5 billion posts, integrating batch and stream processing technologies with an intricate index schema design.
  • Spearheaded the execution of machine learning models as a core platform element, employing a combination of textual feature extraction and various regression and classification algorithms to forecast audience demographics.
  • Served as the main point of contact between the product and data teams in specifying data product requirements.
  • Supported the launch of three new products by leading the implementation process.
Technologies: Apache Spark, Elasticsearch, Python, Amazon SageMaker, Data Modeling, Data Lakes, Spark Streaming, Amazon Kinesis, AWS Cloud Architecture, SQL, Data Engineering, PySpark, ETL, PostgreSQL, Data Architecture, Big Data Architecture, Data Warehousing, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), JavaScript, NoSQL, Amazon Web Services (AWS), Data, Docker, CI/CD Pipelines, DevOps, Data Visualization, Machine Learning, Spark, Programming, Semantics, Data Pipelines, Amazon Elastic Container Service (Amazon ECS), ETL Implementation & Design, Amazon S3 (AWS S3), Business Intelligence (BI), APIs, Reporting, Databases, Data Transformation, Database Architecture, Database Design, Data Science, Architecture

Senior Software Architect

2017 - 2018
TapInfluence
  • Coordinated the system shifting to microservices architecture with a keen focus on analytics and search components.
  • Steered significant efforts for implementing relevance score improvements on the product search functionality.
  • Delivered exceptional assistance to the VP of engineering on critical technological and architectural decisions.
Technologies: Scala, Spark, Java, Spring, AWS Cloud Architecture, SQL, Terraform, PostgreSQL, NoSQL, Amazon Web Services (AWS), Data, Docker, CI/CD Pipelines, DevOps, Data Visualization, Programming, Data Pipelines, Amazon Elastic Container Service (Amazon ECS), ETL Implementation & Design, Amazon S3 (AWS S3), APIs, Databases, Database Architecture, Database Design, Architecture

Senior Software Architect | Partner

2010 - 2017
Amtera Semantic Technologies
  • Implemented diverse product concepts and components using state-of-the-art research on semantic search and natural language processing.
  • Established robust and effective in-house solutions for various important clients in the telecom, oil and gas, and IT security industries.
  • Collaborated with other partners to implement the overall company strategy.
Technologies: Python, MongoDB, Elasticsearch, Linked Data, Semantics, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Data Modeling, SQL, Scala, Data Engineering, ETL, PostgreSQL, Data Architecture, Big Data Architecture, NoSQL, Data, Data Visualization, Machine Learning, Spark, Programming, ETL Implementation & Design, APIs, Databases, Database Architecture, Database Design, Data Science, Architecture

Core Social Media Data Platform for Izea

A cloud-based data platform using a data lake architecture to ingest, process, index, and query social media data, covering full-text search and analytics use cases.

I was one of the key architects and also played a central role in implementing the platform.

NLP-based Machine Learning Models for Izea

A suite of machine learning models that used pipelines of word embeddings and other supervised and unsupervised approaches to predict the audience demographics of social media profiles.

I was the principal architect and developer of the project.
2006 - 2012

Bachelor's Degree in Computer Science

Federal University of Rio de Janeiro | UFRJ - Rio de Janeiro, Brazil

AUGUST 2022 - PRESENT

Machine Learning with Python: From Linear Models to Deep Learning

MITx Online

MAY 2013 - PRESENT

MongoDB for Developers

MongoDB University

Libraries/APIs

Spark Streaming, PySpark, Node.js, PyTorch

Tools

GitHub, Amazon Elastic MapReduce (EMR), Amazon SageMaker, Slack, Jira, Amazon Elastic Container Service (Amazon ECS), Terraform

Frameworks

Apache Spark, Spark, Spring

Languages

Python, SQL, Java, Scala, JavaScript

Paradigms

ETL, ETL Implementation & Design, Database Design, Data Science, DevOps, Business Intelligence (BI)

Platforms

Amazon Web Services (AWS), Docker, Visual Studio Code (VS Code)

Storage

Elasticsearch, Databases, NoSQL, Data Pipelines, Amazon S3 (AWS S3), Database Architecture, Data Lakes, MongoDB, PostgreSQL

Other

Data Modeling, Algorithms, Programming, Time Complexity Analysis, Space Complexity Analysis, Linked Data, Semantics, Natural Language Processing (NLP), Machine Learning, Data Engineering, Data Architecture, Big Data Architecture, Data Warehousing, Data, APIs, Data Transformation, Architecture, Generative Pre-trained Transformers (GPT), Amazon Kinesis, AWS Cloud Architecture, Compilers, Number Theory, Deep Learning, CI/CD Pipelines, Data Visualization, Reporting

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring