Felipe Batista, Developer in Belo Horizonte - State of Minas Gerais, Brazil
Felipe is available for hire
Hire Felipe

Felipe Batista

Verified Expert  in Engineering

Machine Learning Engineer and Full-stack Developer

Location
Belo Horizonte - State of Minas Gerais, Brazil
Toptal Member Since
October 17, 2018

Felipe has 8+ years of experience in machine learning and full-stack software development. He's currently focused on cutting-edge technologies such as TensorFlow, Keras, PyTorch, OpenCV, and most of the Python data science stack. He is an AWS-certified solutions architect skilled in implementing deep learning models from research papers with a focus on computer vision and reinforcement learning.

Availability

Full-time

Preferred Environment

GitHub, PyCharm, Linux, MacOS

The most amazing...

...project I've implemented was an AI SaaS app that was able to serve over 100,000 users, leveraging complex user behavior analytics to improve the core offering.

Work Experience

Founder and Software Engineer

2012 - PRESENT
Totem AI
  • Developed and scaled an image generation service to 1+ million users and built a sophisticated image evaluation pipeline to improve renders, reduce generation time, and provide high-quality control.
  • Built generative AI models for text and image generation, utilizing technologies like Stable Diffusion, GPT (and similar models), retrieval augmented generation (RAG), and diffusion models.
  • Gained proficiency in analyzing customer behavior data within the SaaS domain to identify key churn drivers, utilizing advanced analytics and user segmentation techniques. Developed machine learning models to predict churn and identify risky behavior.
  • Developed comprehensive business intelligence solutions, focusing on data-driven dashboards and financial modeling for a mid-size agriculture conglomerate with over 60 million in revenue.
  • Developed image classification pipelines using convolutional neural networks (CNNs, data augmentation, and transfer learning) for real-time object detection, face classification/recognition, and semantic segmentation.
  • Architected digital signal processing pipelines for healthcare (freezing of gait detection for Parkinson's disease patients using sensor data).
  • Developed real-time machine learning models for fantasy football, including analysis of optimal lineup selections.
  • Implemented scientific papers containing state-of-the-art research related to computer vision, digital signal processing, time-series modeling for financial markets, and NLP for data enrichment.
Technologies: Amazon Web Services (AWS), Google Cloud, Scikit-learn, PyTorch, Keras, TensorFlow, Python, Amazon SageMaker, Microsoft Power BI, Stable Diffusion, Generative Pre-trained Transformers (GPT), Team Leadership, Deep Learning, NumPy, Artificial Intelligence (AI), Machine Learning, Data Science, Generative Artificial Intelligence (GenAI), OpenAI GPT-4 API, OpenAI GPT-3 API, LangChain, LlamaIndex

Data Scientist and Full-stack Software Engineer

2015 - 2018
ShopYak
  • Developed a neural network and Q-learning (reinforcement learning) to perform automated A/B testing on different website layouts for eCommerce stores. The objective was to optimize conversions by adjusting the layout, fonts, and colors for each store.
  • Designed and developed portions of the front end using AngularJS and SCSS. Set up the build process using Grunt.
  • Implemented several PowerBI dashboards to keep track of KPIs and general user behavior (using both DB and Google Analytics integrations).
  • Developed a significant portion of the back end using Node.js, Express, and PostgreSQL.
  • Integrated Stripe (regular and connect) both in the front and back end.
  • Deployed on AWS with HTTPS, Amazon CloudFront, Amazon S3, and Elastic Load Balancing (ELB).
Technologies: Amazon Web Services (AWS), Express.js, Grunt, SCSS, PostgreSQL, Node.js, Angular, Google Cloud, Scikit-learn, PyTorch, Keras, TensorFlow, Python, Microsoft Power BI, Analytics, Team Leadership, Deep Learning, NumPy, Artificial Intelligence (AI), Machine Learning, Data Science

Large-scale Deep Learning-based Video Analytics Pipeline for a United Nations Project

Architected and implemented a video processing pipeline to deduplicate videos at scale using deep learning (TensorFlow) to generate video signatures. I also developed a video augmentation pipeline to validate and test deduplication models using OpenCV, MoviePy, and FFmpeg. I implemented several evaluation routines—both visual and quantitative—to evaluate model results.

DSP Using Multiple Deep Learning Architectures (CNNs, LSTM, GRU)

The project involved the creation of a digital signal processing pipeline to perform non-intrusive load monitoring (a process for analyzing changes in the voltage and current going into a house and deducing what appliances are used in the house as well as their individual energy consumption).

Over the course of the project, I:

1. Reviewed several papers containing the state-of-the-art methods for NILM.

2. Optimized available models to perform an initial POC.

3. Explored different model architectures on a specific setting defined by the client, including:
a. Parallel CNNs
b. Parallel CNNs with LSTMs
c. CNNs with LSTM (bidirectional)
d. CNN with GRU

Models were optimized, and ultimately, the best model was chosen based on the results of a cross-validation routine.

Deliverables were both Jupyter Notebooks and Python Scripts.

Image Classification Pipeline with TensorFlow/Keras

Implemented a state-of-the-art image classification pipeline using TensorFlow/Keras.

Tested several different model architectures (including multi-input models with both images and bounding boxes).

Settled on a fine-tuned VGG16 network.

The pipeline included data augmentation, cross-validation, visualization of accuracy and loss across epochs (and sample results).

DSP/DL for Freezing of Gait Detection (with POC Mobile App)

Implemented a digital signal processing pipeline for freezing of gait detection for patients with Parkinson's.

Replicated state of the art medical scientific papers using Python, Sklearn, Tensorflow, Keras, and Jupyter in order to obtain well-defined baselines.

After in-depth research of DSP techniques for anomaly detection, I implemented a few Deep Learning Model architectures never before applied to this specific domain.

Performed cross-validation to assess model performance focusing on the model's generalization potential. The model was trained using the data of 80% of the patients and tested on the data for the remaining patients.

Examples of Models implemented on this project:

CNNs
Deep Conv LSTM
Parallels CNNs w/ LSTM

In order to test the model, I implemented a simple Android application that used the trained model to make a live Freezing of Gait inference based on the phone's accelerometer data. The model was converted to CoreML for future iOS app development.

Other relevant work includes handling of class imbalance by adjustment of the loss function of the deep learning models, hyperparameter optimization using GridSearch, and simulation of the effect of different windows sizes on model performance.

Languages

Python, JavaScript, SCSS, R, C++

Frameworks

LlamaIndex, Express.js, Flask, Angular

Libraries/APIs

NumPy, Pandas, Keras, TensorFlow, Scikit-learn, OpenCV, PyTorch, Node.js, Amazon Rekognition, SpaCy

Tools

Microsoft Excel, Amazon SageMaker, Git, Microsoft Power BI, PyCharm, GitHub, Grunt

Paradigms

Data Science, Agile Software Development

Platforms

Jupyter Notebook, Amazon Web Services (AWS), Google Cloud Platform (GCP), MacOS, Linux

Other

Machine Learning, Computer Vision, Deep Learning, Artificial Intelligence (AI), Diffusers, Stable Diffusion, ControlNet, Retrieval-augmented Generation (RAG), Image Generation, Replicate, ChatGPT, Team Leadership, Generative Artificial Intelligence (GenAI), Chatbots, OpenAI GPT-4 API, OpenAI GPT-3 API, LangChain, Analytics, Natural Language Processing (NLP), Econometrics, Computational Finance, Statistics, Generative Pre-trained Transformers (GPT)

Storage

Google Cloud, MongoDB, PostgreSQL

2008 - 2012

Bachelor of Science Degree in Economics with a Focus on Econometrics and Computational Methods

UFMG - Belo Horizonte, Brazil

AUGUST 2018 - AUGUST 2021

AWS Certified Solutions Architect Associate

AWS

MAY 2018 - PRESENT

Deep Learning Specialization

Coursera

APRIL 2018 - PRESENT

Image and Video Processing

Coursera

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring