Bruno Barbosa Miranda, Developer in Belo Horizonte - State of Minas Gerais, Brazil
Bruno is available for hire
Hire Bruno

Bruno Barbosa Miranda

Verified Expert  in Engineering

Data Scientist and Python Developer

Location
Belo Horizonte - State of Minas Gerais, Brazil
Toptal Member Since
September 30, 2021

Bruno was originally a medical doctor but was the first to get a master's degree in computer science (CS) at one of the top universities in his country, without prior CS formal education. He is currently pursuing a Ph.D. in CS at the same university (UFMG). Bruno considers himself a highly motivated and passionate professional who loves his work and constantly learns new things.

Portfolio

Shop For A Better World
Azure Cloud Services, Cognitive Search, Classification, VM, LightGBM...
Pixelcut Inc.
Machine Learning, Python, TypeScript, PyTorch, SQL...
Faculdade Unimed
Snowflake, Docker, Python, SQL, NoSQL, Spark, Spark ML...

Experience

Availability

Part-time

Preferred Environment

Spyder, Windows 10, MacOS, Anaconda, TensorFlow, Scikit-learn, NumPy, Python 3, Amazon Web Services (AWS), LightGBM

The most amazing...

...result for me was when I implemented my own custom deep reinforcement learning algorithm, which learned to play video games independently.

Work Experience

Senior ML Engineer

2023 - 2023
Shop For A Better World
  • Successfully classified a database of over 180,000 businesses in multiple categories to generate metadata.
  • Configured Azure Cognitive Search for the client's search engine.
  • Created a web scraping system that automatically updates database records over time while deduplicating instances obtained from different sources and consolidating the results.
Technologies: Azure Cloud Services, Cognitive Search, Classification, VM, LightGBM, Web Scraping, Deduplication, Database Management, Communication

AI/ML Engineer

2023 - 2023
Pixelcut Inc.
  • Deployed a computer vision neural network to estimate soft shadows based on cutout masks, with improvements on top of the reference paper.
  • Implemented multi-GPU training with PyTorch-compiled code and multi-node disk access, training with over one billion synthetic image samples that I generated myself.
  • Produced code for the deployment of the model into the client's app using FastAPI.
  • Aided the development of a cutout masking model using synthetic data generation.
Technologies: Machine Learning, Python, TypeScript, PyTorch, SQL, Graphics Processing Unit (GPU), Google Cloud, Google Cloud ML, Google Cloud SQL, FastAPI, Deep Neural Networks, Algorithms, 3D Images, Image Generation, NVIDIA CUDA, Models, Artificial General Intelligence (AGI), Generative Adversarial Networks (GANs), Communication

Data Science Specialist Consultant

2023 - 2023
Faculdade Unimed
  • Reduced the computation time of a critical software component from three days to less than five minutes.
  • Helped build the entire Snowflake architecture from the ground up to host a software service for hospital stay analysis, similar to HRG and DRG.
  • Built core functionalities to process the entire hospital stay analysis algorithm using Python and Snowpark.
Technologies: Snowflake, Docker, Python, SQL, NoSQL, Spark, Spark ML, Machine Learning Operations (MLOps), Sentiment Analysis, Regression, MySQL, Leadership, Communication

Machine Learning Developer

2022 - 2023
Arthur Haliski De Andrade
  • Deployed a genetic algorithm based on genetic programming that learns to generate custom technical variables for trading. The whole algorithm runs on GPU using RAPIDS.
  • Delivered a mixed reinforcement learning neural network algorithm that learns to trade financial markets using neural networks written in PyTorch.
  • Managed a team of three programmers while producing complex trading algorithms myself.
Technologies: Machine Learning, Python, Algorithms, Deep Neural Networks, Genetic Algorithms, Trading Systems, Mathematics, Algorithmic Trading, Quantitative Research, Machine Learning Operations (MLOps), NVIDIA CUDA, Financial Modeling, Regression, Backtesting Trading Strategies, Models, Leadership, Communication

Senior Data Scientist

2022 - 2023
Microsoft
  • Developed an email signature expansion method based on optical character recognition (OCR).
  • Worked on clustering and classifying malicious emails.
  • Resolved many new bugs on Microsoft services, tracking them with data science and other analysis tools.
Technologies: Python, C#, Azure, Data Science, OCR, Algorithms, SQL, NVIDIA CUDA, Regression, MySQL, Models, Artificial General Intelligence (AGI), Communication

NLP Expert

2022 - 2022
Prepaire Labs Limited
  • Developed a drug recommender system based on a knowledge graph of interactions between drugs, genes, proteins, and diseases.
  • Built a drug embedding system that can cluster drug classes without labeled data.
  • Built a large-scale interaction network to predict molecule interactions between different entities.
Technologies: Python, Bioinformatics, Machine Learning, GPT, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Data Science, Natural Language Toolkit (NLTK), Biotechnology, Machine Learning Operations (MLOps), NVIDIA CUDA, Models, Leadership, Communication

Data Analyst | Statistician

2022 - 2022
Product Tranquility LLC
  • Helped the client clean inconsistent data points in their survey data.
  • Combined machine learning and data science to generate survey insights.
  • Reproduced classical survey analysis techniques to analyze pricing.
Technologies: Data Science, Linear Regression, Clustering, Surveys, Survey Development & Analysis, Interoperability, Financial Modeling, Models, Communication

Data Scientist

2022 - 2022
Alaris Acquisitions, LLC
  • Delivered an app to estimate the similarity between buying and selling institutions with a front-end presentation using Streamlit.
  • Deployed online file synchronization for all remote users of our app using Amazon's AWS S3.
  • Created a customizable interface to input and change system variables to make the app as flexible as possible while maintaining consistency and scalability.
Technologies: Algorithms, Streamlit, Python, Machine Learning, Finance, Data Mining, Data Modeling, Data Analytics, Generative Pre-trained Transformers (GPT), GPT, Natural Language Processing (NLP), Pandas, JSON, Text Mining, Algorithmic Trading, Quantitative Research, Financial Modeling, Communication

Senior Data Scientist and ML Engineer

2021 - 2022
Toptal Client
  • Developed and successfully deployed a custom recommender system for one of the clients' ventures.
  • Worked actively on the automatic metadata generation from users' images with transfer learning.
  • Developed a churn detection model integrated with customized emails for the CS team.
  • Worked actively on a semantic search algorithm based on items and user embeddings.
  • Developed PowerBI dashboards to show relevant data to clients.
  • Integrated our solutions with existing architectures using MLflow and email APIs.
  • Developed a lead scoring algorithm based on the positive-unlabeled problem framework.
Technologies: Microsoft Power BI, Microsoft Azure, Databricks, Recommendation Systems, B2C Marketing, TensorFlow, Neural Networks, Analysis, Data Analysis, Apache Spark, Data Visualization, Statistics, Business Intelligence (BI), MLflow, Statistical Data Analysis, Analytics, Data Analytics, ETL, Atlassian, Jira, Database Analytics, Classification Algorithms, NumPy, Innovation, PySpark, Data Science, Jupyter Notebook, Data Engineering, Excel 365, Microsoft Excel, Predictive Modeling, Predictive Analytics, Data Mining, Data Modeling, Data Reporting, SQL, Python, Cloud, Deep Learning, GPT, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Pandas, ETL Tools, JSON, Text Mining, Technical Hiring, Code Review, Task Analysis, Interviewing, Statistical Modeling, Azure, Bash, OCR, Sentiment Analysis, MySQL

Senior Data Scientist

2021 - 2021
Tecnium
  • Developed the best unique product identification routine available at the time.
  • Taught teammates about neural network approaches, such as autoencoders, RankNet, transformers, and embedding spaces.
  • Created a product name embedding, using a transformer neural network, usable in multiple problems.
Technologies: Python 3, Microsoft Azure, Databricks, Machine Learning, Few-shot Learning, Rankings, Classification Algorithms, Azure Machine Learning, NumPy, Innovation, PySpark, Data Science, Jupyter Notebook, Data Engineering, Excel 365, Microsoft Excel, Predictive Analytics, Data Mining, Data Modeling, Data Reporting, SQL, Python, Cloud, Deep Learning, Analytics, Generative Pre-trained Transformers (GPT), GPT, Natural Language Processing (NLP), Pandas, ETL Tools, JSON, Text Mining, Statistical Modeling, Azure, Bash, OCR, Google Cloud ML, Regression, MySQL

Senior Data Scientist

2019 - 2021
Unimed
  • Developed data-oriented models using NLP, machine learning, and transformers.
  • Helped transform my current sector into a data science-oriented enterprise.
  • Built useful metrics and scoring systems using embeddings and neural networks.
  • Created and deployed a web crawler to generate healthcare-related disease code datasets.
  • Proposed, implemented, and deployed an algorithm for future hospitalization prediction and a semantic representation for patient disease data.
  • Competed as one of three finalists in the company's innovation prize awards in all the years I participated, winning the 2021 prize.
  • Used time-series algorithms to predict future healthcare requirements. I used multiple neural network architectures, such as variational autoencoders, residual neural networks, and vision transformers to obtain information from medical images.
Technologies: Python 3, TensorFlow, Machine Learning, Reinforcement Learning, GPT, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Amazon Web Services (AWS), ARIMA, LightGBM, SVMs, Support Vector Machines (SVM), LSTM, LSTM Networks, Long Short-term Memory (LSTM), Artificial Intelligence (AI), Artificial Neural Networks (ANN), MySQL, MySQLi, SQL, Presto, Big Data, Residual Neural Network (ResNet), Vision Transformer (ViT), Classification Algorithms, Images, Health, Azure Machine Learning, NumPy, Innovation, Transformers, Data Science, Data Engineering, Excel 365, Microsoft Excel, Predictive Modeling, Predictive Analytics, Data Mining, Data Modeling, Data Reporting, Data Analytics, Amazon SageMaker, Image Processing, Computer Vision, Python, Cloud, Deep Learning, Analytics, Pandas, ETL Tools, Text Mining, Code Review, Task Analysis, Statistical Modeling, Bash, OCR, Oncology & Cancer Treatment, Sentiment Analysis

Teacher

2018 - 2021
University of Medical Science
  • Developed a digital teaching system, using online technologies.
  • Taught students about medical research using machine learning and data science methods.
  • Oversaw research projects and courses focused on recent machine learning and general technology trends.
Technologies: Python 3, Data Science, Research, Classification Algorithms, NumPy, Innovation, Jupyter Notebook, Code Review, Oncology & Cancer Treatment

Project Cindy

Cindy is a reinforcement learning agent that learns how to trade stocks based on price, volume, and other relevant data. I've built a simulator that operates on historical data to train the agent and applied it to real-time market operations. The initial version made a bridge between MetaTrader 5 and my own Python API.

In this project, I've tested many different reinforcement learning technologies, including :
• Deep-Q learning agent (DQN)
• Asynchronous actor critic agent (A3C)
• Synchronous actor critic agent (A2C)
• Proximal policy optimization (PPO)
• Convolutional Neural Networks (CNN)

Instagram Bot

I've created a fully automated Instagram automation solution as a doodle project. The robot would automatically visit profiles (web crawling), evaluate the profile against my own, estimate the probability that given it interacts with and follows the user, the user will follow it back. If the probability exceeds a certain threshold, the robot likes and comments on some posts and then follows the target user. Using this strategy, my mock account grew to 2.000 followers in around a month, after which I stopped using the bot.

Brain CT Image Alteration Detection

In this doodle project, I've set up an autoencoder to learn the reconstruction of normal brain CT images, taking a public Kaggle dataset as input, and then used the resulting embedding to obtain state-of-the-art classification results.

Automatic Labeling with Reinforcement Learning

I worked on this project for my master's degree and built a reinforcement learning agent that meta-learns based on a small labeled dataset. The agent selects chunks of data to use one of its trained classifiers and label, then adds to the labeled set and retrain other classifiers.

Electrocardiogram (ECG) Automatic Classification

https://www.kaggle.com/competitions/dcc-week-challenge-2023/overview
I won second place in this Kaggle competition for automatic electrocardiogram classification while learning from a relatively small sample of 12-lead examinations. For this project, I used a custom neural network solution.

Languages

Python 3, Python, SQL, C++, Bash, MQL5, C#, Snowflake, TypeScript

Libraries/APIs

TensorFlow, Scikit-learn, NumPy, Pandas, PyTorch, PySpark, LSTM, Natural Language Toolkit (NLTK), Spark ML, Amazon API

Paradigms

Data Science, ETL, Business Intelligence (BI), Interoperability, Quantitative Research

Platforms

Anaconda, Jupyter Notebook, Amazon Web Services (AWS), Databricks, Azure, NVIDIA CUDA, MacOS, Docker, Amazon EC2, AWS Lambda

Other

Business Administration, Innovation, Machine Learning, Neural Networks, Deep Neural Networks, Deep Reinforcement Learning, Transformers, Natural Language Processing (NLP), Reinforcement Learning, English, Autoencoders, Medical Imaging, Algorithms, Artificial Intelligence (AI), Artificial Neural Networks (ANN), Data Analysis, Analytics, Database Analytics, Data Engineering, Predictive Modeling, Predictive Analytics, Data Mining, Data Modeling, Data Reporting, ETL Tools, Text Mining, Task Analysis, Statistical Modeling, OCR, Graphics Processing Unit (GPU), Oncology & Cancer Treatment, GPT, Generative Pre-trained Transformers (GPT), Sentiment Analysis, Financial Modeling, Regression, Models, Communication, Research, Microsoft Azure, Few-shot Learning, Web Crawlers, Deep Learning, Classification Algorithms, Software Engineering, Computer Vision, Vision Transformer (ViT), Recurrent Neural Networks (RNNs), Long Short-term Memory (LSTM), Web Scraping, Residual Neural Network (ResNet), Machine Learning Operations (MLOps), Data Visualization, Statistical Data Analysis, Data Analytics, Excel 365, Image Processing, Cloud, Source Code Review, Code Review, Backtesting Trading Strategies, Artificial General Intelligence (AGI), Generative Adversarial Networks (GANs), Leadership, Windows 10, BERT, Learning, Rankings, Dedupe.io, Convolutional Neural Networks (CNN), Meta-learning, 3D Images, Images, Health, Medical Software, ARIMA, SVMs, Support Vector Machines (SVM), Variational Autoencoders, Data Transformation, LSTM Networks, Big Data, Recommendation Systems, B2C Marketing, Analysis, Statistics, MLflow, Finance, Technical Hiring, Interviewing, Linear Regression, Clustering, Surveys, Survey Development & Analysis, Biotechnology, Genetic Algorithms, Mathematics, Algorithmic Trading, GPU Computing, Google Cloud ML, FastAPI, Image Generation, Cognitive Search, Classification, VM, Deduplication

Frameworks

LightGBM, Selenium, Presto, Apache Spark, Streamlit, Spark

Tools

Spyder, Atlassian, Jira, Microsoft Excel, Amazon SageMaker, Named-entity Recognition (NER), Azure Machine Learning, Microsoft Power BI, AWS CLI, AWS IAM

Storage

MySQL, MySQLi, JSON, NoSQL, Google Cloud, Google Cloud SQL, Azure Cloud Services, Database Management

Industry Expertise

Bioinformatics, Trading Systems

2020 - 2021

Ph.D. Degree in Computer Science

Federal University of Minas Gerais - Belo Horizonte

2018 - 2020

Master's Degree in Computer Science

Federal University of Minas Gerais - Belo Horizonte

2017 - 2018

Master of Business Administration in Business Administration

Dom Cabral Foundation - Belo Horizonte

2009 - 2015

Bachelor's Degree in Medicine

Federal University of Minas Gerais - Belo Horizonte

NOVEMBER 2022 - PRESENT

AWS Academy Graduate - AWS Academy Cloud Foundations

Amazon Web Services Training and Certification

JANUARY 2014 - PRESENT

Cambridge Advanced English Certificate

Cambridge Assessment English

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring