Felipe Almeida, Developer in São Paulo - State of São Paulo, Brazil
Felipe is available for hire
Hire Felipe

Felipe Almeida

Verified Expert  in Engineering

Software Engineering Developer

Location
São Paulo - State of São Paulo, Brazil
Toptal Member Since
June 12, 2019

Felipe has over nine years of experience as a software engineer and data scientist, both in academia and in the industry, using tools like Python and Scala. He is interested in working at the intersection of data science and software engineering, helping companies build solid ML systems and pipelines. He has experience in retail, infosec, banking, and fintech and is open to working in other areas.

Portfolio

Nubank
Kubernetes, Spark, Scala, Scikit-learn, Pandas, Python, Predictive Modeling...
Cognitivo.ai
Pandas, Matplotlib, Scikit-learn, Flask, Python, Data Science, TensorFlow...
Itaú-Unibanco
Camel, Java, Pandas, Scikit-learn, Flask, Python, Software Engineering...

Experience

Availability

Part-time

Preferred Environment

Ubuntu, Linux, Data Science, Natural Language Processing (NLP), GPT, Generative Pre-trained Transformers (GPT), Machine Learning

The most amazing...

...thing I've coded is an information system the whole company runs on.

Work Experience

Lead Machine Learning Engineer

2019 - PRESENT
Nubank
  • Helped ensure machine learning models—including credit risk, credit lines, and fraud—performed as expected concerning technical and business metrics via model monitoring, alerting, and more.
  • Interacted with managers and domain experts to help design and think about ML-based (or sometimes simpler) solutions and products for business problems.
  • Helped direct cross-team efforts to enhance our ML infrastructure, processes, and governance.
  • Wrote tools (Python, Scala, Bash, and ad-hoc dashboards) to help increase productivity and correctness in data science and machine learning tasks.
  • Oversaw the lifecycle of ML systems, from deployment through operation, monitoring, and eventual termination, for multiple models, including batch and online.
  • Mentored and led junior team members, validating and verifying their work and helping them improve technically.
Technologies: Kubernetes, Spark, Scala, Scikit-learn, Pandas, Python, Predictive Modeling, Data Science, IT Project Management, Software Engineering, Machine Learning, Amazon Web Services (AWS), Credit Modeling, Fintech, Interviewing, Fraud Prevention, Banking & Finance, APIs, NumPy, SQL, Technical Hiring, Docker, Flask, Shell Scripting, REST APIs, Python 3, Artificial Intelligence (AI), Machine Learning Operations (MLOps)

Data Science Consultant (Part-time)

2018 - PRESENT
Cognitivo.ai
  • Created a full end-to-end solution for text classification for legal documents—data analysis, presenting, modeling, and serving models via an API on AWS.
  • Designed, modeled, and implemented a system to generate keywords based on article titles. Used Seq2seq models on TensorFlow served via a Flask API on AWS, fully packaged on Docker.
  • Analyzed legal documents for auction systems and assessed for viability.
  • Assessed the viability and contributed to peer reviews for a project related to social media analytics.
Technologies: Pandas, Matplotlib, Scikit-learn, Flask, Python, Data Science, TensorFlow, Machine Learning, Generative Pre-trained Transformers (GPT), GPT, Natural Language Processing (NLP), Interviewing, Fraud Prevention, Banking & Finance, APIs, Spark, NumPy, SQL, Technical Hiring, Docker, Shell Scripting, REST APIs, Predictive Modeling, Python 3, Artificial Intelligence (AI), Machine Learning Operations (MLOps)

Senior Machine Learning Engineer

2019 - 2019
Itaú-Unibanco
  • Helped design and develop the infrastructure for services supporting the main ML model API, such as proxies, gateways, databases, and caches.
  • Helped design and develop the main Flask-based API to serve a conversational ML model.
  • Prioritized efforts and helped guide junior members of the team.
Technologies: Camel, Java, Pandas, Scikit-learn, Flask, Python, Software Engineering, Machine Learning, Credit Modeling, Interviewing, Banking & Finance, APIs, Data Science, NumPy, SQL, Technical Hiring, Puppet, Python 3, Artificial Intelligence (AI), Machine Learning Operations (MLOps)

Senior Data Scientist

2017 - 2019
Itaú-Unibanco
  • Collaborated with core teams in the bank to develop data-driven solutions for business problems.
  • Interacted with clients and stakeholders to design solutions that impact the bottom line and deliver actual value.
  • Designed and trained statistical models to address problems such as credit risk modeling, debt recovery optimization, churn analysis, user segmentation, and more.
  • Helped deploy and monitor the usage of trained models.
  • Built command-line and web-based tools for the company's end-to-end ML platform.
Technologies: Scikit-learn, Matplotlib, NumPy, Jupyter Notebook, SAS, Python, Machine Learning, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), GPT, Credit Modeling, Interviewing, Banking & Finance, APIs, Data Science, SQL, Technical Hiring, Predictive Modeling, Artificial Intelligence (AI)

Senior Software Engineer | Data Engineer

2015 - 2017
VTEX
  • Designed relevant and actionable metrics and KPIs to measure business objectives, both short and long-term.
  • Designed and implemented AWS-based solutions, using tools such as EMR, Kinesis, ElasticBeanstalk, DynamoDB, S3, among many others.
  • Created and implemented batch- and stream-oriented data pipelines using Apache Spark.
  • Leveraged Elasticsearch for data storage and analytics at scale.
  • Designed and implemented Docker-based, reactive systems on top of Akka, mostly using akka-http.
Technologies: Amazon Web Services (AWS), Akka, Amazon Elastic MapReduce (EMR), Elasticsearch, Spark, Java, Scala, Software Engineering, APIs, SQL, Shell Scripting, eCommerce, REST APIs

Full-stack Software Engineer

2011 - 2015
3Elos Informática
  • Interviewed users and stakeholders to find out what their exact needs were.
  • Designed and implemented full web applications—back end, front end, and database—using modern PHP with Yii MVC Framework.
  • Designed and wrote large RESTFul APIs using Scala with Play 2 Framework.
  • Designed and wrote unit, functional, acceptance, Selenium-based tests to ensure required functionality is stable, enabling changes and refactoring.
  • Managed Linux Servers—web servers, shell scripting, task automation, and database administration.
  • Interacted with and managed Elasticsearch clusters.
Technologies: RESTful Development, REST APIs, Play Framework, Yii, Elasticsearch, Scala, PHP, Software Engineering, APIs, SQL, Shell Scripting, PHP 5

Full-stack Software Engineer (Internship)

2010 - 2011
Marinha do Brasil
  • Interviewed a variety of users regarding specific features to be implemented.
  • Designed and implemented a back-end functionality using PHP.
  • Designed and implemented a database functionality using SQL and PL/SQL.
  • Designed and implemented a front-end functionality using Adobe Flex/ActionScript.
Technologies: Oracle, PHP, Adobe Flex, Software Engineering, Yii, SQL, Shell Scripting, PHP 5

Social Tag Prediction: Resource-centered Approaches for Broad Folksonomies

https://www.cos.ufrj.br/index.php/pt-BR/publicacoes-pesquisa/details/15/2865
This work addresses the problem of how to predict tags that will be assigned by users in social tagging systems. It is widely known that a tag prediction functionality helps promote system usability and increase the quality of the tag vocabulary in use. With that in mind, we verify the difference in the performance of several label ranking techniques on two datasets, which differ from each other in several key metrics, such as the average number of tags per resource, tag vocabulary length, the total number of resources, etc. We also analyze a specific label ranking technique, namely MIMLSVM. We verify whether it generalizes to dense text representations in addition to traditional sparse ones. Experiments are conducted on the two datasets, and the results are analyzed.

Word Embeddings: A Survey

https://arxiv.org/pdf/1901.09069.pdf
This work lists and describes the main recent strategies for building fixed-length, dense, and distributed representations for words based on the distributional hypothesis. These representations are now commonly called word embeddings. In addition to encoding surprisingly good syntactic and semantic information, they have been proven useful as extra features in many downstream NLP tasks.

Languages

Python, Scala, SQL, Bash Script, Python 3, Java, PHP, HTML, CSS, SAS, Ruby, JavaScript, PHP 5

Libraries/APIs

Scikit-learn, Pandas, Matplotlib, NumPy, REST APIs, Keras, SciPy, TensorFlow

Other

Machine Learning, Natural Language Processing (NLP), GPT, Generative Pre-trained Transformers (GPT), Fintech, eCommerce, Credit Modeling, Fraud Prevention, Shell Scripting, APIs, Software Engineering, Interviewing, Technical Hiring, Machine Learning Operations (MLOps), Predictive Modeling, IT Project Management, Technical Writing, Artificial Intelligence (AI)

Frameworks

Yii, Spark, Akka, Play Framework, Adobe Flex, Camel, Flask

Paradigms

Data Science, RESTful Development

Industry Expertise

Banking & Finance

Tools

Puppet, Amazon Elastic MapReduce (EMR)

Platforms

Ubuntu, Jupyter Notebook, Oracle, Kubernetes, Linux, Amazon Web Services (AWS), Docker

Storage

Elasticsearch

2015 - 2018

Master of Science Degree in Computer Science

Universidade Federal do Rio de Janeiro - Rio de Janeiro, Brazil

2009 - 2014

Bachelor of Science in Computer Science

Universidade Federal do Rio de Janeiro - Rio de Janeiro, Brazil

JANUARY 2013 - JANUARY 2016

Systems Security Certified Practitioner

(ISC)2

JANUARY 2013 - JANUARY 2018

Linux Professional Institute Certification - Level 1

Linux Professional Institute