Pedro Lima, Developer in Porto, Portugal
Pedro is available for hire
Hire Pedro

Pedro Lima

Verified Expert  in Engineering

Software Developer

Location
Porto, Portugal
Toptal Member Since
September 27, 2017

Pedro is a software developer and architect specializing in data science, machine learning, and AI. He has extensive experience in the end-to-end process of conceiving, designing, developing, and deploying data applications for large companies and startups.

Portfolio

Mirzacles
Artificial Intelligence (AI), Machine Learning, Python...
Hum Nutrition Inc
Machine Learning, Python, Llama 2, Generative Pre-trained Transformers (GPT)...
Talkmap (formerly Discourse.ai)
Continuous Deployment, Continuous Integration (CI), Rasa.ai, GitLab CI/CD...

Experience

Availability

Part-time

Preferred Environment

Jupyter, Python, MacOS, Linux, Git, Docker

The most amazing...

...thing I've developed was an AI agent that learned to play text-based games and won the Microsoft TextWorld competition.

Work Experience

AI/ML Engineer via Toptal

2023 - 2024
Mirzacles
  • Fine-tuned several models with client data using QLoRA (Llama2 70B, Mistral 7B, Mixtral 8x7B, and Yi 34B).
  • Developed SFT instructions and trained a judge model for quality evaluation.
  • Developed a Gradio prototype app for model testing and evaluation.
  • Deployed models with vLLM and FastAPI. Optimized model deployment with AWQ quantization.
Technologies: Artificial Intelligence (AI), Machine Learning, Python, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Deep Learning, Supervised Learning, Unsupervised Learning, PyTorch, Data Science, Gradio, Large Language Models (LLMs), Llama 2, Machine Learning Operations (MLOps), Google Cloud Platform (GCP), FastAPI, Product Management, LangChain, Generative Pre-trained Transformer 3 (GPT-3), OpenAI, Text Generation, Serverless, Docker, REST

Machine Learning Engineer

2023 - 2023
Hum Nutrition Inc
  • Developed a conversational AI agent for data analysis from text descriptions. The application has access to information on the internal databases and data structures for retrieval augmented generation and is able to generate and execute SQL queries.
  • Developed an AI agent for customer support, using a generative LLM, retrieval augmented generation, and a sequential chain of prompts that implements the theory-of-mind algorithm.
  • Deployed the AI agents in Google Cloud and Vercel platforms.
Technologies: Machine Learning, Python, Llama 2, Generative Pre-trained Transformers (GPT), GPT, OpenAI GPT-4 API, LangChain, Chatbots, AIOps, Google Cloud Platform (GCP), Supabase, Vercel, Retrieval-augmented Generation (RAG), A/B Testing, Data Scientist, Statistics, Algorithms, Analytics, OpenAI, Text Generation, Serverless, Docker, Predictive Analytics, Statistical Modeling, Data Modeling, SciPy, Data Analysis, REST, Google BigQuery

NLP and Machine Learning Developer (Freelance)

2018 - 2023
Talkmap (formerly Discourse.ai)
  • Developed machine learning models for natural language understanding, concept discovery, incremental discovery, and conversational analytics.
  • Implemented models for dialog flow identification and classification.
  • Built unsupervised models for intent discovery and natural language generation.
  • Designed and built the AI platform for incremental discovery of new customer intents in call center conversation data.
Technologies: Continuous Deployment, Continuous Integration (CI), Rasa.ai, GitLab CI/CD, Apache Kafka, Apache Airflow, Data Pipelines, Docker, Big Data, PostgreSQL, Data Science, Machine Learning, PyTorch, Artificial Intelligence (AI), GPT, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Natural Language Understanding (NLU), Computational Linguistics, Pytest, SpaCy, Data Engineering, Kubernetes, Amazon Web Services (AWS), Scrum, Scikit-learn, Google Cloud Platform (GCP), Jupyter, Deep Learning, Chatbots, Statistical Data Analysis, NumPy, Python, Pandas, IBM Watson, React, TypeScript, Generative Pre-trained Transformer 3 (GPT-3), OpenAI, HTMX, ChatGPT, Technical Leadership, Software Architecture, FastAPI, Machine Learning Operations (MLOps), Language Models, Large Language Models (LLMs), Haystack, Supervised Learning, Unsupervised Learning, Retrieval-augmented Generation (RAG), A/B Testing, Data Scientist, Statistics, Algorithms, Analytics, Software as a Service (SaaS), Text Generation, Predictive Analytics, SciPy, Data Analysis, REST

Freelance Chatbot Developer

2018 - 2019
Bigger Brains
  • Developed a natural language understanding model for an educational chatbot.
  • Integrated the chatbot with Slack, Facebook Messenger, and Microsoft Teams platforms.
  • Implemented the chatbot dialog engine using a Rasa machine learning system.
Technologies: Machine Learning, Artificial Intelligence (AI), Natural Language Processing (NLP), GPT, Generative Pre-trained Transformers (GPT), Natural Language Understanding (NLU), Computational Linguistics, SpaCy, Scikit-learn, Deep Learning, Chatbots, NumPy, Python, Pandas, Django, Rasa.ai, TensorFlow, Software Architecture, Full-stack, Azure, Software as a Service (SaaS), Web Development, Text Generation, Data Analysis

NLP/ML Developer

2017 - 2018
Cargo Chief
  • Developed a system for data extraction from natural language sources.
  • Implemented both rule-based extractions and machine learning for natural language processing.
  • Developed tools for data curation and data preprocessing.
Technologies: Amazon Web Services (AWS), OCR, Data Pipelines, Image Processing, MySQL, Data Science, Machine Learning, PyTorch, Artificial Intelligence (AI), Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), GPT, Natural Language Understanding (NLU), Computational Linguistics, SpaCy, Data Engineering, LightGBM, Scrum, Flask, Scikit-learn, GitHub, Jupyter, SQL, Deep Learning, Statistical Data Analysis, NumPy, Python, Pandas, Software Architecture, Supervised Machine Learning, Data Scientist, Software as a Service (SaaS), Data Modeling, Data Analysis

Freelance Data Scientist

2017 - 2017
HumNutrition
  • Implemented a product recommendation system that was also integrated with the webshop.
  • Built a predictive model for the churning of subscription customers.
  • Performed a clustering analysis of the customer space to extract insights for marketing.
  • Developed REST web services to integrate the models into the main business application.
Technologies: ETL, Data Pipelines, Docker, MySQL, Data Science, Machine Learning, PyTorch, Forecasting, SpaCy, Data Engineering, Recommendation Systems, XGBoost, LightGBM, Scikit-learn, GitHub, Jupyter, SQL, Deep Learning, Web Scraping, Statistical Data Analysis, NumPy, Python, Pandas, TensorFlow, Software Architecture, Full-stack, Data Scientist, Statistics, Algorithms, Analytics, Software as a Service (SaaS), Fraud Prevention, Data Analytics, Statistical Modeling, Data Modeling, Tableau, SciPy, Data Analysis, REST, Google BigQuery, Matplotlib

Freelance Data Analyst

2014 - 2017
Nespresso
  • Worked on the blueprint design and implementation of a new retail information system with responsibilities on customer service and supply chain flows.
  • Developed web services for eCommerce integration with a supply chain covering stock, availability, and sales documents.
  • Implemented an outlier detection system for alerts on customer service data entry.
  • Developed an extensive set of automated unit tests for end-to-end supply chain flows.
  • Analyzed data log for bottleneck identification and performance improvement.
Technologies: ETL, Data Pipelines, Data Science, Machine Learning, Forecasting, Data Engineering, XGBoost, LightGBM, Scikit-learn, Jupyter, SQL, Statistical Data Analysis, NumPy, Python, Pandas, R, SAP, Data Mining, Big Data, Data Scientist, Statistics, Algorithms, Analytics, Data Analytics, Predictive Analytics, Statistical Modeling, Data Modeling, Test-driven Development (TDD), SciPy, Data Analysis, Matplotlib

Freelance Data Analyst and Developer

2013 - 2013
Syncronic
  • Worked as a core developer of a new SaaS platform for sales forecasting.
  • Created the general architecture design, forecasting model, back-end development, REST API, and integration with the ERP system.
  • Developed a forecasting model by combining ARIMA and ETS with a machine-learning gradient-boosting model.
Technologies: ETL, Data Science, Machine Learning, Forecasting, Scikit-learn, Statistical Data Analysis, NumPy, Python, Pandas, Time Series, Technical Leadership, Software Architecture, Full-stack, Statistics, Algorithms, Analytics, Software as a Service (SaaS), Data Analytics, Statistical Modeling, Data Modeling, SciPy, Data Analysis

Data Analyst (Contract)

2011 - 2013
Novozymes
  • Worked on the business blueprint and implementation of a new SAP supply chain solution.
  • Designed and developed new production planning heuristics.
  • Developed automated end-to-end tests for the business processes.
Technologies: ETL, Data Science, Machine Learning, Scikit-learn, Jupyter, SQL, Statistical Data Analysis, NumPy, Python, Pandas, SAP, Data Pipelines, Data Scientist, Algorithms, Analytics, Data Analytics, Predictive Analytics, Data Modeling, Data Analysis

Web Developer (Contract)

2011 - 2011
Unilever
  • Enhanced the Python back-end of Digpedia.net—a multimedia marketing internal web application.
  • Developed, in Django, a system for multimedia management with fine-grained access control.
  • Implemented in the Django application a system for temporary access to resources.
Technologies: Python, PostgreSQL, Django, HTML, CSS, JavaScript, Technical Leadership, Full-stack, Web Development, jQuery

Data Analyst | Developer (Contract)

2008 - 2011
Tetra Pak
  • Designed and developed a block-planning extension to the production planning software.
  • Collaborated on the design of the new available-to-promise solution and product-allocation solution.
  • Designed and implemented enhancements to the SAP CIF interface to support a special VMI scenario.
  • Supported and worked on solution enhancements for rollouts in multiple plants.
Technologies: Data Science, SQL, Statistical Data Analysis, Python, SAP

Data Analyst (Contract)

2009 - 2010
Sony Mobile
  • Implemented a prototype for the available-to-promise system using rules-based functionality for the project proof of concept.
  • Designed and implemented an SAP global available-to-promise system.
  • Designed and built the technical specifications for enhancements in the APO interface, backorder processing, and special reports.
Technologies: Data Science, Python, SAP, Data Analytics

Data Analyst (Contract)

2008 - 2010
Nestlé
  • Worked on the data analysis for performance improvement and issue resolution.
  • Designed and implemented an algorithm to support products with shelf-life constraints in sales stock allocation.
  • Enhanced the available-to-promise process in the supply chain management system.
  • Designed and developed a simplified user interface for quotas and allocation maintenance. Used dynamic source code generation.
Technologies: Data Science, ABAP, SQL, Python, SAP

SCM Consultant

2007 - 2008
SAP
  • Acted as a member of the MaxAttention team and was sent on a mission to Sappi paper company in South Africa to identify and propose solutions for the critical issues in the SAP system.
  • Designed the detailed plan for implementation of Available to Promise in Philips Consumer Electronics.
  • Analyzed and proposed a solution for a timezones issue in ATP scheduling in the SAP system of Clariant.
Technologies: SAP

Data Analyst (Contract)

2007 - 2007
Philips Consumer Electronics
  • Collaborated on the business blueprint design for supply chain quota allocation as a technical expert.
  • Implemented a prototype with the algorithm identified in the blueprint.
  • Worked on the project SWOT analysis.
Technologies: Data Science, Python, SAP, Data Analytics

Data Analyst (Contract)

2007 - 2007
Johnson & Johnson Pharmaceutical
  • Conducted a data analysis in order to troubleshoot system issues and implement improvements.
  • Designed and developed enhancements on the sales forecasting and a demand-planning system.
  • Implemented enhancements to the production-planning detailed scheduling system.
Technologies: Data Science, SQL, Python, SAP, Data Analytics

Data Analyst | Developer

2001 - 2006
Sonae Industria
  • Worked on a new greenfield SAP implementation with a focus on the supply chain system.
  • Analyzed data in order to achieve performance and stability improvement.
  • Developed a real-time asynchronous interface for a SAP integration with shop-floor control.
  • Implemented a web-based performance-monitoring dashboard.
Technologies: Java, Data Warehouse Design, Data Science, ABAP, SQL, Statistical Data Analysis, Python, R, Oracle, SAP, Web Development, Data Analytics

First Prize Winner of Microsoft Research TextWorld AI Competition

https://www.microsoft.com/en-us/research/blog/first-textworld-problems-the-competition-using-text-based-games-to-advance-capabilities-of-ai-agents/
I am the winner of a global competition on reinforcement learning and natural language understanding for building agents capable of solving text-based games.

Kaggle Master

https://www.kaggle.com/pvlima
I achieved the rank of master in a Kaggle machine learning competition platform.

Sales Forecasting Platform

I worked on the design and development of a SaaS platform for sales forecasting. I was also involved in the implementation of a hybrid time series forecasting algorithm using ARIMA, ETS, and gradient boosting.

Production Planning Heuristic

A heuristic algorithm that I designed and developed for the special cyclic planning of fast-moving products. The planning heuristic is based on standard lots planning with additional leveling of production quantities and special scheduling based on product setup values.

Outlier Detection System

I implemented an outlier detection system based on a random forest algorithm to provide alerts for suspicious/wrong data in the customer service flow.

Game with Generative AI

https://museumof.ai/collection/#bot-poets-14675
Developed a game with content created by generative AI (poems and game images). The game is featured in the Museum of AI ("Love is like a foot on the beach" by Bot Poets Society 2021; AI-generated poem).

Languages

Python, SQL, ABAP, Java, R, JavaScript, TypeScript, HTML, CSS

Frameworks

LightGBM, Django, Scrapy, Flask

Libraries/APIs

PyTorch, XGBoost, Scikit-learn, Pandas, Matplotlib, Beautiful Soup, TensorFlow, Keras, SciPy, NumPy, SpaCy, jQuery, React, HTMX

Paradigms

Data Science, ETL, Continuous Integration (CI), Continuous Deployment, Test-driven Development (TDD), REST, Scrum

Other

Data Analytics, Chatbots, Artificial Intelligence (AI), Data Analysis, Statistical Data Analysis, Natural Language Processing (NLP), Data Engineering, Machine Learning, SAP, Natural Language Understanding (NLU), Generative Pre-trained Transformer 3 (GPT-3), GPT, Generative Pre-trained Transformers (GPT), Text Generation, FastAPI, Language Models, Machine Learning Operations (MLOps), Large Language Models (LLMs), Analytics, Data Scientist, Predictive Analytics, Data Modeling, Image Processing, Serverless, Recommendation Systems, lxml, Web Scraping, Deep Learning, Scientific Computing, Computational Linguistics, OpenAI, ChatGPT, LangChain, Technical Leadership, Software Architecture, Full-stack, Llama 2, OpenAI GPT-4 API, Supervised Learning, Unsupervised Learning, Retrieval-augmented Generation (RAG), Supervised Machine Learning, Data Mining, Gradio, Web Development, Fraud Prevention, Software as a Service (SaaS), Algorithms, Statistics, A/B Testing, Statistical Modeling, Outlier Detection, Forecasting, Time Series, Big Data, Maps, OCR, Data Warehouse Design, Google BigQuery, Computer Vision, Games, Optimization, AIOps, Supabase, Product Management

Tools

Rasa.ai, Apache Airflow, Jupyter, GitHub, Git, Haystack, Sublime Text, GitLab CI/CD, Tableau, PyCharm, IBM Watson, Pytest

Platforms

Docker, Google App Engine, MacOS, Linux, Google Cloud Platform (GCP), Kubernetes, Oracle, Apache Kafka, AWS Lambda, Amazon Web Services (AWS), Vercel, Azure

Storage

Data Pipelines, PostgreSQL, MySQL, Elasticsearch, PostGIS, MongoDB

1999 - 2007

PhD Degree in Machine learning applied to process engineering

Coimbra University - Coimbra, Portugal

1992 - 1997

Master's Degree in Chemical Engineering

Coimbra University - Coimbra, Portugal

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring