
Caio Taniguchi
Verified Expert in Engineering
Machine Learning Developer
Zaragoza, Spain
Toptal member since November 14, 2018
Caio is a data scientist and back-end developer with experience in the whole data science pipeline, from data collection to model deployment. His main goal is to use data science and machine learning techniques to help businesses extract the most value out of their data.
Portfolio
Experience
- Machine Learning - 4 years
- Python - 4 years
- Data Science - 4 years
- Docker - 3 years
- Agile - 3 years
- MongoDB - 3 years
- AWS Cloud Computing Services - 3 years
- Node.js - 3 years
Availability
Preferred Environment
Jupyter, Atom, Git, MacOS
The most amazing...
...learning model I've trained was a classifier to determine the political inclination of users, using engineered features and model ensemble.
Work Experience
Senior Data Engineer
Stone
- Designed and developed a general-purpose feature processing system for use in anti-fraud processes company-wide, using Kotlin, Flink, Kafka, S3, and MongoDB.
- Developed an event-driven anti-fraud system for card transactions, processing operations through batch and streaming pipelines with Apache Flink.
- Experimentally developed real-time ELT pipelines for PostgreSQL based on change data capture (CDC) tools using Kafka Connect and Debezium.
Machine Learning Engineer
Stone
- Developed a data-driven real-time fraud detection process for banking transactions, making use of client behavior and known fraud patterns to reach a decision. Applied both heuristics-based and machine learning approaches.
- Designed and developed a real-time system based on facial recognition, used for onboarding clients and as an additional security measure for banking transactions.
- Implemented the data processing pipelines and initial analysis for the batch AML system. A solution made use of heuristics and graph analysis.
Software Engineer
Accenture
- Created a recommender system framework and trained model for deployment using Docker, PySpark, and AWS.
- Coordinated a team and architected a serverless web app for fraud detection and credit approval with Java and AWS.
- Architected and developed an image fraud detection system in Python and AWS.
Junior DevOps Engineer
Concrete Solutions
- Developed plugins and maintained CD Jenkins pipelines.
- Created an on-premise testing framework for mobile apps using physical devices with Node.js, React, MongoDB, and Redis.
- Supported a video streaming platform with thousands of views per day built with Python and Django.
Experience
HackerRank's Machine Learning CodeSprint 2016
https://github.com/caiotaniguchi/hackerrank-ml-sprint- A classification problem to predict whether or not an email would be opened by a HackerRank user, which involved data cleaning, data exploration, feature engineering and model training and validation with Python, Pandas, Matplotlib, and XGBoost.
- A ranking problem to select a number of competitions to recommend for HackerRank users. Solved by coding an item-based recommender system from scratch in Python.
Besides the model predictions, the competition also required the source code used and documentation about the methods applied. Earned a silver medal by finishing in the top 7% of the leaderboard.
Competition: https://www.hackerrank.com/machine-learning-codesprint
Post about the classifier: https://medium.com/@caiotaniguchi/one-week-of-machine-learning-madness-with-hackerrank-part-1-bde90dd30d2f
Post about the recommender: https://medium.com/@caiotaniguchi/one-week-of-machine-learning-madness-with-hackerrank-part-2-783328191f7e
HomeBroker Automator
https://github.com/caiotaniguchi/hb-automatorEducation
Bachelor's Degree in Electronics and Computer Engineering
Universidade Federal Do Rio De Janeiro (UFRJ) - Rio de Janeiro, Brazil
Certifications
AWS Certified Cloud Practitioner
Amazon Web Services
Deep Learning
Coursera
AWS Certified Solutions Architect Associate
AWS
AWS Certified Developer – Associate
Amazon Web Services
Skills
Libraries/APIs
Scikit-learn, Pandas, Node.js, React, PySpark, Keras, XGBoost
Tools
Plotly, Git, Atom, Jupyter, Jenkins, Apache Airflow, BigQuery
Platforms
AWS Cloud Computing Services, Docker, MacOS, Linux, Azure, Amazon Web Services (AWS), Apache Kafka, Apache Flink, Debezium
Languages
JavaScript, Python, Java, SQL, C++, C, Kotlin
Frameworks
Express.js, Spring, Bootstrap, AngularJS, Scrapy, Selenium, Spark
Paradigms
Scrum, Object-oriented Programming (OOP), Agile, Test-driven Development (TDD), DevOps, ETL
Storage
MongoDB, Redis, NoSQL, MySQL, Databases, PostgreSQL, Graph Databases
Other
AWS Cloud Architecture, Data Science, Machine Learning, Data Analytics, Data Analysis, Statistics, Data Scraping, Web Scraping, Data Visualization, Cloud, Deep Learning, Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Sequence Models, Electronics, Programming, Software Engineering, Recommendation Systems, Back-end, Software Design, Software Development, Big Data, Streaming Data
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring