A. Rosa Castillo, PhD, Developer in Málaga, Spain
A. is available for hire
Hire A.

A. Rosa Castillo, PhD

Verified Expert  in Engineering

Data Scientist and Developer

Location
Málaga, Spain
Toptal Member Since
September 29, 2022

Rosa is a full-stack developer and data scientist with a PhD, solid research skills, and extensive software engineering experience. Combining the academy and industry approaches to data sciences, she can contribute to the whole data pipeline—from exploratory data analysis to prototyping and production. Rosa has also efficiently worked on projects across different countries using her professional English, Italian, German, and Spanish proficiency.

Portfolio

Fortris
Python, Docker, SQL, Data Science, Finance, Tax Accounting, Algorithms...
National University of Singapore - Main
Python, R, nbdev, Version Control, Git, Data Science, API Design...
The Workshop
R, Python, Spark, Docker, Apache Kafka, Data Science, Algorithms...

Experience

Availability

Part-time

Preferred Environment

PyCharm, Slack, Jupyter Notebook, Docker, Python, Machine Learning, Data Science, Spark, Apache Kafka

The most amazing...

...solution I've developed is a recommender system that suggests games to players, which increased the number of players trying new games by 7%.

Work Experience

Data Scientist

2022 - PRESENT
Fortris
  • Collected Blockchain data and performed accounting algorithms on it, expanding my knowledge of Blockchain technology.
  • Explored finance terminology and how to build meaningful reports for stakeholders.
  • Started using Neo4j to implement graph analysis and represent data more efficiently than just applying plain relational database models.
Technologies: Python, Docker, SQL, Data Science, Finance, Tax Accounting, Algorithms, Reporting, Data Analysis, Neo4j, Blockchain, Data Engineering, Data Reporting, Data Analytics, Data Modeling, Mathematics, Mathematical Analysis, Product Analytics, Data Pipelines, ETL, Predictive Modeling, Risk Analysis, Flask, Artificial Intelligence (AI), Kubernetes, Stream Processing, Git, Data Visualization, PyCharm, Pandas, APIs, Business Analysis

Python and R Library/Package Developer

2023 - 2024
National University of Singapore - Main
  • Helped the research team—committed to building a high-quality analysis package—incorporate some best practices regarding software architecture, coding guidelines, and testing.
  • Contributed to the improvements introduced in the documentation about the Python package, making the code more robust, reliable, scalable, and easier to maintain.
  • Instructed the small team on the next steps and how to follow the best coding guidelines as part of the engagement.
Technologies: Python, R, nbdev, Version Control, Git, Data Science, API Design, API Architecture, Jupyter Notebook

Data Scientist

2019 - 2021
The Workshop
  • Built more efficient SQL queries to deal with a huge data volume and created my first reports in Tableau.
  • Created data pipelines using Kafka streaming and learned to use Docker for effective testing.
  • Worked on my very first project in Spark to compute fraud detection metrics.
Technologies: R, Python, Spark, Docker, Apache Kafka, Data Science, Algorithms, Machine Learning, Recommendation Systems, Forecasting, Fraud Prevention, Tableau, Jupyter Notebook, SQL, Elasticsearch, Statistical Modeling, Data Engineering, Data Reporting, Data Analytics, Data Modeling, Mathematics, Mathematical Analysis, Product Analytics, Data Pipelines, ETL, Predictive Modeling, Risk Analysis, Flask, Artificial Intelligence (AI), Kubernetes, Stream Processing, MySQL, Git, Data Visualization, PyCharm, Pandas, APIs, Business Analysis, Time Series

AI Engineer

2018 - 2019
Accenture
  • Performed a deep customer service analysis based on ticketing data for various companies.
  • Developed different NLP techniques as part of the data product offered to clients.
  • Improved reporting practices by sharing visualization graphs with the clients.
Technologies: R, Python, SQL, Business Requirements, Data Science, Reporting, Excel 2010, Data Reporting, Data Analytics, Data Modeling, Mathematics, Mathematical Analysis, Product Analytics, ETL, Flask, Artificial Intelligence (AI), MySQL, Amazon Web Services (AWS), Data Visualization, Pandas, APIs, Business Analysis

R&D Engineer

2014 - 2018
Haag Streit AG
  • Managed a team of two for three months while looking for a new project manager. I also wrote risk analysis for a project, conducted computer estimations, and handled employee performance reviews.
  • Handled requirements engineering carefully, considering regulations and tracking needs in the healthcare sector. Became certified in software requirements engineering by the International Requirements Engineering Board.
  • Worked on a refactoring project with a challenging new system's software design and architecture. I learned a lot about software patterns and algorithm optimizations.
Technologies: Java, Programming, UML, Analysis, QA Testing, C++, Software, SQL, Mathematics, Mathematical Analysis, Git

Software Engineer

2011 - 2014
Annax Information Systems
  • Worked on the system architecture, analysis, and design of software components using UML, Enterprise Architect, and other software engineering tools.
  • Embedded code in a Linux-based device used in trains to manage train announcements.
  • Coded with C++ and handled testing both in the lab and running trains.
Technologies: C++, Software, UML, Programming, Realtime, Linux, Embedded C++

Account Takeover Detection Model

The client wanted to improve the current rules-based detection system with a more innovative and dynamic service triggered by some machine learning model.

I designed and developed a POC using the available web data to build a detection model for classifying web activity into different risk scores.

Player Forecasting Model

The client wanted to know a reasonable estimation of the players' volume on the platform beforehand. Thanks to the predictions of this model, the system and the team could accommodate resources and support on the platform anticipating high demands and preventing potential issues.

I designed and developed a SARIMA model that achieved high accuracy on the volume predictions and worked both during and after the COVID-19 period.

Game Recommender System

The old recommendation system was static and not customized, so I proposed and implemented a dynamic recommender system based on the collaborative filtering algorithm. After the POC was approved, the model was moved into production and increased the number of games per player.

Ticket Similarity Application

To improve customer service efficiency and performance, we created an application that allowed clients to enter a new ticket text and find similar past tickets and the corresponding solutions. The application was based on NLP techniques and similarity metrics to process past data and compute the similarity. The client significantly reduced the response time for many recurrent tickets.

VIP Customer Classifier

https://github.com/cyberosa/classifier_with_unbalanced_ds
The goal was to build a prediction model to find VIP customers. I tried a simple random forest classifier versus an ensembled approach combining the clustering phase with several classifiers. I then compared the performance of both models. The dataset was heavily unbalanced, so I used different state-of-the-art techniques to compensate for this effect. I finally built a Flask API to wrap the winning model.

Languages

Python, Java, UML, R, SQL, C++, Embedded C++

Libraries/APIs

Pandas, TensorFlow, OpenMP

Tools

PyCharm, Slack, Git, Jira, Kafka Streams, Tableau, Confluence, Excel 2010

Paradigms

Data Science, Parallel Programming, Requirements Analysis, ETL, API Architecture

Storage

Data Pipelines, MySQL, Elasticsearch, Neo4j

Other

Machine Learning, Software Development, Algorithms, Software, Technical Requirements, Research, Data Analysis, Programming, Analysis, Recommendation Systems, Forecasting, Fraud Prevention, Data Engineering, Data Reporting, Data Analytics, Data Modeling, Mathematics, Product Analytics, Artificial Intelligence (AI), Data Visualization, APIs, Deep Learning, User Requirements, System Requirements, Functional Requirements, Statistics, Data Products, Regression, Hyperparameters, Natural Language Processing (NLP), QA Testing, Scripting, Clustering, Statistical Analysis, Statistical Modeling, Generative Adversarial Networks (GANs), Mathematical Analysis, Predictive Modeling, Risk Analysis, GPT, Generative Pre-trained Transformers (GPT), Stream Processing, Business Analysis, Time Series, HPCC Systems, Compilers, Business Requirements, Convolutional Neural Networks (CNN), Deep Neural Networks, Image Processing, Reporting, Logistic Regression, Bayesian Statistics, Computer Vision, Finance, Tax Accounting, Directed Acrylic Graphs (DAG), Dagster, Cryptocurrency, nbdev, Version Control, API Design

Frameworks

Spark, Realtime, Flask

Platforms

Jupyter Notebook, Docker, Apache Kafka, Linux, Blockchain, Kubernetes, Amazon Web Services (AWS)

2005 - 2010

PhD in Computer Science

University of Malaga - Malaga, Spain

JULY 2022 - PRESENT

Machine Learning Specialization

Coursera

MAY 2020 - PRESENT

Deep Learning Specialization

Coursera

DECEMBER 2017 - PRESENT

Data Science Specialization

Coursera

JUNE 2016 - PRESENT

Certified Professional for Requirements Engineering

International Requirements Engineering Board

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring