Luis is available for hire

Luis Nicolas-Alonso

Verified Expert in Engineering

Machine Learning Developer

Location

Barcelona, Spain

Toptal Member Since

September 21, 2018

Luis is a seasoned data scientist with a strong background in mathematics, software engineering, and machine learning. He has a proven track record of success developing scalable data analytics applications in the cloud and collaborating with technical and non-technical stakeholders. Luis is passionate about running data science projects with agile management, test-driven development, and continuous delivery.

Statistics Machine Learning Big Data A/B Testing Signal Processing Deep Neural Networks Deep Learning Pandas Python 2 SQL NumPy Python Scikit-learn Spark Amazon Athena

Portfolio

BAZZE & COMPANY

Amazon Web Services (AWS), Spark, Python 3, Python API, Swagger, Amazon Athena...

Greenchef (Toptal client)

Git, CircleCI, AWS Lambda, SQL, Python, PostgreSQL...

Xapo

Microsoft Excel, Google Data Studio, Tableau, Redshift, NiFi, SQL, Python

Experience

Machine Learning - 7 years SQL - 5 years Python - 5 years Spark - 4 years Generative Pre-trained Transformers (GPT) - 2 years Computer Vision - 2 years GPT - 2 years Natural Language Processing (NLP) - 2 years

Availability

Part-time

Preferred Environment

Jira, Git, OS X

The most amazing...

...project I've developed was an intelligent car that offered a fully personalized driving experience using deep learning and natural language processing.

Work Experience

Data Engineer

2020 - 2021

BAZZE & COMPANY

Developed data pipelines to ingest more than 20GB of location data daily. Was responsible for developing and deploying the processes to clean data and monitor data quality.
Contributed to design and developed the back end of the data marketplace, https://bazze.io/. Was responsible for optimizing SQL queries in AWS Athena to reduce response time and cost.
Acted as a subject matter expert for logging (AWS Cloudwatch and Sentry), testing. (PyTest), CI/CD (GitLab), and monitoring (AWS CloudWatch).
Developed data analysis models to extract location patterns (e.g daily commuting).
Developed a PoC with Kafka and Python for streaming data analysis.

Technologies: Amazon Web Services (AWS), Spark, Python 3, Python API, Swagger, Amazon Athena, AWS Lambda, Amazon API Gateway, Amazon S3 (AWS S3), Amazon Simple Queue Service (SQS)

Data Engineer

2019 - 2019

Greenchef (Toptal client)

Built a data warehouse on AWS Redshift.
Used AWS DMS to synchronize the production database (MongoDB) with AWS Redshift within seconds.
Developed AWS Lambda functions to validate data quality daily and raise alarms if necessary.
Built a CI/CD framework to develop and automatically run data analysis queries on AWS Redshift.
Developed a set of microservices with AWS Lambda to automatically restart data pipelines in case of a failure.

Technologies: Git, CircleCI, AWS Lambda, SQL, Python, PostgreSQL, AWS Database Migration Service (DMS)

Data Engineer

2019 - 2019

Xapo

Built a data warehouse on AWS (Airflow, Glue, Lambda, Redshift) to generate operational dashboards at every level in the business (customer support, compliance, debit card, and more).
Created ETL data pipelines with Kinesis and Spark to sync data with databases in production.
Created datamarts in BigQuery easily accessible using Excel, Tableau, or Google Data Studio.
Collaborated with all areas of the organization to ensure data quality and integrity.
Ensured compliance with the organization’s data governance policies.
Created a model to predict the number of open tickets by customer. Data points such as number of transactions, number of new customers, current Bitcoin price were used. Output is exposed as a service and shown on a web built with Shiny (R).
Designed and analyzed debit cards campaign KPIs such as card penetration, customer activity (e.g. time to first buy), retention (churn), transactions (amount, rejected), and more. Results were reported with Google Data Studio (cohorts, line charts).
Created a dynamic Excel sheet to track cash reserves, balances, safeguarding, money in, money out, open transactions, exchange balance, and more. Excel daily updates data by using BigQuery.

Technologies: Microsoft Excel, Google Data Studio, Tableau, Redshift, NiFi, SQL, Python

Data Scientist (Remote)

2017 - 2019

Vodafone

Designed and developed large-scale machine learning algorithms with Impala, Spark, R (Shiny) and Python (Pandas/Numpy/Plotly/TF/Keras) to improve customer retention and product recommendation, analyze customer social network, and optimize marketing campaigns. Model was deployed in production using AWS SageMaker. A/B testing were used to validate offline results.
Analyzed WhatsApp usage patterns with Spark to understand customer social network. This information would be used for marketing.
Analyzed network performance and net promoter score to improve mobile network based on customer satisfaction.
Designed pricing model with machine learning to offer dynamic pricing on Internet data tariffs. This project focused on customers who occasionally used mobile Internet data. Current customers' data usage, customer segment, customer location or current price elasticity were used to enhance right price estimation.
Designed pricing model with machine learning that optimised counter offer price to increase revenue and reduce churn rate. Customer segmentation was used to optimise price. Model was deployed in production.

Technologies: Plotly, NumPy, Pandas, Keras, TensorFlow, Git, Scala, Python, PySpark, Cloudera, Impala, HDFS, Hadoop

Data Scientist

2015 - 2017

Jaguar Land Rover

Managed stakeholders, planned projects, and designed a strategic roadmap for the research data lab team.
Directly involved in deploying a scalable automotive data logging system on a fleet of 150 engineering vehicles, and developing large-scale data pipelines on AWS. Technologies used included Spark, Kafka, Parquet, S3, Akka, and Python.
Analyzed driving patterns to enhance advanced driver-assistance systems, anomaly detection to improve vehicle reliability and enable failure prediction, and analysis of vehicle component usage to optimize reliability and cost.
Created a data quality testing framework to ensure data integrity.
Designed and developed a library that made it easy to run queries on vehicle data.

Technologies: Docker, Logstash, Elasticsearch, BigQuery, Apache Kafka, Cassandra, HBase, Tableau, RStudio Shiny, R, Python, Scala, Spark, Hadoop

Data Scientist

2015 - 2015

Jaguar Land Rover

Contributed to the design and development of an intelligent car and native cloud application on AWS to offer a fully personalized driving experience.
Designed performance metrics to measure the quality of service for each component of the application.
Developed streaming machine learning services to predict user driving routines with Python (Sklearn and Pandas) and Kafka. Predictions were used for car preconditioning, fuel consumption estimation, destination prediction, or estimating the time of arrival.
Created a model to predict user destination based on calendar and email using natural language processing.

Technologies: Amazon Web Services (AWS), Docker, Apache Kafka, Scala, Java, Python, HBase, Cassandra

Machine Learning Engineer

2012 - 2015

Biomedical Engineering Group

Improved state-of-art motor imagery brain-computer interface performance by 10% using online adaptive machine learning model. Spectral, temporal, and spatial EEG characteristics were analysed to decode motor tasks from brain activity.
Developed a machine learning algorithm for automated diagnosis of obstructive sleep apnea–hypopnea syndrome (SAHS). Desaturations in blood oxygen saturation (SaO 2 ) recordings, respiratory rate variability (RRV) or ECG were measured to extract a set of statistical, spectral and nonlinear features that helped diagnosis.
Assessed the effectiveness of a motor imagery brain computer interface application to rehabilitate cognitive functions by neurofeedback training (NFT). Electroencephalogram (EEG) changes measured by relative power (RP) showed evidence that visuospatial, oral language, memory, intellectual and attention functions improved after performing NFT sessions.

Technologies: Apache Hive, MATLAB

Research Scientist

2014 - 2014

Brain Computer Interface Group, University of Essex, UK

Worked on advanced brain signal processing with multitask learning, transfer learning, domain adaptation, deep learning, auto-encoders, and deep belief neural networks.

Technologies: Python, MATLAB

Software Engineer

2010 - 2012

Agroguia

Developed a machine learning application that allows steering a tractor by means of an EMG-based human-machine Interface.

Technologies: GPS, Digital Signal Processing, Java, C++

Experience

Go I-PACE App

https://media.jaguar.com/news/2018/07/go-i-pace-app-puts-electric-jaguar-your-pocket

One of my last projects at Jaguar Land Rover.

Go I-PACE helps customers understand the potential cost savings of going electric compared to their existing vehicle. would-be buyers. The app estimates how I-PACE would fit into your life based on personal journey data.

The Go I-PACE app captures journey data to calculate potential cost savings, show how much battery would be used per trip and tell users how many charges they would need in a week if they were driving the I-PACE.

Calculates the range expected from a full charge based on your vehicle use, the number of charges required in a typical week and how frequently you would need to top up mid-journey.

It can also distinguish between different modes of transport to make sure it collects accurate data, even prompting users to confirm that individual trips were made by car for unusual routes – for instance on journeys made by cycling rather than behind the wheel.

Self-learning Car

https://www.youtube.com/watch?v=F923EuB06CI

Responsible for the delivery of the data analytics components of a car and mobile application to offer fully personalized driving experience to Jaguar Land Rover customers and help prevent accidents by reducing driver distraction.

Main responsibilities and goals involved:

● Define and implement scalable real-time workflow to load data, quality management, and distribution across various system using Big Data technologies on Amazon Web Services.
● Contribute to the software development lifecycle including the analysis, architecture, design, implementation, and QA.
● Hands-on work directly implementing complex machine learning solutions using Natural Language Processing, recommender systems, neural networks and/or deep learning.
● Write technical documentation and presentation of results to technical and non-technical stakeholders.

Sensors

https://github.com/lnicalo/Sensors

Scala package to process time series from different sensors with Spark.

Processing time series collected from different sensors poses several challenges as a result of data may not be aligned or have the same time sampling. Writing data queries can be quite hard for data scientists because data cannot be expressed in a tabular form.

This library makes it easy to write queries with this kind of datasets.

Driver Profile Analysis

Analysis of daily driving patterns of Jaguar Land Rover customers:

- Fuel consumption
- Daily in-car time
- Commute schedule
- Regular routes
- Total distance
- Journey duration
- Driving style
- Refuelling events
- Phone call patterns
- Heated and cooled seat usage
- Phone call pattern
- Radio stations

Data Science Competition - CONNECTOMICS - (16th / 143)

https://www.kaggle.com/c/connectomics

This challenge will stimulate research on network-structure learning from neurophysiological data, including causal discovery methods.

The goal of the data science competition was to predict the directed connection between 1000 neurons based on their time series of the activity.

My solution involved a mixture of several features such as correlation, mutual information, partial correlation, spectrogram, and frequency analysis.

Data Science Competition - Grasp-and-Lift EEG Detection - (15th / 379)

https://www.kaggle.com/c/grasp-and-lift-eeg-detection

This competition challenges you to identify when a hand is grasping, lifting, and replacing an object using EEG data that was taken from healthy subjects as they performed these activities. A better understanding of the relationship between EEG signals and hand movements is critical to developing a BCI device that would give patients with neurological disabilities the ability to move through the world with greater autonomy.

My solution involved a deep neural net developed with Python using Theano and Lasagne.

Skills

Languages

Python 2, Python 3, SQL, Python, R, Scala, Java, C++

Frameworks

Spark, Hadoop, RStudio Shiny, Swagger

Libraries/APIs

Pandas, Spark ML, Keras, NumPy, Scikit-learn, OpenCV, Python API, TensorFlow, Spark Streaming, PySpark, Google Cloud API

Tools

Spark SQL, Amazon Athena, Amazon Elastic MapReduce (EMR), CircleCI, Tableau, Impala, Cloudera, BigQuery, PyCharm, Git, Amazon SageMaker, MATLAB, Apache Airflow, AWS Glue, Jira, Plotly, Microsoft Excel, Superset, Logstash, Amazon CloudWatch, Amazon ElastiCache, Amazon Simple Queue Service (SQS)

Paradigms

Lambda Architecture, Siamese Neural Networks, Microservices Architecture, Agile, Scrum

Platforms

Spark Core, Amazon Web Services (AWS), Docker, Apache Kafka, Azure, OS X, AWS Lambda, Kubernetes

Storage

Apache Hive, HBase, HDFS, Cassandra, Amazon S3 (AWS S3), Redshift, MySQL, Elasticsearch, Google Cloud, PostgreSQL, NoSQL

Other

Convolutional Neural Networks (CNN), Deep Neural Networks, Neural Networks, Deep Learning, Predictive Modeling, Big Data, Machine Learning, Statistics, Recurrent Neural Networks (RNNs), Artificial Neural Networks (ANN), Statistical Analysis, A/B Testing, Customer Analysis, Cohort Analysis, Digital Signal Processing, Signal Processing, EEG, Google BigQuery, Data Visualization, Lambda Functions, Computer Vision, Natural Language Processing (NLP), Agile Data Science, Internet of Things (IoT), Google Cloud ML, OCR, Google Data Studio, Bayesian Statistics, Churn Analysis, Biomedical Skills, ECG, CI/CD Pipelines, GPT, Generative Pre-trained Transformers (GPT), GPS, NiFi, AWS Database Migration Service (DMS), Kappa Architecture, Amplitude, Amazon API Gateway

Education

2013 - 2019

Ph.D. in Biomedical Engineering

University of Valladolid - Valladolid, Spain

2011 - 2012

Master's Degree in Information Technology (Data analysis)

University of Valladolid - Valladolid, Spain

2005 - 2011

Master of Science Degree in Electronic and Telecommunication Engineering

University of Valladolid - Valladolid, Spain

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring