Luis Nicolas-Alonso, Machine Learning Developer in Barcelona, Spain
Luis Nicolas-Alonso

Machine Learning Developer in Barcelona, Spain

Member since September 21, 2018
Luis is a seasoned data scientist with a strong background in mathematics, software engineering, and machine learning. He has a proven track record of success developing scalable data analytics applications in the cloud and collaborating with technical and non-technical stakeholders. Luis is passionate about running data science projects with agile management, test-driven development, and continuous delivery.
Luis is now available for hire


    Amazon Web Services (AWS), Spark, Python 3, Python API, Swagger, AWS Athena...
  • Greenchef (Toptal client)
    Git, CircleCI, AWS Lambda, SQL, Python, PostgreSQL...
  • Xapo
    Microsoft Excel, Google Data Studio, Tableau, Redshift, NiFi, SQL, Python



Barcelona, Spain



Preferred Environment

Jira, Git, OS X

The most amazing...

...project I've developed was an intelligent car that offered a fully personalized driving experience using deep learning and natural language processing.


  • Data Engineer

    2020 - 2021
    • Developed data pipelines to ingest more than 20GB of location data daily. Was responsible for developing and deploying the processes to clean data and monitor data quality.
    • Contributed to design and developed the back end of the data marketplace, Was responsible for optimizing SQL queries in AWS Athena to reduce response time and cost.
    • Acted as a subject matter expert for logging (AWS Cloudwatch and Sentry), testing. (PyTest), CI/CD (GitLab), and monitoring (AWS CloudWatch).
    • Developed data analysis models to extract location patterns (e.g daily commuting).
    • Developed a PoC with Kafka and Python for streaming data analysis.
    Technologies: Amazon Web Services (AWS), Spark, Python 3, Python API, Swagger, AWS Athena, AWS Lambda, AWS API Gateway, AWS S3, Amazon SQS
  • Data Engineer

    2019 - 2019
    Greenchef (Toptal client)
    • Built a data warehouse on AWS Redshift.
    • Used AWS DMS to synchronize the production database (MongoDB) with AWS Redshift within seconds.
    • Developed AWS Lambda functions to validate data quality daily and raise alarms if necessary.
    • Built a CI/CD framework to develop and automatically run data analysis queries on AWS Redshift.
    • Developed a set of microservices with AWS Lambda to automatically restart data pipelines in case of a failure.
    Technologies: Git, CircleCI, AWS Lambda, SQL, Python, PostgreSQL, AWS Database Migration Service
  • Data Engineer

    2019 - 2019
    • Built a data warehouse on AWS (Airflow, Glue, Lambda, Redshift) to generate operational dashboards at every level in the business (customer support, compliance, debit card, and more).
    • Created ETL data pipelines with Kinesis and Spark to sync data with databases in production.
    • Created datamarts in BigQuery easily accessible using Excel, Tableau, or Google Data Studio.
    • Collaborated with all areas of the organization to ensure data quality and integrity.
    • Ensured compliance with the organization’s data governance policies.
    • Created a model to predict the number of open tickets by customer. Data points such as number of transactions, number of new customers, current Bitcoin price were used. Output is exposed as a service and shown on a web built with Shiny (R).
    • Designed and analyzed debit cards campaign KPIs such as card penetration, customer activity (e.g. time to first buy), retention (churn), transactions (amount, rejected), and more. Results were reported with Google Data Studio (cohorts, line charts).
    • Created a dynamic Excel sheet to track cash reserves, balances, safeguarding, money in, money out, open transactions, exchange balance, and more. Excel daily updates data by using BigQuery.
    Technologies: Microsoft Excel, Google Data Studio, Tableau, Redshift, NiFi, SQL, Python
  • Data Scientist (Remote)

    2017 - 2019
    • Designed and developed large-scale machine learning algorithms with Impala, Spark, R (Shiny) and Python (Pandas/Numpy/Plotly/TF/Keras) to improve customer retention and product recommendation, analyze customer social network, and optimize marketing campaigns. Model was deployed in production using AWS SageMaker. A/B testing were used to validate offline results.
    • Analyzed WhatsApp usage patterns with Spark to understand customer social network. This information would be used for marketing.
    • Analyzed network performance and net promoter score to improve mobile network based on customer satisfaction.
    • Designed pricing model with machine learning to offer dynamic pricing on Internet data tariffs. This project focused on customers who occasionally used mobile Internet data. Current customers' data usage, customer segment, customer location or current price elasticity were used to enhance right price estimation.
    • Designed pricing model with machine learning that optimised counter offer price to increase revenue and reduce churn rate. Customer segmentation was used to optimise price. Model was deployed in production.
    Technologies: Plotly, NumPy, Pandas, Keras, TensorFlow, Git, Scala, Python, PySpark, Cloudera, Impala, HDFS, Hadoop
  • Data Scientist

    2015 - 2017
    Jaguar Land Rover
    • Managed stakeholders, planned projects, and designed a strategic roadmap for the research data lab team.
    • Directly involved in deploying a scalable automotive data logging system on a fleet of 150 engineering vehicles, and developing large-scale data pipelines on AWS. Technologies used included Spark, Kafka, Parquet, S3, Akka, and Python.
    • Analyzed driving patterns to enhance advanced driver-assistance systems, anomaly detection to improve vehicle reliability and enable failure prediction, and analysis of vehicle component usage to optimize reliability and cost.
    • Created a data quality testing framework to ensure data integrity.
    • Designed and developed a library that made it easy to run queries on vehicle data.
    Technologies: Docker, Logstash, Elasticsearch, BigQuery, Apache Kafka, Cassandra, HBase, Tableau, RStudio Shiny, R, Python, Scala, Spark, Hadoop
  • Data Scientist

    2015 - 2015
    Jaguar Land Rover
    • Contributed to the design and development of an intelligent car and native cloud application on AWS to offer a fully personalized driving experience.
    • Designed performance metrics to measure the quality of service for each component of the application.
    • Developed streaming machine learning services to predict user driving routines with Python (Sklearn and Pandas) and Kafka. Predictions were used for car preconditioning, fuel consumption estimation, destination prediction, or estimating the time of arrival.
    • Created a model to predict user destination based on calendar and email using natural language processing.
    Technologies: Amazon Web Services (AWS), Docker, Apache Kafka, Scala, Java, Python, HBase, Cassandra, AWS
  • Machine Learning Engineer

    2012 - 2015
    Biomedical Engineering Group
    • Improved state-of-art motor imagery brain-computer interface performance by 10% using online adaptive machine learning model. Spectral, temporal, and spatial EEG characteristics were analysed to decode motor tasks from brain activity.
    • Developed a machine learning algorithm for automated diagnosis of obstructive sleep apnea–hypopnea syndrome (SAHS). Desaturations in blood oxygen saturation (SaO 2 ) recordings, respiratory rate variability (RRV) or ECG were measured to extract a set of statistical, spectral and nonlinear features that helped diagnosis.
    • Assessed the effectiveness of a motor imagery brain computer interface application to rehabilitate cognitive functions by neurofeedback training (NFT). Electroencephalogram (EEG) changes measured by relative power (RP) showed evidence that visuospatial, oral language, memory, intellectual and attention functions improved after performing NFT sessions.
    Technologies: Apache Hive, MATLAB
  • Research Scientist

    2014 - 2014
    Brain Computer Interface Group, University of Essex, UK
    • Worked on advanced brain signal processing with multitask learning, transfer learning, domain adaptation, deep learning, auto-encoders, and deep belief neural networks.
    Technologies: Python, MATLAB
  • Software Engineer

    2010 - 2012
    • Developed a machine learning application that allows steering a tractor by means of an EMG-based human-machine Interface.
    Technologies: GPS, Digital Signal Processing, Java, C++


  • Go I-PACE App

    One of my last projects at Jaguar Land Rover.

    Go I-PACE helps customers understand the potential cost savings of going electric compared to their existing vehicle. would-be buyers. The app estimates how I-PACE would fit into your life based on personal journey data.

    The Go I-PACE app captures journey data to calculate potential cost savings, show how much battery would be used per trip and tell users how many charges they would need in a week if they were driving the I-PACE.

    Calculates the range expected from a full charge based on your vehicle use, the number of charges required in a typical week and how frequently you would need to top up mid-journey.

    It can also distinguish between different modes of transport to make sure it collects accurate data, even prompting users to confirm that individual trips were made by car for unusual routes – for instance on journeys made by cycling rather than behind the wheel.

  • Self-learning Car

    Responsible for the delivery of the data analytics components of a car and mobile application to offer fully personalized driving experience to Jaguar Land Rover customers and help prevent accidents by reducing driver distraction.

    Main responsibilities and goals involved:

    ● Define and implement scalable real-time workflow to load data, quality management, and distribution across various system using Big Data technologies on Amazon Web Services.
    ● Contribute to the software development lifecycle including the analysis, architecture, design, implementation, and QA.
    ● Hands-on work directly implementing complex machine learning solutions using Natural Language Processing, recommender systems, neural networks and/or deep learning.
    ● Write technical documentation and presentation of results to technical and non-technical stakeholders.

  • Sensors

    Scala package to process time series from different sensors with Spark.

    Processing time series collected from different sensors poses several challenges as a result of data may not be aligned or have the same time sampling. Writing data queries can be quite hard for data scientists because data cannot be expressed in a tabular form.

    This library makes it easy to write queries with this kind of datasets.

  • Driver Profile Analysis

    Analysis of daily driving patterns of Jaguar Land Rover customers:

    - Fuel consumption
    - Daily in-car time
    - Commute schedule
    - Regular routes
    - Total distance
    - Journey duration
    - Driving style
    - Refuelling events
    - Phone call patterns
    - Heated and cooled seat usage
    - Phone call pattern
    - Radio stations

  • Data Science Competition - CONNECTOMICS - (16th / 143)

    This challenge will stimulate research on network-structure learning from neurophysiological data, including causal discovery methods.

    The goal of the data science competition was to predict the directed connection between 1000 neurons based on their time series of the activity.

    My solution involved a mixture of several features such as correlation, mutual information, partial correlation, spectrogram, and frequency analysis.

  • Data Science Competition - Grasp-and-Lift EEG Detection - (15th / 379)

    This competition challenges you to identify when a hand is grasping, lifting, and replacing an object using EEG data that was taken from healthy subjects as they performed these activities. A better understanding of the relationship between EEG signals and hand movements is critical to developing a BCI device that would give patients with neurological disabilities the ability to move through the world with greater autonomy.

    My solution involved a deep neural net developed with Python using Theano and Lasagne.


  • Languages

    Python 2, Python 3, SQL, Python, R, Scala, Java, C++
  • Frameworks

    Spark, AWS EMR, Hadoop, RStudio Shiny, Swagger
  • Libraries/APIs

    Pandas, Spark ML, Keras, NumPy, Scikit-learn, OpenCV, Python API, TensorFlow, Spark Streaming, PySpark, Google Cloud API
  • Tools

    Spark SQL, AWS Athena, CircleCI, Tableau, Impala, Cloudera, BigQuery, PyCharm, Git, Amazon SageMaker, MATLAB, Apache Airflow, AWS Glue, Jira, Plotly, Microsoft Excel, Superset, Logstash, AWS CloudWatch, AWS ElastiCache, Amazon SQS
  • Paradigms

    Lambda Architecture, Siamese Neural Networks, Microservices Architecture, Agile, Scrum
  • Platforms

    Spark Core, Amazon Web Services (AWS), Docker, Apache Kafka, Azure, OS X, AWS Lambda, Kubernetes
  • Storage

    Apache Hive, HBase, HDFS, Cassandra, AWS S3, Redshift, MySQL, Elasticsearch, Google Cloud, PostgreSQL, NoSQL
  • Other

    Convolutional Neural Networks, Deep Neural Networks, Neural Networks, Deep Learning, Predictive Modeling, Big Data, Machine Learning, Statistics, Recurrent Neural Networks, Artificial Neural Networks (ANN), Statistical Analysis, A/B Testing, Customer Analysis, Cohort Analysis, Digital Signal Processing, Signal Processing, EEG, Google BigQuery, Data Visualization, Lambda Functions, Computer Vision, Natural Language Processing (NLP), Agile Data Science, Internet of Things (IoT), Google Cloud ML, AWS, OCR, Google Data Studio, Bayesian Statistics, Churn Analysis, Biomedical Skills, ECG, CI/CD Pipelines, GPS, NiFi, AWS Database Migration Service, Kappa Architecture, Amplitude, AWS API Gateway


  • Ph.D. in Biomedical Engineering
    2013 - 2019
    University of Valladolid - Valladolid, Spain
  • Master's Degree in Information Technology (Data analysis)
    2011 - 2012
    University of Valladolid - Valladolid, Spain
  • Master of Science Degree in Electronic and Telecommunication Engineering
    2005 - 2011
    University of Valladolid - Valladolid, Spain

To view more profiles

Join Toptal
Share it with others