Scroll To View More
Tanguy Coatalem, Big Data Developer in London, United Kingdom
Tanguy Coatalem

Big Data Developer in London, United Kingdom

Member since July 15, 2015
Tanguy is a young, active engineer from the EPF, a French graduate school of engineering, with a specialization in computer science. He is an experienced developer who has worked on a very wide range of technologies, included C programming for embedded systems, web development with the Ruby on Rails framework, and Big Data projects with Apache Spark. In addition, his experience as a consultant has provided him with great communication skills.
Tanguy is now available for hire



  • Python, 5 years
  • Big Data, 3 years
  • Recommendation Systems, 2 years
  • Deep Learning, 2 years
  • Natural Language Processing (NLP), 2 years
  • Image Recognition, 2 years
  • Apache Spark, 2 years
  • Machine Learning, 1 year
London, United Kingdom



Preferred Environment

Spyder, Eclipse, Sublime Text, Windows, Linux

The most amazing... I've developed is a system for CERN that allowed the regulation of the magnetic field to be done collectively by several power converters.


  • Data Scientist

    2016 - PRESENT
    • Set up big data infrastructure.
    • Used natural language processing to explore video content.
    • Created a recommender system based on collaborative filtering.
    • Created a hybrid recommender system using mixtures of natural language processing, collaborative filtering, and context information.
    • Created a streaming pipeline from Django to Cassandra, Azure Storage, and HDFS.
    Technologies: Spark, Scala, Python, Django, Hadoop, HDFS, Parquet, Elasticsearch, Kafka
  • Research Student

    2016 - 2016
    Imperial College London, Hamlyn Research Center
    • Improved the research state of the art in cancer detection through use of pCLE cell images.
    • Investigated merits of integration of computer vision features with deep learning architectures.
    • Created a test framework to improve the deep learning model for cancer classification.
    • Created a preprocessing pipeline for images to make models resilient to noise.
    • Implemented a web scraper to download images in large quantities.
    • Created a deduplication system to filter out duplicated images.
    Technologies: Python, Scikit-learn, Tensorflow, Keras
  • Software Developer

    2013 - 2013
    • Refactored code following "clean code" conventions.
    • Reused a specific communication protocol.
    • Developed a profiling function for Ethernet communications, as part of a feasibility study.
    • Implemented a communication feature in the field regulation.
    • Developed another function based on established communication.
    Technologies: C, µC-OS II


  • Caterpillar Tube Cost Prediction (Development)

    This project, using Python and the Python scientific libraries for machine learning, involved the creation of a model which could evaluate the cost of a Caterpillar tube based on a set of features of the tube and its composition.

  • Click-through Rate Prediction (Development)

    An Apache Spark program predicting the click-through rate based on categorical features. Those features are first converted to a numerical form with the use of One Hot Encoding algorithm before being fed to a model based on the MLLib library.

  • Song Classification Model (Development)

    An Apache Spark program classifying songs in terms of their years of release based on a set of representative features of the songs.

  • Movie Recommender System (Development)

    A project for the creation of a simple recommender system, based on user history input. This project was completed using the Apache Spark Big Data framework and the MLLib library, the Spark machine learning library.


  • Languages

    Python, Scala, SQL, Java, R, C
  • Frameworks

    Apache Spark, AngularJS
  • Libraries/APIs

    Scikit-learn, SQLAlchemy, SciPy, NumPy
  • Tools

  • Other

    Image Recognition, Classification Algorithms, Parquet, Recommendation Systems, Deep Learning, Machine Learning, Big Data, Natural Language Processing (NLP), Apache Cassandra, Web Development, Web Services, Data Mining, Web Scraping
  • Paradigms

    Data Science, MapReduce, Object-oriented Programming (OOP), Functional Programming
  • Platforms

    Apache Kafka, Linux, Android
  • Storage

    HDFS, NoSQL, MySQL, PostgreSQL


  • Master of Science degree in Machine Learning
    2015 - 2016
    Imperial College of London - London
  • Diplome d'Ingenieur (Engineering) degree in Computer Science and Engineering
    2010 - 2015
    EPF, Graduate School of Engineering - Paris, France
I really like this profile
Share it with others