Data Scientist2016 - PRESENTPulpix
Technologies: Spark, Scala, Python, Django, Hadoop, HDFS, Parquet, Elasticsearch, Kafka
- Set up big data infrastructure.
- Used natural language processing to explore video content.
- Created a recommender system based on collaborative filtering.
- Created a hybrid recommender system using mixtures of natural language processing, collaborative filtering, and context information.
- Created a streaming pipeline from Django to Cassandra, Azure Storage, and HDFS.
Research Student2016 - 2016Imperial College London, Hamlyn Research Center
Technologies: Python, Scikit-learn, Tensorflow, Keras
- Improved the research state of the art in cancer detection through use of pCLE cell images.
- Investigated merits of integration of computer vision features with deep learning architectures.
- Created a test framework to improve the deep learning model for cancer classification.
- Created a preprocessing pipeline for images to make models resilient to noise.
- Implemented a web scraper to download images in large quantities.
- Created a deduplication system to filter out duplicated images.
Software Developer2013 - 2013CERN
Technologies: C, µC-OS II
- Refactored code following "clean code" conventions.
- Reused a specific communication protocol.
- Developed a profiling function for Ethernet communications, as part of a feasibility study.
- Implemented a communication feature in the field regulation.
- Developed another function based on established communication.