
Toni Cebrián
Verified Expert in Engineering
Machine Learning Developer
A rare mixture of data scientist and data engineer, Toni is able to lead projects from conception and prototyping to deploying at scale in the cloud.
Portfolio
Experience
Availability
Preferred Environment
Linux
The most amazing...
...experience has been teaching a typeclasses talk using Scala at a local Scala meetup group.
Work Experience
Founder
D5.ai
- Ingested the Bitcoin graph into a Neo4J database using Airflow to periodically crawl BigQuery tables with bitcoin transactions.
- Created asyncio web crawlers in Python to scrape websites with newsworthy content.
- Maintained and evolve an SDK in Scala and Haskell for accessing web APIs from customers using those languages.
Lead Data Engineer
Coinfi
- Created the ETL orchestration systems using Airflow with Composer in Google Cloud.
- Created scrapping services for getting Crypto data (prices, events, news.) to ingest into the platform.
Head of Data Science
Stuart
- Designed the company's data warehouse using Redshift.
- Created a forecasting model for predicting drivers login into the platform and deliveries to be served.
- Architected an event sourcing system for complex event processing.
- Deployed a route optimization algorithm for picking drivers based on route and package size.
- Created the data science team from scratch.
Chief Data Officer
Enerbyte
- Architected the infrastructure for ingesting data from IoT devices.
- Researched algorithms for energy disaggregation from a single point of measure.
- Created the data science team from scratch.
Head of Data Science
Softonic
- Created a recommender system based on textual content from app reviews.
- Developed an improved search engine using machine learning and Solr.
- Created the data science team from scratch. Hired all relevant profiles and set up the OKRs and managerial tasks.
Experience
Typeclasses Talk
https://github.com/tonicebrian/typeclasses-talkSGF Parser in Haskell
https://github.com/tonicebrian/sgfSkills
Languages
Python, Python 3, Scala, SQL, RDF, Haskell, C++, Java
Frameworks
Spark, Akka, Hadoop
Libraries/APIs
Spark Streaming, Pandas, Scikit-learn, NumPy, PubSubJS, Python Asyncio, TensorFlow, XGBoost
Tools
Apache Airflow, Cloud Dataflow, Apache Beam, Solr, Apache Avro
Paradigms
Functional Programming, Data Science, Reactive Programming
Other
Machine Learning, Akka HTTP, Data Mining, Data Engineering, Artificial Intelligence (AI), Crypto, NEO, Data Flows, Recommendation Systems, Word2Vec, Semantic Web, Web Scraping, Natural Language Processing (NLP), Deep Learning, Financial Modeling, Monte Carlo Simulations
Platforms
Apache Kafka, Linux
Storage
Redshift, Cassandra, Google Cloud, Redis
Education
Master's Degree in Artificial Intelligence
Universitat Politecnica de Catalunya - Barcelona, Spain
Postgraduate Degree in Quantitative Techniques for Financial Products
Universitat Politecnica de Catalunya - Barcelona, Spain
Certifications
Cloudera Certified Hadoop Professional
Cloudera