Miguel Vazquez
Verified Expert in Engineering
Data Science Developer
Barcelona, Spain
Toptal member since January 17, 2022
Miguel has a Ph.D. in computer science and bioinformatics. He learned about data mining 20 years ago and discovered a great field of application in Bioinformatics. Since then, Miguel has worked in many areas focusing on cancer genomics (text-mining, systems biology, web development, and workflows). He has written hundreds of thousands of lines of open-source code and developed the Rbbt framework, one of the most effective tools for data analysis, downloaded over a million times.
Portfolio
Experience
- Machine Learning - 20 years
- Text Mining - 15 years
- Programming - 15 years
- Data Analytics - 15 years
- Unix - 15 years
- Ruby - 15 years
- Pipelines - 10 years
- Statistics - 10 years
Availability
Preferred Environment
Linux, Vi, SSH
The most amazing...
...tool I've developed is the Rbbt framework, which has made me one of the most effective programmers in the field of bioinformatics.
Work Experience
Head of Unit
Barcelona Supercomputing Center
- Developed a suite of bioinformatics pipelines covering a wide range of functionalities: DNA and RNA-Seq alignment, variant calling, quantification, clonality, cohort statistics (cancer drivers, survival, etc.), drug response, synergies, modelling...
- Contributed to the development of my own bioinformatics framework (Rbbt) with functionalities improving integration on HPC: Flexible and elastic resource allocation, and automatic deployment of heterogeneous workloads across sites and containers.
- Developed interactive web portals to visualize terabytes of genomics data, with secure policies respecting data privacy. Integrated and developed multiple data visualization frameworks to explore the multiple data types. On-demand analyses.
Postdoctoral Researcher
Norwegian University of Science and Technology
- Developed a complex pipeline for the prioritization of targeted drug combination treatments for cancer based on integrative analysis of multiple genomics data sources, Bayesian statistics modeling, and Boolean cell signaling simulations.
- Created a comprehensive resource for gene transcription regulation information based on our own text mining and integration with multiple curated database resources. Resolved cross-species integration, normalization issues, and quality assessment.
- Built a tool to assess drug synergies across arrow drug sensitivity assays implementing the most used statistics: CI, Bliss, HSA, etc. Produced interactive plots. Support for massive execution batches using HPC and the Rbbt workflow enactment.
Postdoctoral Researcher
Spanish National Cancer Center
- Released own bioinformatics framework Rbbt (Ruby Bioinformatics Toolkit). Arguably, it's the most comprehensive framework for developing bioinformatics applications. The core package (rbbt-util) has been downloaded more than 1.2 million times.
- Made essential contributions to the Pancancer Analysis of Whole Genomes (PCAWG), an international project: Web data visualization portals, functional annotation of variants, driver predictions, and pathway enrichment analyses (statistics).
- Developed a workflow enactment engine for Rbbt with many advanced features not present in competing solutions: cmdline + HTML + web services, multi-step forking streaming API, HTTP hijacking, and elastic concurrency.
Teaching Assistant
Universidad Complutense de Madrid
- Developed Rbbt (Ruby Bioinformatics Toolkit) incrementally through the different projects I was involved with. Support for data processing, text-mining, and web development, among other things.
- Developed several bioinformatics web applications: Text-mining for the functional description of gene lists through NMF, functional enrichment analyses of genes and proteins across multiple databases, and named-entity recognition and normalization.
- Produced the first SOAP and REST web services for the different projects in my group.
Freelance Junior Programmer
Several Business
- Contributed to the development of the system used by Jazztel to manage internal orders and provisions using Java Spring.
- Build a tool to extract all literal strings of text in code and replace them with dictionary entries to support the localization of a large web application for car rental.
- Developed a clustering model to process survey responses in a sociological study.
Experience
Ruby Bioinformatics Toolkit
http://mikisvaz.github.io/rbbt/The project was developed with bioinformatics in mind, but its functionalities are applicable to any field of data analysis. It has been used in nearly 50 different projects of all sizes, from small utilities to support larger investigations to crucial components of massive international projects.
PCAWG Scout
Text-mining for Transcription Regulation Information
https://extri.org/Education
Ph.D. in Computer Science & Bioinformatics
Universidad Complutense de Madrid - Madrid, Spain
Bachelor's Degree in Computer Science
Universidad Complutense de Madrid - Madrid, Spain
Certifications
Management Fundamentals for Scientists and Researchers in Business Administration
IE Business School
Skills
Libraries/APIs
Microsoft HPC
Languages
Ruby, JavaScript, Python, R, HTML, C, Perl, Java
Platforms
Unix, Linux
Storage
Databases, NoSQL
Paradigms
Management
Industry Expertise
Bioinformatics
Other
Programming, Algorithms, Text Mining, Web Development, Web Services, Frameworks, Pipelines, Machine Learning, Data Mining, Genomics, Data Analytics, Data Analysis, Vi, SSH, Data Science, Oncology & Cancer Treatment, Data Manipulation, Data Extraction, Data Collection, Statistics, Clustering, Data Visualization, Web Scraping, Hypothesis Testing, Regression, Biology, Molecular Biology, Computational Biology, Large Data Sets, Accounts, Business, Marketing Mix, Boolean Modeling, Drug Development, Natural Language Processing (NLP), Artificial Intelligence (AI), Deep Learning, Generative Pre-trained Transformers (GPT)
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring