Córdoba, Argentina
Member since May 29, 2014
A disciplined developer with a scientific background, Rafael is a true specialist in machine learning and natural language processing. He has excellent communication skills and loves to work in teams. Rafael is always proactively thinking of ways to add more value to his clients.
  • Python, 6 years
  • Natural Language processing, 5 years
  • Machine Learning, 4 years
  • Back-end Development, 2 years
Córdoba, Argentina
Preferred Environment
Python, Git, Ubuntu, Debian
The most amazing...
...project I've led is the development of Quepy, a tool for translating natural language questions to queries for knowledge graph databases.
  • Lead Developer
    2012 - PRESENT
    • Led development of a click-through rate predictor used for image content selection.
    • Gave training on machine learning and natural language processing to groups of developers.
    • Developed a myriad of open source projects for Machinalis, including REfO, Quepy, Lepy, Featureforge, SimpleAI, and Yalign.
    • Developed and led the modernization and upgrading of a machine translation system based on Moses.
    • Led developed of Quepy, an open source framework for translating natural language questions to queries for knowledge graph databases.
    Technologies: scikit-learn, NLTK, MongoDB, scikit-image, NumPy, SciPy
  • PhD Student
    FaMAF, Universidad Nacional de Córdoba
    2011 - 2012
    • Took a postgraduate course on natural language generation.
    • Took several courses at the European Summer School on language, logic, and information (ESSLLI).
    • Received a 3-year scholarship given by CONICET to pursue a PhD.
    • Engaged in a scientific exchange program during which I spent a month working at Inria Rennes, Bretagne, France.
    • Published two scientific papers, one at MDPI's Journal of Algorithms and the other at the Journal of Discrete Algorithms.
    Technologies: Python, Natural Language Processing, Machine Learning
  • Developer for Scientific Experiments
    Natural Language Processing Group, FaMAF, Universidad Nacional de Córdoba
    2010 - 2011
    • Developed a text normalization pipeline for short pieces of text (like ADS and SMS).
    • Developed experiments on grammatical inference for data compression.
    Technologies: Python, Natural Language Processing, Machine Learning
  • Quepy: Transform natural language to database queries. (Development)

    I was the lead developer for Quepy, a Python framework for transforming natural language questions to queries in a database query language. It can be easily customized to different kinds of questions on natural language and database queries. With a little coding, users can build their own system for natural language access to their databases.

  • Sentiment analysis on movie reviews (Development)

    I developed an entry to Kaggle's Sentiment Analysis on Movie Reviews competition in Python. The code uses machine learning and natural language processing techniques and showcases the development methodology I prefer.

  • Information extraction framework (Development)

    I was lead developer for IEPY: an open source Python framework for information extraction on unstructured documents. It uses natural language processing and partially supervised machine learning techniques.

  • Comparable corpora sentence alignment (Development)

    I was the lead developer for Yalign, a tool for extracting parallel sentences from comparable corpora (ie. Wikipedia). Statistical machine translation relies on parallel corpora (ie. Europarl) for training translation models. However, these corpora are limited and take time to create. Yalign is designed to automate this process by finding sentences that are close translation matches from comparable corpora.

  • Regular expressions for objects (Development)

    I developed REfO, a small open source library to support regular expressions for sequences of arbitrary Python objects (and not just sequences of characters).

  • Featureforge (Development)

    I was developer in Featureforge, a set of tools for creating and testing machine learning features, with a scikit-learn compatible API.

  • SimpleAI (Development)

    I developed some of the algorithms in SimpleAI, a Python library that implements many of the artificial intelligence algorithms described in the book "Artificial Intelligence, a Modern Approach", by Stuart Russel and Peter Norvig.

  • Member of Natural Language Processing group (Other amazing things)

    Between late 2007 and 2012, I was a member of the NLP group at FaMAF, Universidad Nacional de Córdoba, and took part in several of its activities. These included graduate and postgraduate courses, reading groups, weekly meetings, and more.

  • Languages
    Python, C
  • Libraries/APIs
    Scikit-learn, NumPy, NLTK, Facebook API, Twitter API, SciPy
  • Misc
    Machine Learning, Natural Language processing, Back-end Development, Text Processing, Data Analysis
  • Tools
    Git, iPython Notebook, Scikit-image
  • Paradigms
    Minimum Viable Product, Agile Software Development, Unit Testing
  • Platforms
    Ubuntu, Linux, Amazon Web Services (AWS), AWS EC2
  • Storage
  • Master's degree in Computer Science
    Universidad Nacional de Córdoba - Córdoba, Argentina
    2003 - 2010
