Carlos Guerreiro

Carlos Guerreiro

Espoo, Finland
Hire Carlos
Scroll To View More
Carlos Guerreiro

Carlos Guerreiro

Espoo, Finland
Member since March 9, 2013
Carlos is an exceptional data generalist who brings a vast amount of experience in the design, implementation, and validation of data-intensive systems to all of his projects, along with deep expertise in machine learning and real-time stream processing.
Carlos is now available for hire
Portfolio
  • Perceptive Constructs
    Machine learning, Python, R, JavaScript, Node.js, C++, Redis
  • MarkaVIP
    Python, R, Java, C++, Kinesis, Redshift, Spark, MySQL, Oracle
  • Codento
    Python, JavaScript, Node.js, CoffeeScript, Java, Ruby on Rails
Experience
  • C++, 20 years
  • Python, 13 years
  • Natural Language processing, 5 years
  • Machine Learning, 5 years
  • Scientific Computing, 5 years
  • Redis, 3 years
  • D3.js, 2 years
  • Node.js, 2 years
Espoo, Finland
Availability
Part-time
Preferred Environment
Mac OS, Linux, Git, Emacs, Command line, IPython
The most amazing...
...thing I've built is an activity stream relevance filter - a low-latency, supervised learning loop over a deep neural net trained from 1/2 TB of unlabeled data.
Employment
  • Founder
    Perceptive Constructs
    2010 - PRESENT
    • Built a custom activity stream data processing pipeline with Node.js and Redis.
    • Build an unsupervised training pipeline for deep neural network architectures aimed at feature extraction from free text, using Python and Node.js.
    • Built low-latency activity stream relevance filters using Node.js and C++.
    • Built optimized random forest and Naive Bayes classifiers in C++ with bindings to Node.js and Python.
    • Built a real-time web UI for activity stream relevance filtering, using Node.js, Socket.IO, and a custom data/DOM binding framework.
    • Built a low-latency framework for training classifiers in an active learning settings using Node.js, Redis, Socket.IO, and jQuery.
    • Built a hybrid native/HTML custom activity stream client for Android, integrated with filtering.
    • Built a real-time custom recommender system for eCommerce. Hybrid collaborative filtering + content (text and metadata). Python, C++. Distributed and multicore.
    • Built a bespoke transaction risk analysis system for eCommerce. Python + R.
    • Built a custom marketing message timing optimizer for eCommerce. Python + R.
    Technologies: Machine learning, Python, R, JavaScript, Node.js, C++, Redis
  • Director of Data Science
    MarkaVIP
    2015 - 2016
    • Built and deployed a foundational analytical backbone for the company in AWS, around Kinesis, Redshift, and Spark. The design balances key goals of scalability, accessibility to analysts, low admin overhead, and support for both batch and streaming analysis.
    • Integrated continuous data ingestion from key systems into the analytical backbone, whenever practical, through low latency interfaces such as database replication.
    • Migrated some interaction tracking systems to sink directly to the backbone.
    • Migrated key analytical systems to the backbone, including the recommender.
    • Mad various improvements to the recommender, including use of fine-grained recorded impressions as a negative signal, and more flexibility in handling of catalog metadata.
    • Made analysis and real-time operation for interventions on customer experience to reduce the impact of returns and cancellations. Optimization is by policy search through retrospective simulation on historical data. Operationalized as HTTP micro-service (Python, Kinesis, Redshift).
    • Expanded the above system to improve order profitability by optimizing basket constraints and incentives.
    • Conducted retrospective sourcing performance and pricing analysis. Our systems don't maintain the full history of changes to all the relevant data, so this analysis was done by replaying row mutations continuously captured from database replication logs and stored in Redshift (Python/C++).
    Technologies: Python, R, Java, C++, Kinesis, Redshift, Spark, MySQL, Oracle
  • Software/Data Architect
    Codento
    2011 - 2015
    • Built an image upload/pre-processing pipeline for a media startup, using Node.js and MongoDB on AWS. Included single sign-on with a Ruby on Rails app in the back-end.
    • Implemented real-time path updates on a bespoke structured messaging app, using Node.js (integrated with a Python back-end) and Batman.js.
    • Built custom, interactive data displays for a bespoke structured messaging application using d3.js.
    • Implemented a complex data entry UI for a structured messaging application using Batman.js in CoffeeScript.
    • Built a custom C# distributed data analysis pipeline to perform Matlab jobs on AWS.
    • Contributed to embedded security appliances in C.
    • Contributed to the back-end for structured messaging applications in Python, with Django.
    • Designed and implemented custom interactive data analysis and visualization for economic data. Python back-end + d3.js visualization.
    • Built a custom nurse schedule and route optimization system for a healthcare startup. Pre-processing and mixed integer model formulation for Gurobi in Python and d3.js visualization of solutions.
    • Modernized the system design of a pre-existing real-time transport logistics system for scalability and higher performance. Enterprise Java.
    • Designed and implemented a reference application for a high-security network architecture for a banking customer. Scala/Play, Slick, two-factor authentication.
    • Contributed to large scale online storage system implementation. Python + PostgreSQL.
    • Built a custom Matlab system to tune a legacy application from data during black-box optimization (derivative free).
    Technologies: Python, JavaScript, Node.js, CoffeeScript, Java, Ruby on Rails
  • Chief Software Architect
    Nokia | Gear
    2009 - 2010
    • Prototyped a voice- and gesture-based user interface for in-car mobile phone usage at various levels of fidelity ranging from Wizard of Oz to software proof-of-concept (Python, Java, Sphinx).
    • Defined software architecture for a family of in-car products, with input to hardware platform selection.
    • Planned costs, schedule, and execution of multiple new product development scenarios.
    • Planned and moderated usability studies for prototype validation and iteration.
    • Conducted rigorous feasibility studies and software architecture reviews at Gear.
    Technologies: Python, Java, Sphinx
  • Team Lead, Senior R&D Manager
    Nokia | Maemo
    2003 - 2009
    • Recruited and ramped up the Maemo Application Framework team from scratch.
    • Defined application framework architecture and development strategy.
    • Led the implementation of three major software generations along with updates.
    • Impacted Nokia's entry into open-source development.
    • Developed a considerable subcontracting and partnering network for Linux development.
    • Contributed to initial product concept definition.
    Technologies: Maemo
  • Senior Software Engineer
    Nokia | Research Center
    2001 - 2003
    • Prototyped a small-footprint relational database for small Linux devices in C++.
    • Prototyped a personal information manager for handheld devices based on semantic web technology in Python.
    • Studied and evaluated architectural options for an application framework aimed at Linux-based handheld devices, adopted by the nascent Maemo project.
    Technologies: C++, C, Python
  • GIS/Computer Graphics Freelancer
    CGEO.net
    1998 - 2001
    • Built a GIS to edit land cadaster for the Portuguese Ministry of Agriculture using C++, Windows, and Oracle technologies.
    • Built a custom C++ framework to offer real-time manipulation of topologically integrated geographic vector data.
    • Built a geographical decision support system for semi-automated execution (optimization) of land-consolidation projects for specialized consultancy, using C++, in Windows.
    • Developed, licensed, and finally sold a ray-tracing rendering module for use with interior design software, written in C++.
    • Build GIS to edit an olive tree cadaster for the Portuguese Ministry of Agriculture, with integrated olive tree recognition from aerial photography, built with C++ in Windows.
    Technologies: C++, Python, Oracle
Experience
  • rawhash (Development)
    https://github.com/pconstr/rawhash

    An experimental, binary, friendly alternative to using a hash as a key:value cache, for Node.js.

    Keys are binary buffer objects rather than strings. Values are arbitrary objects.

    rawhash is built on google-sparsehash and murmurhash3 (included).

  • rdb-parser (Development)
    https://github.com/pconstr/rdb-parser

    An asynchronous streaming parser for Redis RDB database dumps, written in 100% JavaScript, intended for use in Node.js.

  • Incremental Random Forest (Development)
    https://github.com/pconstr/irf

    An implementation in C++ (with Node.js and Python bindings) of a variant of Leo Breiman's Random Forests

    The forest is maintained incrementally as samples are added or removed - rather than fully rebuilt from scratch every time - to save resources.

    It is not a streaming implementation, as all the samples are stored and will be re-seen when required to recursively rebuild invalidated subtrees. The effort to update each individual tree can vary substantially but the overall effort to update the forest is averaged across the trees and tends not to vary significantly.

  • catsagram (Development)
    http://catsagram.perceptiveconstructs.com/

    Rolling instagram photos of cats, built to experiment with custom data/DOM bindings (data-graft.js), responsive layout (try resizing the window), and socket.io.

  • data-graft.js (Development)
    https://github.com/pconstr/data-graft.js

    An animation-friendly, differential DOM template engine, self-contained and framework-agnostic. Built to experiment with dynamic data/DOM binding, with a particular focus on flexibility for animating data-change transitions.

Skills
  • Languages
    C++, C, Python, SQL, MATLAB, JavaScript, R, Java, Scala
  • Libraries/APIs
    Node.js, D3.js, Spark Streaming, Twitter API, SciPy, Scikit-learn, Eigen, NumPy, jQuery, matplotlib, Theano, Facebook API, pandas
  • Platforms
    Linux, Amazon Web Services (AWS), Amazon Kinesis, Google App Engine, Android
  • Storage
    Redis, MongoDB, Amazon Redshift, LevelDB, RocksDB, Cassandra
  • Frameworks
    Django, Bottle, Hadoop
  • Tools
    Apache Spark
  • Paradigms
    Parallel programming, Distributed Programming, Functional programming
  • Misc
    Scientific Computing, Machine Learning, Tornado, Natural Language processing
Education
  • Master's degree in Computer Science
    Universidade Nova de Lisboa - Lisbon, Portugal
    1991 - 1996
I really like this profile
Share it with others