C++ Developer in Espoo, Finland
Founder2010 - PRESENTPerceptive Constructs
- Built a custom activity stream data processing pipeline with Node.js and Redis.
- Build an unsupervised training pipeline for deep neural network architectures aimed at feature extraction from free text, using Python and Node.js.
- Built low-latency activity stream relevance filters using Node.js and C++.
- Built optimized random forest and Naive Bayes classifiers in C++ with bindings to Node.js and Python.
- Built a real-time web UI for activity stream relevance filtering, using Node.js, Socket.IO, and a custom data/DOM binding framework.
- Built a low-latency framework for training classifiers in an active learning settings using Node.js, Redis, Socket.IO, and jQuery.
- Built a hybrid native/HTML custom activity stream client for Android, integrated with filtering.
- Built a real-time custom recommender system for eCommerce. Hybrid collaborative filtering + content (text and metadata). Python, C++. Distributed and multicore.
- Built a bespoke transaction risk analysis system for eCommerce. Python + R.
- Built a custom marketing message timing optimizer for eCommerce. Python + R.
Director of Data Science2015 - 2016MarkaVIP
Technologies: Python, R, Java, C++, Kinesis, Redshift, Spark, MySQL, Oracle
- Built and deployed a foundational analytical backbone for the company in AWS, around Kinesis, Redshift, and Spark. The design balances key goals of scalability, accessibility to analysts, low admin overhead, and support for both batch and streaming analysis.
- Integrated continuous data ingestion from key systems into the analytical backbone, whenever practical, through low latency interfaces such as database replication.
- Migrated some interaction tracking systems to sink directly to the backbone.
- Migrated key analytical systems to the backbone, including the recommender.
- Mad various improvements to the recommender, including use of fine-grained recorded impressions as a negative signal, and more flexibility in handling of catalog metadata.
- Made analysis and real-time operation for interventions on customer experience to reduce the impact of returns and cancellations. Optimization is by policy search through retrospective simulation on historical data. Operationalized as HTTP micro-service (Python, Kinesis, Redshift).
- Expanded the above system to improve order profitability by optimizing basket constraints and incentives.
- Conducted retrospective sourcing performance and pricing analysis. Our systems don't maintain the full history of changes to all the relevant data, so this analysis was done by replaying row mutations continuously captured from database replication logs and stored in Redshift (Python/C++).
Software/Data Architect2011 - 2015Codento
- Built an image upload/pre-processing pipeline for a media startup, using Node.js and MongoDB on AWS. Included single sign-on with a Ruby on Rails app in the back-end.
- Implemented real-time path updates on a bespoke structured messaging app, using Node.js (integrated with a Python back-end) and Batman.js.
- Built custom, interactive data displays for a bespoke structured messaging application using d3.js.
- Implemented a complex data entry UI for a structured messaging application using Batman.js in CoffeeScript.
- Built a custom C# distributed data analysis pipeline to perform Matlab jobs on AWS.
- Contributed to embedded security appliances in C.
- Contributed to the back-end for structured messaging applications in Python, with Django.
- Designed and implemented custom interactive data analysis and visualization for economic data. Python back-end + d3.js visualization.
- Built a custom nurse schedule and route optimization system for a healthcare startup. Pre-processing and mixed integer model formulation for Gurobi in Python and d3.js visualization of solutions.
- Modernized the system design of a pre-existing real-time transport logistics system for scalability and higher performance. Enterprise Java.
- Designed and implemented a reference application for a high-security network architecture for a banking customer. Scala/Play, Slick, two-factor authentication.
- Contributed to large scale online storage system implementation. Python + PostgreSQL.
- Built a custom Matlab system to tune a legacy application from data during black-box optimization (derivative free).
Chief Software Architect2009 - 2010Nokia | Gear
Technologies: Python, Java, Sphinx
- Prototyped a voice- and gesture-based user interface for in-car mobile phone usage at various levels of fidelity ranging from Wizard of Oz to software proof-of-concept (Python, Java, Sphinx).
- Defined software architecture for a family of in-car products, with input to hardware platform selection.
- Planned costs, schedule, and execution of multiple new product development scenarios.
- Planned and moderated usability studies for prototype validation and iteration.
- Conducted rigorous feasibility studies and software architecture reviews at Gear.
Team Lead, Senior R&D Manager2003 - 2009Nokia | Maemo
- Recruited and ramped up the Maemo Application Framework team from scratch.
- Defined application framework architecture and development strategy.
- Led the implementation of three major software generations along with updates.
- Impacted Nokia's entry into open-source development.
- Developed a considerable subcontracting and partnering network for Linux development.
- Contributed to initial product concept definition.
Senior Software Engineer2001 - 2003Nokia | Research Center
Technologies: C++, C, Python
- Prototyped a small-footprint relational database for small Linux devices in C++.
- Prototyped a personal information manager for handheld devices based on semantic web technology in Python.
- Studied and evaluated architectural options for an application framework aimed at Linux-based handheld devices, adopted by the nascent Maemo project.
GIS/Computer Graphics Freelancer1998 - 2001CGEO.net
Technologies: C++, Python, Oracle
- Built a GIS to edit land cadaster for the Portuguese Ministry of Agriculture using C++, Windows, and Oracle technologies.
- Built a custom C++ framework to offer real-time manipulation of topologically integrated geographic vector data.
- Built a geographical decision support system for semi-automated execution (optimization) of land-consolidation projects for specialized consultancy, using C++, in Windows.
- Developed, licensed, and finally sold a ray-tracing rendering module for use with interior design software, written in C++.
- Build GIS to edit an olive tree cadaster for the Portuguese Ministry of Agriculture, with integrated olive tree recognition from aerial photography, built with C++ in Windows.
- rawhash (Development)https://github.com/pconstr/rawhash
An experimental, binary, friendly alternative to using a hash as a key:value cache, for Node.js.
Keys are binary buffer objects rather than strings. Values are arbitrary objects.
rawhash is built on google-sparsehash and murmurhash3 (included).
- rdb-parser (Development)https://github.com/pconstr/rdb-parser
- Incremental Random Forest (Development)https://github.com/pconstr/irf
An implementation in C++ (with Node.js and Python bindings) of a variant of Leo Breiman's Random Forests
The forest is maintained incrementally as samples are added or removed - rather than fully rebuilt from scratch every time - to save resources.
It is not a streaming implementation, as all the samples are stored and will be re-seen when required to recursively rebuild invalidated subtrees. The effort to update each individual tree can vary substantially but the overall effort to update the forest is averaged across the trees and tends not to vary significantly.
- catsagram (Development)
Rolling instagram photos of cats, built to experiment with custom data/DOM bindings (data-graft.js), responsive layout (try resizing the window), and socket.io.
- data-graft.js (Development)https://github.com/pconstr/data-graft.js
An animation-friendly, differential DOM template engine, self-contained and framework-agnostic. Built to experiment with dynamic data/DOM binding, with a particular focus on flexibility for animating data-change transitions.
Libraries/APIsNode.js, Spark Streaming, Eigen, Scikit-learn, SciPy, Twitter API, NumPy, D3.js, jQuery, Matplotlib, Pandas, Theano, Facebook API
PlatformsLinux, AWS Kinesis, Amazon Web Services (AWS), Google App Engine, Android
StorageRedis, Redshift, LevelDB, MongoDB, RocksDB, Cassandra
FrameworksApache Spark, Django, Hadoop, Bottle
ParadigmsParallel & Distributed Computing, Distributed Programming, Functional Programming
OtherMachine Learning, Scientific Computing, Natural Language Processing (NLP), Tornado
- Master's degree in Computer Science1991 - 1996Universidade Nova de Lisboa - Lisbon, Portugal