Daniel O'Huiginn, Data Engineering Developer in Berlin, Germany
Daniel O'Huiginn

Data Engineering Developer in Berlin, Germany

Member since February 27, 2017
Daniel likes code, words, and data. Starting as a Python developer, he's moved gradually from web back-end work to more data-driven projects. After spending time working in classic big data and data science, he found a niche in investigative data journalism—learning skills he'd now like to use more commercially.
Daniel is now available for hire

Portfolio

Experience

Location

Berlin, Germany

Availability

Part-time

Preferred Environment

Git, Emacs, Debian/Ubuntu

The most amazing...

...tool I built has been used to expose corruption in Azerbaijan and Uzbekistan—to find $300 million of undeclared offshore assets and to enable prosecutions.

Employment

  • Lead Developer

    2015 - 2017
    OpenOil
    • Built a database of corporate filings from the energy and mining industries. My full-stack responsibility: web front-end and back-end, data engineering and ETL, DB administration, and DevOps.
    • Supported the financial modeling through data provision.
    • Created data visualizations.
    • Implemented a data analysis using Linux Shell tools.
    Technologies: Elasticsearch, Docker, Python, PostgreSQL, Celery, OCR, AngularJS, JavaScript, Flask, Docker, AWS S3/EC2/RDS, Excel, Statistics, jQuery, GitHub Pages, Bash, AWK, Sed
  • Developer | Data Engineer

    2006 - 2017
    Freelance Work (Independent Contract Work)
    • Used natural language processing to extract treatment histories from medical correspondence.
    • Implemented automatic clustering of Russian-language news articles for an academic research project.
    • Led the development of an online film distribution platform and scaled it to handle 500+ requests per second.
    • Administered to servers for web and data-analysis workflows, including Docker and up to 40 servers.
    • Rewrote python code as PHP, and maintained PHP code for web services and data scraping.
    • Devops and system administration of linux servers.
    • Full-stack development of a social media aggregation website.
    • Image processing for a financial-industry client.
    • Smaller web development projects using django, wordpress, javascript, jQuery, drupal, pylons, turbogears.
    • Wrote content including technical documentation, website copy, articles on cultural issues, French-English translations.
    Technologies: Python, Linux, HTML/CSS, JavaScript, PostgreSQL, NoSQL, flask, django, PHP, drupal, apache, nginx, memcached, mysql, NLTK, numpy, pandas
  • Developer

    2013 - 2014
    Organized Crime and Corruption Reporting Project
    • Helped a world-class team of investigative journalists use technology in their work. (data analysis, data journalism, security, training).
    • Researched several stories with substantial international impact.
    • Acted as the project manager and lead developer for a research service for investigative journalists.
    • Rapidly built a Django website for a large leaked database.
    Technologies: Project Management, Investigative Journalism, Python, HTML/CSS, Django, Elasticsearch, PostgreSQL, Google App Engine
  • Senior Developer

    2011 - 2012
    Zugo Services
    • Data engineering, using a MapReduce system to collect and process terabytes of data.
    • Scaled a data ingestion pipeline (MongoDB, Nginx) to handle write loads of 1,000+ requests per second.
    • Used statistics and machine learning to generate insight from big data and to forecast customer behavior.
    • Responsible for reliability of a system with over 1 million users.
    • Worked on a browser extension.
    • Worked in an agile team, using Agile/Scrum, test-driven development, code review.
    Technologies: Machine Learning, Big Data, MapReduce, MongoDB, R, Scikit-learn, numpy, javascript, erlang

Experience

  • Investigative Dashboard (Development)
    https://investigativedashboard.org/

    Investigative Dashboard is a tool that helps investigative journalists use public records to research their stories. It combines a document database and search system with a research help-desk. I led the development in 2013 to 2014.

  • Open Data Tour of Tanzania (Development)
    http://tanzania.openoil.net

    A showcase of data-driven work on the energy industry: geodata, financial modelling, and mapping of corporate structures.

Skills

  • Languages

    JavaScript, Python 3, Python 2, Python, SQL, Curl Language, ECMAScript (ES6), Bash Script, HTML/CSS, HTML, Bourne Shell, Bash, HTML5, CSS, PHP 7, PHP 5, CSS3, Sass, PHP, Ruby, Java, Go, R
  • Frameworks

    Flask, Django, Nose, Pylons, TurboGears, AngularJS, Bootstrap
  • Libraries/APIs

    Beautiful Soup, Pandas, NumPy, Node.js, JSONP, REST APIs, SQLAlchemy, Scikit-learn, NLTK, Python Asyncio, Stanford NLP, SpaCy, PyTorch, TensorFlow, Keras, AMQP, FFmpeg, Google Maps API, Fabric, jQuery, Django ORM, SciPy
  • Tools

    Git, Shell, *nux Shells, Docker Swarm, Docker Compose, cURL Command Line Tool, Nginx, Pytest, Jupyter, Celery, Logging, uWSGI, GIS, Google Sheets, Amazon SQS, SPSS, RabbitMQ, GitHub, Microsoft Excel, Apache, NPM, Jira
  • Paradigms

    DevOps, Data Science, Agile, Test-driven Development (TDD), REST, Microservices, RESTful Development, Continuous Integration (CI)
  • Platforms

    Amazon Web Services (AWS), Debian, Linux, Ubuntu, AWS EC2, Google App Engine, Docker, Jupyter Notebook, Apache2, WordPress, CentOS, MapBox, Drupal, Red Hat Linux
  • Storage

    JSON, RDBMS, NoSQL, PostgreSQL, MariaDB, MySQL, AWS S3, Elasticsearch, Memcached, MongoDB, AWS RDS, Neo4j, Redis, Google Cloud
  • Other

    Machine Learning, Research & Investigation, APIs, Natural Language Processing (NLP), Shell Commands, Scraping, Data Scraping, Web Scraping, lxml, Ubuntu Server, BitTorrent, Text Mining, Back-end, Web Back-end, Screen Scraping, Data Engineering, Writing & Editing, Data Analytics, Journalism, Regression Models, Linear Regression, Algorithms, Full-stack, Data Mining, Bash Scripting, Unix Shell Scripting, Code Reviews, Visualization, Information Visualization, Containers, Container Orchestration, Data Architecture, Architecture, Technical Project Management, Software Project Management, Source Code Reviews, Data Structures, Statistical Modeling, Statistics, Data Visualization, Matrix Algebra, Big Data, Documentation, RESTful Web Services, RESTful Microservices, RESTful Services, Deep Learning, Cython, Chatbots, Mathematics, Data Wrangling, Math, Algebra, Big Data Architecture, Linear Algebra, Bayesian Statistics, SVMs, Forecasting, Scalability, Single-page Applications (SPA), Search, Tornado, SSL Certificates, SSL Configurations, SSL, HTTP, mod_wsgi, Computational Economics, Gunicorn, QGIS, Encryption, OCR, Load Balancers, Geodatabases, Financial Data, HTTPS, TCP/IP, RESTful APIs, AWS Route 53, Open Data, Networks, System Administration
  • Industry Expertise

    Project Management, Security, Financial Modeling

Education

  • Bachelor of Arts degree in Sanskrit and South Asian Studies
    2001 - 2005
    University of Cambridge - Cambridge, UK

To view more profiles

Join Toptal
Share it with others