Daniel O'Huiginn, Developer in Berlin, Germany
Daniel is available for hire
Hire Daniel

Daniel O'Huiginn

Verified Expert  in Engineering

Software Developer

Location
Berlin, Germany
Toptal Member Since
March 20, 2017

Daniel likes code, words, and data. Starting as a Python developer, he's moved gradually from web back-end work to more data-driven projects. After spending time working in classic big data and data science, he found a niche in investigative data journalism—learning skills he now likes to use more commercially.

Portfolio

MBC Consultants Inc.
ChatGPT, Generative Pre-trained Transformers (GPT), OpenAI GPT-4 API...
Grata
Python, Containers, Docker, Elasticsearch, Pelias, Pytest, Kubernetes...
GeoThinkTank LLC
Django, Python, Amazon Web Services (AWS), PostgreSQL, Unit Testing, PostGIS...

Experience

Availability

Part-time

Preferred Environment

Git, Linux, IntelliJ IDEA

The most amazing...

...tool I've built has been used to expose corruption in Azerbaijan and Uzbekistan—to find $300 million of undeclared offshore assets and to enable prosecutions.

Work Experience

Prompt Engineer and GPT Developer

2023 - PRESENT
MBC Consultants Inc.
  • Used GPT-4 to prepare a book outline based on published blog posts.
  • Combined text embeddings with GPT to form a custom text classification pipeline, using GPT to identify and highlight the most classification-relevant aspects of the input text.
  • Processed data using pandas, LangChain, NumPy, and BeautifulSoup.
Technologies: ChatGPT, Generative Pre-trained Transformers (GPT), OpenAI GPT-4 API, Language Models, Text Classification, Text Mining, Artificial Intelligence (AI), Machine Learning, LangChain, Pandas, Natural Language Processing (NLP)

Senior Engineer

2021 - 2023
Grata
  • Built an internal API to extract and geolocate addresses from web pages.
  • Worked remotely within a large, established team of 20+ developers, collaborating via Jira, GitHub, and Slack.
  • Mentored and onboarded a junior developer via code review, pair programming, and general advice.
Technologies: Python, Containers, Docker, Elasticsearch, Pelias, Pytest, Kubernetes, CI/CD Pipelines, GIS, SQL, Git, Microservices, Jira, Django ORM, APIs, Data Engineering, JSON, Scrum, API Integration, Database Management, AWS Lambda

GIS Engineer

2022 - 2022
GeoThinkTank LLC
  • Maintained and enhanced the Backend of a weather forecasting mobile app.
  • Managed geospatial data pipelines, for processing and serving satellite imagery.
  • Added monitoring and logging functionality to an existing application.
Technologies: Django, Python, Amazon Web Services (AWS), PostgreSQL, Unit Testing, PostGIS, GeoDjango, GDAL, Celery, RabbitMQ, Geospatial Data, APIs, Data Engineering, REST, Site Reliability Engineering (SRE), Database Management

Geospatial Developer

2018 - 2019
Spatial Datalyst
  • Built a tool to plan antenna locations for telecoms.
  • Combined aerial imaging with government and open data to generate special-purpose maps.
  • Enabled a web application backed by terabytes of source data by optimizing the entire data pipeline: Linux server admin, PostGIS database, Python data-processing, web back end, JavaScript front end, and data visualization.
Technologies: PostGIS, Django, PostgreSQL, Geospatial Data, GIS, Angular, LiDAR, Open Data, Databases, Pytest, Git, Data Engineering, Geodatabases, Database Management

Machine Learning Developer

2017 - 2017
Travel Industry Client
  • Built a natural language processing (NLP) system to match free-form text queries to appropriate product offers.
  • Created a search tool using Elasticsearch, integrated with NLP tools.
  • Developed an API to enable integration with other systems.
Technologies: Natural Language Processing (NLP), Machine Learning, Text Generation, SpaCy, Elasticsearch, Scikit-learn, NumPy, APIs, Pandas, Back-end, *nux Shells, Git, Data Science, API Integration

Lead Developer

2015 - 2017
OpenOil
  • Built a database of corporate filings from the energy and mining industries. Full-stack responsibility included web front and back end, data engineering and ETL, DB administration, and DevOps.
  • Supported financial modeling through data provision.
  • Created data visualizations combining financial, geographical, and qualitative data.
  • Implemented a data analysis using Linux Shell tools.
Technologies: Amazon Web Services (AWS), Sed, AWK, Bash, GitHub Pages, jQuery, Statistics, Microsoft Excel, Amazon EC2, Amazon S3 (AWS S3), Flask, JavaScript, AngularJS, OCR, Celery, PostgreSQL, Python, Docker, Elasticsearch, Databases, SQL, Git, Full-stack, Project Management, Text Mining, Technical Project Management, Software Project Management, Database Management

Developer | Data Engineer

2006 - 2017
Freelance Work (Independent Contract Work)
  • Used natural language processing to extract treatment histories from medical correspondence.
  • Implemented automatic clustering of Russian-language news articles for an academic research project.
  • Led the development of an online film distribution platform and scaled it to handle 500+ requests per second.
  • Administered to servers for web and data-analysis workflows, including Docker and up to 40 servers.
  • Rewrote python code as PHP, and maintained PHP code for web services and data scraping.
  • Devops and system administration of linux servers.
  • Full-stack development of a social media aggregation website.
  • Image processing for a financial-industry client.
  • Smaller web development projects using django, wordpress, javascript, jQuery, drupal, pylons, turbogears.
  • Wrote content including technical documentation, website copy, articles on cultural issues, French-English translations.
Technologies: CSS, Pandas, NumPy, Natural Language Toolkit (NLTK), MySQL, Memcached, NGINX, Apache, Drupal, PHP, Django, Flask, NoSQL, PostgreSQL, JavaScript, HTML, Linux, Python, Databases, Shell Commands

Developer

2013 - 2014
Organized Crime and Corruption Reporting Project
  • Helped a world-class team of investigative journalists to use technology in their work, such as data analysis, data journalism, security, and training.
  • Researched several stories with substantial international impact.
  • Acted as the project manager and lead developer for a research service for investigative journalists.
  • Built a Django website rapidly for an extensive leaked database.
Technologies: CSS, Google App Engine, PostgreSQL, Elasticsearch, Django, HTML, Python, Project Management, *nux Shells, Google Cloud, Shell Commands, Writing & Editing, Journalism, Text Mining, Technical Project Management, Software Project Management, Data Visualization, Database Management

Senior Developer

2011 - 2012
Zugo Services
  • Data engineering, using a MapReduce system to collect and process terabytes of data.
  • Scaled a data ingestion pipeline (MongoDB, Nginx) to handle write loads of 1,000+ requests per second.
  • Used statistics and machine learning to generate insight from big data and to forecast customer behavior.
  • Responsible for reliability of a system with over 1 million users.
  • Worked on a browser extension.
  • Worked in an agile team, using Agile/Scrum, test-driven development, code review.
Technologies: JavaScript, NumPy, Scikit-learn, R, MongoDB, MapReduce, Big Data, Machine Learning, Shell Commands, Data Engineering

URL Search with Regular Expressions

An app that allows a regular expression search across URLs. I built it to demonstrate what is possible on a single server, enabling regular expressions to be indexed.

This application is sophisticated, performant, and unique, with relatively little code—around 300 lines. Python is my core language, but I can work ad hoc in other languages, such as Go here, and use C/C++ libraries for performance-critical elements. This shows my familiarity with classic data structures/algorithms and recent improvements in them.

The architecture and installation are documented in README.md.

Investigative Dashboard

https://investigativedashboard.org/
Investigative Dashboard is a tool that helps investigative journalists use public records to research their stories. It combines a document database and search system with a research help-desk. I led the development in 2013 to 2014.

Open Data Tour of Tanzania

http://tanzania.openoil.net
A showcase of data-driven work on the energy industry: geodata, financial modelling, and mapping of corporate structures.
2001 - 2005

Bachelor of Arts Degree in Sanskrit and South Asian Studies

University of Cambridge - Cambridge, UK

MARCH 2023 - PRESENT

Certified Scrum Master

Coursera

Libraries/APIs

Beautiful Soup, Pandas, NumPy, Node.js, JSONP, REST APIs, SQLAlchemy, Scikit-learn, Natural Language Toolkit (NLTK), Python Asyncio, Stanford NLP, SpaCy, PyTorch, TensorFlow, AMQP, FFmpeg, Google Maps API, Fabric, jQuery, Django ORM, SciPy, GDAL

Tools

Git, GIS, Shell, *nux Shells, Docker Swarm, Docker Compose, cURL Command Line Tool, NGINX, Emacs, GitHub Pages, Pytest, Jupyter, Celery, Logging, uWSGI, Google Sheets, Amazon Simple Queue Service (SQS), RabbitMQ, GitHub, Microsoft Excel, Apache, NPM, Jira, IntelliJ IDEA, ChatGPT

Frameworks

Flask, Django, Nose, AngularJS, Bootstrap, Angular, GeoDjango

Languages

JavaScript, Python 3, Python 2, Python, SQL, Curl Language, ECMAScript (ES6), Bash Script, HTML, Bourne Shell, Bash, HTML5, AWK, Sed, CSS, PHP 7, PHP 5, CSS3, Sass, PHP, Ruby, Java, Go, R, C, C++

Paradigms

DevOps, Data Science, Agile, Test-driven Development (TDD), REST, Microservices, MapReduce, RESTful Development, Continuous Integration (CI), Unit Testing, Scrum

Platforms

Amazon Web Services (AWS), Debian, Linux, Ubuntu, Amazon EC2, Google App Engine, Docker, Jupyter Notebook, Apache2, WordPress, CentOS, Mapbox, Drupal, Red Hat Linux, Kubernetes, AWS Lambda

Storage

JSON, RDBMS, NoSQL, PostgreSQL, Databases, MariaDB, MySQL, Amazon S3 (AWS S3), Elasticsearch, Memcached, MongoDB, Neo4j, Redis, Google Cloud, PostGIS, Database Management

Industry Expertise

Project Management

Other

Back-end Development, Machine Learning, Research & Investigation, APIs, Natural Language Processing (NLP), Shell Commands, Scraping, Data Scraping, Web Scraping, lxml, Ubuntu Server, BitTorrent, Text Mining, Back-end, Web Development, Screen Scraping, Data Engineering, Writing & Editing, Data Analytics, Journalism, API Integration, Regression Modeling, Linear Regression, Algorithms, Full-stack, Data Mining, Unix Shell Scripting, Code Review, Visualization, Information Visualization, Containers, Container Orchestration, Data Architecture, Architecture, Technical Project Management, Software Project Management, Source Code Review, Data Structures, Statistical Modeling, Statistics, Data Visualization, Matrix Algebra, Big Data, Documentation, RESTful Web Services, RESTful Microservices, RESTful Services, Deep Learning, Cython, Chatbots, Mathematics, Data Wrangling, Algebra, Big Data Architecture, Linear Algebra, Bayesian Statistics, SVMs, Forecasting, Scalability, Single-page Applications (SPA), Search, Tornado, SSL Certificates, SSL Configurations, SSL, HTTP, mod_wsgi, Computational Economics, Gunicorn, QGIS, Encryption, Security, OCR, Load Balancers, Geodatabases, Financial Data, HTTPS, TCP/IP, Amazon Route 53, Open Data, Networks, Financial Modeling, System Administration, Geospatial Data, LiDAR, Text Generation, Pelias, CI/CD Pipelines, Site Reliability Engineering (SRE), Generative Pre-trained Transformers (GPT), OpenAI GPT-4 API, Language Models, Text Classification, Artificial Intelligence (AI), LangChain

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring