Daniel O'Huiginn
Verified Expert in Engineering
Software Developer
Berlin, Germany
Toptal member since March 20, 2017
Daniel likes code, words, and data. Starting as a Python developer, he's moved gradually from web back-end work to more data-driven projects. After spending time working in classic big data and data science, he found a niche in investigative data journalism—learning skills he now likes to use more commercially.
Portfolio
Experience
Availability
Preferred Environment
Git, Linux, IntelliJ IDEA
The most amazing...
...tool I've built has been used to expose corruption in Azerbaijan and Uzbekistan—to find $300 million of undeclared offshore assets and to enable prosecutions.
Work Experience
Prompt Engineer and GPT Developer
MBC Consultants Inc.
- Used GPT-4 to prepare a book outline based on published blog posts.
- Combined text embeddings with GPT to form a custom text classification pipeline, using GPT to identify and highlight the most classification-relevant aspects of the input text.
- Processed data using pandas, LangChain, NumPy, and BeautifulSoup.
Senior Engineer
Grata
- Built an internal API to extract and geolocate addresses from web pages.
- Worked remotely within a large, established team of 20+ developers, collaborating via Jira, GitHub, and Slack.
- Mentored and onboarded a junior developer via code review, pair programming, and general advice.
GIS Engineer
GeoThinkTank LLC
- Maintained and enhanced the Backend of a weather forecasting mobile app.
- Managed geospatial data pipelines, for processing and serving satellite imagery.
- Added monitoring and logging functionality to an existing application.
Geospatial Developer
Spatial Datalyst
- Built a tool to plan antenna locations for telecoms.
- Combined aerial imaging with government and open data to generate special-purpose maps.
- Enabled a web application backed by terabytes of source data by optimizing the entire data pipeline: Linux server admin, PostGIS database, Python data-processing, web back end, JavaScript front end, and data visualization.
Machine Learning Developer
Travel Industry Client
- Built a natural language processing (NLP) system to match free-form text queries to appropriate product offers.
- Created a search tool using Elasticsearch, integrated with NLP tools.
- Developed an API to enable integration with other systems.
Lead Developer
OpenOil
- Built a database of corporate filings from the energy and mining industries. Full-stack responsibility included web front and back end, data engineering and ETL, DB administration, and DevOps.
- Supported financial modeling through data provision.
- Created data visualizations combining financial, geographical, and qualitative data.
- Implemented a data analysis using Linux Shell tools.
Developer | Data Engineer
Freelance Work (Independent Contract Work)
- Used natural language processing to extract treatment histories from medical correspondence.
- Implemented automatic clustering of Russian-language news articles for an academic research project.
- Led the development of an online film distribution platform and scaled it to handle 500+ requests per second.
- Administered to servers for web and data-analysis workflows, including Docker and up to 40 servers.
- Rewrote python code as PHP, and maintained PHP code for web services and data scraping.
- Devops and system administration of linux servers.
- Full-stack development of a social media aggregation website.
- Image processing for a financial-industry client.
- Smaller web development projects using django, wordpress, javascript, jQuery, drupal, pylons, turbogears.
- Wrote content including technical documentation, website copy, articles on cultural issues, French-English translations.
Developer
Organized Crime and Corruption Reporting Project
- Helped a world-class team of investigative journalists to use technology in their work, such as data analysis, data journalism, security, and training.
- Researched several stories with substantial international impact.
- Acted as the project manager and lead developer for a research service for investigative journalists.
- Built a Django website rapidly for an extensive leaked database.
Senior Developer
Zugo Services
- Data engineering, using a MapReduce system to collect and process terabytes of data.
- Scaled a data ingestion pipeline (MongoDB, Nginx) to handle write loads of 1,000+ requests per second.
- Used statistics and machine learning to generate insight from big data and to forecast customer behavior.
- Responsible for reliability of a system with over 1 million users.
- Worked on a browser extension.
- Worked in an agile team, using Agile/Scrum, test-driven development, code review.
Experience
URL Search with Regular Expressions
This application is sophisticated, performant, and unique, with relatively little code—around 300 lines. Python is my core language, but I can work ad hoc in other languages, such as Go here, and use C/C++ libraries for performance-critical elements. This shows my familiarity with classic data structures/algorithms and recent improvements in them.
The architecture and installation are documented in README.md.
Investigative Dashboard
https://investigativedashboard.org/Open Data Tour of Tanzania
http://tanzania.openoil.netEducation
Bachelor of Arts Degree in Sanskrit and South Asian Studies
University of Cambridge - Cambridge, UK
Certifications
Certified Scrum Master
Coursera
Skills
Libraries/APIs
Beautiful Soup, Pandas, NumPy, Node.js, JSONP, REST APIs, SQLAlchemy, Scikit-learn, Natural Language Toolkit (NLTK), Python Asyncio, Stanford NLP, SpaCy, PyTorch, TensorFlow, AMQP, FFmpeg, Google Maps API, Fabric, jQuery, Django ORM, SciPy, GDAL
Tools
Git, GIS, Shell Development, *nux Shells, Docker Swarm, Docker Compose, cURL Command Line Tool, NGINX, Emacs, GitHub Pages, Pytest, Jupyter, Celery, Logging, uWSGI, Google Sheets, Amazon Simple Queue Service (SQS), RabbitMQ, GitHub, Microsoft Excel, Apache, NPM, Jira, IntelliJ IDEA, ChatGPT
Languages
JavaScript, Python 3, Python 2, Python, SQL, Curl Language, ECMAScript (ES6), Bash Script, HTML, Bourne Shell, Bash, HTML5, AWK, Sed, CSS, PHP 7, PHP 5, CSS3, Sass, PHP, Ruby, Java, Go, R, C, C++
Frameworks
Flask, Django, Nose, AngularJS, Bootstrap, Angular, GeoDjango
Paradigms
DevOps, Agile, Test-driven Development (TDD), REST, Microservices, MapReduce, RESTful Development, Continuous Integration (CI), Unit Testing, Scrum
Platforms
Amazon Web Services (AWS), Debian, Linux, Ubuntu, Amazon EC2, Google App Engine, Docker, Jupyter Notebook, Apache2, WordPress, CentOS, Mapbox, Drupal, Red Hat Linux, Kubernetes, AWS Lambda
Storage
JSON, RDBMS, NoSQL, PostgreSQL, Databases, MariaDB, MySQL, Amazon S3 (AWS S3), Elasticsearch, Memcached, MongoDB, Neo4j, Redis, Google Cloud, PostGIS, Database Management
Industry Expertise
Project Management
Other
Back-end Development, Machine Learning, Research & Investigation, APIs, Natural Language Processing (NLP), Shell Commands, Scraping, Data Scraping, Web Scraping, lxml, Ubuntu Server, BitTorrent, Text Mining, Back-end, Web Development, Screen Scraping, Data Science, Data Engineering, Writing & Editing, Data Analytics, Journalism, API Integration, Regression Modeling, Linear Regression, Algorithms, Full-stack, Data Mining, Unix Shell Scripting, Code Review, Visualization, Information Visualization, Containers, Container Orchestration, Data Architecture, Architecture, Technical Project Management, Software Project Management, Source Code Review, Data Structures, Statistical Modeling, Statistics, Data Visualization, Matrix Algebra, Big Data, Documentation, RESTful Web Services, RESTful Microservices, RESTful Services, Deep Learning, Cython, Chatbots, Mathematics, Data Wrangling, Algebra, Big Data Architecture, Linear Algebra, Bayesian Statistics, SVMs, Forecasting, Scalability, Single-page Applications (SPA), Search, Tornado, SSL Certificates, SSL Configurations, SSL, HTTP, mod_wsgi, Computational Economics, Gunicorn, QGIS, Encryption, Security, OCR, Load Balancers, Geodatabases, Financial Data, HTTPS, TCP/IP, Amazon Route 53, Open Data, Networks, Financial Modeling, System Administration, Geospatial Data, LiDAR, Text Generation, Pelias, CI/CD Pipelines, Site Reliability Engineering (SRE), Generative Pre-trained Transformers (GPT), OpenAI GPT-4 API, Language Models, Text Classification, Artificial Intelligence (AI), LangChain
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring