Verified Expert in Engineering
Daniel likes code, words, and data. Starting as a Python developer, he's moved gradually from web back-end work to more data-driven projects. After spending time working in classic big data and data science, he found a niche in investigative data journalism—learning skills he now likes to use more commercially.
Git, Linux, IntelliJ
The most amazing...
...tool I've built has been used to expose corruption in Azerbaijan and Uzbekistan—to find $300 million of undeclared offshore assets and to enable prosecutions.
Prompt Engineer and GPT Developer
MBC Consultants Inc.
- Used GPT-4 to prepare a book outline based on published blog posts.
- Combined text embeddings with GPT to form a custom text classification pipeline, using GPT to identify and highlight the most classification-relevant aspects of the input text.
- Processed data using pandas, LangChain, NumPy, and BeautifulSoup.
- Built an internal API to extract and geolocate addresses from web pages.
- Worked remotely within a large, established team of 20+ developers, collaborating via Jira, GitHub, and Slack.
- Mentored and onboarded a junior developer via code review, pair programming, and general advice.
- Maintained and enhanced the Backend of a weather forecasting mobile app.
- Managed geospatial data pipelines, for processing and serving satellite imagery.
- Added monitoring and logging functionality to an existing application.
- Built a tool to plan antenna locations for telecoms.
- Combined aerial imaging with government and open data to generate special-purpose maps.
Machine Learning Developer
Travel Industry Client
- Built a natural language processing (NLP) system to match free-form text queries to appropriate product offers.
- Created a search tool using Elasticsearch, integrated with NLP tools.
- Developed an API to enable integration with other systems.
- Built a database of corporate filings from the energy and mining industries. Full-stack responsibility included web front and back end, data engineering and ETL, DB administration, and DevOps.
- Supported financial modeling through data provision.
- Created data visualizations combining financial, geographical, and qualitative data.
- Implemented a data analysis using Linux Shell tools.
Developer | Data Engineer
Freelance Work (Independent Contract Work)
- Used natural language processing to extract treatment histories from medical correspondence.
- Implemented automatic clustering of Russian-language news articles for an academic research project.
- Led the development of an online film distribution platform and scaled it to handle 500+ requests per second.
- Administered to servers for web and data-analysis workflows, including Docker and up to 40 servers.
- Rewrote python code as PHP, and maintained PHP code for web services and data scraping.
- Devops and system administration of linux servers.
- Full-stack development of a social media aggregation website.
- Image processing for a financial-industry client.
- Wrote content including technical documentation, website copy, articles on cultural issues, French-English translations.
Organized Crime and Corruption Reporting Project
- Helped a world-class team of investigative journalists to use technology in their work, such as data analysis, data journalism, security, and training.
- Researched several stories with substantial international impact.
- Acted as the project manager and lead developer for a research service for investigative journalists.
- Built a Django website rapidly for an extensive leaked database.
- Data engineering, using a MapReduce system to collect and process terabytes of data.
- Scaled a data ingestion pipeline (MongoDB, Nginx) to handle write loads of 1,000+ requests per second.
- Used statistics and machine learning to generate insight from big data and to forecast customer behavior.
- Responsible for reliability of a system with over 1 million users.
- Worked on a browser extension.
- Worked in an agile team, using Agile/Scrum, test-driven development, code review.
URL Search with Regular Expressions
This application is sophisticated, performant, and unique, with relatively little code—around 300 lines. Python is my core language, but I can work ad hoc in other languages, such as Go here, and use C/C++ libraries for performance-critical elements. This shows my familiarity with classic data structures/algorithms and recent improvements in them.
The architecture and installation are documented in README.md.
Open Data Tour of Tanzaniahttp://tanzania.openoil.net
Flask, Django, Nose, AngularJS, Bootstrap, Angular, GeoDjango
Beautiful Soup, Pandas, NumPy, Node.js, JSONP, REST APIs, SQLAlchemy, Scikit-learn, Natural Language Toolkit (NLTK), Python Asyncio, Stanford NLP, SpaCy, PyTorch, TensorFlow, AMQP, FFmpeg, Google Maps API, Fabric, jQuery, Django ORM, SciPy, GDAL
Git, GIS, Shell, *nux Shells, Docker Swarm, Docker Compose, cURL Command Line Tool, NGINX, Emacs, GitHub Pages, Pytest, Jupyter, Celery, Logging, uWSGI, Google Sheets, Amazon Simple Queue Service (SQS), RabbitMQ, GitHub, Microsoft Excel, Apache, NPM, Jira, IntelliJ
DevOps, Data Science, Agile, Test-driven Development (TDD), REST, Microservices, MapReduce, RESTful Development, Continuous Integration (CI), Unit Testing, Scrum
Amazon Web Services (AWS), Debian, Linux, Ubuntu, Amazon EC2, Google App Engine, Docker, Jupyter Notebook, Apache2, WordPress, CentOS, Mapbox, Drupal, Red Hat Linux, Kubernetes, AWS Lambda
JSON, RDBMS, NoSQL, PostgreSQL, Databases, MariaDB, MySQL, Amazon S3 (AWS S3), Elasticsearch, Memcached, MongoDB, Neo4j, Redis, Google Cloud, PostGIS, Database Management
Back-end Development, Machine Learning, Research & Investigation, APIs, Natural Language Processing (NLP), Shell Commands, Scraping, Data Scraping, Web Scraping, lxml, Ubuntu Server, BitTorrent, Text Mining, Back-end, Web Development, Screen Scraping, Data Engineering, Writing & Editing, Data Analytics, Journalism, GPT, API Integration, Regression Modeling, Linear Regression, Algorithms, Full-stack, Data Mining, Unix Shell Scripting, Code Review, Visualization, Information Visualization, Containers, Container Orchestration, Data Architecture, Architecture, Technical Project Management, Software Project Management, Source Code Review, Data Structures, Statistical Modeling, Statistics, Data Visualization, Matrix Algebra, Big Data, Documentation, RESTful Web Services, RESTful Microservices, RESTful Services, Deep Learning, Cython, Chatbots, Mathematics, Data Wrangling, Algebra, Big Data Architecture, Linear Algebra, Bayesian Statistics, SVMs, Forecasting, Scalability, Single-page Applications (SPA), Search, Tornado, SSL Certificates, SSL Configurations, SSL, HTTP, mod_wsgi, Computational Economics, Gunicorn, QGIS, Encryption, OCR, Load Balancers, Geodatabases, Financial Data, HTTPS, TCP/IP, Amazon Route 53, Open Data, Networks, Financial Modeling, System Administration, Geospatial Data, LiDAR, Text Generation, Pelias, CI/CD Pipelines, Site Reliability Engineering (SRE), ChatGPT, Generative Pre-trained Transformers (GPT), OpenAI GPT-4 API, Language Models, Text Classification, Artificial Intelligence (AI), LangChain
Project Management, Security
Bachelor of Arts Degree in Sanskrit and South Asian Studies
University of Cambridge - Cambridge, UK
Certified Scrum Master