Verified Expert in Engineering
Data Engineer and Software Developer
Daniel is an experienced data professional with a decade of experience in building analytical systems. He is happiest working on the full stack, from engineering storage systems and processing pipelines to data analysis, presentation, and building front-end dashboards. For the past several years, Daniel has been heavily into machine learning (ML) and has applied it to create value from real-world data. He has a PhD in experimental physics, so ML is not the only tool in his toolbox.
Linux, Emacs, Visual Studio Code (VS Code), Jupyter
The most amazing...
...experience I've had was building the DAQ for a dark matter detector, leading an analysis team, and guiding our data from analog cables to final plots.
Senior Data Scientist
- Created an ML modeling platform from scratch in PySpark and Kubernetes to automatically perform a complex series of ETLs, build model training sets, normalize inputs, train models, and serve results. It is easily monitored by a web front end.
- Built several ML models for customer behavior predictions for advertising. These models have performance improvements over human-based methods and bring enormous cost savings. Key technologies used: Spark, Kubernetes, XGBoost, and scikit-learn.
- Built an ML system for anomaly detection in transaction data using a sparse training set. It has superior performance to software from multiple industry-leading vendors. Key technologies used: Apache Airflow, Kubernetes, Spark, TensorFlow, and XGBoost.
- Built an ML-on demand live web API serving preprocessed results as well as live time-series predictions. Key technologies: Kubernetes, Flask, and Prophet.
University of Freiburg
- Led an analysis team for international collaboration to publish the world's most sensitive dark matter search results. The team had more than 60 scientists from 20 institutions around the world.
- Headed a working group to design and install the data acquisition system for a more extensive detector upgrade. I managed a ten-person team and built and installed a €1 million system that is still in operation under an Italian mountain.
- Performed data analyses across all sub-teams as needed to fill gaps in the workforce. I reviewed and challenged the results. Also, I drafted and wrote a seminal paper about the experiment, collaborating with 150 other people working on this project.
- Taught at the university level. Advised graduate students both locally and in other teams. Presented at schools, conferences, and outreach events around the world.
University of Bern
- Designed and installed a data acquisition system for a large dark matter detector. I used a fast triggerless readout system, built structured events with No-SQL database queries, and ran 24/7/365 over the three-year lifetime of the experiment.
- Led a detector physics analysis subgroup of about ten scientists. The main task was to study incoming data and ensure that any and all anomalies discovered could be reasonably explained and wouldn't interfere with the result.
- Worked on the data formats, governance, processing systems, and analytical routines for what would eventually become 2PB of experimental data.
- Taught, advised, and supervised graduate students. Also, I participated in outreach events and presentations at conferences worldwide.
Machine Learning Marketing Platform
Data Acquisition System for Dark Matter Detectorhttps://arxiv.org/abs/1906.00819
• Analog electronics, hardware signaling, i.e., a nuclear instrumental module (NIM logic), and field-programmable gate array (FPGA) routines
• C++ control and readout software parallelized over several physical servers and communicating over a unified network interface
• A MongoDB database layer with a distributed deployment over multiple servers
• Python top-level program logic
• A bare metal server cluster, local network, dozens of electronics components, and thousands of cables.
I built this system with a small group and was the project's leader. While I didn't implement every part myself, I did work on every part and was the main coordinator of the effort. Also, I was the official leader of the data acquisition team for a subsequent upgrade to this system.
Machine Learning Anti-money Laundering Platform
Models are trained periodically and the system provides full traceability as well as model explainability reports, fulfilling compliance requirements.
This system provides a seven-figure cost savings with no loss in sensitivity.
PySpark, Node.js, TensorFlow, XGBoost
Jupyter, Emacs, GitLab CI/CD, Helm, Apache Airflow
Linux, Kubernetes, Visual Studio Code (VS Code), Amazon Web Services (AWS)
MongoDB, Amazon S3 (AWS S3)
Data Analysis, Machine Learning, Physics, Mathematics, Cabling, University Teaching, Radiation Detection, Data Analytics, Software Development, Big Data, Electronics, Monte Carlo Simulations, Linux Server Administration, Machine Learning Automation, Neural Networks, APIs, Data Processing Automation, Statistics, Team Leadership, Remote Team Leadership, Analog-to-Digital Converters (ADC), CI/CD Pipelines
Django, AngularJS, Spark, Flask
PhD in Physics
Ruhr University Bochum - Bochum, Germany
Master's Degree in Physics
State University of New York at Albany - Albany, NY, United States