Data Scientist
2019 - PRESENTFreelance- Created novel machine learning algorithms for transient detection, localization, anomaly detection, and demand forecasting, within hydraulic networks.
- Contributed to a patented event-detection solution for the hydraulic time series.
- Designed and developed analytical solutions to tackle specific problems faced by water utility companies.
- Created data validation algorithms to validate imported data and to detect and correct corrupted time-series data.
- Built an automatic report generation tool to provide an overview of the quality of the data, pinpoint specific problems in the data, and detect problems with the devices captured by the data.
- Designed a scalable SQL database structure, based on the TimescaleDB extension, to run research experiments at scale.
- Found bugs in the client's infrastructure and advised its simplification and decoupling to make it more robust and more efficient for working between development and data science teams.
- Created novel algorithms to detect mechanical misconfiguration of the sensor devices and correct the data from the misconfigured devices.
- Conducted an extensive statistical study on the uncertainty and confidence intervals of the data received from the monitoring devices.
- Developed a strategic roadmap and new architecture for advanced analytics.
Technologies: Research, GIS, Time Series Analysis, Anomaly Detection, Forecasting, Convex Optimization, Linear Programming, Statistics, Mathematics, Optimization, Algorithms, Matplotlib, Plotly, Keras, TensorFlow, Scikit-learn, Pandas, NumPy, Julia, Python, Data Science, Amazon Web Services (AWS), Predictive Analytics, Git, GitLab, GitHub, JSON, JSON API, Mathematical Analysis, Mathematical Modeling, Data Analytics, Data Analysis, RustSoftware Engineer
2017 - 2018TomTom- Figured out the specification of complex legacy GIS data for which no specification was known and migrated it successfully to a new format.
- Converted, processed, and generated the entire world map data used in navigation platforms around the globe.
- Developed probabilistic map data error detection tools for the given imperfect and erroneous data.
Technologies: GIS, Python, Bash, Git, Amazon Web Services (AWS), Java, AgileData Scientist
2014 - 2014Pact Coffee (Intern)- Developed an algorithm in Go for recommending new coffees, with no user feedback, based on their intrinsic properties.
- Achieved the algorithmic performance superior to a professional coffee connoisseur.
- Delivered fast execution speeds on ordinary hardware.
Technologies: Git, Recommendation Systems, Algorithms, Machine Learning, Go, Data Science, Data Analytics, Data AnalysisForward Deployed Engineer
2013 - 2013Palantir Technologies (Intern)- Developed a document similarity search plugin for the Palantir Government platform with an integrated security layer for the Elasticsearch server, covering https and the Palantir internal authentication endpoint and access control list.
- Extended the codebase, which comprised over one million lines of code, had limited documentation, and depended on the legacy software.
- Updated the previously developed plugin, HTML-Exporter for Maps, thus satisfying existing clients.
Technologies: JavaForward Deployed Engineer
2012 - 2012Palantir Technologies (Intern)- Developed a Java plugin, HTML-Exporter for Maps, exporting tiles, Palantir objects, KML objects, layers, and other map data from the Palantir Government platform to an HTML file.
- Transformed tiles between incompatible geographic information systems.
- Communicated directly with the clients, achieving a track record of successful deployments and uses of the plugin I developed.
Technologies: Algorithms, GIS, JavaInstep Research Intern
2011 - 2011Infosys Labs- Researched the optimization of parallel algorithms and analyzed asymmetric workload distribution, resulting in a publication.
- Implemented a home-grown sequence data mining algorithm on Nvidia (CUDA) graphic cards.
- Developed auxiliary tools, including a statistical sequence generator.
Technologies: Git, Research, Optimization, Statistics, CUDA, C++, C, Data Science