- Freelance NLP Expert2018 - 2018Zugata (via Toptal)
Technologies: Python, NLTK, SpaCy, Sklearn, Flask, MySQL, Pandas, NumPy
- Improved and developed a system for the key-phrase extraction from texts by using a trained ML classifier and a variety of extraction techniques (including those involving the statistical analysis of word collocations).
- Innovated a library which applies a dependency parser (including SpaCy or Stanford parsers) to texts and then extracts phrases according to common grammatical rules that have been automatically inferred from training texts (with the help of graph theory and NetworkX).
- Implemented frameworks that helped with research including the use of caches for extracted phrases and objects for persisting models with metadata to give consumer knowledge of how that model was formed.
- Created a Flask API for the output of results along with the user ability to specify the different methodologies.
- Enhanced an in-house evaluator of extractor performance accompanied by integration of traditional evaluators (Bleu/Rouge); also, set up cross-validation tests for classifier performance.
- Verified and advised on statistical/confidence tests for studies by a company which went toward a paper that won an award at KDD 2018.
- Freelance Data Scientist | Freelance Machine Learning Specialist2017 - 2018A US-based Investment Management Firm (via Toptal)
Technologies: Python, Scikit-learn, TensorFlow, Pandas, MongoDB, Jupyter Notebook, AWS, React, NLTK
- Researched and tested prediction models with a Python stack using machine learning regressors and natural language processing techniques.
- Derived features from various sources including forming vector representations of words/documents using a Bag of Words model (with NLTK) and neural networks (with TensorFlow).
- Developed a configurable model backtesting (and backfilling) system making extensive use of various Pandas functionality.
- Improved the reliability of a Selenium-based framework for scraping websites to source data for model training, including improved logging and reports of nightly performance.
- Created a framework for mining and structuring of data from particular sections of PDF files.
- Enhanced and bug-fixed a React/Redux web app used for showing predictions.
- Freelance Machine Learning Engineer2016 - 2017Wedifique (via Toptal)
Technologies: Python, Scikit-learn, Pandas, NumPy, MongoDB, Node.js, AngularJS
- Implemented a collaborative filtering learning algorithm using Python libraries for use in a product recommendation system.
- Allowed the ability for learning algorithm to be influenced by administrator suggestions when deciding feature weighings.
- Updated aspects of the main web app, where necessary on both the Node.js back-end and AngularJS front-end.
- Queried (using MongoDB) and derived data for use in user/trend analysis and to populate reports/graphs.
- Set up a web/worker multiple server infrastructure using AMQP with Heroku.
- Freelance Full-stack Developer2016 - 2016Swtch (via Toptal)
Technologies: AngularJS, Node.js, Bootstrap, Express.js, PostgreSQL, Gulp
- Created a geolocation web app's proof of concept using primarily AngularJS.
- Developed a complex user registration and booking system that was persisted to a PostgreSQL database using a pg-promise library in Node.js.
- Styled an app using Bootstrap so that it is responsive and can be used on a variety of devices.
- Implemented a RESTful API using Express.js and Node.js.
- Associate Developer2013 - 2016Goldman Sachs
- Collaborated with a global market risk business to design and maintain a platform that produced a bank’s risk metrics.
- Used the firm's Python-like proprietary language to build and test a framework to collate big data sets and to automate the creation of stress test reports for regulators.
- Led the development team that produced an AngularJS web app and RESTful API to allow users to adjust risk measures and audit these changes.
- Conducted interviews of lateral hires and of interns/analysts for tech division.
- Assisted in integrating a platform into a new distributed computing framework, including occasional examination of platform's core C++ code.
- Co-created a Java-based version system for report configurations which could be controlled via an AngularJS web app.
- Investigated machine learning methods for possible use in the department.
- Analytics Developer2010 - 2013RBS Markets & International Banking
Technologies: C#, SQL Server, ASP.NET, Oracle, VBA
- Developed and maintained a .NET web-based application for analyzing and visualizing time series data for a range of financial products.
- Performed extensive regression testing and other analysis as part of the regular upgrades in pricing libraries the tool depended on.
- Implemented Agile methodologies in delivering a number C# coding assignments to add new analytics.
- Trained the support and development teams around the globe including a trip to Singapore to facilitate this.
- Created a VBA tool for logging emails sent to the support inbox which then detected whether they had been responded to. This was then summarized in a management report.
- Inside Licensing Specialist2010 - 2010Microsoft
Technologies: VBA, Excel
- Automated the building of a spreadsheet which was compiled from various sources and also kept track of deal progress for a Munich-based licensing team.
- Created a tool using VBA to identify discrepancies between two customer pricing sheets (taking into account that entries may be present in both, but in different row locations).
- Presented these tools at team calls and wrote up documentation for them in English and German.
- Added new statistics for the account planning sheet including the data mining of past discounts given to customers.
- Assisted the licensing sales specialists with price and product migration queries.