Freelance Tech Lead | Machine Learning Engineer2019 - 2020European-based Investment Company
Technologies: Flask, SQL, Google Cloud Platform (GCP), Python, Neural Networks, Data Science, Machine Learning, Flask-RESTful, PostgreSQL, SQLAlchemy, Marshamallow, Angular, Keras, Tensorflow 2, Pandas
- Worked as the lead developer and ML engineer in a Toptal project to create an MVP of a platform for building stock portfolios with continued rebalancing based on live financial data.
- Researched and productionized portfolio-build models including those using neural network approximation.
- Created a highly configurable system for training and back testing models based on point-in-time financial data.
- Oversaw the build of the ingestion/ETL framework capable of finding potential stocks from third-party data suppliers (based on region/industry/market cap) and extracting historical market and vital data for those into a centralized database format.
- Ensured rich reporting of the model performance via user-defined metrics and portfolio constituent characteristics, directly consumable via the API.
- Reviewed code for the Angular web app through which an administrator can test models and view tabular reports.
- Designed the database model and RESTful API for model configuration and performance views.
- Set up the Google Cloud Platform-based infrastructure, including a Docker-based CI/CD pipeline for both the UI and back-end service, Kubernetes engine for parallelizing long-running worker processes, worker queues (PubSub), and scheduled tasks.
Freelance Speech Recognition Engineer2019 - 2019A US-based Tech Startup (via Toptal)
Technologies: Audio Processing, Data Science, Neural Networks, Machine Learning, MongoDB, Flask, Celery, Keras, Python
- Created speech recognition models using custom neural networks with Keras, along with Baidu's DeepSpeech architecture via TensorFlow.
- Handled preprocessing of audio file input types (PCM/WAV) including conversions to and manipulations of MFCCs/spectrograms.
- Wrapped up processes for data uploading, model training, persistence and inference in RESTful APIs, with capabilities for complex versioning and live updates of training.
- Deployed and configured APIs to leverage GPUs for faster training/inference and a Celery distributed-task queue for training in parallel.
Freelance Data Scientist2019 - 2019Kalepa Corporation (via Toptal)
Technologies: Natural Language Processing (NLP), Machine Learning, Data Science, SpaCy, MongoDB, Sklearn, XGBoost, Gensim, Pandas, Python
- Trained and evaluated various models for classifying text data using GloVe word representations, bag-of-words model, and XGBoost among other ML/NLP techniques.
- Created and managed Amazon Mechanical Turk tasks for deriving labels for training classification, including detection of consistently inaccurate workers.
- Productionized the inference process into the PIP installable Python package.
Chief Information Officer2019 - 2019Lawli Ltd
Technologies: Natural Language Processing (NLP), WordPress, RabbitMQ, Gensim, Python, Node.js, Angular
- Led the technical direction for a UK government-funded legal AI app to help deliver the initial version of the product.
- Implemented initial NLP solutions in Python for providing document services at the heart of the app.
- Created the first version of an Angular 7 web app using Node.js and MongoDB.
- Conducted interviews to find people for the company's first development team to carry on my work.
Freelance NLP Expert2018 - 2018Zugata (via Toptal)
Technologies: SQL, Graph Theory, Natural Language Processing (NLP), Data Science, Machine Learning, NumPy, Pandas, MySQL, Flask, Sklearn, SpaCy, NLTK, Python
- Improved and developed a system for the key-phrase extraction from texts by using a trained ML classifier and a variety of extraction techniques (including those involving the statistical analysis of word collocations).
- Innovated a library which applies a dependency parser (including SpaCy or Stanford parsers) to texts and then extracts phrases according to grammatical rules that have been automatically inferred from training texts (using graph theory and NetworkX).
- Implemented frameworks that helped with research including the use of caches for extracted phrases and objects for persisting models with metadata to give consumer knowledge of how that model was formed.
- Created a Flask API for the output of results along with the user ability to specify the different methodologies.
- Enhanced an in-house evaluator of extractor performance accompanied by integration of traditional evaluators (Bleu/Rouge); also, set up cross-validation tests for classifier performance.
- Verified and advised on statistical/confidence tests for studies by a company which went toward a paper that won an award at KDD 2018.
Freelance Data Scientist | Freelance Machine Learning Specialist2017 - 2018A US-based Investment Management Firm (via Toptal)
Technologies: Amazon Web Services (AWS), Neural Networks, Machine Learning, Data Science, NLTK, React, AWS, Jupyter Notebook, MongoDB, Pandas, TensorFlow, Scikit-learn, Python
- Researched and tested prediction models with a Python stack using machine learning regressors and natural language processing techniques.
- Derived features from various sources including forming vector representations of words/documents using a Bag of Words model (with NLTK) and neural networks (with TensorFlow).
- Developed a configurable model backtesting (and backfilling) system making extensive use of various Pandas functionality.
- Improved the reliability of a Selenium-based framework for scraping websites to source data for model training, including improved logging and reports of nightly performance.
- Created a framework for mining and structuring of data from particular sections of PDF files.
- Enhanced and bug-fixed a React/Redux web app used for showing predictions.
Freelance Machine Learning Engineer2016 - 2017Wedifique (via Toptal)
Technologies: Machine Learning, AngularJS, Node.js, MongoDB, NumPy, Pandas, Scikit-learn, Python
- Implemented a collaborative filtering learning algorithm using Python libraries for use in a product recommendation system.
- Enabled a learning algorithm to be influenced by administrator suggestions when deciding feature weighings.
- Updated aspects of the main web app, where necessary on both the Node.js back-end and AngularJS front-end.
- Queried (using MongoDB) and derived data for use in user/trend analysis and to populate reports/graphs.
- Set up a web/worker multiple server infrastructures using AMQP with Heroku.
Freelance Full-stack Developer2016 - 2016Swtch (via Toptal)
Technologies: Gulp.js, PostgreSQL, Express.js, Bootstrap, Node.js, AngularJS
- Created a geolocation web app's proof of concept using primarily AngularJS.
- Developed a complex user registration and booking system that was persisted to a PostgreSQL database using a pg-promise library in Node.js.
- Styled an app using Bootstrap so that it is responsive and can be used on a variety of devices.
- Implemented a RESTful API using Express.js and Node.js.
Associate Developer2013 - 2016Goldman Sachs
- Collaborated with a global market risk business to design and maintain a platform that produced a bank’s risk metrics.
- Used the firm's Python-like proprietary language to build and test a framework to collate big data sets and to automate the creation of stress test reports for regulators.
- Led the development team that produced an AngularJS web app and RESTful API to allow users to adjust risk measures and audit these changes.
- Conducted interviews of lateral hires and of interns/analysts for tech division.
- Assisted in integrating a platform into a new distributed computing framework, including occasional examination of the platform's core C++ code.
- Co-created a Java-based version system for report configurations which could be controlled via an AngularJS web app.
- Investigated machine learning methods for possible use in the department.
Analytics Developer2010 - 2013RBS Markets & International Banking
Technologies: SQL, Visual Basic for Applications (VBA), Oracle, ASP.NET, Microsoft SQL Server, C#
- Developed and maintained a .NET web-based application for analyzing and visualizing time series data for a range of financial products.
- Performed extensive regression testing and other analysis as part of the regular upgrades in pricing libraries the tool depended on.
- Implemented Agile methodologies in delivering several C# coding assignments to add new analytics.
- Trained the support and development teams around the globe including a trip to Singapore to facilitate this.
- Created a VBA tool for logging emails sent to the support inbox which then detected whether they had been responded to. This was then summarized in a management report.
Inside Licensing Specialist2010 - 2010Microsoft
Technologies: Microsoft Excel, Visual Basic for Applications (VBA)
- Automated the building of a spreadsheet which was compiled from various sources and also kept track of deal progress for a Munich-based licensing team.
- Created a tool using VBA to identify discrepancies between two customer pricing sheets (taking into account that entries may be present in both, but in different row locations).
- Presented these tools at team calls and wrote up documentation for them in English and German.
- Added new statistics for the account planning sheet including the data mining of past discounts given to customers.
- Assisted the licensing sales specialists with price and product migration queries.