Oliver is a versatile data scientist and software engineer combining over a decade of experience and a postgraduate mathematics degree from Oxford. Career assignments have ranged from building machine learning solutions for startups to leading project teams and handling vast amounts of data at Goldman Sachs. With this background, he is adept at picking up new skills quickly to deliver robust solutions to the most demanding of businesses.
ExperiencePython - 7 yearsPandas - 4 yearsMongoDB - 4 yearsAngular - 4 yearsMachine Learning - 4 yearsGenerative Pre-trained Transformers (GPT) - 3 yearsNatural Language Processing (NLP) - 3 yearsGPT - 3 years
Git, PyCharm, Jupyter, Windows, Linux
The most amazing...
...application I've worked on allowed traders to visualize historical financial data, perform technical analysis and determine the relative value of securities.
Freelance Tech Lead | Machine Learning Engineer
European-based Investment Company (via Toptal)
- Led a team of Toptal engineers in a project to create an MVP of a platform for configuring and testing models for building stock portfolios.
- Researched architectures for portfolio-build models, including those using neural network approximation.
- Created a highly configurable system for training and backtesting models based on point-in-time financial data. Furthermore, automated the configurability itself by setting up hyperparameter optimization using Ray Tune.
- Oversaw the build of the ingestion/ETL framework capable of discovering potential stocks from third-party data suppliers (based on region/industry/market cap) and extracting historical market and fundamental data from them into a central DB.
- Ensured rich reporting of model performance via user-defined metrics and information around portfolio constituents. This was directly consumable via an API, including a Dash/Plotly dashboard with interactive graphics.
- Reviewed and maintained code for the Angular web app through which an administrator can test models and view tabular reports.
- Designed the database model used by the ORM (SQLAlchemy), which also underpinned the RESTful API that was robustly marshalled using Marshmallow.
- Set up the Google Cloud Platform-based infrastructure, including a Docker-based CI/CD pipeline for both the UI and back-end service, a Kubernetes engine for parallelizing long-running worker processes, worker queues (Pub/Sub), and scheduled tasks.
A US-based Behavioral Care Startup
- Consulted on directions that a potential MVP could take for a diagnostic system in the mental health space.
- Researched and assessed transferable speech and text models using Hugging Face that could be fine-tuned and used in diagnosis.
- Demonstrated PoCs using IBM Watson and OpenAI's ChatGPT API as a basis of a chatbot-based solution.
Computer Vision Engineer
UK-based eCommerce Company
- Devised and implemented a system for receiving an image via API and finding the closest matching products in a catalog.
- Isolated the encoder of the MobileNet neural network architecture to translate images to compressed vectors which could then be queried using an efficient similarity search index.
- Used MLflow to manage model storage, experiment tracking, and serve the model through an API.
- Derived an autoencoder so that the encoder model could be fine-tuned with any available training data.
- Leveraged TensorFlow datasets to move image data through the pipeline without overwhelming memory.
- Created a Python package for consuming RabbitMQ messages to create and wait on multiple subprocesses, all performed in an asynchronous manner using asyncio.
Freelance Speech Recognition Engineer
US-based Tech Startup (via Toptal)
- Created speech recognition models using custom neural networks with Keras, along with Baidu's DeepSpeech architecture via TensorFlow.
- Handled preprocessing of audio file input types (PCM/WAV), including conversions to and manipulations of MFCCs/spectrograms.
- Wrapped up processes for data uploading, model training, persistence, and inference in RESTful APIs, with capabilities for complex versioning and live updates of training.
- Deployed and configured APIs to leverage GPUs for faster training/inference and a Celery distributed-task queue for training in parallel.
Freelance Data Scientist
Kalepa Corporation (via Toptal)
- Trained and evaluated various models for classifying text data using GloVe word representations, bag-of-words model, and XGBoost, among other ML/NLP techniques.
- Created and managed Amazon Mechanical Turk tasks for deriving labels for training classification, including detecting consistently inaccurate workers.
- Productionized the inference process into the pip installable Python package.
Interim Chief Information Officer
- Led the technical direction for a UK government-funded legal AI app to help deliver the initial version of the product.
- Implemented initial NLP solutions in Python for providing document services at the heart of the app.
- Created the first version of an Angular 7 web app using Node.js and MongoDB.
- Conducted interviews to find people for the company's first development team to continue my work.
Freelance NLP Expert
Zugata (via Toptal)
- Improved and developed a system for key-phrase extraction from texts using a trained ML classifier and various extraction techniques (including those involving the statistical analysis of word collocations).
- Innovated a library that applies a dependency parser (including spaCy or the Stanford Parser) to texts and then extracts phrases according to grammatical rules automatically inferred from training texts (using graph theory and NetworkX).
- Implemented frameworks that helped with research, including the use of caches for extracted phrases and objects for persisting models with metadata to give consumer knowledge of how that model was formed.
- Created a Flask API for the output of results and the user's ability to specify the different methodologies.
- Enhanced an in-house evaluator of extractor performance accompanied by the integration of traditional evaluators (Bleu/Rouge); also, set up cross-validation tests for classifier performance.
- Verified and advised on statistical/confidence tests for studies by a company that went toward a paper that won an award at KDD 2018.
Freelance Data Scientist | Freelance Machine Learning Specialist
US-based Investment Management Firm (via Toptal)
- Researched and tested prediction models with a Python stack using machine learning regressors and natural language processing techniques.
- Derived features from various sources, including forming vector representations of words/documents using a bag-of-words model (with NLTK) and neural networks (with TensorFlow).
- Developed a configurable model backtesting (and backfilling) system using various Pandas functionality extensively.
- Improved the reliability of a Selenium-based framework for scraping websites to source data for model training, including improved logging and reports of nightly performance.
- Created a framework for mining and structuring data from particular sections of PDF files.
- Enhanced and bug-fixed a React/Redux web app used for showing predictions.
Freelance Machine Learning Engineer
Wedifique (via Toptal)
- Implemented a collaborative filtering learning algorithm using Python libraries for use in a product recommendation system.
- Enabled a learning algorithm to be influenced by administrator suggestions when deciding feature weighings.
- Updated aspects of the main web app, where necessary, on both the Node.js back-end and AngularJS front-end.
- Queried (using MongoDB) and derived data for use in user/trend analysis and to populate reports/graphs.
- Set up web/worker multiple server infrastructures using AMQP with Heroku.
Freelance Full-stack Developer
Swtch (via Toptal)
- Created a geolocation web app's proof of concept using primarily AngularJS.
- Developed a complex user registration and booking system that was persisted to a PostgreSQL database using a pg-promise library in Node.js.
- Styled an app using Bootstrap so that it is responsive and can be used on a variety of devices.
- Implemented a RESTful API using Express.js and Node.js.
- Collaborated with a global market risk business to design and maintain a platform that produced a bank’s risk metrics.
- Used the firm's Python-like proprietary language to build and test a framework to collate big data sets and to automate the creation of stress test reports for regulators.
- Led the development team that produced an AngularJS web app and RESTful API to allow users to adjust risk measures and audit these changes.
- Conducted interviews of lateral hires and of interns/analysts for tech division.
- Assisted in integrating a platform into a new distributed computing framework, including occasional examination of the platform's core C++ code.
- Co-created a Java-based version system for report configurations which could be controlled via an AngularJS web app.
- Investigated machine learning methods for possible use in the department.
RBS Markets & International Banking
- Developed and maintained a .NET web-based application for analyzing and visualizing time series data for a range of financial products.
- Performed extensive regression testing and other analysis as part of the regular upgrades in pricing libraries the tool depended on.
- Implemented Agile methodologies in delivering several C# coding assignments to add new analytics.
- Trained the support and development teams around the globe including a trip to Singapore to facilitate this.
- Created a VBA tool for logging emails sent to the support inbox, which then detected whether they had been responded to. This was then summarized in a management report.
Inside Licensing Specialist
- Automated the building of a spreadsheet which was compiled from various sources and also kept track of deal progress for a Munich-based licensing team.
- Created a tool using VBA to identify discrepancies between two customer pricing sheets (taking into account that entries may be present in both, but in different row locations).
- Presented these tools at team calls and wrote up documentation for them in English and German.
- Added new statistics for the account planning sheet including the data mining of past discounts given to customers.
- Assisted the licensing sales specialists with price and product migration queries.
In my time on the committee, I analyzed the data from Eventbrite and Mailchimp to build a picture of our attendees and used it to inform our event strategy. I also maintained and promoted their website and social media profiles. Attendance grew stronger as a result.
Adapting AI-generated Texts
These included NLP techniques achievable in the short term, such as examining parts of speech, dependency parsing, and manipulating grammar.
From Solving Equations to Deep Learning: A TensorFlow Python Tutorial
Angular, Alembic, .NET, Flask, AngularJS, Angular Material, Bootstrap, Express.js, Jasmine, Django, ASP.NET, NUnit, Selenium
Pandas, SpaCy, Natural Language Toolkit (NLTK), Google Chart API, TensorFlow, RxJS, Scikit-learn, React, D3.js, REST APIs, Google Maps, Node.js, jQuery, NumPy, SQLAlchemy, Flask-RESTful, Matplotlib, Keras, XGBoost, OpenCV, AMQP, Asyncio, PyTorch
Data Science, Object-oriented Programming (OOP), DevOps, ETL, RESTful Development, Asynchronous Programming, Agile Software Development, Test-driven Development (TDD), Search Engine Optimization (SEO)
Google Cloud, MongoDB, PostgreSQL, Microsoft SQL Server, Databases, MySQL, Sybase
Data Modeling Expert, Software Development, Marshmallow, APIs, Deep Learning, Artificial Intelligence (AI), Natural Language Processing (NLP), Machine Learning, Mathematics, Data Analysis, Data Engineering, Freelancing, Data Modeling, Full-stack, GPT, Generative Pre-trained Transformers (GPT), Sentiment Analysis, Queue Management, Cryptography, Angular Bootstrap, Data Visualization, Statistics, Data, PDF, Computer Vision, Architecture, Technical Leadership, Software Architecture, Finance, Hedge Funds, Forecasting, Machine Learning Operations (MLOps), Hugging Face, Deep Neural Networks, Language Models, Neural Networks, Audio Processing, Graph Theory, Bokeh, Google Material Design, Librosa, Multiprocessing, MLflow, Image Processing, Dash, Leadership, Algorithms, Algebra, Number Theory, Elliptic Curve Cryptography, Quantum Computing, Calculus, Complexity Theory, Speech Recognition, Voice Analysis, Early-stage Startups, OpenAI GPT-3 API, Chatbots, Convolutional Neural Networks, GloVe, Solution Architecture, AI Programming, Web Development, Eventbrite, Scraping, PDF Scraping, Web Scraping
Gulp, AngularFire, PyCharm, Git, Mongoose, NPM, Sublime Text, Jira, Microsoft Visual Studio, TortoiseSVN, Amazon SageMaker, Jupyter, Microsoft Excel, RabbitMQ, Celery, DataTables, Browserify, IntelliJ, CVS, MATLAB, Karma, Gensim, Plotly, IBM Watson, Mailchimp
Google Cloud Platform (GCP), Docker, Kubernetes, Jupyter Notebook, Amazon Web Services (AWS), Heroku, Firebase, Windows, Linux, Oracle, Visual Studio Code (VS Code), Twilio, WordPress, NVIDIA CUDA
Master of Science Degree in Mathematics and the Foundations of Computer Science
University of Oxford - Oxford, England
Bachelor of Science Degree in Mathematics with a Study in Continental Europe
University of Bristol - Bristol, England
Stanford University via Coursera
Financial Engineering in C++
City, University of London (London, UK)