Nicolas Mallison
Verified Expert in Engineering
Team Lead and Data Science Developer
London, United Kingdom
Toptal member since May 4, 2022
Nicolas is an expert data scientist with over 24 years of experience using programming languages, including R and Python, to design and develop AI/ML data products, combined with strong practice leadership and people management skills. Nicolas is a published author and thought leader with a vast track record of success in implementing new and innovative ways of achieving the most scalable, data-centric outcomes to drive new business while promoting a consultative and collaborative environment.
Portfolio
Experience
- SQL - 20 years
- Data Science - 20 years
- Artificial Intelligence (AI) - 20 years
- Python - 20 years
- Excel VBA - 20 years
- Tableau - 13 years
- R - 12 years
- Keras - 3 years
Availability
Preferred Environment
R, SQL, Git, Tableau, Python, Visual Studio Code (VS Code), PyTorch, Amazon SageMaker, Azure Synapse, Jupyter Notebook
The most amazing...
...technique I've pioneered is a revolutionary data science method using fuzzy matching, machine learning, behavioral rules, and graph analytics to detect fraud.
Work Experience
Expert Data Scientist
PIMS Associates Ltd.
- Advised on the optimal AI and ML strategy to leverage a unique dataset of historical financial, operational, and market data, encompassing over 4000 businesses and 500 variables.
- Surpassed the performance of their current production multivariate linear regression model whilst adding much more flexibility/dynamism in the type and nature of model inputs.
- Designed and implemented a suite of masked self-supervised autoencoder models using PyTorch on Azure to effectively predict the long-term strategic performance of a business.
- Achieved superior performance regardless of the type of historical financial, operational, and market data provided as input, while also determining the most relevant data points to sequentially improve accuracy.
- Leveraged autoencoder encoding and model factor importance to perform dynamic searches on any input, obtaining the best set of look-alikes.
- Used model insights to recommend interventions most likely to enhance long-term strategic performance based on the model-identified operational and financial indicators that differentiate top performers from underperformers in the retrieved sample.
Python/Pandas Consultant
Game Play Network, Inc.
- Performed overall consultation on specific data science projects, data, and analysis tooling.
- Executed projects leveraging the hex.tech platform and then taught the rest of the team how to do it themselves.
- Advised on how to cluster customers based on their gameplay activity and analyze the correlation/relationship between gameplay, marketing promotions, and customer churn to build a predictive model.
Expert Data Scientist
Ruth's Hospitality Group
- Put safely into production a demand forecasting and labor scheduling model that predicts 80+ stores' daily entrees weekly and calculates hourly staffing by roles, rewriting the data layer for the reliability of model inputs and fit convergence.
- Added to the weekly production model pipeline a strong model fit and output quality assurance test module that would automatically alert in case of anything missing or incorrect forecast outputs and stop uploading them in labor schedules.
- Designed, built, and put into production a prediction performance monitoring Shiny app, allowing interactive visual comparison of predicted-versus-actuals store by store and used the mean absolute scaled error to inform when the model needs retuning.
Expert Data Scientist
H+M Industrial EPC
- Loaded, cleaned, transformed, and analyzed a large set of semi-structured engineering and construction project data to determine the correlation between the information available at the bidding stage and the projects' financial performance.
- Analyzed customer details, project description, contract price, cost estimates, change order cost and revenue, actual costs, and invoice value. Then, I built predictive models to support risk-based decision-making at the bidding stage.
- Delivered an interactive visualization Shiny app that demonstrated how model predictions were derived from project data, compared a project's risk against others, and provided online scoring of new projects to allow testing of model predictions.
Data Science Senior Director for Group CIO Technology Risk
Deutsche Bank
- Combined machine learning and time series encoding to develop a bank-wide innovative, predictive, and evidence-based model that predicts future IT system stability, enabling smarter decisions.
- Designed, built, and deployed into production a suite of ML products that influenced an excess of 30% risk reduction across the bank IT applications due to being incorporated in the bank's strategic balanced scorecard.
- Extended the initial IT stability-affected risk model to application and non-application IT assets, such as infra or platform components, and from a risk-causing perspective, making the risk impact of shared infrastructure visible to management.
- Built NLP machine learning models to characterize the nature of the IT work performed by an agent, allowing for much more accurate measurement of task productivity and targeted routing of tickets to the ablest available agent for speedier resolution.
- Created unsupervised machine learning models to identify opportunities to eliminate IT service management tasks, such as repeat tickets, and automate broken processes through NLP and process mining.
- Devised an IT service management task demand model to match service capacity better and reduce costs by dynamically matching capacity to forecasted demand and taking advantage of volume-based discounts.
- Led the predictive analytics and data science function in the technology risk department to develop models, which kept gaining appreciation from a wide range of senior technology stakeholders and leaders across the bank.
Partner, Analytics Digital Transformation Consulting UK | Global Lead
Atos
- Developed models accurately predicting the remaining lifetime of electrical railways assets—such as railroad switches based on asset age, point-in-time cable conductance, and external weather data—which optimized asset replacement plans.
- Created models to quantify airport parking demand based on price, parking occupancy, month/day of the year, duration, customer geo-demographics, and flight details—to dynamically adjust pricing based on predictions and maximize profit.
- Devised models to quantify the mean time between failure of expensive oil drill equipment—leveraging IoT instrumentation and other environmental usage data—to optimize maintenance intervention and maximize equipment use time (predictive maintenance).
- Built healthcare demand models for maternity services elective interventions such as cesarean sections to optimize resource planning and quantify the presence of any geo/socio-demographic drivers impacting the procedure decision process.
- Produced predictive and prescriptive models of the air force helicopter maintenance schedule and process flow—based on site locations, resource constraints, and equipment capacity—predicting the impact of budget reduction on future availability.
- Established and led a newly formed solutions team within the UK and globally to harvest data assets and explore insights that could inform strategy and drive business outcomes.
- Doubled the practice size from 23 to 45 consultants in three years by blending financial, simulation, statistical, and operational research optimization modeling expertise and deep technical experience.
- Incorporated data science in consulting engagement to provide a data-driven evidential basis for improving and recommending performance across all consulting go-to-market offerings.
- Led effective interactions with clients on an issue-based consulting basis where specific strategic and operational challenges were resolved using data-driven approaches and analytics.
- Provided professional advice to clients on best practices and performance through information management advisory, digital transformation analytics, strategic analytics-led consulting, operational decision support, and data science consulting.
Director, Head of Forensic Data Analytics in Fraud Investigations and Disputes Services
EY
- Led the data analytics to trace a $1.6 billion customer-segregated funds shortfall after the bankruptcy of a global financial derivatives broker. See page 11 of blogs.harvard.edu/bankruptcyroundtable/files/2017/09/JPMCC-Till-MF-Global.pdf.
- Built a legally defensible and statistically sound extrapolation method to quantify the probability that the total amount of claim leakage would likely exceed a threshold in a contractual dispute between the underwriter and claims handling provider.
- Devised statistically sound and legally defensible approaches for the use of predictive document coding to accelerate the legal discovery of digital evidence by learning models on labeled samples allowing quicker identification of relevant documents.
- Developed a statistical method to score digital evidence based on the intensity of their use of language related to the three dimensions of Cressey's Fraud Triangle—pressure or incentive, opportunity, and rationalization—to identify fraud hotspots.
- Founded, led, and developed its innovative forensic data analytics team, which grew within 60 months from 1 to 25 people with revenue for FY13 in excess of £5.5 million, and £0.5 million in FY08.
- Spearheaded the forensic data analytics proposition development, thought leadership, recruitment, sales, and business development, which led to the growth of the wider FTDS practice from a £1.5 million to a £15.5 million business in five years.
- Developed a high-quality, culturally diverse team with one of the highest retention rates in the industry, comprising much sought-after talent with deep technical expertise.
- Contributed significantly to promoting and coordinating analytics services cross-network as the FTDS EMEIA analytics lead by developing strategic account plans executed in collaboration with the global client service partners.
- Ensured that applicable analytics propositions and subject matter experts were known and ready to operate proactively and reactively.
- Created the global forensic data analytics methodology for the successful execution of engagements.
Senior Manager, Head of Forensic Data Analytics
KPMG UK
- Developed a graph mining method to identify the layering and concealment of a sophisticated financial statement fraud involving the use of around 21,000 manual general ledger journal entries that were hidden in a dataset of 35 million records.
- Used data analytics to identify multiple steps in which fraud was committed by automating the tracing and displaying money flows from one side of the balance sheet.
- Established a new forensic technology service line, built a 12-member team, and increased company revenue from £0.308 million to over £2 million within 16 months.
- Created analytical techniques that transform data by extracting useful information, discovering hidden patterns, and facilitating conclusions so businesses can proactively and cost-effectively seek out to prevent and detect fraud, waste, and abuse.
- Packaged data collection and preparation, automated investigative linking, rules-based, and model-based analysis methods and techniques into generic and sector-specific fraud detection and investigation solutions.
- Delivered through a purpose-built scalable Data Lab technology facility with dedicated resources, proven quality assurance processes, pre-built systems software, hardware infrastructure, and best-of-breed data analytics applications.
- Used Data Lab to enable fast, predictable, and consistent delivery of analytics services incorporating investigative experience across citizen, employee, accounting, supplier, customer, product, and unstructured and transactional data.
Senior Manager, Head of Datalab - Data Analytics Services, NetReveal Founder
BAE Systems Applied Intelligence (Detica)
- Instrumental in landing two major fraud managed service data analytics deals worth several millions of pounds in revenue to Detica (this became known later as Detica NetReveal).
- Designed and developed a groundbreaking cross-industry motor, home, and personal injury insurance claim fraud detection system for the UK Insurance Anti-fraud Bureau.
- Designed and developed groundbreaking non-compliance in a taxation detection system for HMRC/Inland Revenue UK tax authorities (identification of ghosts, moonlighters, and under declarers).
- Developed a strong reputation rapidly (both internally and externally) for technical excellence through repeatable delivery and innovative thinking.
- Ran Detica’s Datalab that operates as a center of excellence for data analytics and incubator for delivery of data analytics managed services: This encompassed recruiting, staff training and development, tools, and methods.
- Supported the facility's selling through a proactive working relationship with the business units and other business development functions to identify potential proposition areas.
- Supported business unit sales activity through demonstrable expertise and other materials, helping with Datalab proposal writing and project estimations, identifying, and recruiting staff to meet demand.
- Communicated the proposition and benefits of the facility more broadly within Detica and worked with the marketing team to position these externally.
- Put in place the appropriate business processes to ensure reliable and repeatable delivery and approaches to capture and build re-usable know-how, tools and components, coaching, counseling, and supporting the career development of staff.
- Provided expert analytics input on client engagements, meeting with clients and project management for clients where services are wholly provided by Datalab.
Manager, CRM Data Analytics - New Customer Acquisition - Contractor
Barclaycard
- Oversaw development, improvement, and automation of the customer recruitment tracking capabilities and program planning tools. Drove a number of complex data analysis projects and strategic analysis. Analysis, reporting, and modeling using the SAS System.
- Performed data manipulation and transformation on huge transactional databases.
- Managed the improvement and full automation of the planning decision-making tool, enabling the program manager to optimize timings and allocation of budget across the different acquisition media to maximize expected net income.
- Developed generic SAS and VBA code to fully automate tracking of the results of every campaign on every company’s key metrics (Net income, activation rate, response rate, ECT lending, risk profile).
- Modeled impact of household structure and member targeting strategy on campaign response rate.
- Analyzed the impact of increasing credit card applications backlog (time to decision) on activation rate and customer lifetime value.
- Conducted over solicitation analysis with and across acquisition channels addressing cannibalization issues between media with regards to timings in contact strategies and impact on response rates.
- Oversaw management, coaching, and skills development of junior analysts in the team.
- Wrote and presented to senior management PowerPoint presentations summarizing main findings on every modeling and analytics project.
Data Analytics Manager
Zalpha - WWAV Rapp Collins
- Produced sophisticated statistical analyses and models for a big online financial retailer (complex financial product behavioral segmentation on transactional data, path analysis to understand behavioral impact of DM solicitations).
- Developed new business methodologies and advanced applied statistical techniques for new business offerings (in particular geo-marketing, response measurement of DM activity for FMCG companies, etc.).
- Led the statistics and analytics new business team.
- In charge of day-to-day management of a part of the Stats and Analysis Team. Wrote the analytics components of business proposals for agency and/or direct clients. Supervised basic analyses by junior analysts.
- Improved the scheduling and project management procedures and system by using MS Project to leverage time and better staff each analyst (team of four analysts).
- Developed quality control and quality assurance of analysis methods and processes.
- Collaborated in the development (Zalpha CRM Partner) of a new business offering: How to obtain an ROI from existing investments in CRM software analytics and modeling components of CRM architecture focus.
Geo-statistical Data Analytics Manager
Asterop
- Pioneered innovative models to dynamically determine the catchment trade area of the point of sale based on time travel, outlet size, competitor locations and size, and census zone geo demographics. Model parameters via regression best fit to annual sales.
- Coordinated conception and implementation of geo-statistical analysis methodologies to realize surveys and/or implement sales and marketing information systems for large key client accounts.
- Managed the surveys and solutions department.
- Managed marketing information system conception projects.
- Defined and developed vertical and horizontal concepts solutions.
- Oversaw conception and production of both general and industry specific geo-statistical indicators (clusters etc.).
- Contributed to technology and software enhancement (conceptual design for automated analysis) and technological watch.
Decision Science Senior Consultant
Cognitive Relation (now Yseop)
- Acted as a founding partner of Cognitive Relation (now Yseop), a consulting firm specializing in customer relationship management personalization solutions.
- Managed the development of decisional marketing capability. Developed personalization solutions using state-of-the-art technology, combining a powerful rules engine (IA) with a natural language text generator.
- Developed a web-based personalized sales dialogue builder and a personalized mailing generator.
- Created three prototypes showing the functionalities and benefits of the former solutions.
- Completed the business development of those offers, focusing on retail groups and banks.
- Wrote the business and product development plans, focusing on the decisional feedback module.
- Conducted extensive meetings with venture capital investors to raise product and commercial development funds.
Analytical CRM Consultant
Accenture
- Conducted strategic customer insight and analytical CRM project for a large French retail group.
- Defined the analysis framework for generating customer information to build a predictive model of purchasing behavior.
- Implemented this conceptual analysis in the form of a qualification questionnaire.
- Designed and implemented the methodological process for efficient data warehousing and exploitation of the information using SAS.
- Programmed under SAS of a parametric predictive model and estimation of the parameter (precision obtained: 80%, robustness: 97%).
- Estimated additional turnover potential associated with a relationship marketing loyalty program.
Experience
IT Application Stability Risk Score Predictive Model
Working collaboratively with software engineers, CIO teams/application owners, and ITSM process owners (incident, problem, change, security, disaster recovery, technology roadmap compliance), over 150 risk driver metrics were defined and calculated on the 1st and 15th of every month for every production system to be scored.
Each of those metrics was derived from data extracted from IT data stored in a system of records, their calculation logic incorporated risk knowledge from SMEs. They demonstrated pairwise correlation with future risk and were actionable by production teams. The metrics were input to a stacked ensemble of negative binomial hurdle count regression models predicting the probability of future incidents in the next four weeks for each production system, aggregating sub-models quantifying risk from resiliency, change, security, and data perspective.
See P Brunner LI feedback link.
Analytics-driven Organizations Thought Leadership Paper
https://atos.net/wp-content/uploads/2016/07/atos-consulting-whitepaper-analytics-hr-interactive.pdfIt explores reasons for organizations to become analytics-driven to succeed in the new digital data economy; how this is as much, if not more so, a management revolution than a technology revolution, what it takes to truly become an analytics-driven competitor and how businesses can start this transformative journey to put data and analytics at the heart of their strategy; and at work to differentiate their unique distinctive capabilities and value chains.
RavenPack/Dow Jones Symposium: Creating and Combining Alpha Streams
https://www.youtube.com/watch?v=ufZsgMa4q-0It is entitled: “Is Big Data a Management Revolution for Quantitative Investing: What is Possible Beyond Automated Sentiment Analytics on News Streams?”
In this presentation, I explain how big data through digital transformation has the promise to transform every industry and that quantitative investing is no exception.
Automated sentiment analytics on large volume and high-velocity news streams is here, but what else is available to be exploited for investment research and trading opportunity detection?
In this short presentation, I explore new potential big data use cases, how big data can be made available and exploited, and what are the main data and analytics technology patterns to practically overcome volume, variety, veracity, and velocity issues
Big Data Live–AI and ML for Risk Management and Compliance Interview
https://www.youtube.com/watch?v=Xx8YPUNBNWcIn this video, I give insight into what changes I have seen during the previous five years, how to identify fraud through analytics, the kind of industry crossovers there are from risk and compliance analytics to other business areas, how predictive analytics can be utilized to prevent fraud and manage risks, what has been the catalyst for the sudden huge growth in big data, where I see big data and analytics in five years’ time and what is needed within a big data team to make it successful.
UK Insurance Fraud Bureau: Cross-industry Organized Crime Fraud Networks Detection System
https://www.youtube.com/watch?v=5xOSbkMfNfsThe method was inspired by insurance fraud investigation/detective methods: Using a multi-level statistical network fuzzy matching approach, motor claims, motor insurance policies, and anti-fraud hotlists records are matched/linked together into network groups following links based on tokenized/encoded combination of elements related to participant names, date of birth and addresses, third-party involved (lawyers/witnesses/accident repairs companies, doctors/clinics), bank account/cards, telephone numbers, and emails.
Supervised and unsupervised learning based on attributes of groups created allow for the identification of unusual activity at group aggregated level very likely indicative of organized crime fraud, for example, a highly unusual network comprising of 56 claims, 21 policies, three addresses, eight cars, six people over a six-month period
Examples of the type of network created on slide 19 in https://bit.ly/3MB1fkP and page 7 in https://bit.ly/3rY9m38
Fraud Triangle Text Analytics for Detection of Fraud in Electronic Communications
By trying to uncover employee behavioral traits indicative of fraud, this approach complements existing approaches that analyze structured data (accounting and operational transactions) and provide a different avenue to highlighting areas of concern that warrant investigations, in particular pointing the investigators at who (which employees or agents), where (which locations/department), when (what time period) and what (which topics are discussed in the electronic communications that have triggered the alert) to look for.
Education
Doctoral Research Fellowship in Advanced Quantitative Economics
Toulouse School of Economics - Toulouse, France
Master's Degree (Diplôme De Statisticien Economiste) in Economics, Data Science and Finance
ENSAE ParisTech - Paris, France
Master's Degree (Diplôme D'Ingénieur) in Engineering, Science, and Technology
Ecole Polytechnique - Paris, France
Certifications
RNNs, GRUs & LTSMs Deep Learning Time Series/Sequence Models
Coursera - deeplearning.ai
Computer Vision CNNs, ResNets and Neural Style Transfer
Coursera - deeplearning.ai
Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization
Coursera - deeplearning.ai
Structuring Machine Learning Projects
Coursera - deeplearning.ai
Neural Networks and Deep Learning
Coursera - deeplearning.ai
Python Programming
Coursera - University of Michigan
Computing for Data Analysis Using R
Coursera - John Hopkins University
Skills
Libraries/APIs
Pandas, Scikit-learn, NumPy, SciPy, Ggplot2, Tidyverse, Caret, jQuery DataTables, XGBoost, NLopt, PyTorch, TensorFlow, Keras, PySpark
Tools
SPSS, SPSS Modeler, StatsModels, Dplyr, Tableau, Tableau Desktop Pro, Plotly, Spark SQL, ARIMA, GitHub, Jira, Confluence, Git, Jupyter, Microsoft Copilot, Microsoft Power BI, Amazon SageMaker
Languages
Python 3, R, SQL, SAS, Excel VBA, Python, Markdown
Paradigms
ETL, Business Intelligence (BI), Change Management, Azure DevOps, Autonomic Computing
Storage
MySQL, SQL Server 2008, Data Pipelines, Teradata, Apache Hive, PostgreSQL, Data Lakes, Redshift
Frameworks
RStudio Shiny, Apache Spark
Platforms
RStudio, Windows, Oracle, Gephi, Visual Studio Code (VS Code), SharePoint, Azure, Azure Synapse, Alteryx, Jupyter Notebook
Industry Expertise
Insurance, Teaching
Other
Data Science, Mathematics, Statistics, Econometrics, Data Analysis, Forecasting, Nonparametric Statistics, Time Series, Machine Learning, Text Mining, Predictive Analytics, Predictive Modeling, Organization, Artificial Intelligence (AI), Fraud Audits, Data Matching, Pattern Matching, Fraud Investigation, Unsupervised Learning, Optimization, Data Visualization, Data Analytics, Neural Networks, Clustering, Statistical Modeling, Education, Classification, Decision Trees, Data-driven Decision-making, Big Data, Dashboards, Quantitative Analysis, Regression Modeling, Regression, Data Scientist, Statistical Analysis, Data Reporting, Data Mining, Transformer Models, Time Series Analysis, Fraud Detection, Excel Modeling, Quantitative Economics, Microeconomics, Bayesian Statistics, Deep Learning, Sequence Models, Strategy, Agile Transformation, Natural Language Processing (NLP), General Ledgers, Financial Data, Digital, Fraud Prevention, Risk Models, Fuzzy Logic, Linked Data, Graphs, Social Network Analysis, Physics, Sales Forecasting, Risk Management, Performance, dygraphs, Text Classification, Performance Management, Risk Modeling, Performance Improvement, Machine Learning Operations (MLOps), Decision Modeling, Financial Modeling, Excel Macros, Data Engineering, Unstructured Data Analysis, Quantitative Finance, Benchmarking, AgentGPT, Data Scraping, Distributed Systems, Large Language Models (LLMs), Data Modeling, Demand Forecasting, Inventory Management, Game Theory, Convolutional Neural Networks (CNNs), Tableau Server, Process Simulation, Geospatial Analytics, Forensics, SAP, PeopleSoft, Information Retrieval, Discovery, ServiceNow, Stock Market, Generative Pre-trained Transformers (GPT), Azure Data Factory (ADF), Azure Blob Storage, Bayesian Inference & Modeling, Bnlearn, Profitability Optimization, Expert Systems, P&C Insurance, Team Mentoring, Hospitality, Generative Artificial Intelligence (GenAI), User Workflows, BERT, Artificial Neural Networks (ANN), Masking, self supervised learning, PacMAP, Dimensionality Reduction
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring