Keyvis is available for hire

Keyvis Damptey

Verified Expert in Engineering

Data Scientist and Software Developer

Location

Atlanta, GA, United States

Toptal Member Since

August 16, 2019

Keyvis uses statistics and mathematics to discover valuable information by identifying, testing, and verifying relationships between all the factors influencing your business. This process unveils the nuances to the "lay of the land" when it comes to the costs, operations, or customer sentiment that your organization must work with. From there, you both can discover the impact of actions and design strategies around your organization's goals.

Data Scientist Data Analysis Data Analytics Exploratory Data Analysis Data Cleansing Predictive Analytics Statistics Analytics Data Reporting Natural Language Processing (NLP)Machine Learning Python Regex NumPy Pandas

Portfolio

Political Hack Institute

Django, React, D3.js, Python, Data Visualization, Graph Theory, Web Scraping...

Tactical Foresight Consulting, LLC

Spark, Hadoop, Neo4j, D3.js, JavaScript, R, Python, Text Classification...

TrustLab

Snowflake, Bazel, Streamlit, D3.js, Design-driven Development (D3)...

Experience

Statistical Modeling - 5 years SQL - 4 years Python - 4 years Natural Language Processing (NLP) - 3 years Neo4j - 3 years Unsupervised Learning - 3 years Data Visualization - 2 years Graph Theory - 1 year

Availability

Full-time

Preferred Environment

R, Python, Linux, Docker, Git

The most amazing...

...AI I've made automatically discovered interrelated activities from justifications for financial advances. It then predicted the legal risk of those activities.

Work Experience

Founder

2019 - PRESENT

Political Hack Institute

Designed the platform and handled the source data.
Produced visualizations, researched statistical methods, and networked with politically active and interested individuals and organizations.
Developed the logo and marketing approach and coded the prototype website.
Created prototype website pages with a static file header.
Discovered data sources from many different sites.
Developed data pipelines for the initial prototyping of visualizations.
Designed a unified data schema in the initial stages of business planning.
Coded complimentary visualizations that are ready to be used once the database is created.

Technologies: Django, React, D3.js, Python, Data Visualization, Graph Theory, Web Scraping, Visualization, Data Analysis, Text Analytics, Text Mining, Natural Language Processing (NLP), GraphDB, SciPy, Programming, User Interface (UI), EDA, Exploratory Data Analysis, Data Modeling, Data Scientist, REST APIs, Data Cleaning, HTML, Web Development, Data Scraping

Data Scientist

2019 - PRESENT

Tactical Foresight Consulting, LLC

Used Python and R for data collection and statistical modeling, leveraging unsupervised models when labeled data was scarce.
Determined and designed technological capabilities, showcasing the proof of concept (POC) of said capabilities to the client.
Created D3.js and Tableau visualizations for clients who reported needs.
Built a program to parse court documents to count references to legislative statutes and detect novel combinations of laws.
Used Bayesian Networks to visualize the influencers of a ballot measure pass rate.
Used NLP to create a graph of activities from scraped data from news articles.
Created an unsupervised system to detect key events in claim adjusters' notes and implemented it in code for parallel processing.
Created a system to detect the text format to inform us of the purpose of the text.

Technologies: Spark, Hadoop, Neo4j, D3.js, JavaScript, R, Python, Text Classification, Classification, Data Science, Natural Language Toolkit (NLTK), SpaCy, Web Scraping, Visualization, Data Analysis, Text Analytics, Text Mining, Natural Language Processing (NLP), Statistical Modeling, Statistical Methods, Analytics, GraphDB, Flask, Artificial Intelligence (AI), Models, NumPy, Pandas, Scikit-learn, Statistics, SciPy, Sentiment Analysis, Programming, User Interface (UI), Data Analytics, EDA, Exploratory Data Analysis, Modeling, Web Applications, Data Cleansing, Feature Engineering, Classification Algorithms, Machine Learning, Regression Modeling, Predictive Modeling, Predictive Analytics, Data Scientist, Statistical Analysis, Quantitative Analysis, Regression, Labeling, PyTorch, REST APIs, Deep Learning, Jupyter Notebook, Data Cleaning, Language Models, Custom Models, Reports, Metrics, CSV, Large Language Models (LLMs), Plotly, HTML, Consulting, Optimization, XGBoost, Bash Script, Finance, Web Development, Python 3, Graph Databases, Neural Networks, Named-entity Recognition (NER), Transformer Models, Data Scraping, Minimum Viable Product (MVP), Classifier Development, Teamwork, Technical Leadership, Leadership, Machine Learning Algorithms, Mathematics, Team Leadership, Embeddings

Technical Threat Analyst

2023 - 2024

TrustLab

Developed and reviewed experimental designs and needed sample sizes in accordance with legal requirements.
Managed multiple projects and adjusted priorities as new clients and requests came in, leveraging Snowflake data sources.
Developed text analytic dashboards using Streamlit and SpaCy in GitHub and D3.js while leveraging Snowflake data sources filed with open source data.
Presented graph analytical capabilities and their use cases. Highlighted business gaps and opportunities and proposed new solutions that could be re-used or turned into future capabilities for trust and safety projects.
Used Bazel and Docker containers to run and test local app instances.

Technologies: Snowflake, Bazel, Streamlit, D3.js, Design-driven Development (D3), Experimental Design, Python, Graphs, Graph Theory, Text Mining, SpaCy, Python 3, Graph Databases, Algorithms, Named-entity Recognition (NER), Data Mining, Data Scraping, Teamwork, Bayesian Statistics, Bayesian Inference & Modeling, Machine Learning Algorithms, Mathematics

Data Scientist

2021 - 2023

Nordstrom

Developed an optimization heuristic for time-series-based allocations while under tight deadlines. The estimate used to compare both solutions found my heuristic to be within a couple of percentage points as good as the vendor-supported solution.
Created a custom data visualization dashboard that pulled information from AWS S3 and SQL databases (Teradata) and was configurable/customizable by the end users. Also maintained and updated existing dashboards with new data sources and metrics.
Enabled the dashboards to cover inventory locations, types, levels, and PO time in transit.
Made an information mining framework and program that returned datasets ready to be optimized in NetworkX for determining which sets of items' total volume can fit in one building while minimizing the number of packages needed for multi-item orders.
Created starting project goals that allowed flexibility and future value for other realized projects. This enabled the reuse of past time series data for future testing, thus reducing AWS S3-related costs and computation and rework time.
Strategically developed starting project goals that allowed for flexibility and future value for other projects that were realized. This enabled us to reuse past time series data for future code updates, reducing AWS S3-related costs and rework time.
Developed filters based on types of inventory, including departments and other SKU-related categories, selling channels, timestamps, locations, and seasons the inventory was for.

Technologies: Amazon S3 (AWS S3), Docker, GitLab, Data Science, Time Series, Data Visualization, Graph Theory, Data Analysis, Statistical Modeling, Statistical Methods, Analytics, GraphDB, Machine Learning, Artificial Intelligence (AI), Cloud, Models, Amazon Web Services (AWS), Anomaly Detection, NumPy, Pandas, Agile Data Science, Scikit-learn, Agile Workflow, Spark ML, NetworkX, Statistics, SciPy, Operations Research, Programming, Data Analytics, EDA, Exploratory Data Analysis, Consumer Behavior, Modeling, RStudio, Data Cleansing, PySpark, Machine Learning Operations (MLOps), Feature Engineering, Classification Algorithms, Regression Modeling, Large Data Sets, Parallelization, Predictive Modeling, Predictive Analytics, Data Scientist, Statistical Analysis, Forecasting, Quantitative Analysis, Regression, Tableau, Data Reporting, Dashboards, Jupyter Notebook, Data Cleaning, Custom Models, Reports, Data Pipelines, Metrics, CSV, Supervised Learning, Plotly, Inventory, Retail & Wholesale, PostgreSQL, A/B Testing, Optimization, Mentorship & Coaching, Time Series Analysis, XGBoost, Bash Script, Business Requirements, Apache Airflow, Python 3, Graph Databases, Algorithms, Neural Networks, NoSQL, Data Mining, Teamwork, Technical Leadership, Leadership, Machine Learning Algorithms

Data Scientist (Consultant)

2018 - 2018

MatchPoint

Suggested, created, and tested a framework of unsupervised methods to detect suggested suppliers.
Presented results in a clear manner and developed flowcharts of how the system works.
Used natural language processing dependency trees to create categories as a training set.
Extracted useful search features from the text, created classifications for matching and search problems, and worked on experiments that resulted in a successful unsupervised matching algorithm with approximately 96% accuracy.
Developed metaheuristics for creating and sourcing training datasets.

Technologies: Regex, SQL, Python, Text Classification, Classification, Data Science, Natural Language Toolkit (NLTK), Unsupervised Learning, Topic Modeling, Nonparametric Statistics, SpaCy, Data Analysis, Text Analytics, Text Mining, Statistical Modeling, Statistical Methods, Analytics, Artificial Intelligence (AI), Cloud, Models, Amazon Web Services (AWS), NumPy, Pandas, Agile Data Science, Scikit-learn, Agile Workflow, Statistics, SciPy, Sentiment Analysis, Programming, Data Analytics, EDA, Exploratory Data Analysis, Modeling, Data Cleansing, Feature Engineering, Classification Algorithms, Regression Modeling, Machine Learning, Large Data Sets, Predictive Modeling, Data Scientist, Statistical Analysis, Quantitative Analysis, Regression, Labeling, Jupyter Notebook, Data Cleaning, Custom Models, Clustering Algorithms, Reports, Data Matching, Consulting, Logistic Regression, XGBoost, Python 3, Neural Networks, Transformer Models, Data Scraping, Minimum Viable Product (MVP), Classifier Development, Microsoft Excel, Bayesian Statistics, Bayesian Inference & Modeling, Leadership, Machine Learning Algorithms, Generalized Linear Model (GLM), Mathematics, Embeddings

Data Scientist

2017 - 2018

Systematrix Solutions

Used Spark MLlib via PySpark for outlier detection on GraphX RDDs.
Presented and coded new algorithms for graph analytics using GraphX and Scala.
Used PySpark for fraud analytics on banking records via RDD transformations, filters, and joins.
Created, modified, and benchmarked machine learning algorithms for statistical inference on network properties and money laundering prediction in a Docker container.
Routinely provided qualitative insights into upcoming roadblocks to meeting projects and customers' needs before they became a noticeable problem.
Took the initiative to develop and present data privacy policies, standards, processes, and local and international legal requirements.
Translated the fraud investigators' goals to extract essential subgraphs via graph-properties filters and transversals that delivered explicitly fraudulent connections in addition to causing a reduction in processing time for analytics.
Prescribed a strategic approach to handle changing algorithmic regulations, burst-out fraud, and take-over fraud.

Technologies: Spark, Hadoop, Neo4j, D3.js, JavaScript, Scala, SQL, Python, Data Science, Cypher, GraphX, Unsupervised Learning, Data Visualization, Graph Theory, Nonlinear Optimization, Visualization, Data Analysis, Statistical Modeling, Statistical Methods, Analytics, Parallel Programming, GraphDB, Flask, Django, Machine Learning, Artificial Intelligence (AI), Cloud, Models, Anomaly Detection, NumPy, Pandas, Agile Data Science, Scikit-learn, Agile Workflow, Spark ML, NetworkX, TensorFlow, Statistics, SciPy, Programming, User Interface (UI), Data Analytics, EDA, Exploratory Data Analysis, Modeling, Web Applications, Data Cleansing, PySpark, Data Modeling, Machine Learning Operations (MLOps), Feature Engineering, Regression Modeling, Classification Algorithms, Large Data Sets, Parallelization, Data Scientist, Statistical Analysis, Forecasting, Quantitative Analysis, Regression, Data Reporting, Dashboards, Data Cleaning, Reports, CSV, HTML, Consulting, Finance, Business Requirements, Web Development, Graph Databases, Algorithms, NoSQL, Data Mining, Unsupervised Fraud Detection, Classifier Development, Teamwork, Technical Leadership, Machine Learning Algorithms, Mathematics

Operational Intelligence Analyst

2015 - 2017

Stanford University

Used mathematical techniques and fit statistical models to analyze data related to business problems and visualized the results in Tableau dashboards and Neo4j.
Visualized and Identified contextual data that was needed, patterns, summary statistics, and trends using, but not limited to, graph analytics, non-parametric ensemble models, Bayesian inference, and natural language processing (NLP).
Adjusted the code for multicore parallel processing on computer clusters and used MapReduce functions to aggregate data for customer profiles to supplement the Neo4j database.
Used Cypher (Neo4j QL) to add features such as fund amount to graph database of transactions.
Automated a system to categorize any text using an unsupervised model that eliminated the need for manually finding cluster centers or reducing the time to find density parameters.
Leveraged glove vectors (or Word2Vec) to classify an activity's risk that was extracted from text using NLP and then modeled their impact as a network/graph.
Constructed statistical frameworks and code by utilizing new machine learning programs; I then presented them at conferences and expos.
Transferred, aggregated, and updated data on approvers of advances, credit cards, purchase orders, payments, and other financial and banking transactions in the NoSQL database (MongoDB) using JavaScript and Python.
Visualized the data mentioned above in a Tableau dashboard.
Collaborated on multiple high-priority projects and made key contributions to the team's long-term strategy meetings.

Technologies: Tableau, MongoDB, SQL, Neo4j, R, Python, Text Classification, Classification, Data Science, Cypher, Natural Language Toolkit (NLTK), Unsupervised Learning, Topic Modeling, Nonparametric Statistics, Time Series, Data Visualization, Graph Theory, Nonlinear Optimization, Visualization, Data Analysis, Text Analytics, Text Mining, Natural Language Processing (NLP), Statistical Modeling, Statistical Methods, Analytics, Stanford CoreNLP, Parallel Programming, GraphDB, Artificial Intelligence (AI), Models, Anomaly Detection, NumPy, Pandas, Scikit-learn, Spark ML, NetworkX, TensorFlow, Statistics, SciPy, Sentiment Analysis, Programming, Data Analytics, EDA, Exploratory Data Analysis, Modeling, RStudio, Data Cleansing, PySpark, Data Modeling, Feature Engineering, Classification Algorithms, Machine Learning, Regression Modeling, Large Data Sets, Parallelization, Predictive Modeling, Predictive Analytics, Data Scientist, Statistical Analysis, Forecasting, Quantitative Analysis, Regression, Labeling, PyTorch, REST APIs, Deep Learning, Data Reporting, Dashboards, Financial Data Analytics, Jupyter Notebook, Data Cleaning, Language Models, Custom Models, Clustering Algorithms, Reports, Data Pipelines, Metrics, CSV, Large Language Models (LLMs), Call Centers, Supervised Learning, Plotly, Consulting, Mentorship & Coaching, Logistic Regression, Time Series Analysis, XGBoost, Finance, Business Requirements, Graph Databases, Neural Networks, NoSQL, Transformer Models, Data Mining, Data Scraping, Unsupervised Fraud Detection, Classifier Development, Teamwork, Microsoft Excel, Bayesian Statistics, Bayesian Inference & Modeling, Leadership, Machine Learning Algorithms, Generalized Linear Model (GLM), Mathematics, Embeddings

Experience

Multiproject Visuals

http://www.tacticalforesight.org

This site has visualizations from multiple small projects that showcase the breadth of skills I have to offer. Many are interactive and based on open data sources. Some required unique data transformations, as well.

Publicly Available Code

https://github.com/quantkeyvis/PublicFiles

Here is some of the impromptu code I have published to showcase my ad-hoc coding style. The scripts include web scrapping, text analysis, pseudonymous data generation, and visualization of maps.

Skillset

Languages

Python, Regex, SQL, R, JavaScript, Cypher, Python 3, Scala, HTML, Bash Script, Snowflake

Libraries/APIs

SciPy, NumPy, Pandas, D3.js, SpaCy, GraphX, Spark ML, Natural Language Toolkit (NLTK), Scikit-learn, NetworkX, PySpark, XGBoost, TensorFlow, React, Graph API, PyTorch, REST APIs

Paradigms

Data Science, Agile Workflow, Parallel Programming, Anomaly Detection, Design-driven Development (D3)

Storage

Neo4j, MongoDB, Graph Databases, NoSQL, Amazon S3 (AWS S3), Data Pipelines, PostgreSQL

Other

Natural Language Processing (NLP), Text Analytics, Text Mining, Unsupervised Learning, Statistical Modeling, Statistical Methods, Statistics, Topic Modeling, Analytics, Machine Learning, Data Analysis, Classification, Models, Data Analytics, EDA, Exploratory Data Analysis, Modeling, Data Cleansing, Feature Engineering, Regression Modeling, Classification Algorithms, Predictive Modeling, Predictive Analytics, Data Scientist, Quantitative Analysis, Regression, Data Reporting, Supervised Learning, Classifier Development, Time Series, Visualization, Agile Data Science, Graph Theory, Nonlinear Optimization, Nonparametric Statistics, Data Visualization, Text Classification, GraphDB, Artificial Intelligence (AI), Programming, Large Data Sets, Parallelization, BI Reports, Reporting, Data-driven Dashboards, Statistical Analysis, Forecasting, Labeling, Deep Learning, Dashboards, Data Cleaning, Language Models, Custom Models, Clustering Algorithms, Reports, Metrics, CSV, Inventory, Consulting, A/B Testing, Business Requirements, Neural Networks, Transformer Models, Data Mining, Minimum Viable Product (MVP), Teamwork, Machine Learning Algorithms, Generalized Linear Model (GLM), Mathematics, Embeddings, Web Scraping, Industrial Engineering, Operations Research, Sentiment Analysis, Cloud, User Interface (UI), Consumer Behavior, Graphs, Web Applications, Data Modeling, Machine Learning Operations (MLOps), Financial Data Analytics, Large Language Models (LLMs), Call Centers, Data Matching, Optimization, Mentorship & Coaching, Logistic Regression, Time Series Analysis, Finance, Web Development, Experimental Design, Algorithms, Data Scraping, Unsupervised Fraud Detection, Bayesian Statistics, Bayesian Inference & Modeling, Technical Leadership, Leadership, Team Leadership

Frameworks

Django, Spark, Hadoop, Flask, Streamlit

Tools

Stanford CoreNLP, Tableau, GitLab, Microsoft Excel, Git, Plotly, Apache Airflow, Bazel, Named-entity Recognition (NER)

Platforms

Linux, RStudio, Jupyter Notebook, Docker, Amazon Web Services (AWS)

Industry Expertise

Retail & Wholesale

Education

2008 - 2013

Bachelor's Degree in Industrial Engineering

University of Central Florida - Orlando, FL, USA

Certifications

AUGUST 2023 - PRESENT

Graph Data Science

Neo4j Graph Academy

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring