Keyvis Damptey
Verified Expert in Engineering
Data Scientist and Software Developer
Keyvis uses statistics and mathematics to discover valuable information by identifying, testing, and verifying relationships between all the factors influencing your business. This process unveils the nuances to the "lay of the land" when it comes to the costs, operations, or customer sentiment that your organization must work with. From there, you both can discover the impact of actions and design strategies around your organization's goals.
Portfolio
Experience
Availability
Preferred Environment
R, Python, Linux, Docker, Git
The most amazing...
...AI I've made automatically discovered interrelated activities from justifications for financial advances. It then predicted the legal risk of those activities.
Work Experience
Founder
Political Hack Institute
- Designed the platform and handled the source data.
- Produced visualizations, researched statistical methods, and networked with politically active and interested individuals and organizations.
- Developed the logo and marketing approach and coded the prototype website.
- Created prototype website pages with a static file header.
- Discovered data sources from many different sites.
- Developed data pipelines for the initial prototyping of visualizations.
- Designed a unified data schema in the initial stages of business planning.
- Coded complimentary visualizations that are ready to be used once the database is created.
Data Scientist
Tactical Foresight Consulting, LLC
- Used Python and R for data collection and statistical modeling, leveraging unsupervised models when labeled data was scarce.
- Determined and designed technological capabilities, showcasing the proof of concept (POC) of said capabilities to the client.
- Created D3.js and Tableau visualizations for clients who reported needs.
- Built a program to parse court documents to count references to legislative statutes and detect novel combinations of laws.
- Used Bayesian Networks to visualize the influencers of a ballot measure pass rate.
- Used NLP to create a graph of activities from scraped data from news articles.
- Created an unsupervised system to detect key events in claim adjusters' notes and implemented it in code for parallel processing.
- Created a system to detect the text format to inform us of the purpose of the text.
Technical Threat Analyst
TrustLab
- Developed and reviewed experimental designs and needed sample sizes in accordance with legal requirements.
- Managed multiple projects and adjusted priorities as new clients and requests came in, leveraging Snowflake data sources.
- Developed text analytic dashboards using Streamlit and SpaCy in GitHub and D3.js while leveraging Snowflake data sources filed with open source data.
- Presented graph analytical capabilities and their use cases. Highlighted business gaps and opportunities and proposed new solutions that could be re-used or turned into future capabilities for trust and safety projects.
- Used Bazel and Docker containers to run and test local app instances.
Data Scientist
Nordstrom
- Developed an optimization heuristic for time-series-based allocations while under tight deadlines. The estimate used to compare both solutions found my heuristic to be within a couple of percentage points as good as the vendor-supported solution.
- Created a custom data visualization dashboard that pulled information from AWS S3 and SQL databases (Teradata) and was configurable/customizable by the end users. Also maintained and updated existing dashboards with new data sources and metrics.
- Enabled the dashboards to cover inventory locations, types, levels, and PO time in transit.
- Made an information mining framework and program that returned datasets ready to be optimized in NetworkX for determining which sets of items' total volume can fit in one building while minimizing the number of packages needed for multi-item orders.
- Created starting project goals that allowed flexibility and future value for other realized projects. This enabled the reuse of past time series data for future testing, thus reducing AWS S3-related costs and computation and rework time.
- Strategically developed starting project goals that allowed for flexibility and future value for other projects that were realized. This enabled us to reuse past time series data for future code updates, reducing AWS S3-related costs and rework time.
- Developed filters based on types of inventory, including departments and other SKU-related categories, selling channels, timestamps, locations, and seasons the inventory was for.
Data Scientist (Consultant)
MatchPoint
- Suggested, created, and tested a framework of unsupervised methods to detect suggested suppliers.
- Presented results in a clear manner and developed flowcharts of how the system works.
- Used natural language processing dependency trees to create categories as a training set.
- Extracted useful search features from the text, created classifications for matching and search problems, and worked on experiments that resulted in a successful unsupervised matching algorithm with approximately 96% accuracy.
- Developed metaheuristics for creating and sourcing training datasets.
Data Scientist
Systematrix Solutions
- Used Spark MLlib via PySpark for outlier detection on GraphX RDDs.
- Presented and coded new algorithms for graph analytics using GraphX and Scala.
- Used PySpark for fraud analytics on banking records via RDD transformations, filters, and joins.
- Created, modified, and benchmarked machine learning algorithms for statistical inference on network properties and money laundering prediction in a Docker container.
- Routinely provided qualitative insights into upcoming roadblocks to meeting projects and customers' needs before they became a noticeable problem.
- Took the initiative to develop and present data privacy policies, standards, processes, and local and international legal requirements.
- Translated the fraud investigators' goals to extract essential subgraphs via graph-properties filters and transversals that delivered explicitly fraudulent connections in addition to causing a reduction in processing time for analytics.
- Prescribed a strategic approach to handle changing algorithmic regulations, burst-out fraud, and take-over fraud.
Operational Intelligence Analyst
Stanford University
- Used mathematical techniques and fit statistical models to analyze data related to business problems and visualized the results in Tableau dashboards and Neo4j.
- Visualized and Identified contextual data that was needed, patterns, summary statistics, and trends using, but not limited to, graph analytics, non-parametric ensemble models, Bayesian inference, and natural language processing (NLP).
- Adjusted the code for multicore parallel processing on computer clusters and used MapReduce functions to aggregate data for customer profiles to supplement the Neo4j database.
- Used Cypher (Neo4j QL) to add features such as fund amount to graph database of transactions.
- Automated a system to categorize any text using an unsupervised model that eliminated the need for manually finding cluster centers or reducing the time to find density parameters.
- Leveraged glove vectors (or Word2Vec) to classify an activity's risk that was extracted from text using NLP and then modeled their impact as a network/graph.
- Constructed statistical frameworks and code by utilizing new machine learning programs; I then presented them at conferences and expos.
- Transferred, aggregated, and updated data on approvers of advances, credit cards, purchase orders, payments, and other financial and banking transactions in the NoSQL database (MongoDB) using JavaScript and Python.
- Visualized the data mentioned above in a Tableau dashboard.
- Collaborated on multiple high-priority projects and made key contributions to the team's long-term strategy meetings.
Experience
Multiproject Visuals
http://www.tacticalforesight.orgPublicly Available Code
https://github.com/quantkeyvis/PublicFilesSkillset
Languages
Python, Regex, SQL, R, JavaScript, Cypher, Python 3, Scala, HTML, Bash Script, Snowflake
Libraries/APIs
SciPy, NumPy, Pandas, D3.js, SpaCy, GraphX, Spark ML, Natural Language Toolkit (NLTK), Scikit-learn, NetworkX, PySpark, XGBoost, TensorFlow, React, Graph API, PyTorch, REST APIs
Paradigms
Data Science, Agile Workflow, Parallel Programming, Anomaly Detection, Design-driven Development (D3)
Storage
Neo4j, MongoDB, Graph Databases, NoSQL, Amazon S3 (AWS S3), Data Pipelines, PostgreSQL
Other
Natural Language Processing (NLP), Text Analytics, Text Mining, Unsupervised Learning, Statistical Modeling, Statistical Methods, Statistics, Topic Modeling, Analytics, Machine Learning, Data Analysis, Classification, Models, Data Analytics, EDA, Exploratory Data Analysis, Modeling, Data Cleansing, Feature Engineering, Regression Modeling, Classification Algorithms, Predictive Modeling, Predictive Analytics, Data Scientist, Quantitative Analysis, Regression, Data Reporting, Supervised Learning, Classifier Development, Time Series, Visualization, Agile Data Science, Graph Theory, Nonlinear Optimization, Nonparametric Statistics, Data Visualization, Text Classification, GraphDB, Artificial Intelligence (AI), Programming, Large Data Sets, Parallelization, BI Reports, Reporting, Data-driven Dashboards, Statistical Analysis, Forecasting, Labeling, Deep Learning, Dashboards, Data Cleaning, Language Models, Custom Models, Clustering Algorithms, Reports, Metrics, CSV, Inventory, Consulting, A/B Testing, Business Requirements, Neural Networks, Transformer Models, Data Mining, Minimum Viable Product (MVP), Teamwork, Machine Learning Algorithms, Generalized Linear Model (GLM), Mathematics, Embeddings, Web Scraping, Industrial Engineering, Operations Research, Sentiment Analysis, Cloud, User Interface (UI), Consumer Behavior, Graphs, Web Applications, Data Modeling, Machine Learning Operations (MLOps), Financial Data Analytics, Large Language Models (LLMs), Call Centers, Data Matching, Optimization, Mentorship & Coaching, Logistic Regression, Time Series Analysis, Finance, Web Development, Experimental Design, Algorithms, Data Scraping, Unsupervised Fraud Detection, Bayesian Statistics, Bayesian Inference & Modeling, Technical Leadership, Leadership, Team Leadership
Frameworks
Django, Spark, Hadoop, Flask, Streamlit
Tools
Stanford CoreNLP, Tableau, GitLab, Microsoft Excel, Git, Plotly, Apache Airflow, Bazel, Named-entity Recognition (NER)
Platforms
Linux, RStudio, Jupyter Notebook, Docker, Amazon Web Services (AWS)
Industry Expertise
Retail & Wholesale
Education
Bachelor's Degree in Industrial Engineering
University of Central Florida - Orlando, FL, USA
Certifications
Graph Data Science
Neo4j Graph Academy
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring