Ilya Prokin
Verified Expert in Engineering
Data Science Developer
Ilya is a researcher (Ph.D.), data scientist, CTO, and entrepreneur with expertise in applied data science and machine learning in manufacturing, finance, and biotech. He has published five scientific papers, improved stock market volatility prediction, developed MVPs, pitched startups, and built a strong data science community to discuss state-of-the-art DS topics. Ilya enjoys improving businesses with data, developing innovative ways to apply data science, and geeking out about optimization.
Portfolio
Experience
Availability
Preferred Environment
Linux, Visual Studio Code (VS Code), Python, Slack
The most amazing...
...part of my ride was building and exiting startups: VC-backed, end-to-end AI products and building a strong data science community that spread across France.
Work Experience
Lead Data Scientist
LoanSnap - AI for US mortgage
- Coordinated a team of data scientists and engineers. Conducted daily standups and project management.
- Provided weekly reports to senior leadership (CTO, directors of capital markets, and product).
- Drove the data science section at company meetings presentations and enabled cross-company collaborations on data science initiatives.
- Participated in strategic initiatives planning and coordinated the execution effort. The data team's marketing recommendations increased lead volume by two times.
- Developed custom models optimizing revenue and cost-critical decisions across the entire sales pipeline and secondary market hedging activities.
- Gathered data from multiple online sources leveraging customized web strapping solutions and performed competitor intelligence analysis.
Founder and Community Organizer
Data Breakfast France
- Built a strong data science community that meets every week to discuss state-of-the-art data science.
- Grew a great data science ecosystem with access to various deep expertise, including accomplished researchers, math Olympiad winners, and strong, competitive data scientists.
- Connected with experts across the country and helped data people find jobs.
Data Science Founder in Residence
Entrepreneur First & AptaDeep
- Chosen as one of the top 3% to join EF, a highly competitive program that only selects potential tech founders with top-notch skills.
- Provided weekly reports to entrepreneurs in residence and VC partners and eventually pitched to the investment committee for pre-seed funding.
- Developed an MVP using Python, HTML, CSS, and Bootstrap to create a SaaS artificial intelligence aptamer development platform.
- Coordinated with C-level executives of aptamer companies and secured POC/pilots.
- Oversaw topics such as business models, financial modeling, B2B sales, OKRs, market sizing, competition and defensibility analysis, early-stage growth, fundraising, investor decks, venture economics, communication, and customer development.
- Performed online data gathering for 360 analysis of various startup and news trends leveraging Python for data manipulation, scraping, data analysis, and modeling.
Co-founder and CTO
NewsPill (ex-Sysmo)
- Improved stock market volatility prediction by machine learning applied to anomaly indicators on scrapped internet chatter, technical, and contextual data.
- Redesigned a legacy algorithmic trading system; reusable and structured code architecture, best practices, and design patterns.
- Supervised numerous data science powered case studies such as Trump Mood Predictor (featured on French TV).
- Built infrastructure with AWS, Docker, Redis, SQL, Python, Flask, Gunicorn, Nginx, and GitLab.
- Built a chatbot framework for the easy creation of rule-based chatbots.
- Pitched the startup and contributed to securing funding with BPI & Rockstart AI. Our startup was featured on the BFM Business TV channel (French Bloomberg).
Senior Data Scientist
Dataswati AI for Manufacturing
- Built predictive models for large French manufacturers for an unevenly sampled time series with uncertainty quantification.
- Built various automated data pipelines from raw data to automated cross-validation-based feature generation and selection to predictions.
- Integrated SOTA deep learning: CNN, LSTM, auto-encoders, and transfer learning.
- Served as a technology evangelist by delivering a blog on medium.com, talks at meetups, and collaborations with the French Institute for Research in Computer Science and Automation (Inria).
- Customized algorithm implementations via optimization by Differential Evolution, a causal model of regime change, Wasserstein distance-based anomaly detection, and a new method of multi-domain transfer learning.
- Collecting and scraping data from diverse online sources to intelligently augment data and enhance machine learning models with essential external data.
Researcher in Computational Biology and Neuroscience
Inria
- Developed a data-driven model of how biological neurons learn using various datasets, data cleaning, parsing, transformation, and modeling. Conducted numerical simulations of differential equations, optimization, and sensitivity analysis.
- Published five scientific papers in top journals: eLife, Scientific Reports, Nature.
- Used Python for data analysis (NumPy, SciPy, Pandas, scikit-learn, matplotlib, etc.) and numerical optimization (PyGMO). Redesigned the calculation module to use Python with F2PY (100x faster than Python + SciPy + NumPy).
Experience
Trump Mood Predictor
It was used as a marketing tool and an illustration of the power of sentiment analysis for the stock market for my first startup. It is known that markets are driven by the so-called animal spirits of fear and greed. During the Trump presidency, his actions and tweets were moving the markets and rippling throughout the economy. We built this web app to illustrate some of the unstructured data processing and modeling techniques that we used to predict stock market volatility.
AptaDeep
• Develop 10x better aptamers (affinity, specificity, stability, or conformational changes)
• Optimize pre-SELEX, SELEX, post-SELEX, and post-production of aptamers, as well as custom non-SELEX processes
DeepProPhoto
In this project, I worked on the back and front end, AI model training, and data scrapping.
PsyTrainer
https://t.me/psychotrainerbotI contributed to the full-stack AI development. Technologies used are Telegram, Python, SQL, Metabase dashboards, Heroku/AWS, and Falcon, fine-tuned with LoRa, OpenAI's tech.
PsyTrainer—evolve your conversations, transform your beliefs, unlock your potential, and unfold the power of communication.
Personalized Books for Kids
CONTRIBUTIONS
• Full-stack Development: I employed this to ensure a seamless user experience.
• Cloud Infrastructure: I relied on AWS for scalability and reliability.
• AI-powered Content Creation: I used Python, PyTorch, TensorFlow, spaCy, and scikit-learn for AI-driven text and illustration generation.
• Data Insights: Metabase facilitated data visualization and business intelligence.
• Marketing: Google Ads enhanced marketing strategy for customer outreach.
KEY ADVANCEMENTS
• AI Illustrations: AI-generated personalized, captivating illustrations—your kid placed within a book.
• AI-generated Text: NLP models crafted engaging, educational narratives.
• Recommendations: ML algorithms offered tailored book suggestions.
Skills
Languages
Python, SQL, R, C++, Python 3
Libraries/APIs
Pandas, Scikit-learn, Natural Language Toolkit (NLTK), TensorFlow, SpaCy, NumPy, PyTorch, PySpark, LSTM, Keras, Matplotlib
Tools
Jupyter, Amazon SageMaker, Tableau, Slack, MATLAB, GitLab, Google Cloud AI, AWS CLI
Paradigms
Data Science, ETL
Storage
Data Pipelines, MySQL, PostgreSQL, Redis
Other
Optimization, Data Cleaning, Scientific Computing, Science, Deep Learning, Time Series Analysis, Time Series, Chatbots, Data Scraping, Research, Machine Learning, Data Analysis, Data Visualization, Computational Biology, Data Analytics, Artificial Intelligence (AI), Data Reporting, Linear Optimization, Statistical Analysis, Natural Language Processing (NLP), API Integration, OpenAI GPT-4 API, Data Modeling, Forecasting, Classification, Text Classification, OpenAI GPT-3 API, Neural Networks, CTO, Chatbot Conversation Design, AI Programming, Programming, User Interface (UI), Integration, Machine Learning Operations (MLOps), Language Models, ChatGPT, Team Leadership, Software Architecture, Computer Vision, Sentiment Analysis, Image Processing, Image Analysis, Deep Reinforcement Learning, Predictive Modeling, Probability Theory, Predictive Analytics, Frameworks, Data Manipulation, Analytics, Convolutional Neural Networks, Data Engineering, Financial Modeling, Biology, Genomics, Recommendation Systems, Data Strategy, Dashboards, Web Scraping, Real-time Data, PDF Scraping, Streamlit, GPT, Pricing Models, Data-driven Marketing, Generative Pre-trained Transformers (GPT), OCR, Metabase, BERT, Custom BERT, Large Language Model (LLM), Physics, 3D Reconstruction, F2PY, Sensitivity Analysis, Numerical Optimization, Options, Scraping, Gunicorn, Communication, Fundraising, Community, Business, Market Opportunity Analysis, Websites, Writing & Editing, Telegram Bots, Google Ads
Platforms
Azure, Linux, Amazon Web Services (AWS), Docker, Visual Studio Code (VS Code), Heroku
Frameworks
Flask
Industry Expertise
Web Design
Education
Ph.D. in Computer Science
Inria Rhône-Alpes︱INSA - Lyon, France
Master's Degree in Physics
University of Nizhny Novgorod - Nizhny Novgorod, Russia