Verified Expert in Engineering
Data Scientist and Developer
Vahan is a data scientist with 5+ years of experience building several end-to-end ETL pipelines that integrate data from multiple sources. He is adept at leveraging cutting-edge tools in NLP, time series analysis, computer vision, geospatial data analysis, network analysis, and tabular data analysis to meet the project needs. Vahan employs a holistic approach to data science consulting and enjoys deep diving into the business context underlying his data science projects.
Ubuntu, Visual Studio Code (VS Code), Jupyter, MongoDB, Python, ChatGPT, Stable Diffusion, MVP Design, Real Estate, Insurance
The most amazing...
...project I've developed uses various data sources and modeling modalities including NLP, CV, and networks to deliver social, political, and economic insights.
NLP Data Scientist
- Built an NLP pipeline with components that include synthetic dataset augmentation using GPT-3, few-shot topic classification using contrastive learning and transformer finetuning, and a suit of linguistic heuristics.
- Built a keyword extraction pipeline that uses morphological and dependency parsing, synthetic data augmentation using GPT-3, few-shot classification using contrastive learning and transformers to extract dyadic networks from company descriptions.
- Built and deployed interactive dashboards to demonstrate data extraction tools using Steamlit and GCP.
Hxr Eq LLC
- Researched concerning models and techniques used by major eCommerce websites for search ranking.
- Consulted concerning the business implications of models and techniques used in eCommerce search ranking for eCommerce retailers.
- Counseled concerning future work and development in eCommerce search ranking strategies.
ML and OpenAI Developer
HODL Media Inc.
- Developed an algorithm to filter cryptocurrency-related news search results.
- Deployed a pipeline that leverages several data retrieval APIs, transformer-based architectures, and the GPT-3 API in GCP.
- Consulted concerning the future deployment of NLP-driven solutions for information retrieval.
Sky Dust Intelligence B.V.
- Developed an AI framework that leverages GPT-3 and other transformer-based neural network architectures to automate email summarization, replies, and question answering.
- Developed a cloud-based Office Outlook add-in that leverages an AI framework for email automation.
- Consulted the team with regard to product development and Natural Language Processing.
American University of Armenia
- Developed a transformers-driven NLP toolkit to analyze multi-language news and social media text data.
- Built a pipeline for real-time monitoring, analysis, and visualization of strategic information and psychological operations (PSYOPS).
- Consulted the government of Armenia on strategic information operations.
International Consultant on Social Media Data Quality Assessment
United Nations Statistics Division
- Developed a hybrid NLP-driven methodology to monitor social media data quality.
- Built an end-to-end ETL pipeline that gathers social media data using advanced automation bots. It also leverages the state-of-the-art of transformer-based architectures for text and image classification.
- Conceived and facilitated training seminars on a range of topics in data science and NLP.
- Contributed to the National Administrative Department of Colombia's (DANE) social media data strategy.
- Participated in international forums to present and discuss results and prospects of undertaken tasks.
Data Science Team Lead
UNDP Armenia National SDG Innovation Lab
- Developed supervised and unsupervised language models for Armenian, Russian, and English in various use cases.
- Designed, implemented, and managed end-to-end data science projects for various sectors, including tourism, labor, social services, etc.
- Oversaw and applied novel methods for unconventional data analysis of the sustainable development goals (SDG) implementation in Armenia and other countries.
- Represented Armenia in international forums on data science for international development.
Entrepreneur and Researcher
- Researched and modeled diversified revenue-sharing approaches for smallholder aggregation to reduce smallholder farmers' supply chain risk in agricultural production.
- Communicated with stakeholders in agriculture, finance, and international development to research, develop, and promote the concept.
- Developed a novel approach for risk management in smallholder agricultural production.
Machine Learning Analyst
- Built natural language processing models for a virtual call center assistant (chatbot).
- Developed recurrent neural networks and convolutional neural networks to forecast commodity prices, financial market indicators, and product sales.
- Created the novel Product2Vec and Customer2Vec models to forecast and predict customer churn.
Ministry of Defense of Republic of Armenia
- Developed code to analyze and visualize tactical, strategic, and administrative data.
- Conducted various tasks related to artillery reconnaissance, collaboration with foreign delegations, research, and speech–writing.
- Coordinated research by experts from MIT and Harvard, Oxford, and Cambridge universities.
The primary challenge in the project was working with low-resource languages and tiny datasets for supervised learning. The framework I designed to overcome this challenge entailed dataset augmentation using machine translation and generative autoregressive language models for paraphrase generation and zero-shot classification and finetuning of pre-trained transformers such as XLM-Roberta.
I contributed to designing the tool to provide public policymakers in the tourism sector with real-time actionable intelligence and historical trend data to render decision-making more data-driven and evidence-based.
National Administrative Department of Colombia
My responsibilities in this project involved developing a hybrid NLP-driven methodology to monitor social media data quality and building an end-to-end ETL pipeline that gathers social media data using advanced automation bots and leveraging transformer-based architectures for text and image classification.
The project and the insights gathered from it contributed to the social media data strategy of the National Administrative Department of Colombia (DANE).
NumPy, Pandas, Scikit-learn, XGBoost, CatBoost, TensorFlow, PyTorch, Natural Language Toolkit (NLTK), Keras
Microsoft Excel, Jupyter, Microsoft Power BI, BigQuery, OpenAI Gym
ETL, Data Science, Business Intelligence (BI), Asynchronous Programming
Jupyter Notebook, Ubuntu, Visual Studio Code (VS Code), Google Cloud Platform (GCP), Azure
Databases, MongoDB, NoSQL
Natural Language Processing (NLP), Data Scraping, Deep Learning, Machine Learning, EDA, Transformers, Artificial Intelligence (AI), Text Classification, Text Mining, Web Scraping, Dashboards, Data Analytics, Predictive Modeling, Data Collection, Data Analysis, Charts, Data Modeling, APIs, Web Crawlers, Scraping, Language Models, Data Processing Automation, MVP Design, GPT, Generative Pre-trained Transformers (GPT), Mathematics, Linear Algebra, Graph Theory, Mathematical Analysis, Microeconomics, Macroeconomics, Probability Theory, Statistics, Computer Vision, Data Visualization, Consulting, Time Series Analysis, Geospatial Data, Forecasting, Chatbots, Research, Social Network Analysis, Risk Models, Teamwork, Leadership, Generative Pre-trained Transformer 3 (GPT-3), IT Project Management, Networks, Geospatial Analytics, Hugging Face, Graphs, OpenAI, Time Series, ChatGPT, Stable Diffusion, Real Estate, Environment, Economics, Financial Mathematics, Quantitative Risk Modeling, Game Theory, Measure Theory, Supply Chain, International Trade, Entrepreneurship, Market Research & Analysis, History, Physics, English, Languages, Biology, Environmental Science, Art, Knowledge Graphs, Rankings
Project Management, Insurance
Bachelor of Science Degree in Mathematics with Economics
University College London | UCL - London, United Kingdom
High School Diploma in Secondary Education
John F. Kennedy Schule - Berlin, Germany