Vahan Martirosyan
Verified Expert in Engineering
Data Scientist and Developer
Abu Dhabi, United Arab Emirates
Toptal member since March 16, 2022
Vahan is a data scientist with over five years of experience building several end-to-end ETL pipelines that integrate data from multiple sources. He is adept at leveraging cutting-edge tools in NLP, time series analysis, computer vision, geospatial data analysis, network analysis, and tabular data analysis to meet the project needs. Vahan employs a holistic approach to data science consulting and enjoys deep diving into the business context underlying his data science projects.
Portfolio
Experience
- Natural Language Processing (NLP) - 5 years
- Computer Vision - 5 years
- Time Series Analysis - 5 years
- Consulting - 5 years
- Generative Pre-trained Transformers (GPT) - 5 years
- ETL - 5 years
- Machine Learning - 5 years
- Geospatial Data - 4 years
Availability
Preferred Environment
Ubuntu, Visual Studio Code (VS Code), Jupyter, MongoDB, Python, ChatGPT, Stable Diffusion, Real Estate, OpenAI GPT-4 API
The most amazing...
...project I've developed uses various data sources and modeling modalities, including NLP, CV, and networks, to deliver social, political, and economic insights.
Work Experience
NLP Data Scientist
Grata Inc
- Built an NLP pipeline with components that include synthetic dataset augmentation using GPT-3, few-shot topic classification using contrastive learning and transformer finetuning, and a suit of linguistic heuristics.
- Built a keyword extraction pipeline that uses morphological and dependency parsing, synthetic data augmentation using GPT-3, few-shot classification using contrastive learning and transformers to extract dyadic networks from company descriptions.
- Built and deployed interactive dashboards to demonstrate data extraction tools using Steamlit and GCP.
Data Scientist
Hxr Eq LLC
- Researched concerning models and techniques used by major eCommerce websites for search ranking.
- Consulted concerning the business implications of models and techniques used in eCommerce search ranking for eCommerce retailers.
- Counseled concerning future work and development in eCommerce search ranking strategies.
ML and OpenAI Developer
HODL Media Inc.
- Developed an algorithm to filter cryptocurrency-related news search results.
- Deployed a pipeline that leverages several data retrieval APIs, transformer-based architectures, and the GPT-3 API in GCP.
- Consulted concerning the future deployment of NLP-driven solutions for information retrieval.
NLP Engineer
Sky Dust Intelligence B.V.
- Developed an AI framework that leverages GPT-3 and other transformer-based neural network architectures to automate email summarization, replies, and question answering.
- Developed a cloud-based Office Outlook add-in that leverages an AI framework for email automation.
- Consulted the team with regard to product development and Natural Language Processing.
Co-researcher
American University of Armenia
- Developed a transformers-driven NLP toolkit to analyze multi-language news and social media text data.
- Built a pipeline for real-time monitoring, analysis, and visualization of strategic information and psychological operations (PSYOPS).
- Consulted the government of Armenia on strategic information operations.
International Consultant on Social Media Data Quality Assessment
United Nations Statistics Division
- Developed a hybrid NLP-driven methodology to monitor social media data quality.
- Built an end-to-end ETL pipeline that gathers social media data using advanced automation bots. It also leverages the state-of-the-art of transformer-based architectures for text and image classification.
- Conceived and facilitated training seminars on a range of topics in data science and NLP.
- Contributed to the National Administrative Department of Colombia's (DANE) social media data strategy.
- Participated in international forums to present and discuss results and prospects of undertaken tasks.
Data Science Team Lead
UNDP Armenia National SDG Innovation Lab
- Developed supervised and unsupervised language models for Armenian, Russian, and English in various use cases.
- Designed, implemented, and managed end-to-end data science projects for various sectors, including tourism, labor, social services, etc.
- Oversaw and applied novel methods for unconventional data analysis of the sustainable development goals (SDG) implementation in Armenia and other countries.
- Represented Armenia in international forums on data science for international development.
Entrepreneur and Researcher
Impact Hub
- Researched and modeled diversified revenue-sharing approaches for smallholder aggregation to reduce smallholder farmers' supply chain risk in agricultural production.
- Communicated with stakeholders in agriculture, finance, and international development to research, develop, and promote the concept.
- Developed a novel approach for risk management in smallholder agricultural production.
Machine Learning Analyst
Ameriabank
- Built natural language processing models for a virtual call center assistant (chatbot).
- Developed recurrent neural networks and convolutional neural networks to forecast commodity prices, financial market indicators, and product sales.
- Created the novel Product2Vec and Customer2Vec models to forecast and predict customer churn.
Serviceman
Ministry of Defense of Republic of Armenia
- Developed code to analyze and visualize tactical, strategic, and administrative data.
- Conducted various tasks related to artillery reconnaissance, collaboration with foreign delegations, research, and speech–writing.
- Coordinated research by experts from MIT and Harvard, Oxford, and Cambridge universities.
Experience
AI4Mulberry
https://www.sdglab.am/en/projectsThe primary challenge in the project was working with low-resource languages and tiny datasets for supervised learning. The framework I designed to overcome this challenge entailed dataset augmentation using machine translation and generative autoregressive language models for paraphrase generation and zero-shot classification and finetuning of pre-trained transformers such as XLM-Roberta.
Travelinsights
https://www.travelinsights.ai/I contributed to designing the tool to provide public policymakers in the tourism sector with real-time actionable intelligence and historical trend data to render decision-making more data-driven and evidence-based.
Edu2Work
https://edu2work.am/National Administrative Department of Colombia
My responsibilities in this project involved developing a hybrid NLP-driven methodology to monitor social media data quality and building an end-to-end ETL pipeline that gathers social media data using advanced automation bots and leveraging transformer-based architectures for text and image classification.
The project and the insights gathered from it contributed to the social media data strategy of the National Administrative Department of Colombia (DANE).
Education
Bachelor of Science Degree in Mathematics with Economics
University College London | UCL - London, United Kingdom
High School Diploma in Secondary Education
John F. Kennedy Schule - Berlin, Germany
Skills
Libraries/APIs
NumPy, Pandas, Scikit-learn, XGBoost, CatBoost, TensorFlow, PyTorch, Natural Language Toolkit (NLTK), Keras
Tools
Microsoft Excel, Jupyter, Microsoft Power BI, ChatGPT, BigQuery, OpenAI Gym
Languages
Python, SQL
Paradigms
ETL, Business Intelligence (BI), Asynchronous Programming
Platforms
Jupyter Notebook, Ubuntu, Visual Studio Code (VS Code), Google Cloud Platform (GCP), Azure
Storage
Databases, MongoDB, NoSQL
Industry Expertise
Project Management, Insurance
Frameworks
Flask
Other
Natural Language Processing (NLP), Data Scraping, Deep Learning, Machine Learning, EDA, Transformers, Artificial Intelligence (AI), Text Classification, Text Mining, Web Scraping, Dashboards, Data Science, Data Analytics, Predictive Modeling, Data Collection, Data Analysis, Charts, Data Modeling, APIs, Web Crawlers, Scraping, Language Models, Data Processing Automation, MVP Design, Generative Pre-trained Transformers (GPT), Mathematics, Linear Algebra, Graph Theory, Mathematical Analysis, Microeconomics, Macroeconomics, Probability Theory, Statistics, Computer Vision, Data Visualization, Consulting, Time Series Analysis, Geospatial Data, Forecasting, Chatbots, Research, Social Network Analysis, Risk Models, Teamwork, Leadership, Generative Pre-trained Transformer 3 (GPT-3), IT Project Management, Networks, Geospatial Analytics, Hugging Face, Graphs, OpenAI, Time Series, Stable Diffusion, Real Estate, Environment, Economics, Financial Mathematics, Quantitative Risk Modeling, Game Theory, Measure Theory, Supply Chain, International Trade, Entrepreneurship, Market Research & Analysis, History, Physics, English, Languages, Biology, Environmental Science, Art, Knowledge Graphs, Rankings, OpenAI GPT-4 API
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring