Senior Staff | Data Science and Analytics
2019 - 2021Doximity- Developed NLP models to extract features from medical literature and match the content with users.
- Led the dimensional modeling team in structuring company-wide data.
- Leveraged the company data to customize the product to achieve company goals.
Technologies: Python, Pandas, NumPy, A/B Testing, SciPy, Jupyter, Scikit-learn, NLTK, Natural Language Processing (NLP), Data Visualization, Looker, Snowflake, Amazon SageMaker, Machine Learning, BERT, Word2Vec, Torch, PyTorch, Deep Learning, ETL, Dimensional Modeling, Predictive Modeling, Data Analytics, Amazon Web Services (AWS), Business Intelligence (BI), Data Analysis, Agile Deployment, Big Data, Apache Airflow, Data Pipelines, Spark, PyCharm, Data Modeling, Project Management, People Management, Dask, APIs, Clustering, Python 3Principal Data Scientist | Head of R&D Data Analytics Center
2016 - 2019Siemens- Recruited, hired, and led an international and multicultural team of 15 data experts, including some with their direct reports.
- Organized international teams in Japan, Singapore, Hong Kong, and Europe.
- Managed simultaneous project proposals, budgets, and timelines.
- Developed and maintained scalable ETL systems to ingest, process, and analyze streams of petabytes of data from diverse sources.
- Produced interactive data visualization products with real-time analytics for resource allocation, predictive maintenance, and improvements to odometric systems.
- Authored intellectual property applications and academic publications.
Technologies: Amazon Web Services (AWS), Tableau, PostgreSQL, Python, Data Analytics, Data Visualization, Pandas, Scikit-learn, Looker, Natural Language Processing (NLP), SQL, Data Science, Snowflake, Time Series Analysis, NumPy, Amazon SageMaker, Torch, TensorFlow, BERT, Business Intelligence (BI), ETL, Machine Learning, Predictive Modeling, Time Series, People Management, Agile Deployment, Project Management, Big Data, Deep Learning, Data Analysis, SciPy, Redshift, Dimensional Modeling, Jupyter, Data Pipelines, pgAdmin, PyCharm, Data Modeling, Luigi, Dask, Tableau Server, APIs, Clustering, Neural Networks, Python 3, GISHead of Data Science
2015 - 2016CrossEngage- Recruited, hired, and led a team of ten data specialists.
- Designed an artificial intelligence technique for predicting and directing customer behavior.
- Invented new techniques for machine learning with imbalanced datasets—expansion of the SMOTE algorithm for online marketing.
- Presented our data products to clients and investors.
Technologies: R, Spark, Cassandra, PostgreSQL, Python, Data Analytics, Data Visualization, Pandas, Scikit-learn, Natural Language Processing (NLP), SQL, Data Science, Time Series Analysis, NumPy, Amazon Web Services (AWS), TensorFlow, Business Intelligence (BI), ETL, Machine Learning, Predictive Modeling, Time Series, People Management, A/B Testing, Data Analysis, SciPy, Big Data, Jupyter, Data Pipelines, PyCharm, Data Modeling, Deep Learning, Project Management, APIs, Clustering, Neural Networks, Python 3Data Scientist
2013 - 2015Rocket Internet- Consulted portfolio companies—HelloFresh, Zalando, Carmudi—to identify new opportunities to utilize their data.
- Oversaw the construction of scalable ETL pipelines to plan for growth.
- Created price prediction models for the global used vehicle market.
- Worked with product owners to determine KPIs with automated reporting.
- Segmented customers for marketing and churn prediction.
Technologies: Python, NumPy, SciPy, Machine Learning, R, PostgreSQL, Data Analysis, Data Visualization, Data Analytics, Business Intelligence (BI), ETL, Predictive Modeling, Big Data, Algorithmic Trading, Data Pipelines, PyCharm, Data Modeling, Deep Learning, Amazon Web Services (AWS), APIs, Clustering, Neural Networks, Python 3Postdoctoral Researcher
2011 - 2013Wageningen University & Research- Supervised a team of six graduate students and managed international projects with teams from Japan, Egypt, Israel, Canada, the USA, and Europe.
- Applied machine learning to chemical systems to produce biochemical sensors.
- Acquired and analyzed the data from "artificial noses" as a diagnostic instrument.
- Authored research articles in top journals and reviewed articles as a peer reviewer.
Technologies: MATLAB, Python, Data Analytics, Data Visualization, SQL, Data Science, Machine Learning, Predictive Modeling, Data Analysis, Data Modeling, People Management, Python 3