- CTO2017 - PRESENTBot MD
Technologies: Python, Django, Web Sockets, Chatbots, PostgreSQL
- Led and managed a remote team of two back-end engineers, two Android engineers, and an assortment of freelancers for Bot MD, a clinical AI assistant for doctors, as part of YCombinator's S18 batch.
- Spearheaded the development of a full-featured Android chat application with various productivity features for doctors.
- Built the chat engine from scratch, leveraging my deep understanding of linguistics and NLP.
- Technical Advisor2016 - PRESENTStravito
Technologies: Python, Elasticsearch, NLP, Machine Learning, Information Retrieval, Search Engines
- Provided technical expertise and advised on information from unstructured text.
- Led a team of two engineers to build a customized text search algorithm for market research documents.
- Scientist2016 - PRESENTInstitute for Infocomm Research
Technologies: Machine Learning, NLP
- Researched novel techniques for improving state-of-the-art NLP systems.
- Advisor and Data Scientist in Residence2016 - PRESENTIntelllex
Technologies: NLP, Information Retrieval, Machine Learning
- Advised and collaborated with the engineering team on topics and techniques related to natural language processing, information retrieval, and machine learning.
- Provided domain knowledge and input on product roadmaps.
- Technical Advisor2015 - PRESENTAirPR, Inc.
Technologies: Python, Scala, Spark, Elasticsearch, Flask, Ruby on Rails, Java, MySQL, MongoDB, AWS, AWS Elastic MapReduce
- Built an automatic key phrase extraction module for PR news (soundbites).
- Designed customized author ranking algorithms for LinkedIn publishers using social and influence metrics.
- Improved the Elasticsearch relevance ranking algorithm by designing custom features and metrics. We improved results relevance rankings by 30%.
- Implemented a state-of-the-art customized sentiment classifier for Tweets using crowdsourcing and ensemble methods.
- Built a data processing pipeline for handling millions of articles using Spark and Elasticsearch.
- Built an NLP pipeline for processing millions of news articles.
- Utilized techniques that included logistic regression, support vector machines (SVM), random forests, and ensemble methods.
- Visiting PhD Scholar2015 - 2016University of Washington
- Performed a variety of academic duties as scholar in residence with the University of Washington Computer Science and Engineering department.
- Graduate Research Assistant2011 - 2016Carnegie Mellon University
Technologies: Java, C++, Python, Julia, LaTeX
- Assisted the course Introduction to Natural Language Processing (NLP) and Graduate Seminar on Advanced NLP.
- Pursued research interests in Machine Learning (ML), Natural Language Processing (NLP), and Computational Social Science (CSS).
- Applied NLP techniques to text mining and information extraction tasks.
- Built tools to help automatic discovery and analysis of decision making in the U.S. Supreme Court.
- Built tools to help political scientists analyze and explore speeches of U.S. presidential candidates.
- Gained expert knowledge of statistical models, probabilistic graphical models, MCMC and variational methods, deep learning, and topic modeling.
- Research Intern2013 - 2013Google, Inc.
Technologies: C++, Borg
- Worked with the Google Knowledge team to improve their state of the art NLP pipeline.
- Proposed and implemented a novel model for joint inference on named entity recognition/tagging and coreference resolution.
- Developed efficient algorithms for performing inference in high-dimension combinatorics space using dual decomposition.
- Utilized techniques including dual decomposition, support vector machine (SVM), conditional random fields (CRF), and graphical models.
- Research Officer2010 - 2011Institute for Infocomm Research
Technologies: Java, UIMA
- Built a state-of-the-art entity resolution system by leveraging unsupervised latent topic features.
- Designed a robust high precision acronym identification module using carefully crafted features.
- Ranked #3 in the 2011 Knowledge Base Population shared task.
- Utilized algorithms including SVM, Naive Bayes, Latent Dirichlet Allocation topic modeling, and UIMA for the NLP pipeline.