Data Scientist2014 - PRESENTSEEK, Ltd.
Technologies: Amazon Web Services (AWS), Spark, Hadoop, AWS, Python
- Performed data analysis and modeling, using Python, Pandas, and scikit-learn.
- Worked on a team that was building a Haskell ETL pipeline.
- Implemented real-time data visualization using D3, Google Maps, and Tableau.
- Built a 20GB per day AWS S3 JSON to HDFS/Impala ETL pipeline, using Hive and Hadoop Streaming.
- Crowdsourced a data curation project, using CrowdFlower.
- Built and deployed a batch-scored predictive model running on six million records per week on a SQL Server.
Data Scientist2013 - 2014Predictive Match
Technologies: Amazon Web Services (AWS), RapidMiner, AWS, Python
- Built a recommender system for a real estate website, using Python.
- Developed a prioritization model for recommendation algorithms using RapidMiner.
- Implemented data analysis and visualization, using a Python data science stack.
- Performed clustering analysis of users, using scikit-learn.
- Collaborated using tools such as Trello and Bitbucket.