Lead Data Scientist2021 - PRESENTSelf-employed
Technologies: Scikit-learn, SpaCy, Natural Language Processing (NLP), Neural Networks, XGBoost
- Designed, implemented, and deployed different natural language processing models.
- Worked with stakeholders to understand use cases, the pathway to product development, and implementation using deployed models.
- Mentored and supported junior data scientists on the team.
Chief Architect2017 - PRESENTRocha Moy Trading
Technologies: Python, Julia, AWS, Options Trading, APIs, Web Scraping, Probability Theory, Machine Learning, Simulations, Data Integration
- Developed the API for probabilistic and algorithmic options trading with Interactive Brokers and TD Ameritrade. Specialties include data integration, task automation, portfolio simulations, risk mitigation, and strategy validation.
- Integrated many different data sources from APIs to web scraping.
- Completely automated trade execution, scheduling of trades, and release of funds for trading.
Enterprise Lead Data Architect-Contractor2020 - 2021Toptal Client
Technologies: Python, AWS EMR, Spark, Snowflake
- Handled the architecture, development, and automation of distributed computing pipelines and data storage in the cloud for the enterprise.
- Automated scalable infrastructure in the cloud to respond to development and consumer demand.
- Co-managed and supervised a team of engineers from designing and delegating tasks, mentoring, and overseeing work.
Enterprise Senior ETL and Data Engineer - Contractor2019 - 2020Toptal Client
Technologies: Oracle SQL, DocumentDB, Scala, Python, MongoDB, Spark SQL, Spark, Apache Kafka, Hadoop
- Designed, implemented, and deployed to production fully-fledged distributed ETL jobs in Spark/Scala API.
- Worked with various sources and sinks of data including desperate files, Hive tables, Mongo collections, and Kafka brokers.
- Served as the senior engineer and tech lead of the team strengthening engineering and development processes, improving software quality control, and helping design stories for sprints.
Hadoop Proof of Concept for Atmospheric Sciences Project - Contractor2019 - 2020Toptal Client
Technologies: PySpark, Hadoop
- Built cluster from scratch adhering to client's needs to work with home cluster.
- Designed and implemented generic and specific data architectures meeting the client's query's complexity and performance needs.
- Built PySpark and Python software layers of abstraction to allow the client to build on top of the current infrastructure.
Research Data Engineer2018 - 2019Nicklaus Children’s Hospital
Technologies: Impala, Hadoop, Spark, Scala, Python
- Developed existing analytical and data workflows for users of R, Python, and Impala establishing best engineering practices.
- Provided ad hoc and systematically developed ETL and big data pipelines, validation, and integration of varying data sources.
- Liaised for the research department to IT and BI departments providing guidance and expertise on analytical and data needs.
Technical Advisor2018 - 2018Insight Data Science
Technologies: PostgreSQL, Cassandra, HBase, Redis, Apache Kafka, Spark
- Worked with fellows and their data engineering projects on problem definition, systems architecture, and execution.
- Advised on technologies such as Spark, Kafka, Redis, HBase, Cassandra, and PostgreSQL.
- Conducted mock interviews with fellows on scalability concepts, algorithms, and CS fundamentals.
Senior Software Engineer2016 - 2017NexHealth
- Developed and deployed software to the client's site to perform data collection and server sync.
- Performed both database and web-based data integrations of electronic medical records back to NexHealth servers.
- Developed a smart SMS response system allowing the user to interact with NexHealth products via SMS.
Data Scientist2016 - 2016QuaEra Insights
- Served as the lead data scientist in a consulting project overseeing data management and modeling strategy.
- Used natural language processing to transform unstructured data into features and extract business intelligence.
- Built a recommendation engine as business rules potentially yielding savings on up to 50% of the business.
Data Engineering Fellow2015 - 2015Insight Data Science
Technologies: Amazon Web Services (AWS), Bootstrap, Hadoop, Apache Spark, Python, AWS
- Built the themidgame-tube, a platform designed to discover YouTube influencers on brand names worldwide.
- Deployed Amazon’s EMR Spark with HBase processing and ingesting billions of data tuples.
- Attained linear scalability performance tested with up to 20 nodes.
Data Analyst2015 - 2015Cartesian
Technologies: Alteryx, PostgreSQL, R, Python
- Aided managed analytics efforts promoting best practices within batch workflows and data management.
- Conducted independent research into big data workflows considering data mining and BI integration.
- Built short data pipelines consuming APIs, transforming, loading, and exposing data connections to BI tools.
Data Analytics Engineer2013 - 2015Daktari Diagnostics
Technologies: Microsoft SQL Server, JMP, SAS, R, Python
- Lead developer of mainstream data processing and data analysis applications in Python for Windows/Mac.
- Developed a calibration model for the Daktari CD4 testing device improving the system's accuracy by 20-30%.
- Deployed machine learning models embedded in standalone applications to end users for data classification.