Chief Architect
2017 - PRESENTRocha Moy Trading- Developed the API for probabilistic and algorithmic options trading with Interactive Brokers and TD Ameritrade. Specialties include data integration, task automation, portfolio simulations, risk mitigation, and strategy validation.
- Integrated many different data sources from APIs to web scraping.
- Automated trade execution, scheduling of trades, and release of funds for trading completely.
Technologies: Python, Julia, Amazon Web Services (AWS), Options Trading, APIs, Web Scraping, Probability Theory, Machine Learning, Simulations, Data IntegrationLead Data Scientist
2021 - 2022Self-employed- Designed, implemented, and deployed different natural language processing models.
- Worked with stakeholders to understand use cases, the pathway to product development, and implementation using deployed models.
- Mentored and supported junior data scientists on the team.
Technologies: Scikit-learn, SpaCy, Natural Language Processing (NLP), Neural Networks, XGBoostEnterprise Lead Data Architect - Contractor
2020 - 2022Toptal Client- Handled the architecture, development, and automation of distributed computing pipelines and data storage in the cloud for the enterprise.
- Automated scalable infrastructure in the cloud to respond to development and consumer demand.
- Co-managed and supervised a team of engineers from designing and delegating tasks, mentoring, and overseeing work.
Technologies: Python, AWS EMR, Spark, SnowflakeEnterprise Senior ETL and Data Engineer - Contractor
2019 - 2020Toptal Client- Designed, implemented, and deployed to production fully-fledged distributed ETL jobs in Spark/Scala API.
- Worked with various sources and sinks of data including desperate files, Hive tables, Mongo collections, and Kafka brokers.
- Served as the senior engineer and tech lead of the team strengthening engineering and development processes, improving software quality control, and helping design stories for sprints.
Technologies: Oracle SQL, DocumentDB, Scala, Python, MongoDB, Spark SQL, Spark, Apache Kafka, HadoopHadoop Proof of Concept for Atmospheric Sciences Project - Contractor
2019 - 2020Toptal Client- Built cluster from scratch adhering to client's needs to work with home cluster.
- Designed and implemented generic and specific data architectures meeting the client's query's complexity and performance needs.
- Built PySpark and Python software layers of abstraction to allow the client to build on top of the current infrastructure.
Technologies: PySpark, HadoopResearch Data Engineer
2018 - 2019Nicklaus Children’s Hospital- Developed existing analytical and data workflows for users of R, Python, and Impala establishing best engineering practices.
- Provided ad hoc and systematically developed ETL and big data pipelines, validation, and integration of varying data sources.
- Liaised for the research department to IT and BI departments providing guidance and expertise on analytical and data needs.
Technologies: Impala, Hadoop, Spark, Scala, PythonTechnical Advisor
2018 - 2018Insight Data Science- Worked with fellows and their data engineering projects on problem definition, systems architecture, and execution.
- Advised on technologies such as Spark, Kafka, Redis, HBase, Cassandra, and PostgreSQL.
- Conducted mock interviews with fellows on scalability concepts, algorithms, and CS fundamentals.
Technologies: PostgreSQL, Cassandra, HBase, Redis, Apache Kafka, SparkSenior Software Engineer
2016 - 2017NexHealth- Developed and deployed software to the client's site to perform data collection and server sync.
- Performed both database and web-based data integrations of electronic medical records back to NexHealth servers.
- Developed a smart SMS response system allowing the user to interact with NexHealth products via SMS.
Technologies: Redis, PostgreSQL, Apache Spark, JavaScript, Scala, Python, Ruby on Rails (RoR)Data Scientist
2016 - 2016QuaEra Insights- Served as the lead data scientist in a consulting project overseeing data management and modeling strategy.
- Used natural language processing to transform unstructured data into features and extract business intelligence.
- Built a recommendation engine as business rules potentially yielding savings on up to 50% of the business.
Technologies: PythonData Engineering Fellow
2015 - 2015Insight Data Science- Built the themidgame-tube, a platform designed to discover YouTube influencers on brand names worldwide.
- Deployed Amazon’s EMR Spark with HBase processing and ingesting billions of data tuples.
- Attained linear scalability performance tested with up to 20 nodes.
Technologies: Amazon Web Services (AWS), Bootstrap, Hadoop, Apache Spark, PythonData Analyst
2015 - 2015Cartesian- Aided managed analytics efforts promoting best practices within batch workflows and data management.
- Conducted independent research into big data workflows considering data mining and BI integration.
- Built short data pipelines consuming APIs, transforming, loading, and exposing data connections to BI tools.
Technologies: Alteryx, PostgreSQL, R, PythonData Analytics Engineer
2013 - 2015Daktari Diagnostics- Lead developer of mainstream data processing and data analysis applications in Python for Windows/Mac.
- Developed a calibration model for the Daktari CD4 testing device improving the system's accuracy by 20-30%.
- Deployed machine learning models embedded in standalone applications to end users for data classification.
Technologies: Microsoft SQL Server, JMP, SAS, R, Python