Pedro Henrique Rocha Moy, Machine Learning Developer in Miami Beach, FL, United States
Pedro Henrique Rocha Moy

Machine Learning Developer in Miami Beach, FL, United States

Member since January 26, 2019
Pedro is a hybrid data scientist-engineer with more than six years of experience building data processing pipelines and machine learning models. His specialties include big data, natural language processing, reinforcement learning, algorithmic trading, time-series analysis/forecasting, and numerical optimization. He's able to provide fully-automated and/or hybrid solutions meeting business needs in accuracy, transparency, and interpretability.
Pedro is now available for hire

Portfolio

  • Self-employed
    Scikit-learn, SpaCy, Natural Language Processing (NLP), Neural Networks, XGBoost
  • Rocha Moy Trading
    Python, Julia, AWS, Options Trading, APIs, Web Scraping, Probability Theory...
  • Toptal Client
    Python, AWS EMR, Spark, Snowflake

Experience

Location

Miami Beach, FL, United States

Availability

Full-time

Preferred Environment

Jupyter Notebook, Git, Linux

The most amazing...

...project I have worked on is my current trading system that employs concepts of evolutionary and generic computing to adapt to financial markets in real time.

Employment

  • Lead Data Scientist

    2021 - PRESENT
    Self-employed
    • Designed, implemented, and deployed different natural language processing models.
    • Worked with stakeholders to understand use cases, the pathway to product development, and implementation using deployed models.
    • Mentored and supported junior data scientists on the team.
    Technologies: Scikit-learn, SpaCy, Natural Language Processing (NLP), Neural Networks, XGBoost
  • Chief Architect

    2017 - PRESENT
    Rocha Moy Trading
    • Developed the API for probabilistic and algorithmic options trading with Interactive Brokers and TD Ameritrade. Specialties include data integration, task automation, portfolio simulations, risk mitigation, and strategy validation.
    • Integrated many different data sources from APIs to web scraping.
    • Completely automated trade execution, scheduling of trades, and release of funds for trading.
    Technologies: Python, Julia, AWS, Options Trading, APIs, Web Scraping, Probability Theory, Machine Learning, Simulations, Data Integration
  • Enterprise Lead Data Architect-Contractor

    2020 - 2021
    Toptal Client
    • Handled the architecture, development, and automation of distributed computing pipelines and data storage in the cloud for the enterprise.
    • Automated scalable infrastructure in the cloud to respond to development and consumer demand.
    • Co-managed and supervised a team of engineers from designing and delegating tasks, mentoring, and overseeing work.
    Technologies: Python, AWS EMR, Spark, Snowflake
  • Enterprise Senior ETL and Data Engineer - Contractor

    2019 - 2020
    Toptal Client
    • Designed, implemented, and deployed to production fully-fledged distributed ETL jobs in Spark/Scala API.
    • Worked with various sources and sinks of data including desperate files, Hive tables, Mongo collections, and Kafka brokers.
    • Served as the senior engineer and tech lead of the team strengthening engineering and development processes, improving software quality control, and helping design stories for sprints.
    Technologies: Oracle SQL, DocumentDB, Scala, Python, MongoDB, Spark SQL, Spark, Apache Kafka, Hadoop
  • Hadoop Proof of Concept for Atmospheric Sciences Project - Contractor

    2019 - 2020
    Toptal Client
    • Built cluster from scratch adhering to client's needs to work with home cluster.
    • Designed and implemented generic and specific data architectures meeting the client's query's complexity and performance needs.
    • Built PySpark and Python software layers of abstraction to allow the client to build on top of the current infrastructure.
    Technologies: PySpark, Hadoop
  • Research Data Engineer

    2018 - 2019
    Nicklaus Children’s Hospital
    • Developed existing analytical and data workflows for users of R, Python, and Impala establishing best engineering practices.
    • Provided ad hoc and systematically developed ETL and big data pipelines, validation, and integration of varying data sources.
    • Liaised for the research department to IT and BI departments providing guidance and expertise on analytical and data needs.
    Technologies: Impala, Hadoop, Spark, Scala, Python
  • Technical Advisor

    2018 - 2018
    Insight Data Science
    • Worked with fellows and their data engineering projects on problem definition, systems architecture, and execution.
    • Advised on technologies such as Spark, Kafka, Redis, HBase, Cassandra, and PostgreSQL.
    • Conducted mock interviews with fellows on scalability concepts, algorithms, and CS fundamentals.
    Technologies: PostgreSQL, Cassandra, HBase, Redis, Apache Kafka, Spark
  • Senior Software Engineer

    2016 - 2017
    NexHealth
    • Developed and deployed software to the client's site to perform data collection and server sync.
    • Performed both database and web-based data integrations of electronic medical records back to NexHealth servers.
    • Developed a smart SMS response system allowing the user to interact with NexHealth products via SMS.
    Technologies: Redis, PostgreSQL, Apache Spark, JavaScript, Scala, Python, Ruby on Rails (RoR)
  • Data Scientist

    2016 - 2016
    QuaEra Insights
    • Served as the lead data scientist in a consulting project overseeing data management and modeling strategy.
    • Used natural language processing to transform unstructured data into features and extract business intelligence.
    • Built a recommendation engine as business rules potentially yielding savings on up to 50% of the business.
    Technologies: Python
  • Data Engineering Fellow

    2015 - 2015
    Insight Data Science
    • Built the themidgame-tube, a platform designed to discover YouTube influencers on brand names worldwide.
    • Deployed Amazon’s EMR Spark with HBase processing and ingesting billions of data tuples.
    • Attained linear scalability performance tested with up to 20 nodes.
    Technologies: Amazon Web Services (AWS), Bootstrap, Hadoop, Apache Spark, Python, AWS
  • Data Analyst

    2015 - 2015
    Cartesian
    • Aided managed analytics efforts promoting best practices within batch workflows and data management.
    • Conducted independent research into big data workflows considering data mining and BI integration.
    • Built short data pipelines consuming APIs, transforming, loading, and exposing data connections to BI tools.
    Technologies: Alteryx, PostgreSQL, R, Python
  • Data Analytics Engineer

    2013 - 2015
    Daktari Diagnostics
    • Lead developer of mainstream data processing and data analysis applications in Python for Windows/Mac.
    • Developed a calibration model for the Daktari CD4 testing device improving the system's accuracy by 20-30%.
    • Deployed machine learning models embedded in standalone applications to end users for data classification.
    Technologies: Microsoft SQL Server, JMP, SAS, R, Python

Experience

  • Continuous Edging and Hedging Equity Trading Strategy
    https://docs.google.com/presentation/d/1zkbfErfwbJvGBXFj9UWKDvq99wkj6EBvqniA4yFNu68/edit?usp=sharing

    This investigation explores reinforcement learning agents as a means to generate a diversified set of strategies that guarantees existing optimal strategies for any market condition. The preliminary result demonstrates that the pool of agents provides the desirable diversity transforming the algorithmic trading challenge into a problem of selection (which may be tackled with AI methods such as evolutionary computing).

Skills

  • Languages

    Python, Julia, Scala, SQL, R, SAS, JavaScript, Bash, Snowflake
  • Storage

    NoSQL, MongoDB, Oracle SQL, Microsoft SQL Server, Redis, Cassandra, PostgreSQL, HBase, Apache Hive, Data Integration
  • Other

    Machine Learning, Distributed Systems, NLP, APIs, Data Architecture, Data Modeling, spaCy, DocumentDB, AWS, Dash, Deep Learning, Natural Language Processing (NLP), Data Engineering, Artificial Intelligence (AI), Algorithms, Algorithmic Trading, Optimization, Reinforcement Learning, Time Series Analysis, Forecasting, Cloud, Numerical Optimization, Sentiment Analysis, Streamlit, Neural Networks, Options Trading, Web Scraping, Probability Theory, Simulations
  • Frameworks

    AWS EMR, Bootstrap, Ruby on Rails (RoR), Spark, Apache Spark, Flask, Hadoop
  • Libraries/APIs

    PySpark, TensorFlow, PyTorch, Scikit-learn, XGBoost, Dask, SpaCy
  • Tools

    Spark SQL, JMP, Impala, Git, Gensim
  • Paradigms

    Parallel Programming, Distributed Computing, Data Science
  • Platforms

    Jupyter Notebook, Apache Kafka, Alteryx, Linux, Amazon Web Services (AWS)

Education

  • Master's degree in Computer Science (Machine Learning)
    2015 - 2017
    Georgia Institute of Technology - Atlanta, GA
  • Master's degree in Earth Science and Engineering (Geophysics)
    2010 - 2012
    King Abdullah University of Science and Technology - Saudi Arabia
  • Bachelor's degree in Mechanical Engineering
    2008 - 2010
    University of Massachusetts Lowell - Lowell, MA

To view more profiles

Join Toptal
Share it with others