Pedro Henrique Rocha Moy
Verified Expert in Engineering
Machine Learning Developer
Miami, FL, United States
Toptal member since April 25, 2019
Pedro is a business-oriented seasoned data scientist and data engineer with experience building and deploying production distributed data pipelines and machine learning models at scale, covering the entirety of the data lifecycle from design, construction, optimization, deployment, and monitoring of data architectures and machine learning models. Pedro's focus is to deliver solutions that are robust to changes in environment and data and flexible to address changes in business requirements.
Portfolio
Experience
Availability
Preferred Environment
Python, Scala, Amazon Web Services (AWS), Data Engineering, Data Science, Machine Learning, Big Data, Software Architecture
The most amazing...
...systems I've built are algorithmic and probabilistic trading systems. With a limited view of the world, probabilities are essential tools in risk management.
Work Experience
Chief Architect
Rocha Moy Trading
- Developed the API for probabilistic and algorithmic options trading with Interactive Brokers and TD Ameritrade. Specialties include data integration, task automation, portfolio simulations, risk mitigation, and strategy validation.
- Integrated many different data sources from APIs to web scraping.
- Automated trade execution, scheduling of trades, and release of funds for trading completely.
Lead Data Scientist
Self-employed
- Designed, implemented, and deployed different natural language processing models.
- Worked with stakeholders to understand use cases, the pathway to product development, and implementation using deployed models.
- Mentored and supported junior data scientists on the team.
Enterprise Lead Data Architect - Contractor
Toptal Client
- Handled the architecture, development, and automation of distributed computing pipelines and data storage in the cloud for the enterprise.
- Automated scalable infrastructure in the cloud to respond to development and consumer demand.
- Co-managed and supervised a team of engineers from designing and delegating tasks, mentoring, and overseeing work.
Enterprise Senior ETL and Data Engineer - Contractor
Toptal Client
- Designed, implemented, and deployed to production fully-fledged distributed ETL jobs in Spark/Scala API.
- Worked with various sources and sinks of data including desperate files, Hive tables, Mongo collections, and Kafka brokers.
- Served as the senior engineer and tech lead of the team strengthening engineering and development processes, improving software quality control, and helping design stories for sprints.
Hadoop Proof of Concept for Atmospheric Sciences Project - Contractor
Toptal Client
- Built cluster from scratch adhering to client's needs to work with home cluster.
- Designed and implemented generic and specific data architectures meeting the client's query's complexity and performance needs.
- Built PySpark and Python software layers of abstraction to allow the client to build on top of the current infrastructure.
Research Data Engineer
Nicklaus Children’s Hospital
- Developed existing analytical and data workflows for users of R, Python, and Impala establishing best engineering practices.
- Provided ad hoc and systematically developed ETL and big data pipelines, validation, and integration of varying data sources.
- Liaised for the research department to IT and BI departments providing guidance and expertise on analytical and data needs.
Technical Advisor
Insight Data Science
- Worked with fellows and their data engineering projects on problem definition, systems architecture, and execution.
- Advised on technologies such as Spark, Kafka, Redis, HBase, Cassandra, and PostgreSQL.
- Conducted mock interviews with fellows on scalability concepts, algorithms, and CS fundamentals.
Senior Software Engineer
NexHealth
- Developed and deployed software to the client's site to perform data collection and server sync.
- Performed both database and web-based data integrations of electronic medical records back to NexHealth servers.
- Developed a smart SMS response system allowing the user to interact with NexHealth products via SMS.
Data Scientist
QuaEra Insights
- Served as the lead data scientist in a consulting project overseeing data management and modeling strategy.
- Used natural language processing to transform unstructured data into features and extract business intelligence.
- Built a recommendation engine as business rules potentially yielding savings on up to 50% of the business.
Data Engineering Fellow
Insight Data Science
- Built the themidgame-tube, a platform designed to discover YouTube influencers on brand names worldwide.
- Deployed Amazon’s EMR Spark with HBase processing and ingesting billions of data tuples.
- Attained linear scalability performance tested with up to 20 nodes.
Data Analyst
Cartesian
- Aided managed analytics efforts promoting best practices within batch workflows and data management.
- Conducted independent research into big data workflows considering data mining and BI integration.
- Built short data pipelines consuming APIs, transforming, loading, and exposing data connections to BI tools.
Data Analytics Engineer
Daktari Diagnostics
- Worked as the lead developer of mainstream data processing and data analysis applications in Python for Windows/Mac.
- Developed a calibration model for the Daktari CD4 testing device improving the system's accuracy by 20-30%.
- Deployed machine learning models embedded in standalone applications to end users for data classification.
Experience
Continuous Edging and Hedging Equity Trading Strategy
https://docs.google.com/presentation/d/1zkbfErfwbJvGBXFj9UWKDvq99wkj6EBvqniA4yFNu68/edit?usp=sharingEducation
Executive MBA in Business Administration
University of Miami - Miami
Master's Degree in Computer Science (Machine Learning)
Georgia Institute of Technology - Atlanta, GA
Master's Degree in Earth Science and Engineering (Geophysics)
King Abdullah University of Science and Technology - Saudi Arabia
Bachelor's Degree in Mechanical Engineering
University of Massachusetts Lowell - Lowell, MA
Skills
Libraries/APIs
Microsoft HPC, PySpark, TensorFlow, PyTorch, Scikit-learn, XGBoost, Dask, SpaCy
Tools
ChatGPT, Amazon Elastic MapReduce (EMR), Spark SQL, JMP, Impala, Git, Gensim
Languages
Python, Julia, Scala, SQL, R, SAS, JavaScript, Bash, Snowflake
Storage
NoSQL, MongoDB, Oracle SQL, Microsoft SQL Server, Redis, Cassandra, PostgreSQL, HBase, Apache Hive, Data Integration
Paradigms
Functional Programming, Parallel Programming, Distributed Computing
Platforms
Docker, Jupyter Notebook, Apache Kafka, Alteryx, Linux, Amazon Web Services (AWS)
Frameworks
Bootstrap, Ruby on Rails (RoR), Spark, Apache Spark, Flask, Hadoop, Streamlit
Industry Expertise
Accounting
Other
Machine Learning, Distributed Systems, OpenAI GPT-4 API, Financial Modeling, Web App UI, APIs, Data Architecture, Data Modeling, DocumentDB, Dash, Deep Learning, Natural Language Processing (NLP), Data Science, Data Engineering, Artificial Intelligence (AI), Algorithms, Algorithmic Trading, Optimization, Reinforcement Learning, Time Series Analysis, Forecasting, Cloud, Numerical Optimization, Sentiment Analysis, Neural Networks, Options Trading, Web Scraping, Probability Theory, Simulations, Finance, Law, Entrepreneurship, Leadership, Big Data, Software Architecture, Generative Pre-trained Transformers (GPT), Data Analytics, Managed Analytics
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring