Senior Python Web Scraping Specialist
2022 - 2022Number Five House Ltd- Developed and deployed a web scraping pipeline with Python and Selenium.
- Collected profile and network data from a large online social media platform.
- Performed ETL on the data. Used network analysis and machine learning to enrich collected information.
Technologies: Python, Web Scraping, Google Sheets API, Google SheetsUndergraduate Researcher
2021 - 2022Imperial College London- Built a TensorFlow machine learning pipeline using Python to predict the properties of high-energy X-ray pulses at ultrafast rates.
- Deployed machine learning pipelines to the university's high-performance computing cluster using Secure Shell.
- Developed simulations of quantum many-body physics in Python and devised a new measurement scheme to analyze simulations.
- Used advanced statistics and machine learning, including restricted Boltzmann machine neural networks, to extract knowledge from our simulations.
- Co-authored two papers currently in preparation, both applying machine learning to different physics regimes.
Technologies: Python 3, HPCC Systems, Machine Learning, Applied Mathematics, TensorFlow, PyTorch, Data Analytics, Data Visualization, Data Science, Artificial Intelligence (AI), Python, APIs, Jupyter Notebook, Pandas, Pytest, Data Reporting, Statistical Modeling, Web Scraping, Data Analysis, Big Data, Google Sheets API, Google SheetsResearch Intern
2020 - 2020The Institute of Cancer Research- Developed an unsupervised learning pipeline to analyze genetic risk factor pathways for brain tumors in adults.
- Programmed and debugged R and Python to contribute to interdisciplinary research.
- Implemented my pipeline on a high-performance computing cluster.
- Updated the legacy code to use Python 3 instead of Python 2.
- Performed tissue-specific analysis to find significant risk factors that would not be recognized as significant without accounting for tissue differences.
Technologies: R, Python 3, Data Science, Genetics, Machine Learning, Data Analytics, Data Visualization, Artificial Intelligence (AI), Python, SQL, Jupyter Notebook, Pandas, Data Engineering, Pytest, Data Reporting, Statistical Modeling, Tableau, Data Mining, Data Analysis, Big Data, STATA, Google Sheets API, Google Sheets