Rachel Park, Software Developer in Los Angeles, CA, United States
Rachel Park

Software Developer in Los Angeles, CA, United States

Member since April 21, 2020
Rachel is a big data professional experienced in various domains including robotics, biotech R&D, entertainment/media, and healthcare. With 8 years of experience in technologies for data mining and machine learning, she's a proactive leader with strengths in communication and collaboration. She is proficient in leveraging the Hadoop-Spark ecosystem, using ML knowledge to promote automated data solutions, and managing concurrent objectives to promote efficiency and influence positive outcomes.
Rachel is now available for hire




Los Angeles, CA, United States



Preferred Environment

Sublime Text, Anaconda, Atom, PyCharm, Visual Studio Code

The most amazing...

...thing I've developed is a machine/deep learning based automatic query generator tool. It utilizes Spark MLLib and Keras.


  • Data Engineer

    2019 - 2020
    Hart Inc.
    • Designed and developed an ML-based health data search engine via automated schema prediction.
    • Modified existing databases to meet unique needs and goals determined during initial evaluation and planning process.
    • Wrote scripts and processes for data integration and bug fixes in Python, Scala, and Java.
    • Planned, engineered, configured, and deployed ML tooling and big data solutions while supporting software in a Hadoop-Spark ecosystem.
    Technologies: Apache Airflow, Scala, Python, Spark, Hadoop, PySpark
  • TechOps Engineer

    2018 - 2019
    Telescope Inc.
    • Built business logic for voting applications and directly support the world's largest live shows such as The Voice, American Idol, and Dancing with the Stars.
    • Advised and provided versatile big data solutions using AWS (Dynamo, S3, EC2, etc.) to meet needs of clients.
    • Wrote unit tests in Python to automate product validation.
    • Researched and developed the integration of smart home devices to the current platform, provided prototypes as a proof of concept to executives/clients, and improved profit margins by launching new add-on projects for existing clients.
    Technologies: Amazon Web Services (AWS), Apache Kafka, Flume, REST APIs, RESTful Development, XML, SQL, JavaScript, Spark, Python
  • Development Engineer/Engineering Consultant

    2016 - 2018
    UCLA, Various Startups (Vortex Biosciences Inc., Ferrologix Inc., etc)
    • Wrote code to automate statistical analysis on vision data (live/recorded microscopic images/videos) using data science and computer vision tools in Python, MATLAB, and R.
    • Trained employees on usage of aforementioned codes remotely.
    • Addressed R&D issues from data mining perspective in developing microfluidics platforms for medical applications.
    • Authored publications in peer-reviewed scientific journals.
    • Consulted start-up companies and assist director with project management.
    Technologies: Amazon Web Services (AWS), R, MATLAB, Linux, Tableau, SQL, Python
  • Robotics Researcher

    2014 - 2016
    • Developed algorithms to be tested on custom humanoid platforms using Simulink, Python, C++, Lua, ROS, LabView, and COMSOL.
    • Maintained robot platforms using CAD, 3D rapid prototyping and CNC mill.
    • Competed in DARPA Robotics Challenge Final as Team THOR. (USA, Jun 2015).
    • Competed in RoboCup as Team THORwIn (China, Jul 2015) – 1st place winner in the adult-sized humanoid open platform.
    • Work as robotics education outreach activity coordinator.
    Technologies: Amazon Web Services (AWS), Java, Robot Operating System (ROS), C++, R, MATLAB, Python


  • Data Type Predictor

    Utilizing Spark MLLib, it predicts data types of fields in a given database. Useful when the inferred data type is not accurate or insufficient. Also, it serves as a key to finding primary key and foreign key relationship.

  • Semantic Type Predictor

    This is a PySpark implementation of the deep learning tool Sherlock.
    This implementation predicts the semantic type of fields in a given database. It is useful for cleaning data and matching schema.

  • Single Cell Image Identifier

    Given a blood sample provided by terminal cancer patients, the goal is to identify circulating tumor cells from various other cells (mostly white blood cells) and debris in the blood by analyzing the morphology of these cells and applying machine learning techniques.


  • Languages

    Python 3, SQL, Python, XML, Scala, Java, JavaScript, R, C++
  • Frameworks

    Spark, Hadoop, Django
  • Libraries/APIs

    Spark ML, PySpark, REST APIs
  • Tools

    Apache Airflow, Spark SQL, MATLAB, PyCharm, Atom, Sublime Text, Flume, Tableau
  • Paradigms

    ETL, RESTful Development
  • Platforms

    Amazon Web Services (AWS), Docker, Kubernetes, Visual Studio Code, Linux, Databricks, Anaconda, Apache Kafka
  • Storage

    MySQL, PostgreSQL
  • Other

    Robot Operating System (ROS), Machine Learning, Deep Learning


  • Master of Science Degree in Mechanical Engineering
    2014 - 2016
    University of California, Los Angeles - Los Angeles, CA
  • Bachelor of Science Degree in Biomedical Engineering
    2009 - 2013
    Johns Hopkins University - Baltimore, MD


  • Big Data Hadoop Certification

To view more profiles

Join Toptal
Share it with others