David Ramirez Molina, Developer in Rosenberg, TX, United States
David is available for hire
Hire David

David Ramirez Molina

Verified Expert  in Engineering

Cluster Management Developer

Location
Rosenberg, TX, United States
Toptal Member Since
March 19, 2024

David is a senior Linux system engineer in high-performance computing (HPC) with over ten years of experience developing, integrating, validating, and supporting HPC facilities across diverse industries and applications. He has worked for major companies like Dell, where he developed and designed HPC prototype clusters and created technical documentation and collaterals. David is Red Hat-certified, demonstrating his system administration and configuration management expertise.

Portfolio

TotalEnergies
Slurm Workload Manager, Rocky Linux, xCAT, VMware, Networking, Unix...
Dell
Red Hat Linux, Slurm Workload Manager, Networking, Unix, System Administration...
High Performance Computing Solutions - X-ISS
Cluster Management, xCAT, Linux, AWS IoT, VMware, Networking, Unix...

Experience

Availability

Full-time

Preferred Environment

Linux, Red Hat Linux

The most amazing...

...thing I've achieved is writing from scratch, and maintaining an effective configuration management policy set to automate a system operation.

Work Experience

HPC Linux System Administrator

2023 - 2024
TotalEnergies
  • Acted as the co-administrator of a multi-cluster HPC facility supporting research and development projects in oil, gas, and wind simulations and models.
  • Evaluated new applications and technologies, proof-of-concept development, scientific software builds, benchmarks, and systems lifecycle.
  • Participated in deploying xCAT-automated cluster environments with InfiniBand and Ethernet connectivity back-ends, including Lustre and General Parallel File System (GPFS), under CentOS and Rocky Linux.
  • Handled system administration tasks, including hardware commissioning with Nvidia GPUs, Mellanox IB cards, cabling, respective software drivers, validation, and troubleshooting. Managed SLURM administration, account maintenance, and end-user support.
Technologies: Slurm Workload Manager, Rocky Linux, xCAT, VMware, Networking, Unix, System Administration, Cluster Management, Bash, Microsoft HPC, Cluster, Bash Script, Databases, Software Engineering, Red Hat Enterprise Linux, Programming, Confluence, VMware ESXi

Principal Systems Design Engineer

2022 - 2023
Dell
  • Handled the development, integration, validation, design, and benchmarking of HPC prototype clusters, as well as technical documentation and collaterals.
  • Worked on the commissioning and lifecycle management of a cluster environment, including software builds, modules, and comparative tests, with permanent interaction with developers on the vendor side.
  • Carried out operations under Bright Cluster Manager and SLURM, with message passing interface (MPI) in a Red Hat Enterprise Linux 8 (RHEL 8) platform with module-based software architecture.
  • Troubleshot and escalated issues, liaising with external testers.
Technologies: Red Hat Linux, Slurm Workload Manager, Networking, Unix, System Administration, Cluster Management, Bash, Microsoft HPC, Cluster, Bash Script, Software Engineering, Red Hat Enterprise Linux, Programming, Confluence

Systems Analyst (HPC)

2021 - 2021
High Performance Computing Solutions - X-ISS
  • Performed managed services for high-performance computing (HPC) sites.
  • Carried out cluster management, troubleshooting, and supervision using tools, including Bright Cluster Manager, xCAT, schedulers, Simple Linux Utility for Resource Management (SLURM) and PBS, and monitoring tools such as Nagios, Ganglia, and Zabbix.
  • Worked on the introductory setup and usage of AWS HPC provisioning environments such as Scale-Out Computing on AWS.
Technologies: Cluster Management, xCAT, Linux, AWS IoT, VMware, Networking, Unix, System Administration, Bash, Microsoft HPC, Cluster, Bash Script, Databases, Software Engineering, Red Hat Enterprise Linux, Programming

Linux Systems Administrator

2018 - 2021
CPAP.com
  • Acted as a member of the infrastructure staff supporting a high-availability medical eCommerce platform with extensive in-house development.
  • Contributed to multiple intranet and internet-facing LAMP (Linux, Apache, MySQL, PHP/Perl/Python) systems, CMS, back-office, Voice Over Internet Protocol (VoIP) telephony, and warehouse operations under HIPAA compliance.
  • Managed an extensive structured documentation initiative using Atlassian Confluence. Co-administered Atlassian systems such as Bitbucket, Jira, Confluence, and e-mail systems, including Zimbra.
  • Handled configuration management using SaltStack. Worked on network monitoring and supervision using Checkmk.
Technologies: Linux, Confluence, Jira, VMware ESXi, Docker, SaltStack, VMware, Networking, Unix, System Administration, Windows, Bash, Bash Script, Databases, Software Engineering, Red Hat Enterprise Linux, Programming, Data Structures

Research Engineer | Analyst

2010 - 2018
Texas A&M University Engineering Experiment Station
  • Managed the Parasol Laboratory research infrastructure. Provided support for internal and external research collaborators and external dissemination of research products, ensuring the quality of service of the computing platform.
  • Provided system administration for high-performance parallel and heterogeneous systems: a Cray XE6m Supercomputer (NetApp Storage), GPU/CUDA HPC servers (Supermicro, IBM), Dell servers supporting KVM and VMware virtualization, and Rocks clusters.
  • Administered 100+ Linux VMs and workstations with 130+ internal and remote users, and their customization for the laboratory's research projects.
  • Developed Red Hat Enterprise Linux, CentOS, Cray/SUSE, and Fedora platforms, including commissioning, OS and application migrations, backup system and security, and user support.
  • Provided identity, inventory, change, and configuration management; Performed hardware and software integration and web mastering for three major websites, and various intranet servers.
  • Managed MySQL and PostgreSQL database servers, contents management server, revision control, and overall system administration, ensuring compliance with the State of Texas IT regulations.
Technologies: Bash Script, Bash, Linux, Microsoft HPC, Cluster, MediaWiki, pbs, CFEngine 3, System Administration, Subversion (SVN), Cluster Management, Fedora, Python 3, Networking, Databases, Software Engineering, Red Hat Enterprise Linux, Programming, Data Structures

Wiki on Linux Administration

https://thelinuxwiki.net
For over ten years, I developed a MediaWiki-based public access wiki with comprehensive support documentation for Linux administrators specializing in Red Hat Enterprise Linux (RHEL) distributions. It includes over 30,000 pages highlighting deployment and use cases for various open-source projects.

Languages

Bash Script, Bash, Python 3

Frameworks

CFEngine 3

Libraries/APIs

Microsoft HPC

Tools

MediaWiki, VMware, Cluster, Confluence, Jira, SaltStack, Subversion (SVN)

Platforms

Red Hat Enterprise Linux, Linux, Red Hat Linux, Fedora, Unix, Windows, AWS IoT, Docker, Debian Linux, Ubuntu

Other

Slurm Workload Manager, Rocky Linux, Cluster Management, Networking, System Administration, xCAT, Software Engineering, Programming, Data Structures, VMware ESXi, pbs

Storage

Databases

2008 - 2009

Master's Degree in Computer Science

Prairie View A&M University - Prairie View, Texas, USA

AUGUST 2016 - PRESENT

Red Hat Certified System Administrator (RHCSA)

Red Hat

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring