
David Ramirez Molina
Verified Expert in Engineering
Cluster Management Developer
Rosenberg, TX, United States
Toptal member since March 19, 2024
David is a senior Linux system engineer in high-performance computing (HPC) with over ten years of experience developing, integrating, validating, and supporting HPC facilities across diverse industries and applications. He has worked for major companies like Dell, where he developed and designed HPC prototype clusters and created technical documentation and collaterals. David is Red Hat-certified, demonstrating his system administration and configuration management expertise.
Portfolio
Experience
- Linux - 20 years
- Red Hat Linux - 15 years
- Bash - 15 years
- Cluster Management - 10 years
- Fedora - 10 years
- MediaWiki - 10 years
- Confluence - 4 years
- xCAT - 2 years
Availability
Preferred Environment
Linux, Red Hat Linux
The most amazing...
...thing I've achieved is writing from scratch, and maintaining an effective configuration management policy set to automate a system operation.
Work Experience
HPC Linux System Administrator
TotalEnergies
- Acted as the co-administrator of a multi-cluster HPC facility supporting research and development projects in oil, gas, and wind simulations and models.
- Evaluated new applications and technologies, proof-of-concept development, scientific software builds, benchmarks, and systems lifecycle.
- Participated in deploying xCAT-automated cluster environments with InfiniBand and Ethernet connectivity back-ends, including Lustre and General Parallel File System (GPFS), under CentOS and Rocky Linux.
- Handled system administration tasks, including hardware commissioning with Nvidia GPUs, Mellanox IB cards, cabling, respective software drivers, validation, and troubleshooting. Managed SLURM administration, account maintenance, and end-user support.
Principal Systems Design Engineer
Dell
- Handled the development, integration, validation, design, and benchmarking of HPC prototype clusters, as well as technical documentation and collaterals.
- Worked on the commissioning and lifecycle management of a cluster environment, including software builds, modules, and comparative tests, with permanent interaction with developers on the vendor side.
- Carried out operations under Bright Cluster Manager and SLURM, with message passing interface (MPI) in a Red Hat Enterprise Linux 8 (RHEL 8) platform with module-based software architecture.
- Troubleshot and escalated issues, liaising with external testers.
Systems Analyst (HPC)
High Performance Computing Solutions - X-ISS
- Performed managed services for high-performance computing (HPC) sites.
- Carried out cluster management, troubleshooting, and supervision using tools, including Bright Cluster Manager, xCAT, schedulers, Simple Linux Utility for Resource Management (SLURM) and PBS, and monitoring tools such as Nagios, Ganglia, and Zabbix.
- Worked on the introductory setup and usage of AWS HPC provisioning environments such as Scale-Out Computing on AWS.
Linux Systems Administrator
CPAP.com
- Acted as a member of the infrastructure staff supporting a high-availability medical eCommerce platform with extensive in-house development.
- Contributed to multiple intranet and internet-facing LAMP (Linux, Apache, MySQL, PHP/Perl/Python) systems, CMS, back-office, Voice Over Internet Protocol (VoIP) telephony, and warehouse operations under HIPAA compliance.
- Managed an extensive structured documentation initiative using Atlassian Confluence. Co-administered Atlassian systems such as Bitbucket, Jira, Confluence, and e-mail systems, including Zimbra.
- Handled configuration management using SaltStack. Worked on network monitoring and supervision using Checkmk.
Research Engineer | Analyst
Texas A&M University Engineering Experiment Station
- Managed the Parasol Laboratory research infrastructure. Provided support for internal and external research collaborators and external dissemination of research products, ensuring the quality of service of the computing platform.
- Provided system administration for high-performance parallel and heterogeneous systems: a Cray XE6m Supercomputer (NetApp Storage), GPU/CUDA HPC servers (Supermicro, IBM), Dell servers supporting KVM and VMware virtualization, and Rocks clusters.
- Administered 100+ Linux VMs and workstations with 130+ internal and remote users, and their customization for the laboratory's research projects.
- Developed Red Hat Enterprise Linux, CentOS, Cray/SUSE, and Fedora platforms, including commissioning, OS and application migrations, backup system and security, and user support.
- Provided identity, inventory, change, and configuration management; Performed hardware and software integration and web mastering for three major websites, and various intranet servers.
- Managed MySQL and PostgreSQL database servers, contents management server, revision control, and overall system administration, ensuring compliance with the State of Texas IT regulations.
Experience
Wiki on Linux Administration
Education
Master's Degree in Computer Science
Prairie View A&M University - Prairie View, Texas, USA
Certifications
Red Hat Certified System Administrator (RHCSA)
Red Hat
Skills
Libraries/APIs
Microsoft HPC
Tools
Rocky Linux, MediaWiki, VMware, Cluster, Confluence, Jira, SaltStack, Subversion (SVN)
Languages
Bash Script, Bash, Python 3
Frameworks
CFEngine 3
Platforms
Red Hat Enterprise Linux, Linux, Red Hat Linux, Fedora, Unix, Windows, AWS IoT, Docker, Debian Linux, Ubuntu
Storage
Databases
Other
Slurm Workload Manager, Cluster Management, Networking, System Administration, xCAT, Software Engineering, Programming, Data Structures, VMware ESXi, pbs
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring