user avatar

Senior HPC Storage Administrator

General Atomics Intelligence

Posted 1 month ago
Top Secret
Unspecified
Unspecified
IT - Hardware
San Diego, CA (On-Site/Office)

General Atomics (GA), and its affiliated companies, is one of the world's leading resources for high-technology systems development ranging from the nuclear fuel cycle to remotely piloted aircraft, airborne sensors, and advanced electric, electronic, wireless and laser technologies.

We are seeking a Senior HPC Storage Administrator for the GA Energy Group to lead the design, deployment, and optimization of highly complex parallel storage systems, specifically BeeGFS, WEKA, Lustre, and Ceph. This role is critical in supporting high-performance computing (HPC) workflows for Fusion Energy research, Digital Twin projects, and advanced numerical modeling. This position is responsible for developing and maintaining the smooth operation of multi-user computer systems in support of fusion energy research activities.

DUTIES AND RESPONSIBILITIES:
  • Responsible for observing all laws, regulations and other applicable obligations wherever and whenever business is conducted on behalf of the Company. Expected to work in a safe manner in accordance with established operating procedures and practices.
  • Other duties as assigned or as required.
  • Architecture & Design: Apply advanced HPC and data concepts to plan, design, and deploy highly complex parallel file systems (Lustre, BeeGFS, WEKA) and scale-out object storage (Ceph).
  • HPC Workflow Optimization: Design and optimize computational workflows, data management, and software packages specifically for large scientific datasets.
  • Performance Benchmarking: Collaborate with science area experts to port, benchmark, and evaluate complex numerical models to ensure peak I/O performance across GPU-accelerated clusters.
  • Technical Leadership: Serve as a consultant and lead for information system planning, hardware/software recommendations, and code curation for both on-premises and cloud-integrated HPC resources.
  • System Operations: Plan and manage day-to-day operations of Linux/Unix and Windows operating systems and applications.
  • Fabric Management: Manage high-speed networking components, including InfiniBand and network manager (nmcli) configurations for bonding and bridging.
  • Staff Development: Act as a technical lead, providing guidance to professional staff and participating in the selection and development of personnel.
We recognize and appreciate the value and contributions of individuals with diverse backgrounds and experiences and welcome all qualified individuals to apply.

Job Qualifications

  • Typically requires a bachelor's degree in information technology or a related discipline and fifteen or more years of progressive professional experience in an information technology department primarily in systems administration. May substitute equivalent working experience in the field in lieu of education.
  • HPC & Scheduler Expertise: Detailed technical expertise in advanced computing environments (HPC), including the use of SLURM for workload management, job configuration, and resource allocation.
  • HPC & Scheduler Expertise: Detailed computing environments (HPC), including the use of SLURM for workload management, job configuration, and resource allocation.
  • Parallel Storage Mastery: Deep expertise with high-performance parallel file systems such as Lustre, BeeGFS, and WEKA, as managing large-scale Ceph clusters for object and block storage.
  • Linux Stack Proficiency: Extensive knowledge of the Linux stack, including NFS, ZFS, BTRFS, MDRaid (mdadm), and LVM.
  • Automation & Scripting: Demonstrated proficiency in Python, Ansible, and Bash scripting for system management and automation.
  • Virtualization & Containers: Experience with Linux KVM/QEMU, Docker, and Podman in production settings.
  • Leadership: Proven ability to resolve unusually complex technical problems and serve as a spokesperson or leader on major projects.
Desirable Qualifications:
  • Experience in a scientific research environment and familiarity with US government and international computer systems.
  • Expertise in the intercontinental movement of large scientific datasets on ultra-high-speed networks.
group id: 10414685
Find General Atomics Intelligence on Social Media
Network Employers
user avatar
About Us
GA-CCRi maintains and deploys production systems for users across the Intelligence Community, Department of Defense, and commercial industry. We build and develop best-in-class all domain and globally focused situational awareness capabilities that process petabytes of data from numerous streaming data sources in near real time. Our systems apply state-of-the-art algorithms and machine learning techniques to extract features and fuse data from multiple phenomenologies to form a rich live view of objects in the sky, on the sea, and on the ground. These analytics are designed to determine not just where something is, but what it is, where it's been and what it's doing. All of this "data to knowledge" is made available to end users in our own browser-based application for visualization, analysis, and understanding. We always want to do more, and that's where you come in!

General Atomics Intelligence Jobs


Job Category
IT - Hardware
Clearance Level
Top Secret