user avatar

HPC Systems Engineer

GeoYeti

Posted today

Job Requirements

Charlottesville, VA
Top Secret CI Polygraph
Career Level not specified
Salary not specified
Join Premium to unlock estimated salaries

Job Description

HPC Systems Engineer

Location: Charlottesville, VA

Clearance Required: Active TS (SCI eligibility)

At Bcore, our strength comes from how we deliver impact to the mission. Whether it's architecting critical IT solutions, producing actionable intelligence, or developing cutting edge technology, we succeed because of the expertise, collaboration, and agility of our teams. Our Mission Services division combines enterprise IT, cloud solutions, DevSecOps, systems engineering, software development, and operational support. Bcore accelerates decisive advantage for warfighters and intelligence professionals by fusing human insight, rapid-fire engineering, precision-measured outcomes, and relentless grit into mission-ready solutions.

Do you want to join a team that is building tailored technical solutions to modernize our government's mission and our client's business? Do you have a desire to change how people work? Are you interested in helping to protect our nation's cyber interests? Join our growing team supporting the Army customer missions as an HPC Systems Engineer.

Responsibilities

What you get to do every day:
  • Build, configure, and maintain secure HPC clusters for simulations, scientific computing, and GPU workloads
  • Collaborate with infrastructure teams on cluster platforms, including schedulers, provisioning systems, high-speed interconnects, and distributed nodes
  • Configure and manage job schedulers (Slurm, PBS) with queue setup, resource policies, and job optimization
  • Support containerized workloads (Docker, Podman, Singularity/Apptainer)
  • Assist with cluster provisioning, node management, and initial build-out, including scheduler configuration and validation
  • Troubleshoot hardware, OS, scheduler, networking, and high-performance interconnect issues (e.g., InfiniBand)
  • Integrate compute nodes and hardware into clusters
  • Develop automation and operational tools using Bash, Python, or similar scripting
  • Support authentication and access control via LDAP or Kerberos
  • Analyze performance and identify bottlenecks across compute, storage, and network layers for distributed workloads (MPI/OpenMP)
  • Support GPU-enabled environments and CUDA-based workloads
  • Coordinate with engineering teams to improve cluster performance, stability, and scalability
  • Maintain documentation for configurations, procedures, and troubleshooting
  • Provide technical guidance on HPC best practices for mission workloads


Qualifications

Clearance Required: Active TS clearance (with SCI Eligibility) and eligibility to obtain CI Poly **We are not able to upgrade or sponsor clearances**

Certification Required: Ability to obtain DoD 8140 (8570) IAT Level II certification

Education/Experience:
  • Requires Bachelor's degree in Engineering, Computer Science, or related STEM field (experience in lieu of degree)
  • 6+ years of experience administering Linux based systems in enterprise, research computing, or distributed compute environments, including configuration and troubleshooting of multi-node systems.


Required Skills:


  • Experience supporting distributed compute environments with workload schedulers (e.g., Slurm, PBS, Torque, Grid Engine)
  • Experience supporting multi-node compute environments or HPC clusters
  • Professional experience administering Linux systems via CLI (RHEL derivatives preferred)
  • Experience with scripting and automation (Bash, Python, or similar)
  • Experience troubleshooting server hardware, OS, and distributed computing systems
  • Familiarity with cluster networking and high-speed interconnects
  • Experience diagnosing performance issues across compute, networking, and storage layers
  • Strong troubleshooting and documentation skills

What is ideal?

  • Experience administering multi-node HPC clusters and supporting distributed workloads
  • Knowledge of parallel file systems (e.g., Lustre, BeeGFS, GPFS)
  • Experience with parallel computing frameworks (MPI, OpenMP)
  • Experience with configuration management tools (Ansible, Puppet)
  • Experience supporting GPU-enabled environments and CUDA workloads
  • Familiarity with hybrid HPC architectures (on-prem + cloud, e.g., AWS)
  • Experience supporting HPC systems in research, lab, or mission environments
  • Experience working in DoD or IC environments preferred


What you can expect from us

  • Recognizing great achievements do not go unnoticed by Bcore through service anniversaries, spot awards, and employee referral bonuses
  • You'll join a growing organization of passionate, top-shelf, IT engineering professionals with extensive experience in actively developing the technology revolution in the Intelligence community
  • Highlights of our benefits include Health/Dental/Vision, 401(k) match and potential Profit Sharing, Universal Leave, STD/LTD/Life Insurance/Voluntary Life Insurance, Stipends, Referral Bonuses, and more!
  • Compensation is unique to each candidate and compensation packages are based on education, experience, and other requirements.

BCore is proud to be an equal opportunity workplace. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability status, protected veteran status, sexual orientation or any other characteristic protected by law.
group id: 91132895

Similar Jobs


Job Category
IT - Software
Clearance Level
Top Secret
Employer
GeoYeti