user avatar

GPU Systems Engineer 3

Base-2 Solutions, LLC

Posted today

Job Requirements

Bethesda, MD
Top Secret/SCI CI Polygraph
Career Level not specified
Salary not specified
Join Premium to unlock estimated salaries

Job Description

Position Summary

Support enterprise AI mission systems by designing, developing, and optimizing GPU clusters, with deep focus on operating systems, hardware, GPU platforms, and high-speed networking in a secure customer environment.
Essential Duties and Responsibilities
  • Design, configure, and maintain GPU clusters.
  • Collaborate with a multidisciplinary team to define and optimize architectures for performance, power efficiency, and required features.
  • Work closely with AI/ML engineers to integrate GPUs with Linux-based systems.
  • Optimize GPU drivers for compatibility, reliability, and performance.
  • Analyze GPU performance, identify bottlenecks, and develop strategies to improve efficiency across hardware and software layers.
  • Build and maintain debugging tools, profiling utilities, and performance analysis software for Linux environments.
  • Leverage Bash, Python, Ansible, Puppet, and Salt for tooling and automation.
  • Maintain technical documentation, architectural specifications, and Linux best practices.
  • Support ATO activities and ensure compliance with federal security standards.
Required Qualifications
  • Active TS/SCI with ability to obtain a CI Polygraph.
  • Bachelor's degree with a minimum of six years of experience in the category field. Three additional years of experience may be substituted for the bachelor's degree.
  • Experience managing NVIDIA GPU data center platforms, including DGX, HGX, H200, H100, and L4s.
  • Knowledge of enterprise server components, including storage/network controllers, HBAs, and SSDs.
  • Strong expertise with Linux distributions, including RHEL, Ubuntu, Oracle, and Rocky.
  • Excellent problem-solving skills and the ability to collaborate within a team.
  • Meet DoD 8570.11 IAT Level II certification requirements at a minimum; IAT Level III is also acceptable.
  • U.S. citizenship is required due to the nature of the government contracts supported.
Preferred Qualifications
  • Experience with Kubernetes cluster management and AI/ML workflow orchestration, including Argo, Airflow, and Kubeflow.
  • Familiarity with GPU virtualization and cloud computing.
  • Experience with Prometheus and Grafana for monitoring.
  • Knowledge of distributed resource scheduling systems such as Slurm, LSF, or similar tools.
Required Education and Experience Equivalency
Education Years of Experience High School Diploma/GED 9 Associates Degree 9 Bachelors' Degree 6 Masters' Degree 6 PhD 6 Required Certifications
  • DoD 8570.11 IAT Level II certification: Security+ CE, CCNA-Security, GICSP, GSEC, or SSCP.
Required Security Clearance
  • Active TS/SCI with ability to obtain a CI Polygraph.


Pay & Benefit Highlights
Compensation
  • Competitive fixed salary or hourly pay (based on experience, skills, location, and internal equity).
  • Employee referral bonuses up to $10,000 per hired referral.
  • Additional bonus opportunities for exceptional performance and contributions to business development and company growth (role-dependent).
Health
  • 100% company-paid medical premiums for employees and eligible dependents.
  • Choose from multiple plan options with CareFirst, Kaiser, and UnitedHealthcare, including PPO, POS, HMO, and HSA-compatible plans.
  • 100% company-paid dental premiums for employees and eligible dependents.
  • 100% company-paid vision premiums for employees and eligible dependents.
Income Protection
  • 100% company-paid premiums for short-term disability.
  • 100% company-paid premiums for long-term disability.
  • 100% company-paid premiums for accidental death & dismemberment (AD&D).
  • 100% company-paid premiums for life insurance up to $200,000.
Retirement
  • 401(k) with immediate vesting: 4% company match plus a 4% non-elective company contribution (8% total).
  • 401(k) pre-tax and Roth options.
Leave
  • Up to 20 days of flexible paid time off (PTO).
  • 11 paid floating holidays.
Work-Life Balance
  • Flexible work schedules, including flex time and compressed work periods (contract and project-dependent).
group id: 90984897