Job Requirements
Bethesda, MD
Top Secret/SCI CI Polygraph
Career Level not specified
Salary not specified
Join Premium to unlock estimated salaries
Job Description
Position Summary
Support enterprise AI mission systems by designing, developing, and optimizing GPU clusters, with deep focus on operating systems, hardware, GPU platforms, and high-speed networking in a secure customer environment.
Essential Duties and Responsibilities
Education Years of Experience High School Diploma/GED 9 Associates Degree 9 Bachelors' Degree 6 Masters' Degree 6 PhD 6 Required Certifications
Pay & Benefit Highlights
Compensation
Support enterprise AI mission systems by designing, developing, and optimizing GPU clusters, with deep focus on operating systems, hardware, GPU platforms, and high-speed networking in a secure customer environment.
Essential Duties and Responsibilities
- Design, configure, and maintain GPU clusters.
- Collaborate with a multidisciplinary team to define and optimize architectures for performance, power efficiency, and required features.
- Work closely with AI/ML engineers to integrate GPUs with Linux-based systems.
- Optimize GPU drivers for compatibility, reliability, and performance.
- Analyze GPU performance, identify bottlenecks, and develop strategies to improve efficiency across hardware and software layers.
- Build and maintain debugging tools, profiling utilities, and performance analysis software for Linux environments.
- Leverage Bash, Python, Ansible, Puppet, and Salt for tooling and automation.
- Maintain technical documentation, architectural specifications, and Linux best practices.
- Support ATO activities and ensure compliance with federal security standards.
- Active TS/SCI with ability to obtain a CI Polygraph.
- Bachelor's degree with a minimum of six years of experience in the category field. Three additional years of experience may be substituted for the bachelor's degree.
- Experience managing NVIDIA GPU data center platforms, including DGX, HGX, H200, H100, and L4s.
- Knowledge of enterprise server components, including storage/network controllers, HBAs, and SSDs.
- Strong expertise with Linux distributions, including RHEL, Ubuntu, Oracle, and Rocky.
- Excellent problem-solving skills and the ability to collaborate within a team.
- Meet DoD 8570.11 IAT Level II certification requirements at a minimum; IAT Level III is also acceptable.
- U.S. citizenship is required due to the nature of the government contracts supported.
- Experience with Kubernetes cluster management and AI/ML workflow orchestration, including Argo, Airflow, and Kubeflow.
- Familiarity with GPU virtualization and cloud computing.
- Experience with Prometheus and Grafana for monitoring.
- Knowledge of distributed resource scheduling systems such as Slurm, LSF, or similar tools.
Education Years of Experience High School Diploma/GED 9 Associates Degree 9 Bachelors' Degree 6 Masters' Degree 6 PhD 6 Required Certifications
- DoD 8570.11 IAT Level II certification: Security+ CE, CCNA-Security, GICSP, GSEC, or SSCP.
- Active TS/SCI with ability to obtain a CI Polygraph.
Pay & Benefit Highlights
Compensation
- Competitive fixed salary or hourly pay (based on experience, skills, location, and internal equity).
- Employee referral bonuses up to $10,000 per hired referral.
- Additional bonus opportunities for exceptional performance and contributions to business development and company growth (role-dependent).
- 100% company-paid medical premiums for employees and eligible dependents.
- Choose from multiple plan options with CareFirst, Kaiser, and UnitedHealthcare, including PPO, POS, HMO, and HSA-compatible plans.
- 100% company-paid dental premiums for employees and eligible dependents.
- 100% company-paid vision premiums for employees and eligible dependents.
- 100% company-paid premiums for short-term disability.
- 100% company-paid premiums for long-term disability.
- 100% company-paid premiums for accidental death & dismemberment (AD&D).
- 100% company-paid premiums for life insurance up to $200,000.
- 401(k) with immediate vesting: 4% company match plus a 4% non-elective company contribution (8% total).
- 401(k) pre-tax and Roth options.
- Up to 20 days of flexible paid time off (PTO).
- 11 paid floating holidays.
- Flexible work schedules, including flex time and compressed work periods (contract and project-dependent).
group id: 90984897