Job Requirements
Remote
Secret Polygraph Unspecified
Career Level not specified
Salary not specified
Join Premium to unlock estimated salaries
Job Description
The Site Reliability Engineer (SRE) / Subject Matter Expert (SME) - Computer Systems Engineer/Architect will provide senior-level reach-back expertise to support the reliability, scalability, performance, and operational resilience of the GEOMAP platform in secure cloud environments. This role focuses on improving service availability, monitoring, incident response, automation, and production stability across cloud-hosted and containerized systems supporting mission-critical geospatial capabilities for the U.S. Air Force.
The Site Reliability Engineer will collaborate across development, DevSecOps, cloud, database, testing, and support teams to identify systemic issues, reduce operational risk, and implement engineering solutions that improve long-term platform reliability.
*This position is contingent upon contract award.*
Responsibilities
Qualifications
Preferred
About Us
Diné Development Corporation (DDC) is a Navajo Nation owned family of companies that provides government agencies and commercial organizations with high-quality IT, professional, environmental, and research and development services. DDC is dedicated to empowering the Navajo Nation and communities we serve.
Benefits
Eligible full-time employees receive a comprehensive benefits package, including medical, dental, vision, life and disability coverage, retirement savings with company match, paid time off, voluntary supplemental benefits, and access to an employee assistance program. The package also includes educational assistance, with tuition reimbursement.
EEO Statement
This contractor and subcontractor shall abide by the requirements of 41 CFR 60-1.4(a), 60-300.5(a), and 60-741.5(a). These regulations prohibit discrimination against qualified individuals based on their status as protected veterans or individuals with disabilities, and prohibit discrimination against all individuals based on their race, color, religion, sex, sexual orientation, gender identity, national origin, or for inquiring about, discussing, or disclosing information about compensation, or any other basis prohibited by law. We participate in E-Verify.
The Site Reliability Engineer will collaborate across development, DevSecOps, cloud, database, testing, and support teams to identify systemic issues, reduce operational risk, and implement engineering solutions that improve long-term platform reliability.
*This position is contingent upon contract award.*
Responsibilities
- Provide senior-level engineering support to improve reliability, availability, performance, and maintainability of GEOMAP cloud-hosted systems and services.
- Analyze production issues, recurring incidents, and operational trends to identify root causes and recommend durable corrective actions.
- Support the design and implementation of monitoring, alerting, logging, and observability solutions across applications, infrastructure, and containerized services.
- Develop and recommend automation approaches that reduce manual effort, improve deployment consistency, and increase system resilience.
- Partner with software engineers, DevSecOps engineers, Kubernetes engineers, database engineers, and production support personnel to improve service health and release readiness.
- Support incident response, problem management, service restoration, and post-incident reviews for high-priority operational issues.
- Evaluate system performance, capacity, and scalability needs and provide recommendations for optimization and operational risk reduction.
- Assist in defining service reliability objectives, operational metrics, and support models for sustained mission operations.
- Contribute to infrastructure and platform engineering efforts involving cloud environments, CI/CD pipelines, container orchestration, and secure deployment patterns.
- Support architecture reviews, technical assessments, and engineering analyses related to reliability, recoverability, and production operations.
- Develop or refine runbooks, standard operating procedures, reliability engineering practices, and technical documentation.
- Provide reach-back support for surge requirements, complex production investigations, and priority modernization or stabilization efforts as directed.
- Performs other related duties as assigned.
Qualifications
- Active Secret clearance required.
- Bachelor's degree in Computer Science, Information Technology, Engineering, or a related field; Master's degree preferred.
- Minimum of 8 years of experience supporting enterprise systems, cloud platforms, site reliability engineering, production engineering, systems engineering, or related technical roles.
- Experience supporting AWS environments, including monitoring, performance tuning, troubleshooting, incident response, and operational sustainment.
- Experience with Linux administration, scripting, and troubleshooting distributed applications in production environments.
- Experience with containerized systems and orchestration platforms such as Kubernetes.
- Experience supporting CI/CD pipelines, release automation, infrastructure-as-code, and operational reliability in Agile or DevSecOps environments.
- Experience with monitoring, logging, and alerting tools used to support enterprise application performance and infrastructure visibility.
- Strong analytical, troubleshooting, documentation, and communication skills, with the ability to translate operational issues into engineering improvements.
- Ability to work effectively across cross-functional teams in a mission-focused DoD environment.
Preferred
- Experience supporting AWS Cloud One or other secure federal cloud environments.
- Experience supporting geospatial or Esri-based platforms, including ArcGIS Enterprise or related technologies.
- Familiarity with service reliability practices such as SLIs, SLOs, error budgets, incident postmortems, and capacity planning.
- Experience with Risk Management Framework (RMF), STIG compliance, vulnerability remediation, and secure system hardening practices.
- AWS, Kubernetes, or other relevant cloud or reliability engineering certifications.
- Experience supporting technical refresh, platform modernization, or high-availability design initiatives in enterprise environments.
About Us
Diné Development Corporation (DDC) is a Navajo Nation owned family of companies that provides government agencies and commercial organizations with high-quality IT, professional, environmental, and research and development services. DDC is dedicated to empowering the Navajo Nation and communities we serve.
Benefits
Eligible full-time employees receive a comprehensive benefits package, including medical, dental, vision, life and disability coverage, retirement savings with company match, paid time off, voluntary supplemental benefits, and access to an employee assistance program. The package also includes educational assistance, with tuition reimbursement.
EEO Statement
This contractor and subcontractor shall abide by the requirements of 41 CFR 60-1.4(a), 60-300.5(a), and 60-741.5(a). These regulations prohibit discrimination against qualified individuals based on their status as protected veterans or individuals with disabilities, and prohibit discrimination against all individuals based on their race, color, religion, sex, sexual orientation, gender identity, national origin, or for inquiring about, discussing, or disclosing information about compensation, or any other basis prohibited by law. We participate in E-Verify.
group id: 90860202
We are DDC!