Job Requirements
Falls Church, VA
Secret Polygraph Unspecified
Career Level not specified
Salary not specified
Join Premium to unlock estimated salaries
Job Description
Job Description
Everforth ECS is seeking a Senior Tier-4 Model Serving Support Lead to work in the National Capital Region covering the Pentagon, Falls Church, and Fairfax . Please Note: This position is contingent upon contract award.
The War Data Platform (WDP) is a key initiative within the U.S. Department of War's (DoW) AI-First strategy introduced in early 2026. The WDP focuses on operational warfighting data and aims to accelerate the deployment of artificial intelligence (AI) on the battlefield. The WDP extends to Unclassified, Secret, and Top Secret environments, and supports collaboration between Combatant Commands, Joint Staff directorates, Senior Executive Service leaders, and operational analysts.
The Senior Tier-4 Model Serving Support Lead serves as the authoritative escalation owner for AI and machine learning model-serving pipelines, production endpoints, and model zoo operations across WDP Core Integration's full multi-enclave environment. This role bridges platform engineering, cybersecurity, and cross-service mission partners to sustain uninterrupted AI model-serving performance in direct support of DoW missions, Joint Staff analysts, Combatant Command elements, and Senior Executive Service leadership.
• Owns Tier-4 escalation coordination for artificial intelligence and machine learning model-serving pipelines, production endpoints, and model zoo operations within War Data Platform (WDP) Core Integration environments supporting Department of War missions, Joint Staff analysts, Combatant Command elements, and Senior Executive Service leadership.
• Directs escalation workflows by activating incident bridges, coordinating engineering response actions, validating operational impact, and aligning escalation playbooks with service-level agreement requirements.
• Applies Kubernetes, GitLab Continuous Integration, VMware environments, Elastic Stack, Prometheus metrics, Grafana dashboards, and enterprise observability tooling to diagnose serving failures, analyze telemetry, and guide stabilization activities across unclassified and higher-domain enclaves.
• Leads coordination with Platform One, Cloud One, multi-national engineering teams, and cross-service mission partners to maintain operational readiness for serving pipelines, cross-domain transfer workflows, API endpoints, and model-runtime components.
• Conducts structured post-incident analysis by collecting operational evidence, reconstructing failure sequences, validating remediation steps, and documenting mission-assurance considerations for future release cycles.
• Produces mission-critical deliverables including escalation playbooks, incident-response documentation, service-level alignment reports, operational risk assessments, and restoration summaries.
• Strengthens program value by reinforcing deployment consistency, advancing mission assurance posture, and sustaining operational continuity across all enclaves.
• Supports enterprise release operations by coordinating readiness checks, validating rollback pathways, and maintaining authoritative Tier-4 support artifacts required for uninterrupted artificial intelligence model-serving performance.
• Performs other duties as assigned.
Required Skills
• Current Secret security clearance with the ability to obtain and maintain a Top Secret (TS) security clearance with Sensitive Compartmented Information (SCI).
• 10 or more years of progressive experience in AI/ML platform operations, enterprise incident management, or senior IT support roles, with demonstrated responsibility for Tier-4 or equivalent escalation ownership in classified or federal government multi-enclave cloud environments.
• Hands-on experience applying enterprise observability and container orchestration tooling, including Kubernetes, GitLab CI, Elastic Stack, Prometheus, and Grafana, to diagnose AI/ML serving failures, analyze pipeline telemetry, and coordinate stabilization activities across Unclassified, Secret, and Top Secret network environments.
• Demonstrated experience coordinating with DoW-authorized DevSecOps platform environments such as Platform One or Cloud One, including participation in cross-enclave release readiness activities, rollback validation, and post-deployment stability verification for AI/ML model-serving workloads.
• CompTIA A+ certification or equivalent, demonstrating validated foundational knowledge of IT systems, hardware, software, and operational support practices.
• Strong problem-solving and decision-making capabilities, with a proven ability to weigh the relative costs and benefits of potential actions and identify the most appropriate solution.
• Highly developed interpersonal and oral/written communication skills, with the ability to effectively and professionally interact with a diverse set of stakeholders (from peers to end-users to executive management).
Desired Skills
• Active Top Secret (TS) security clearance with Sensitive Compartmented Information (SCI) eligibility.
• Advanced cloud or DevSecOps certification such as AWS SysOps Administrator (Professional), Certified Kubernetes Administrator (CKA), or AWS DevOps Engineer (Professional), demonstrating validated expertise in cloud-native incident response, container orchestration, and infrastructure operations in GovCloud or classified environments.
• Practical familiarity with AI/ML model zoo architecture and model registry operations, including working knowledge of how catalog state, API endpoint configuration, and cross-domain promotion readiness intersect with Tier-4 escalation triggers and serving runtime recovery procedures.
• Familiarity with cross-domain solution (CDS) architectures and data transfer controls governing multi-enclave AI/ML serving environments, including enclave-specific constraints across NIPRNet, SIPRNet, and JWICS as they apply to model pipeline continuity and escalation sequencing.
• Working knowledge of Zero Trust Architecture principles and Risk Management Framework (RMF) requirements as they apply to AI/ML serving runtime security, continuous monitoring integration, and cybersecurity incident coordination within DoW-accredited cloud environments.
ECS Federal LLC is an equal opportunity employer and does not discriminate or allow discrimination on the basis any characteristic protected by law. All qualified applicants will receive consideration for employment without regard to disability, status as a protected veteran or any other status protected by applicable federal, state, or local jurisdiction law.
is the federal segment of , a $4B global organization with over 10,000 employees. Our nearly 3,500 professionals deliver advanced technology solutions in data and AI, cybersecurity, and enterprise transformation, serving defense, intelligence, and federal civilian agencies.
Our work powers mission-critical outcomes, strengthens technology partnerships, and creates meaningful opportunities for our people. We are defined by a commitment to excellence in delivery, a culture of innovation, and an environment where talent can thrive and grow.
We value:
Everforth ECS is seeking a Senior Tier-4 Model Serving Support Lead to work in the National Capital Region covering the Pentagon, Falls Church, and Fairfax . Please Note: This position is contingent upon contract award.
The War Data Platform (WDP) is a key initiative within the U.S. Department of War's (DoW) AI-First strategy introduced in early 2026. The WDP focuses on operational warfighting data and aims to accelerate the deployment of artificial intelligence (AI) on the battlefield. The WDP extends to Unclassified, Secret, and Top Secret environments, and supports collaboration between Combatant Commands, Joint Staff directorates, Senior Executive Service leaders, and operational analysts.
The Senior Tier-4 Model Serving Support Lead serves as the authoritative escalation owner for AI and machine learning model-serving pipelines, production endpoints, and model zoo operations across WDP Core Integration's full multi-enclave environment. This role bridges platform engineering, cybersecurity, and cross-service mission partners to sustain uninterrupted AI model-serving performance in direct support of DoW missions, Joint Staff analysts, Combatant Command elements, and Senior Executive Service leadership.
• Owns Tier-4 escalation coordination for artificial intelligence and machine learning model-serving pipelines, production endpoints, and model zoo operations within War Data Platform (WDP) Core Integration environments supporting Department of War missions, Joint Staff analysts, Combatant Command elements, and Senior Executive Service leadership.
• Directs escalation workflows by activating incident bridges, coordinating engineering response actions, validating operational impact, and aligning escalation playbooks with service-level agreement requirements.
• Applies Kubernetes, GitLab Continuous Integration, VMware environments, Elastic Stack, Prometheus metrics, Grafana dashboards, and enterprise observability tooling to diagnose serving failures, analyze telemetry, and guide stabilization activities across unclassified and higher-domain enclaves.
• Leads coordination with Platform One, Cloud One, multi-national engineering teams, and cross-service mission partners to maintain operational readiness for serving pipelines, cross-domain transfer workflows, API endpoints, and model-runtime components.
• Conducts structured post-incident analysis by collecting operational evidence, reconstructing failure sequences, validating remediation steps, and documenting mission-assurance considerations for future release cycles.
• Produces mission-critical deliverables including escalation playbooks, incident-response documentation, service-level alignment reports, operational risk assessments, and restoration summaries.
• Strengthens program value by reinforcing deployment consistency, advancing mission assurance posture, and sustaining operational continuity across all enclaves.
• Supports enterprise release operations by coordinating readiness checks, validating rollback pathways, and maintaining authoritative Tier-4 support artifacts required for uninterrupted artificial intelligence model-serving performance.
• Performs other duties as assigned.
Required Skills
• Current Secret security clearance with the ability to obtain and maintain a Top Secret (TS) security clearance with Sensitive Compartmented Information (SCI).
• 10 or more years of progressive experience in AI/ML platform operations, enterprise incident management, or senior IT support roles, with demonstrated responsibility for Tier-4 or equivalent escalation ownership in classified or federal government multi-enclave cloud environments.
• Hands-on experience applying enterprise observability and container orchestration tooling, including Kubernetes, GitLab CI, Elastic Stack, Prometheus, and Grafana, to diagnose AI/ML serving failures, analyze pipeline telemetry, and coordinate stabilization activities across Unclassified, Secret, and Top Secret network environments.
• Demonstrated experience coordinating with DoW-authorized DevSecOps platform environments such as Platform One or Cloud One, including participation in cross-enclave release readiness activities, rollback validation, and post-deployment stability verification for AI/ML model-serving workloads.
• CompTIA A+ certification or equivalent, demonstrating validated foundational knowledge of IT systems, hardware, software, and operational support practices.
• Strong problem-solving and decision-making capabilities, with a proven ability to weigh the relative costs and benefits of potential actions and identify the most appropriate solution.
• Highly developed interpersonal and oral/written communication skills, with the ability to effectively and professionally interact with a diverse set of stakeholders (from peers to end-users to executive management).
Desired Skills
• Active Top Secret (TS) security clearance with Sensitive Compartmented Information (SCI) eligibility.
• Advanced cloud or DevSecOps certification such as AWS SysOps Administrator (Professional), Certified Kubernetes Administrator (CKA), or AWS DevOps Engineer (Professional), demonstrating validated expertise in cloud-native incident response, container orchestration, and infrastructure operations in GovCloud or classified environments.
• Practical familiarity with AI/ML model zoo architecture and model registry operations, including working knowledge of how catalog state, API endpoint configuration, and cross-domain promotion readiness intersect with Tier-4 escalation triggers and serving runtime recovery procedures.
• Familiarity with cross-domain solution (CDS) architectures and data transfer controls governing multi-enclave AI/ML serving environments, including enclave-specific constraints across NIPRNet, SIPRNet, and JWICS as they apply to model pipeline continuity and escalation sequencing.
• Working knowledge of Zero Trust Architecture principles and Risk Management Framework (RMF) requirements as they apply to AI/ML serving runtime security, continuous monitoring integration, and cybersecurity incident coordination within DoW-accredited cloud environments.
ECS Federal LLC is an equal opportunity employer and does not discriminate or allow discrimination on the basis any characteristic protected by law. All qualified applicants will receive consideration for employment without regard to disability, status as a protected veteran or any other status protected by applicable federal, state, or local jurisdiction law.
is the federal segment of , a $4B global organization with over 10,000 employees. Our nearly 3,500 professionals deliver advanced technology solutions in data and AI, cybersecurity, and enterprise transformation, serving defense, intelligence, and federal civilian agencies.
Our work powers mission-critical outcomes, strengthens technology partnerships, and creates meaningful opportunities for our people. We are defined by a commitment to excellence in delivery, a culture of innovation, and an environment where talent can thrive and grow.
We value:
- Attracting and developing top talent and high-performing teams
- Fostering a culture that is engaging, accountable, and mission-driven
group id: 10112231A