Today

Dept of Homeland Security

Unspecified

Engineering - Systems

Arlington, VA (On-Site/Office)

Job Description
ECS is seeking a MLOps Integration Engineer to work in our Arlington, VA office.

Job Summary:

We are seeking an experienced MLOps Integration Engineer to design, dep loy, and op timize m achine learning pipelines supporting the secure, reliable, and efficient operation of AI models in production. The MLOps Integration E ngin ee r wi ll lead the automation of end-to-end ML workflows-from model deployment and versi on ing to m on itoring , drift detection, and compliance logging. This role focuses on building scalable infrastructure and observability frameworks that ensur e mode ls remain performant, traceable, and aligned with mission and bu siness o bj ectives across cloud and on-premises environments.

Responsibilities:

Deploy and manage ML models in production using tools such as MLflow , Kubeflow, or AWS SageMaker, ensuring scalability, low latency, and availability.
Design and maintain dashboards using Grafana, Prometheus, or Kibana to track real-time and historical model performance metrics (e.g., accuracy, latency, throughput).
Build automated pipelines using tools like Evidently AI or Alibi Detect to identify data distribution shifts and initiate retraining or alerting mechanisms.
Implement centralized logging with ELK Stack or OpenTelemetry to capture inference events, system errors, and audit trails for debugging, compliance, and model governance.
Develop CI/CD pipelines using GitHub Actions, Jenkins, or Azure DevOps to automate model builds, testing, deployment, and rollback.
Apply secure-by-design principles to safeguard AI pipelines through encryption, access control, and compliance with frameworks such as GDPR, HIPAA, and NIST AI RMF.
Partner with data scientists, AI engineers, DevOps, and security teams to ensure seamless model integration and lifecycle management.
Optimize model inference performance through techniques such as quantization, pruning, and container orchestration for efficient resource utilization across AWS, Azure, or Google Cloud.
Develop comprehensive documentation for ML pipelines, observability configurations, and monitoring workflows to promote operational transparency and knowledge sharing.

Required Skills

Bachelor's or Master's degree in Computer Science , Data Science, Engineering, or a related technical discipline.
Minimum 5+ years of experience in MLOps , DevOps, or software engineering, with emphasis on AI/ML systems.
Proven success deploying and maintaining ML models in production using MLflow , Kubeflow, or cloud AI platforms (AWS SageMaker, Azure ML, or Google Vertex AI).
Hands-on experience with observability and monitoring tools such as Prometheus, Grafana, or Datadog.
Proficiency in Python and SQL; familiarity with JavaScript or Go preferred.
Expertise in containerization and orchestration (Docker, Kubernetes) and CI/CD automation (GitHub Actions, Jenkins).
Working knowledge of time-series databases ( InfluxDB , TimescaleDB ) and logging frameworks (ELK Stack, OpenTelemetry ).
Experience implementing drift detection tools (Evidently AI, Alibi Detect) and visualization libraries ( Plotly , Seaborn).
Strong understanding of model performance metrics (precision, recall, F1, AUC) and statistical drift detection techniques (KS test, PSI).
Familiarity with AI security vulnerabilities such as data poisoning and adversarial attacks, with knowledge of mitigation tools like the Adversarial Robustness Toolbox (ART).
Strong problem-solving and debugging ability for complex ML system and pipeline issues.
Excellent collaboration and communication skills across cross-functional technical teams.
High attention to detail to ensure reliability, accuracy, and compliance in system reporting.
Must be a U.S. Citizen and eligible to obtain and maintain a Department of Homeland Security (DHS) EOD clearance (requires a favorable background investigation).

Desired Skills

Experience monitoring Large Language Models (LLMs) using tools such as LangSmith , Helicone , or similar observability frameworks.
Familiarity with compliance frameworks such as GDPR, HIPAA, and NIST AI RMF for secure data management and ethical AI operations.
Experience contributing to open-source MLOps projects or engaging in professional AI operations communities (e.g., #MLOps, #AIOps).
Knowledge of automated retraining pipelines, model version control, and feature store management (e.g., Feast, Tecton ).
Professional certifications such as AWS Certified Machine Learning - Specialty, Azure AI Engineer Associate, or Google Cloud Professional ML Engineer.

#ECS1

ECS is an equal opportunity employer and does not discriminate or allow discrimination on the basis any characteristic protected by law. All qualified applicants will receive consideration for employment without regard to disability, status as a protected veteran or any other status protected by applicable federal, state, or local jurisdiction law.

ECS is a leading mid-sized provider of technology services to the United States Federal Government. We are focused on people, values and purpose. Every day, our 3300+ employees focus on providing their technical talent to support the Federal Agencies and Departments of the US Government to serve, protect and defend the American People.

group id: 10112231A

Recruiter

MLOps Integration Engineer

ECS

Match Score

Similar Jobs

Location

Job Category

Clearance Level

Employer

Related Searches