Today
Top Secret
Mid Level Career (5+ yrs experience)
IT - Data Science
Scott AFB, IL•Omaha, NE
Responsibilities/Duties:
Build Scalable Data & ML Infrastructure
· Design and implement medallion architecture (Bronze/Silver/Gold) using Databricks for reliable data processing and ML model training
· Develop automated data pipelines that process structured and unstructured data from multiple sources into analytics-ready formats
· Create robust ETL/ELT workflows using Apache Spark and modern data engineering practices for both batch and streaming data
· Build and maintain data quality monitoring and validation systems across the entire data and ML lifecycle
Drive ML Platform Excellence
· Implement MLOps best practices including automated model training, validation, deployment, and monitoring using MLflow and Databricks workflows
· Design scalable ML inference systems that handle high-volume, low-latency predictions in production environments
· Create comprehensive monitoring and alerting systems for model performance, data drift, and system health
· Build self-service ML capabilities that enable data scientists to deploy and monitor their own models efficiently
Enable Advanced Analytics & Business Intelligence
· Design and maintain data models that support both machine learning workloads and business intelligence requirements
· Create integration points between ML systems and business intelligence platforms (Tableau, PowerBI, Qlik Sense)
· Implement data governance standards and metadata management systems that ensure data quality and compliance
· Collaborate with analysts and data scientists to optimize data architecture for both predictive modeling and reporting needs
Ensure Data Quality & Governance
· Implement comprehensive data governance frameworks including data lineage tracking, quality monitoring, and compliance controls
· Design and maintain data catalogs and metadata management systems that enable efficient data discovery across the organization
· Establish data quality standards and automated testing frameworks for both analytical and ML workloads
· Work with stakeholders to define data definitions, business logic, and governance policies
Integrate with Enterprise Systems
· Build integrations with MAVEN Smart Systems (Palantir Foundry) environments to support operational and predictive analytics
· Connect Databricks-based systems with enterprise data warehouses, streaming platforms, and business applications
· Implement security and compliance controls that meet enterprise requirements while enabling self-service capabilities
· Collaborate with platform engineers to integrate ML systems with broader application architecture and infrastructure
Required Skills – What You’ll Bring:
· 5+ years of technical experience, including 3+ years building production data pipelines and ML infrastructure using distributed computing platforms like Databricks.
· Strong data engineering skills in Python, PySpark, and Spark SQL with experience implementing medallion architecture and modern data platform patterns
· Production ML systems experience including model deployment, monitoring, and MLOps practices in cloud environments
· Data architecture expertise with experience designing scalable data processing systems and implementing data governance frameworks
· Experience integrating with platforms such as Qlik, Tableau, PowerBI, MAVEN Smart System (Palantir), or similar.
Preferred Skills - What Would Set You Apart:
· Deep expertise in distributed computing, performance optimization, and large-scale data processing using Databricks and Apache Spark
· Advanced MLOps knowledge including automated retraining, model versioning, model testing frameworks, and production ML monitoring
· Experience conducting regression analysis, and building predictive models for business applications with measurable impact
· Advanced statistical knowledge including experimental design, hypothesis testing, causal inference, and statistical modeling techniques
· Experience designing and building enterprise-level dashboards, reports, and self-service analytics platforms
· Analytics platform knowledge including experience with Advana / MAVEN Smart Systems (Palantir Foundry) or similar enterprise analytics environments
Build Scalable Data & ML Infrastructure
· Design and implement medallion architecture (Bronze/Silver/Gold) using Databricks for reliable data processing and ML model training
· Develop automated data pipelines that process structured and unstructured data from multiple sources into analytics-ready formats
· Create robust ETL/ELT workflows using Apache Spark and modern data engineering practices for both batch and streaming data
· Build and maintain data quality monitoring and validation systems across the entire data and ML lifecycle
Drive ML Platform Excellence
· Implement MLOps best practices including automated model training, validation, deployment, and monitoring using MLflow and Databricks workflows
· Design scalable ML inference systems that handle high-volume, low-latency predictions in production environments
· Create comprehensive monitoring and alerting systems for model performance, data drift, and system health
· Build self-service ML capabilities that enable data scientists to deploy and monitor their own models efficiently
Enable Advanced Analytics & Business Intelligence
· Design and maintain data models that support both machine learning workloads and business intelligence requirements
· Create integration points between ML systems and business intelligence platforms (Tableau, PowerBI, Qlik Sense)
· Implement data governance standards and metadata management systems that ensure data quality and compliance
· Collaborate with analysts and data scientists to optimize data architecture for both predictive modeling and reporting needs
Ensure Data Quality & Governance
· Implement comprehensive data governance frameworks including data lineage tracking, quality monitoring, and compliance controls
· Design and maintain data catalogs and metadata management systems that enable efficient data discovery across the organization
· Establish data quality standards and automated testing frameworks for both analytical and ML workloads
· Work with stakeholders to define data definitions, business logic, and governance policies
Integrate with Enterprise Systems
· Build integrations with MAVEN Smart Systems (Palantir Foundry) environments to support operational and predictive analytics
· Connect Databricks-based systems with enterprise data warehouses, streaming platforms, and business applications
· Implement security and compliance controls that meet enterprise requirements while enabling self-service capabilities
· Collaborate with platform engineers to integrate ML systems with broader application architecture and infrastructure
Required Skills – What You’ll Bring:
· 5+ years of technical experience, including 3+ years building production data pipelines and ML infrastructure using distributed computing platforms like Databricks.
· Strong data engineering skills in Python, PySpark, and Spark SQL with experience implementing medallion architecture and modern data platform patterns
· Production ML systems experience including model deployment, monitoring, and MLOps practices in cloud environments
· Data architecture expertise with experience designing scalable data processing systems and implementing data governance frameworks
· Experience integrating with platforms such as Qlik, Tableau, PowerBI, MAVEN Smart System (Palantir), or similar.
Preferred Skills - What Would Set You Apart:
· Deep expertise in distributed computing, performance optimization, and large-scale data processing using Databricks and Apache Spark
· Advanced MLOps knowledge including automated retraining, model versioning, model testing frameworks, and production ML monitoring
· Experience conducting regression analysis, and building predictive models for business applications with measurable impact
· Advanced statistical knowledge including experimental design, hypothesis testing, causal inference, and statistical modeling techniques
· Experience designing and building enterprise-level dashboards, reports, and self-service analytics platforms
· Analytics platform knowledge including experience with Advana / MAVEN Smart Systems (Palantir Foundry) or similar enterprise analytics environments
group id: 10105424
Accelerating IT transformation in the public sector