Today
Top Secret/SCI
Unspecified
Unspecified
IT - Database
Saint Louis, MO (On-Site/Office)
Full Description
Lead the design and implementation of the intelligent data fabric that transforms geospatial datasets into AI-enabled, discoverable resources. This role will architect the ETL pipelines, metadata enrichment systems, and graph database implementations that serve both human analysts and autonomous AI systems.
Responsibilities
Lead data lake structure development with SQL and ElasticSearch integration
• Design and implement Apache Airflow ETL pipelines for diverse geospatial data sources
• Develop AI-powered metadata enrichment using NLP and semantic analysis
• Architect knowledge graph construction systems for anomaly detection and pattern discovery
• Implement comprehensive structured logging in JSON format for AI training datasets
• Design ontology-driven graph database population strategies
• Lead spatiotemporal analytics and activity-based intelligence tool development
• Ensure data quality monitoring and automated validation throughout ingestion processes
Qualifications:
TS/SCI Clearance
• 10+ years experience in data engineering and ETL pipeline development
• Expertise with graph databases (Neo4j, Amazon Neptune) and ontology modeling
• Advanced knowledge of ElasticSearch, PostGIS, and distributed SQL systems
• Experience with Apache Airflow orchestration and workflow management
• Proficiency in Python, SQL, and data processing frameworks
• Experience with geospatial data formats and standards (OGC, ISO)
• Knowledge of metadata management and controlled vocabularies
Desired Qualifications
15+ years experience with publications and patents in data engineering
• Experience with AI/ML classification systems and automated processes
• Background in geospatial intelligence and NGA mission requirements
• Experience with FAISS vector databases and semantic similarity matching
• Knowledge of ICD 503 compliance and security audit trail requirements
Lead the design and implementation of the intelligent data fabric that transforms geospatial datasets into AI-enabled, discoverable resources. This role will architect the ETL pipelines, metadata enrichment systems, and graph database implementations that serve both human analysts and autonomous AI systems.
Responsibilities
Lead data lake structure development with SQL and ElasticSearch integration
• Design and implement Apache Airflow ETL pipelines for diverse geospatial data sources
• Develop AI-powered metadata enrichment using NLP and semantic analysis
• Architect knowledge graph construction systems for anomaly detection and pattern discovery
• Implement comprehensive structured logging in JSON format for AI training datasets
• Design ontology-driven graph database population strategies
• Lead spatiotemporal analytics and activity-based intelligence tool development
• Ensure data quality monitoring and automated validation throughout ingestion processes
Qualifications:
TS/SCI Clearance
• 10+ years experience in data engineering and ETL pipeline development
• Expertise with graph databases (Neo4j, Amazon Neptune) and ontology modeling
• Advanced knowledge of ElasticSearch, PostGIS, and distributed SQL systems
• Experience with Apache Airflow orchestration and workflow management
• Proficiency in Python, SQL, and data processing frameworks
• Experience with geospatial data formats and standards (OGC, ISO)
• Knowledge of metadata management and controlled vocabularies
Desired Qualifications
15+ years experience with publications and patents in data engineering
• Experience with AI/ML classification systems and automated processes
• Background in geospatial intelligence and NGA mission requirements
• Experience with FAISS vector databases and semantic similarity matching
• Knowledge of ICD 503 compliance and security audit trail requirements
group id: 91141743