Job Requirements
Falls Church, VA
Top Secret/SCI Polygraph Unspecified
Career Level not specified
Salary not specified
Join Premium to unlock estimated salaries
Job Description
What Impact You'll Have
GRVTY is seeking a motivated and experienced Data Scientist to perform data analysis via statistical and quantitative methods, develop visualizations to support decision making, and build software to enable thorough monitoring and management of data pipeline tasks. The ideal candidate brings deep expertise in data systems and a passion for solving complex analytical problems in support of Machine Learning model development and operational intelligence missions. This role will help accelerate the implementation of the NGA Maven Data Strategy and drive cross-departmental collaboration to enhance workflows and deliver greater productivity across labeling and imagery curation efforts.
What You'll be Owning
What You Must Have
What Would be Nice to Have
GRVTY is seeking a motivated and experienced Data Scientist to perform data analysis via statistical and quantitative methods, develop visualizations to support decision making, and build software to enable thorough monitoring and management of data pipeline tasks. The ideal candidate brings deep expertise in data systems and a passion for solving complex analytical problems in support of Machine Learning model development and operational intelligence missions. This role will help accelerate the implementation of the NGA Maven Data Strategy and drive cross-departmental collaboration to enhance workflows and deliver greater productivity across labeling and imagery curation efforts.
What You'll be Owning
- Integrate emerging sensors and platforms into existing data pipelines and workflows, including:
- Developing pipelines and relevant data structures for emerging capabilities
- Incorporating new data types into existing data workflows
- Assess potential differences in metadata, data format, and data structure characteristics regarding changes to:
- Databases and schemas
- APIs and other ETL-related processes
- Ingestion and movement of data within the existing data operations pipelines
- Conduct analysis of how emerging data impacts overall program data holdings from the perspective of:
- Machine Learning model development
- Test and evaluation use cases
- Real-world operational use cases
- Evaluate and recommend solutions for partitioning new data types into training, test, and validation splits for effective ML model development and performance evaluation
- Lead the design and application of methods to identify, collect, process, and analyze large volumes of data to build and enhance products, processes, and systems
- Lead projects using advanced mathematical, statistical, and scientific techniques to:
- Examine complex mission and operational problems
- Create insights from data
- Build web scraping algorithms for imagery curation in accordance with customer data priorities
- Integrate multiple data and intelligence sources in various formats for imagery curation to address gaps and meet project priorities
- Support data analytics, data cleansing, analyst workflow efficiencies, and ETL transformation capabilities
- Utilize statistical and analytic methods to support answering intelligence requirements and mission objectives
- Build and integrate analytic tools and algorithms for imagery curation, acquisition, and chipping
- Design methods and mechanisms to track, report, and monitor mission-specific objectives to maintain custody of training data priorities
- Build software tools to manage labeling campaigns, including:
- Tracking unlabeled and labeled data and campaign status
- Transferring label task information via API-based movement between platforms
- Build software tools to:
- Filter and visualize data geospatially
- Allow feedback entry and data analysis
- Integrate with existing data management platforms
- Conduct analysis of overall data holdings to support development of performant AI/ML models satisfying operational user requirements
- Evaluate, monitor, and provide recommendations on training, test, and validation data splits for effective model development and performance evaluation
What You Must Have
- Active TS/SCI Clearance with the ability to obtain a CI/Poly
- Experience working with AI/ML technologies and data systems
- Experience working with multiple file types including:
- Geospatial file formats
- JSON, XML, and related formats
- 3+ years of experience in quantitative analysis and data operations, including:
- Developing visualizations and processing complex data to create data-driven insights
- Data manipulation and ETL procedures
- Working knowledge of SQL and NoSQL database technologies
- Development experience in Python and other languages for data cleaning and manipulation
What Would be Nice to Have
- Experience applying Natural Language Processing (NLP) algorithms to extract data from documents
- Experience with enterprise NGA analytic modernization efforts such as SOM, Computer Vision, automated collection, or automated reporting, along with data standardization best practices
- Demonstrated expertise in math, statistics, and quantitative analysis, including analytic techniques such as classification, regression, clustering, data reduction, and causal modeling
group id: 90883154