user avatar
Posted today

Job Requirements

Washington, DC
Dept of Homeland Security Polygraph Unspecified
Career Level not specified
$130,000 - $155,000

Job Description

Piper Companies is looking for a Data Engineer to join a government integrator in Washington, DC. This is a hybrid position and requires the candidate to be onsite 3 days a week and possess an active Top Secret or DHS clearance.

Essential Duties of theData Engineer:
  • Design, develop, and optimize data pipelines and architectures that support data-driven decision-making across AI and ML initiatives
  • Collaborate with data scientists, analysts, and other stakeholders to ensure data availability, integrity, and quality
  • Implement ETL (Extract, Transform, Load) processes to integrate data from various sources into centralized systems
  • Design, develop and maintain data models to support advanced analytics initiatives, AI/ML, Generative AI, and predictive analytics
  • Work with relational databases (e.g., Oracle, PostgreSQL, MySQL, Redshift) to support data integrity and consistency

Qualifications of theData Engineer:
  • Bachelor's Degree in Mathematics, Computer Science, Information Systems, or a related discipline
  • 6+ years of progressive experience in data science, advanced analytics, data visualization, and reporting, with demonstrated ownership of analytical solutions from concept through delivery and operationalization
  • Proven ability to lead the design, development, and deployment of data-driven solutions, including AI/ML models, predictive analytics, and business intelligence products, in production environments
  • Advanced proficiency in Python for data manipulation, automation, and development of scalable analytical workflows
  • Strong expertise in SQL and relational databases (e.g., Oracle, PostgreSQL, MySQL), with the ability to design efficient data models and support complex data integration needs
  • Extensive experience developing automated data pipelines and analytics workflows using Python, R, SQL, and related tools, with a focus on scalability, reliability, and maintainability (e.g., Pandas, R Shiny)

Compensation for theData Engineer:
  • $130,000 - $155,000 (based on experience)
  • Comprehensive benefit package; Cigna Medical, Cigna Dental, Vision, 401k w/ ADP, PTO, paid holidays, sick Leave as required by law

This job opens for applications on 4/6/26. Applications for this job will be accepted for at least 30 days from the posting date

#LI-HYBRID

#LI-BM2

data engineering, data pipeline design, end-to-end pipeline orchestration, data ingestion, batch ingestion, streaming ingestion, real-time processing, near-real-time processing, ETL, ELT, data extraction, data transformation, data loading, schema design, schema evolution, schema enforcement, schema-on-read, schema-on-write, data modeling, dimensional modeling, star schema, snowflake schema, fact tables, dimension tables, slowly changing dimensions, SCD Type 1, SCD Type 2, SCD Type 3, normalization, denormalization, data architecture, lakehouse architecture, data lake, data warehouse, data mart, operational data store, ODS, columnar storage, row-based storage, partitioning strategy, clustering, sharding, bucketing, data pruning, predicate pushdown, query optimization, cost-based optimizer, query execution plan, SQL performance tuning, indexing strategy, composite index, bitmap index, materialized views, window functions, CTEs, subqueries, joins, join optimization, broadcast join, shuffle join, aggregation optimization, incremental processing, change data capture, CDC, log-based CDC, snapshot-based CDC, idempotent pipelines, exactly-once semantics, at-least-once semantics, event-time processing, watermarking, late-arriving data handling, data consistency, data freshness, data latency, data availability, SLA management, pipeline reliability, retry logic, backpressure handling, fault tolerance, pipeline resilience, scalable pipelines, horizontal scaling, vertical scaling, distributed systems, task parallelism, data parallelism, partition-aware processing, orchestration frameworks, workflow orchestration, DAG design, dependency management, pipeline scheduling, backfill processing, pipeline reprocessing, parameterized pipelines, Python-based pipelines, SQL-based transformations, hybrid SQL-Python workloads, PySpark, Spark SQL, vectorized execution, pandas optimization, memory management, performance profiling, pipeline observability, logging, monitoring, alerting, metrics instrumentation, data quality checks, data validation, data profiling, anomaly detection, data completeness checks, data accuracy checks, data consistency checks, deduplication, data cleansing, data standardization, null handling, outlier handling, reference data management, master data management, MDM, data lineage, metadata management, data catalog, semantic layer, business metrics definitions, feature engineering pipelines, feature extraction, feature transformation, feature stores, offline feature store, online feature store, training-serving skew prevention, reusable features, ML-ready datasets, dataset versioning, data version control, reproducible pipelines, experiment tracking, data reproducibility, AI/ML data readiness, model training pipelines, model inference pipelines, model monitoring, data drift detection, concept drift detection, pipeline automation, CI/CD for data pipelines, data unit tests, pipeline integration testing, data contract testing, contract-driven development, schema contracts, backward compatibility, forward compatibility, versioned schemas, governance frameworks, data access control, row-level security, column-level security, data masking, tokenization, anonymization, PII handling, compliance enforcement, audit logging, privacy-by-design, cost optimization, query cost control, storage optimization, compression techniques, parquet optimization, partition pruning, workload isolation, multi-tenant data architecture, resource management, job scheduling optimization, SQL analytics, advanced SQL analytics, Python user-defined functions, UDF optimization, vectorized UDFs, data science enablement, analytics engineering, metrics layer development, self-service analytics enablement, decision intelligence pipelines
group id: 10430981
job ad image
Find Zachary Piper Solutions, LLC on Social Media
Network Employers
user avatar
About Us
Zachary Piper Solutions is a National Security focused technology services and consulting firm with a top-secret facility clearance. We support mission-critical initiatives on behalf of the Intelligence Community, Department of Defense, Department of Homeland Security, Department of Justice, Department of State, and a variety of Civilian Agencies. ZPS is dedicated to help protect government networks against cyber threats and to maximize the wide-spectrum of intelligence and security-related technologies. Our dedicated support and proven experience drive results in support of our client’s mission objectives.
job ad2 image

Zachary Piper Solutions, LLC Jobs


Job Category
IT - Database