Posted today
Public Trust
Early Career (2+ yrs experience)
Unspecified
IT - Data Science
Remote/Hybrid•Atlanta, GA (Off-Site/Hybrid)
CODE Plus, Inc., an experienced IT government contractor in Fairfax, VA with offices in Huntsville, AL and Oak Ridge, TN have been in business for 30 years and have been servicing different agencies within the Federal sector. Our mission is to deliver high-quality, cost-effective solutions that empower our clients to achieve their goals. At CODEplus, we value teamwork, integrity, and technical excellence, and we pride ourselves on maintaining long-standing partnerships built on trust and results.
Our CODEplus team is looking for a Cloudera Data Engineer.
Job Requirements:
In this role, you’ll work with a variety of technologies across the Hadoop and Cloudera ecosystems to move, transform, and optimize data from multiple sources. You’ll partner with data engineers, analysts, and developers to make sure the data infrastructure is efficient, scalable, and secure—enabling smarter, faster decisions for our clients.
You’ll spend your days designing and deploying data solutions, improving data processing performance, and supporting the overall health of our data environment. This position is ideal for someone who enjoys hands-on technical work, problem-solving, and continuous learning in a collaborative, team-oriented setting.
What You’ll Do
• Design and build data pipelines to extract, transform, and load (ETL) large data sets from multiple sources into the Cloudera environment.
• Manage and optimize data infrastructure for high performance, reliability, and scalability across both on-premise and cloud environments.
• Develop and maintain scripts and workflows using Python, Java, Scala, or Pig to automate data processing and integration tasks.
• Collaborate with cross-functional teams to understand data requirements, develop solutions, and ensure data is accurate and available for analytics and reporting.
• Monitor and troubleshoot Cloudera environments, leveraging tools such as Cloudera Manager and Hue for system health, tuning, and debugging.
• Use generative AI tools (like GitHub Copilot, Codex, or Claude Code) to enhance development efficiency and code quality.
Basic Qualifications
• 3+ years of experience using Hadoop technologies (including Spark) to ingest, transform, and process data
• Experience with Cloudera installation, configuration, tuning, and administration
• Experience leveraging generative AI coding tools to assist in development
• Experience working in SQL (Hive, Spark SQL, or Impala) for querying and managing data within the Cloudera ecosystem
• Knowledge of Python, Java, Scala, or Pig for data engineering and automation
• Possession of strong collaboration and communication skills with the ability to work effectively in a cross-functional team environment
• Ability to obtain and maintain a Public Trust or Suitability/Fitness determination based on client requirements
• Bachelor’s degree
Nice to Have
• Experience managing and monitoring Cloudera data platforms using Cloudera Manager or Hue
• Experience managing infrastructure deployments in both on-premises and cloud environments
• Experience with public cloud platforms such as AWS or Microsoft Azure
• Ability to optimize queries and resource usage for better performance and efficiency
Our CODEplus team is looking for a Cloudera Data Engineer.
Job Requirements:
In this role, you’ll work with a variety of technologies across the Hadoop and Cloudera ecosystems to move, transform, and optimize data from multiple sources. You’ll partner with data engineers, analysts, and developers to make sure the data infrastructure is efficient, scalable, and secure—enabling smarter, faster decisions for our clients.
You’ll spend your days designing and deploying data solutions, improving data processing performance, and supporting the overall health of our data environment. This position is ideal for someone who enjoys hands-on technical work, problem-solving, and continuous learning in a collaborative, team-oriented setting.
What You’ll Do
• Design and build data pipelines to extract, transform, and load (ETL) large data sets from multiple sources into the Cloudera environment.
• Manage and optimize data infrastructure for high performance, reliability, and scalability across both on-premise and cloud environments.
• Develop and maintain scripts and workflows using Python, Java, Scala, or Pig to automate data processing and integration tasks.
• Collaborate with cross-functional teams to understand data requirements, develop solutions, and ensure data is accurate and available for analytics and reporting.
• Monitor and troubleshoot Cloudera environments, leveraging tools such as Cloudera Manager and Hue for system health, tuning, and debugging.
• Use generative AI tools (like GitHub Copilot, Codex, or Claude Code) to enhance development efficiency and code quality.
Basic Qualifications
• 3+ years of experience using Hadoop technologies (including Spark) to ingest, transform, and process data
• Experience with Cloudera installation, configuration, tuning, and administration
• Experience leveraging generative AI coding tools to assist in development
• Experience working in SQL (Hive, Spark SQL, or Impala) for querying and managing data within the Cloudera ecosystem
• Knowledge of Python, Java, Scala, or Pig for data engineering and automation
• Possession of strong collaboration and communication skills with the ability to work effectively in a cross-functional team environment
• Ability to obtain and maintain a Public Trust or Suitability/Fitness determination based on client requirements
• Bachelor’s degree
Nice to Have
• Experience managing and monitoring Cloudera data platforms using Cloudera Manager or Hue
• Experience managing infrastructure deployments in both on-premises and cloud environments
• Experience with public cloud platforms such as AWS or Microsoft Azure
• Ability to optimize queries and resource usage for better performance and efficiency
group id: 10124632