Posted today
Unspecified
Mid Level Career (5+ yrs experience)
Unspecified
IT - Data Science
Blu Omega is looking for a Cloudera Data Engineer. In this role, you’ll work with a variety of technologies across the Hadoop and Cloudera ecosystems to move, transform, and optimize data from multiple sources. You’ll partner with data engineers, analysts, and developers to make sure the data infrastructure is efficient, scalable, and secure—enabling smarter, faster decisions for our clients.
You’ll spend your days designing and deploying data solutions, improving data processing performance, and supporting the overall health of our data environment. This position is ideal for someone who enjoys hands-on technical work, problem-solving, and continuous learning in a collaborative, team-oriented setting.
Responsibilities:
Design and build data pipelines to extract, transform, and load (ETL) large data sets from multiple sources into the Cloudera environment.
Manage and optimize data infrastructure for high performance, reliability, and scalability across both on-premise and cloud environments.
Develop and maintain scripts and workflows using Python, Java, Scala, or Pig to automate data processing and integration tasks.
Collaborate with cross-functional teams to understand data requirements, develop solutions, and ensure data is accurate and available for analytics and reporting.
Monitor and troubleshoot Cloudera environments, leveraging tools such as Cloudera Manager and Hue for system health, tuning, and debugging.
Use generative AI tools (like GitHub Copilot, Codex, or Claude Code) to enhance development efficiency and code quality.
Qualifications
3+ years of experience using Hadoop technologies (including Spark) to ingest, transform, and process data
Experience with Cloudera installation, configuration, tuning, and administration
Experience developing and managing NiFi pipelines for data ingestion and transformation
Experience leveraging generative AI coding tools to assist in development
Experience working in SQL (Hive, Spark SQL, or Impala) for querying and managing data within the Cloudera ecosystem
Experience with public cloud platforms such as AWS or Microsoft Azure
Knowledge of Python, Java, Scala, or Bash for data engineering and automation
Possession of strong collaboration and communication skills with the ability to work effectively in a cross-functional team environment
Ability to obtain and maintain a Public Trust or Suitability/Fitness determination based on client requirements
Bachelor’s degree
Nice to Have
Experience with Terraform for infrastructure automation and deployment
Experience with with CI/CD tools and DevOps best practices
Knowledge of data governance, metadata management, and data catalog tools
Ability to optimize queries and resource usage for better performance and efficiency
You’ll spend your days designing and deploying data solutions, improving data processing performance, and supporting the overall health of our data environment. This position is ideal for someone who enjoys hands-on technical work, problem-solving, and continuous learning in a collaborative, team-oriented setting.
Responsibilities:
Design and build data pipelines to extract, transform, and load (ETL) large data sets from multiple sources into the Cloudera environment.
Manage and optimize data infrastructure for high performance, reliability, and scalability across both on-premise and cloud environments.
Develop and maintain scripts and workflows using Python, Java, Scala, or Pig to automate data processing and integration tasks.
Collaborate with cross-functional teams to understand data requirements, develop solutions, and ensure data is accurate and available for analytics and reporting.
Monitor and troubleshoot Cloudera environments, leveraging tools such as Cloudera Manager and Hue for system health, tuning, and debugging.
Use generative AI tools (like GitHub Copilot, Codex, or Claude Code) to enhance development efficiency and code quality.
Qualifications
3+ years of experience using Hadoop technologies (including Spark) to ingest, transform, and process data
Experience with Cloudera installation, configuration, tuning, and administration
Experience developing and managing NiFi pipelines for data ingestion and transformation
Experience leveraging generative AI coding tools to assist in development
Experience working in SQL (Hive, Spark SQL, or Impala) for querying and managing data within the Cloudera ecosystem
Experience with public cloud platforms such as AWS or Microsoft Azure
Knowledge of Python, Java, Scala, or Bash for data engineering and automation
Possession of strong collaboration and communication skills with the ability to work effectively in a cross-functional team environment
Ability to obtain and maintain a Public Trust or Suitability/Fitness determination based on client requirements
Bachelor’s degree
Nice to Have
Experience with Terraform for infrastructure automation and deployment
Experience with with CI/CD tools and DevOps best practices
Knowledge of data governance, metadata management, and data catalog tools
Ability to optimize queries and resource usage for better performance and efficiency
group id: 91121246