Senior Big Data Engineer
BBI
Date: 5 hours ago
City: Riyadh
Contract type: Full time
Responsibilities:
- Design, implement, and optimize data pipelines for batch and real-time data processing using Cloudera (Hadoop, Hive, Spark, Impala) and Informatica (PowerCenter, Cloud Data Integration)
- Build data extraction, transformation, and loading (ETL) workflows using Informatica PowerCenter for large-scale data integration from source systems (e.g., relational databases, flat files, APIs) into Cloudera Data Lake or data warehouse environments.
- Implement Spark jobs on Cloudera for distributed data processing and optimization of data workflows.
- Leverage Informatica for orchestrating ETL workflows, including data extraction, cleansing, transformation, and loading into data repositories (HDFS, Hive, SQL databases, etc.).
- Optimize the Informatica workflows to minimize runtime, ensure smooth data integration, and maintain high data quality.
- Utilize Hadoop and Spark on Cloudera to process large datasets and implement data transformations using MapReduce, Spark SQL, and PySpark.
- Leverage Impala for low-latency SQL queries on Hadoop, ensuring real-time access to processed data.
- Implement partitioning, bucketing, and indexing strategies in Hive and HBase to improve query performance on large datasets.
- Implement and enforce data quality rules within Informatica workflows, ensuring that all transformations meet the required standards for completeness, consistency, and accuracy.
- Ensure compliance with data governance and security protocols (e.g., encryption, masking, access control) in accordance with industry best practices.
- Automation and Scheduling: Automate ETL workflows using Informatica Server, integrating with Airflow, Nifi or other workflow orchestration tools for scheduling and monitoring jobs.
- Utilize Cloudera Navigator for monitoring and auditing data processes within the Hadoop ecosystem.
- Perform regular tuning of the ETL pipelines, data flows, and SQL queries to ensure optimal performance.
- Bachelor’s degree in Computer Science, Engineering, or related field.
- 6+ years of experience in the same field.
- Proven experience with the Cloudera Distribution of Hadoop (CDH), including expertise in HDFS, Hive, Impala, Spark, and HBase.
- Strong hands-on experience with Informatica PowerCenter (ETL), EDC, IDQ, B2B, and Axon.
- Deep understanding of ETL best practices, data pipelines, and distributed computing technologies such as Spark, MapReduce, PySpark, and Hadoop ecosystem components.
- Advanced SQL skills for data manipulation, aggregation, optimization, and reporting across relational and non-relational data stores (e.g., SQL Server, MySQL, PostgreSQL, Hive, Impala).
- Experience in Python and SQL.
- Strong background in data warehousing principles and data modeling, including dimensional modeling (star schema, snowflake schema) and OLAP/OLTP considerations.
How to apply
To apply for this job you need to authorize on our website. If you don't have an account yet, please register.
Post a resumeSimilar jobs
Senior Manager - Delivery - Sports Stadium - SPA101
Qiddiya Investment Company,
Riyadh
18 hours ago
Qiddiya Investment Company is proud to announce an opening for a Delivery Director for one of our Stadium projects. This strategic role will require an inspirational leader with a track record in delivering large-scale sports facilities to ensure the successful execution of this iconic venue, which will serve as a centerpiece for national athletics.ResponsibilitiesOversee the construction and end-to-end delivery phases...
CONTROLLER, DOCUMENT
alfanar,
Riyadh
1 day ago
Job description:Job Purpose Efficiently manage and organize the flow of documents, ensuring accuracy, accessibility, and compliance with established procedures and standards.Key Accountability Areas Document Management:Organize, store, retrieve, and distribute documents efficiently.Ensure documents are labeled, filed, and archived per company standards.Monitor revisions and remove obsolete versions.Communication and Collaboration:Facilitate communication between departments for smooth document exchange.Coordinate with project teams to meet deadlines...
Construction Director
JASARA PMC,
Riyadh
1 day ago
The Construction Director at JASARA PMC is a senior leadership role responsible for overseeing the planning, execution, and delivery of major construction projects. This role entails managing multidisciplinary teams to achieve project objectives on time, within budget, and to the highest standards of quality and safety.The ideal candidate will leverage extensive experience in managing complex construction programs, collaborating with stakeholders,...