Job Summary
This role involves designing, developing, and maintaining data infrastructure and data pipelines to ensure that data is efficiently collected, processed, stored, and made accessible for analysis and decision-making within an organization. The role of a Data Engineering Lead is instrumental in supporting data-driven initiatives. This role is instrumental in shaping an organization's data landscape. The ability to blend technical expertise with a strong understanding of business needs is essential in this role.
Job Description​
-
Design, develop, and maintain data pipelines for extracting, transforming, and loading (ETL) data from various sources into data storage systems.
-
Create efficient and scalable ETL processes to ensure the smooth flow of data and its availability for analysis.
-
Integrate data from multiple sources, such as databases, APIs, third-party services, and logs, into a unified data ecosystem.
-
Implement data quality checks and validation processes to ensure data accuracy and consistency.
-
Develop and maintain data models, schemas, and database structures to optimize data storage and retrieval.
-
Design and maintain data warehouses, data lakes, or other storage solutions for data analytics.
-
Utilize big data technologies such as Hadoop, Spark, Kafka, and NoSQL databases for processing and analysing large volumes of data.
-
Stay up to date with emerging data technologies and evaluate their applicability to the organization's data infrastructure.
-
Create and maintain data documentation, data dictionaries, and data lineage to support data governance and audit requirements.
-
Implement data governance best practices and adhere to data security and compliance standards.
-
Optimize database performance, including query tuning, indexing, and data partitioning, to enhance system performance and reduce latency.
-
Implement data compression, archiving, and data retention policies to manage storage costs.
-
Implement data security measures, encryption, and access controls to protect sensitive data and ensure compliance with data protection regulations.
-
Collaborate with data scientists, analysts, and other stakeholders to understand their data needs and provide the necessary infrastructure for data-driven decision-making.
-
Work with cross-functional teams to define data requirements for new projects and initiatives.
-
Set up monitoring and alerting systems to proactively identify and resolve data pipeline issues, data quality problems, or performance bottlenecks.
-
Conduct root cause analysis and troubleshooting of data-related problems.
-
Ensure that data infrastructure is scalable to accommodate growing data volumes and evolving business requirements.
-
Continuously optimize data pipelines and infrastructure for improved performance and cost-efficiency.
-
Implement automation for deployment, data pipeline scheduling, and routine maintenance tasks to increase operational efficiency.
-
Keep the team and relevant stakeholders informed about data engineering best practices and new technologies.
-
Provide training and knowledge sharing sessions to help others understand and utilize data systems effectively.
Qualifications
Bachelor's Degree - Computer and Information Science, Experience in a similar environment ideally at executive management level