The Role
Job Description:
We are looking for a skilled Data Engineer to join our team. The ideal candidate will have strong experience in designing, building, and maintaining scalable data pipelines and architectures. You will play a critical role in managing data workflows, ensuring data integrity, and optimizing data processing.
Responsibilities:
* Data Pipeline Development: Design, build, and maintain scalable and efficient data pipelines to process and transform large datasets.
* ETL & Data Integration: Develop and optimize ETL (Extract, Transform, Load) workflows for structured and unstructured data sources.
* Big Data Processing: Work with PySpark and Pandas to handle large-scale data processing tasks.
* Database Management: Design, implement, and manage relational (SQL) and non-relational databases for data storage and retrieval.
* Cloud Technologies: Leverage cloud platforms such as AWS, GCP, or Azure to deploy and manage data infrastructure.
* Collaboration: Work closely with data scientists, analysts, and software engineers to support analytical and machine learning projects.
* Data Quality & Performance Optimization: Ensure data accuracy, consistency, and security while optimizing performance.
* Monitoring & Troubleshooting: Identify and resolve data pipeline performance bottlenecks and failures.
Ideal Profile
Required Work Experience:
* 2+ years of experience in data engineering or a related field.
* Proven experience developing ETL pipelines and data processing workflows.
* Hands-on experience with PySpark, Pandas, and SQL.
* Experience working with big data technologies such as Apache Spark, Hadoop, or Kafka (preferred).
* Familiarity with cloud data solutions (AWS, GCP, or Azure).
Required Skills:
* Programming: Strong proficiency in Python (PySpark, Pandas) or Scala.
* Data Modeling & Storage: Experience with relational databases (PostgreSQL, MySQL, SQL Server) and NoSQL databases (MongoDB, Cassandra).
* Big Data & Distributed Computing: Knowledge of Apache Spark, Hadoop, or Kafka.
* ETL & Data Integration: Ability to develop efficient ETL processes and manage data pipelines.
* Cloud Computing: Experience with AWS (S3, Redshift, Glue), GCP (BigQuery), or Azure (Data Factory, Synapse).
* Data Warehousing: Understanding of data warehousing concepts and best practices.
* Problem-Solving: Strong analytical skills to troubleshoot and optimize data pipelines.
* Communication: Must be proficient in spoken English to collaborate with US-based teams.
Education Requirements:
* Bachelor’s degree in Computer Science, Data Engineering, Information Technology, or a related field (preferred).
* Equivalent work experience in data engineering will also be considered.
What's on Offer?
* Work within a company with a solid track record of success.
* Attractive salary & benefits.
#J-18808-Ljbffr