Mô tả công việc
Data Pipeline Development
Develop, maintain, and optimize data pipelines and ETL processes to ensure
efficient data flow and integration.
Work with structured and unstructured data from various sources.
Data Management
Ensure data integrity, accuracy, and consistency across all data systems.
Implement and monitor data quality checks to maintain high standards of data
quality.
Collaboration
Work closely with data scientists, analysts, and other stakeholders to understand
data requirements and deliver solutions.
Assist in integrating data from different sources into a unified data warehouse.
Performance Optimization
Optimize data processing workflows and database performance.
Identify and resolve data-related issues and bottlenecks.
Documentation
Create and maintain documentation for data pipelines, processes, and data models.
Ensure all data engineering practices comply with company policies and industry
standards.
Yêu cầu công việc
Bachelor's degree in computer science, Engineering, or a related field.
2-4 years of experience in data engineering or a related role.
Proficiency in programming languages such as Python or Java.
Strong experience with SQL and relational databases (e.g., MySQL, PostgreSQL, SQL
Server).
Familiarity with big data technologies (e.g., Hadoop, Spark) and data processing
frameworks.
Experience with ETL tools (e.g., Talend, Apache Nifi) and data integration
processes.
Knowledge of cloud platforms (e.g., AWS, Azure, Google Cloud) and their data
services.
Strong problem-solving skills and attention to detail.
Excellent communication skills, both written and verbal.
Preferred Skills
Experience with NoSQL databases (e.g., MongoDB, Cassandra).
Knowledge of data warehousing concepts and tools (e.g., Snowflake, Redshift,
BigQuery).
Familiarity with containerization and orchestration tools (e.g., Docker,
Kubernetes).
Understanding of machine learning and data analytics concepts.
Experience with version control systems (e.g., Git) and CI/CD pipelines.
Certification in cloud data services or big data technologies.