
MentorX
全球专业服装品牌零售商
POSITION SUMMARY:
The Data Engineer Intern will help to enable data as the forefront of all business process and decisions. You will be part of a team of strong technologists passionate about our roadmap to deliver highly available and scalable data pipelines and solutions in the Google Cloud Platform (GCP) allowing teams across Digital, Stores, Marketing, Supply Chain, Finance, and Human Resources to make real-time decisions based on consistent, complete, and correct data using Data Quality rules. Your contributions will include supporting improvements in our data platform, building frameworks for security, automation and optimization, and curating data to be consumed by Data Science, Data Insights, and Advanced Analytics teams across the organization using a suite of sophisticated cloud tools and open source technologies.
RESPONSIBILITIES:
- Guided work using Google Cloud Platform (GCP) environment to perform the following:
- Develop highly available, scalable data pipelines and applications to support business decisions
- Pipeline development and orchestration using Cloud Data Fusion/CDAP, Cloud Composer/Airflow/Python, Dataflow
- Big Query table creation and query optimization
- Cloud Function’s for event-based triggering
- Cloud Monitoring and Alerting
- Pub/Sub for real-time messaging
- Cloud Data Catalog build-out
- Work in an agile environment applying SDLC principles and SCRUM methodologies utilizing tools such as Jira, Wiki, Bitbucket/GitHub and Bamboo
- Guided work around software and product security, scalability, general data warehousing principles, documentation practices, refactoring and testing techniques
- Use Terraform for infrastructure automation and provisioning
QUALIFICATIONS:
- Bachelor/Master’s degree in MIS, Computer Science, or a related discipline
- Experience / training using Java or Python
- Understanding of Big Data ETL Pipelines and familiar with Dataproc, Dataflow, Spark, or Hadoop
- Proficient ANSI SQL skills