Data Platform - Migration Engineer

🌍 Remote, USA πŸš€ Full-time πŸ• Posted Recently

Job Description

ROLE SUMMARY We are looking for a Senior Data Platform / Migration Engineer to lead the modernization of an enterprise data ecosystem, including migration from Cloudera DataIQ DSS to MapR. This role requires deep expertise in large-scale distributed data systems, migration strategy, and performance optimization, with a strong focus on zero data loss, minimal downtime, and production stability. KEY RESPONSIBILITIES β€’ Lead end-to-end migration of enterprise data lake from Cloudera (DataIQ, DSS, CDP) to MapR β€’ Define and execute migration strategy ensuring data integrity, minimal downtime, and rollback readiness β€’ Design and build scalable, production-grade data pipelines post-migration β€’ Optimize cluster performance including compute, storage, and resource utilization β€’ Partner with BI/reporting teams to ensure schema consistency and data availability β€’ Implement data validation frameworks to ensure accuracy and completeness post-migration β€’ Document architecture, runbooks, lineage, and operational procedures β€’ Collaborate with governance teams on data quality, lineage, and compliance requirements REQUIRED SKILLS AND EXPERIENCE β€’ 8+ years in Data Engineering / Data Platform Engineering β€’ Strong hands-on experience with Cloudera (CDP, DSS, DataIQ) and/or MapR β€’ Strong hands-on experience with Apache Spark, Hive, Hadoop, HDFS β€’ Proven experience executing large-scale data lake migrations β€’ Strong programming skills in Python, Scala, or SQL β€’ Deep understanding of distributed data processing and storage systems β€’ Experience with ETL/ELT frameworks (Informatica, Talend, dbt, or similar) PREFERRED QUALIFICATIONS β€’ Prior MapR implementation or certification β€’ Experience with streaming platforms (Kafka, Pulsar) β€’ Exposure to cloud-native data platforms (AWS S3, Azure Data Lake, Google Cloud Platform) β€’ Familiarity with data governance, lineage, and catalog tools β€’ Experience working in high-scale enterprise environments (multi-terabyte/petabyte) CORE TECHNOLOGY STACK Cloudera DSS / DataIQ / CDP, MapR, Apache Spark, Hive, Hadoop, HDFS, Kafka, Python, SQL, dbt, Informatica / Talend

Ready to Apply?

Don't miss out on this amazing opportunity!

πŸš€ Apply Now

Similar Jobs

Recent Jobs

You May Also Like