Opis
Career Area:
Technology, Digital and DataJob Description:
Your Work Shapes the World at Caterpillar Inc.
When you join Caterpillar, you're joining a global team who cares not just about the work we do – but also about each other. We are the makers, problem solvers, and future world builders who are creating stronger, more sustainable communities. We don't just talk about progress and innovation here – we make it happen, with our customers, where we work and live. Together, we are building a better world, so we can all enjoy living in it.
AWS Data Engineer who will design, build, and maintain robust data pipelines and connectors across AWS and Coveo. This role will have a strong focus on data quality, reliability, and validation, ensuring that data ingested into Coveo and downstream systems is accurate, complete, and production‑ready.
The engineer will work closely with search, data, and platform teams to support enterprise‑scale ingestion pipelines, improve data quality frameworks, and enable high‑quality search and content experiences.
Key Responsibilities
Design, develop, and maintain scalable data pipelines on AWS (batch and near‑real‑time).
Own data quality validation across pipelines, including
Data completeness, freshness, consistency, and accuracy checks
Schema validation and anomaly detection
Implement automated data quality checks, alerts, and monitoring for pipeline failures or data issues.
Collaborate with search and platform teams to ensure high‑quality indexed data for Coveo sources.
Contribute to CI/CD pipelines and follow enterprise SDLC and security standards.
Produce clear documentation for pipelines, data contracts, and quality rules.
Perform in-depth analysis of search-indexed datasets to identify data quality issues to improve relevance, accuracy, and overall search performance.
Required Qualifications
Strong experience as a Data Engineer working on AWS.
Hands‑on experience building data pipelines using services such as:
AWS Glue,BedRock,Lambda, Step Functions, S3, SNS/SQS (or similar)
Proficiency in Python (and/or PySpark) for data engineering and validation logic.
Strong understanding of data quality concepts, frameworks, and best practices.
Experience integrating data from multiple sources (APIs, databases, files).
Solid knowledge of ETL/ELT patterns and data modeling fundamentals.
Experience with logging, monitoring, and alerting for data pipelines.
Familiarity with CI/CD practices and version control (Git).
Preferred / Nice‑to‑Have Skills
Experience working with Coveo (sources, connectors, indexing, ingestion pipelines).
Prior experience supporting search platforms or content indexing pipelines.
Exposure to data quality frameworks or rule‑based validation approaches.
Knowledge of Snowflake.
Experience working in large enterprise or multi‑team environments.
Soft Skills
Strong problem‑solving and debugging skills.
Ability to work cross‑functionally with data, search, and platform teams.
Clear communication of data issues, risks, and remediation plans.
High ownership mindset with attention to detail, especially around data correctness.
Posting Dates:
June 4, 2026 - June 17, 2026Caterpillar is an Equal Opportunity Employer. Qualified applicants of any age are encouraged to apply
Not ready to apply? Join our Talent Community.

