You will need to login before you can apply for a job.
Software Development Engineer, Central Reliability & Response Engineering DESCRIPTION
Are you ready to be the guardian of Amazon's global digital infrastructure?
CRRE, within Amazon's Engine organization, is where engineering meets impact at massive scale.
We're the team that ensures millions of customers can shop, stream, and connect without missing a beat - 24/7, across the globe.
About the Role:
Imagine building systems that protect Amazon's services across 20+ global marketplaces - that's the scale we operate at.
You'll join one of our elite teams that sits at the intersection of innovation and reliability, where your code will serve as the backbone of Amazon's operational excellence.
Whether you're preventing service disruptions before they happen or enabling lightning-fast incident response, your solutions will directly safeguard the shopping experience for hundreds of millions of customers worldwide.
We're looking for exceptional Software Development Engineers to join our dynamic team in Dublin, Ireland - a tech hub that's home to some of Amazon's most critical reliability and resilience engineering initiatives.
If you get excited about building real-time systems that process petabytes of data daily and want to work where your code impacts millions of customers globally, we want to talk to you.
Based in our state-of-the-art Dublin office, you'll collaborate with builders across Amazon to ensure seamless customer experiences across our vast network of backend and frontend services.
This isn't just any software engineering role - it's an opportunity to solve complex problems at a scale few engineers ever experience, where every line of code you write has the potential to impact global commerce in real-time.
Key job responsibilities Design and implement large-scale systems processing petabytes of data daily Build and maintain high-quality, thoroughly tested software solutions Create tools and mechanisms that help service teams identify and prevent availability risks Develop real-time monitoring and analysis capabilities Collaborate with teams across Amazon to improve service resilience Participate in on-call rotations to support business-critical systems A day in the life
You'll work in an agile environment, designing and implementing solutions that operate at Amazon scale.
This could involve:
Building real-time data processing systems that analyze service health Developing mechanisms to surface and prevent reliability risks Creating actionable insights that help teams deploy changes safely Collaborating with service teams to implement resilience best practices Contributing to systems that process and analyze logs from thousands of services About the team
You could join one of two specialized teams within Central Reliability & Response Engineering (CRRE):
Operational Intelligence (OI) Team: Owns Real-Time Log Analysis (RTLA), a critical platform used by thousands of internal customers Helps teams monitor and categorize service errors in real-time Enables root cause analysis within minutes of issues occurring Processes and analyzes massive amounts of log data daily Resilience Insights and Safety Engineering (RISE) Team: Creates tools to help services maintain availability under any conditions Develops frameworks for assessing and improving service resilience Builds systems to ensure safe deployment of code and configuration changes Provides actionable insights for improving service reliability BASIC QUALIFICATIONS Bachelor's degree or equivalent Experience programming with at least one modern language such as Java, C++, or C# including object-oriented design Experience contributing to the architecture and design (architecture, design patterns, reliability and scaling) of new and current systems PREFERRED QUALIFICATIONS Experience with full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations Experience building complex software systems that have been successfully delivered to customers Experience using or building tools in the Observability space, such as log analysis, tracing, or monitoring Amazon is an equal opportunities employer.
We believe passionately that employing a diverse workforce is central to our success.
We make recruiting decisions based on your experience and skills.
We value your passion to discover, invent, simplify and build.
Protecting your privacy and the security of your data is a longstanding top priority for Amazon.
Please consult our Privacy Notice ( to know more about how we collect, use and transfer the personal data of our candidates.
Amazon is committed to a diverse and inclusive workplace.
Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers.
If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit for more information.
If the country/region you're applying in isn't listed, please contact your Recruiting Partner.
#J-18808-Ljbffr