Social network you want to login/join with:
Principal Systems Development Engineer, Managed Operations, Dublin Client: Amazon Development Centre Ireland Limited - D94
Location: Dublin, Ireland
Job Category: Other
EU work permit required: Yes
Job Reference: 91a0110ac2b3
Job Views: 102
Posted: 21.01.2025
Expiry Date: 07.03.2025
Job Description: Do you enjoy balancing being hands-on, leading by example, with helping shape strategic direction?
Do the challenges that come of driving technical, business, and cultural change to improve the reliability, performance, and efficiency of one of the largest cloud providers excite you?
The Amazon Managed Operations (MO) organization was founded in April 2023, with the objective to reduce operational load and toil through long-term engineering projects.
MO is building the best-in-class engineering and operations team that will own the day-to-day operations for Amazon Regions; improving the availability, reliability, latency, performance and efficiency to operate Amazon regions.
Amazon is looking for a highly motivated Principal Systems Development Engineer to drive technical operational efficiency across Amazon.
This role will tackle intrinsically hard problems, venturing beyond comfortable approaches when necessary.
You will learn, educate, and advocate, acquiring expertise as needed, pioneer new spaces, and inspire others as to what's possible.
This role is internally focused and highly visible, demanding continuous learning, collaboration across departments within Amazon, and it will significantly impact the quality of life for both current and future customers and builders who directly or indirectly depend on Amazon's European Sovereign Cloud.
A day in the life You'll balance your time between operating production systems and making long-term improvements to the reliability, availability, and performance of those software systems.
An example week could look like: Monday you provide meaningful feedback on the most critical upcoming change whilst guiding the most senior technical talent in your organization to make more decisions without you.
Tuesday you identified a major reliability risk in the interplay between systems in your care and designed a cohesive solution.
On Wednesday you lead the design review with the relevant technical leaders, receiving consensus on a path forward.
Thursday, you influenced your senior management to take goals and make investments to achieve that outcome.
Friday, you begun developing part of that system which would have the most impact on the reliability of the overall system.
Basic Qualifications 10+ years of experience in software development or related field Experience operating and troubleshooting reliable, scalable software systems Proficient in at least one modern programming language such as Java, Typescript, Python, or Ruby Able to troubleshoot at all levels, from network to operating systems to software applications Proficient communicator across languages, cultures, and time zones Able to periodically travel to meet with internal engineering teams, leaders, and customers Preferred Qualifications Highly proficient in operating 24x7 high-availability, distributed software applications Desire to dive deep into, and find opportunities to improve, the reliability, availability, and performance of distributed software systems Experience influencing and leading strategic efforts requiring work from multiple teams Experience actively mentoring individual engineers and managers Experience performance tuning software applications and optimizing fleet utilization Strong understanding of network fundamentals (DNS, DHCP, TCP/IP, routing, load balancing, load shedding) Proficient with Infrastructure as Code (such as CDK, CloudFormation, Puppet, Chef, Ansible, or similar) Proficient with operating services in AWS Experience with monitoring frameworks (such as CloudWatch, Datadog, Grafana, Elastic or similar) Experience scripting operating system tasks in Bash, Python, etc.
#J-18808-Ljbffr