Social network you want to login/join with:Principal Systems Development Engineer, Managed Operations, DublinClient:Amazon Development Centre Ireland Limited - D94Location:Dublin, IrelandJob Category:OtherEU work permit required:YesJob Reference:91a0110ac2b3Job Views:102Posted:21.01.2025Expiry Date:07.03.2025Job Description:Do you enjoy balancing being hands-on, leading by example, with helping shape strategic direction? Do the challenges that come of driving technical, business, and cultural change to improve the reliability, performance, and efficiency of one of the largest cloud providers excite you? The Amazon Managed Operations (MO) organization was founded in April 2023, with the objective to reduce operational load and toil through long-term engineering projects. MO is building the best-in-class engineering and operations team that will own the day-to-day operations for Amazon Regions; improving the availability, reliability, latency, performance and efficiency to operate Amazon regions. Amazon is looking for a highly motivated Principal Systems Development Engineer to drive technical operational efficiency across Amazon. This role will tackle intrinsically hard problems, venturing beyond comfortable approaches when necessary. You will learn, educate, and advocate, acquiring expertise as needed, pioneer new spaces, and inspire others as to what’s possible. This role is internally focused and highly visible, demanding continuous learning, collaboration across departments within Amazon, and it will significantly impact the quality of life for both current and future customers and builders who directly or indirectly depend on Amazon's European Sovereign Cloud.A day in the lifeYou’ll balance your time between operating production systems and making long-term improvements to the reliability, availability, and performance of those software systems. An example week could look like: Monday you provide meaningful feedback on the most critical upcoming change whilst guiding the most senior technical talent in your organization to make more decisions without you. Tuesday you identified a major reliability risk in the interplay between systems in your care and designed a cohesive solution. On Wednesday you lead the design review with the relevant technical leaders, receiving consensus on a path forward. Thursday, you influenced your senior management to take goals and make investments to achieve that outcome. Friday, you begun developing part of that system which would have the most impact on the reliability of the overall system.Basic Qualifications10+ years of experience in software development or related fieldExperience operating and troubleshooting reliable, scalable software systemsProficient in at least one modern programming language such as Java, Typescript, Python, or RubyAble to troubleshoot at all levels, from network to operating systems to software applicationsProficient communicator across languages, cultures, and time zonesAble to periodically travel to meet with internal engineering teams, leaders, and customersPreferred QualificationsHighly proficient in operating 24x7 high-availability, distributed software applicationsDesire to dive deep into, and find opportunities to improve, the reliability, availability, and performance of distributed software systemsExperience influencing and leading strategic efforts requiring work from multiple teamsExperience actively mentoring individual engineers and managersExperience performance tuning software applications and optimizing fleet utilizationStrong understanding of network fundamentals (DNS, DHCP, TCP/IP, routing, load balancing, load shedding)Proficient with Infrastructure as Code (such as CDK, CloudFormation, Puppet, Chef, Ansible, or similar)Proficient with operating services in AWSExperience with monitoring frameworks (such as CloudWatch, Datadog, Grafana, Elastic or similar)Experience scripting operating system tasks in Bash, Python, etc.
#J-18808-Ljbffr