Job Description
The Network Alerts team in AWS is looking for System Development Engineers to help build systems that monitor the AWS network, one of the world's largest and most complex networks.
Tens of millions of customers rely on this network for using our retail websites, accessing content on their Kindles and building applications and businesses on top of Amazon Web Services (AWS).
Our success depends on our world-class network infrastructure, and keeping a reliable, fault-tolerant network by diagnosing impairments depends on the Network Alerts team.
What You'll Be Working On:
* Simplifying and optimizing systems, processes, and tools to make things better for our customers
About the Team
AWS Network Alerts owns the design, planning, delivery, and operation of the core monitoring and detection engines for AWS Network.
We work on the most challenging problems, with huge amounts of data, thousands of variables impacting the AWS network, and we're looking for talented people who want to join this journey.
You'll join a diverse team of software, system, and data engineers, scientists, technical program managers, software managers, and other vital roles.
Key Job Responsibilities
* Maintaining their teams' services, troubleshooting, and identifying root causes of any issues that arise within their systems and subcomponents
* Utilizing testing, monitoring, and validations on their services, tools, and infrastructure to ensure continuous deployment with minimal interruption
* Identifying areas to optimize, refine, and develop automation and tools to reduce manual operations and fulfill business and customer requirements
* Participating in proposing infrastructural architecture improvements and developing capabilities to make those happen
Requirements
* Knowledge of computer and system engineering fundamentals (computer architecture, networking, storage, operating systems)
* 5+ years of experience in professional development
* Experience designing or architecting new and existing systems
* Experience in hands-on ops processes, systems engineering, tool development, and maintaining large-scale systems in bare-metal servers
* Experience programming with at least one modern language such as Python, Java, Golang, and Rust
Preferred Qualifications
* Strong experience working on building softwares following SDLC
* Strong experience of infrastructure design, infra as code, writing and managing pipelines, and automation
* Strong experience with Python, Java, or Golang
* Experience managing distributed stateful softwares in a fleet of bare-metal servers, with solid networking and SLAs
* Experience working in a team following Scrum/Kanban