The Parallel Computing Service (PCS) team at AWS is seeking a Software Development Engineer to join the core Slurm team.
The role involves building and shipping services that focus on advancing PCS capabilities to run and scale high-performance computing (HPC) workloads using the open-source Slurm scheduler.
Key Responsibilities:
* Architect, develop, and maintain core functionality to manage high performance computing clusters.
* Develop tools to streamline deployment, monitoring, and maintenance processes for the services owned by the team.
* Functionally decompose complex problems into simple, straight-forward solutions.
* Limited use of short-term workarounds. Things are done with the proper level of complexity the first time (or at least minimize incidental complexity).
* Proficient in a broad range of design approaches and know when it is appropriate to use them (and when it is not). Solutions are pragmatic.
* Collaborate with the Slurm maintainers and open-source community to drive improvements and ensure alignment with industry best practices.
* Provide mentorship and knowledge sharing within the team to facilitate a collaborative and learning-oriented environment.
BASIC QUALIFICATIONS
* Proven experience in software development, with a focus on distributed systems with at least 1 programming language.
* Non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience.
* Solid knowledge of Linux fundamentals.
* Experience with cloud-native technologies.
PREFERRED QUALIFICATIONS
* Bachelor's degree in computer science or equivalent.
* Several years of experience of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience.
* Experience programming in Java.
* Experience scripting in Python.
* Experience with Slurm or other HPC schedulers (LSF, PBS, GridEngine, etc.) and/or other HPC technologies.
* Experiencing mentoring junior software development engineers and driving engineering excellence.