You will be joining the team that is responsible for the end-to-end health (performance and reliability) of Meta's backbone networks. You will build tools and use automation to efficiently scale how we mitigate real-time impact to the network, identify and investigate long-term trends into performance and risks in our backbone, and drive innovative solutions to monitor and improve Meta's current and future backbone network products. Our backbones continue to rapidly expand globally, driven most recently through the network demands that our AGI journey brings. We support both our 'Classic Backbone', that transports traffic destined to people using Meta's products, and our 'Express Backbone', that handles machine to machine traffic between our Data Centers.
Engineers that typically thrive in this role are hybrid software and network engineers who are curious about how systems work, how they fail, and how we can increase their reliability. You have the opportunity to dig into interesting challenges in the networking and software domains, at a scale that offers new challenges on a daily basis.
Responsibilities
1. Write and review code, develop documentation and capacity plans, and debug the hardest problems, live, on some of the largest and most complex networks and systems in the world.
2. Participate in a weekly on-call rotation and be an escalation contact for service incidents.
3. Perform deep dives on complex technical issues across networks, ranging from automated tooling to hardware failures and network issues.
4. Manage and maintain multi-vendor, multi-protocol backbone and edge networks.
5. Analyze data to diagnose and identify root causes to network issues.
6. Define, develop, and optimize automated network monitoring systems to mitigate and remediate network events.
7. Proactively find gaps that impact multiple teams, come up with the execution plan, and drive the project directly and through influence of other teams.
8. Contribute to team growth and development through peer mentorship.
Minimum Qualifications
1. Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience.
2. 4+ years experience coding in higher-level languages (e.g., Python, C++, Go, etc.).
3. 5+ years experience in one or more of BGP, MPLS, ISIS or similar routing protocols - knowledge in typical configurations and performance tuning.
Preferred Qualifications
1. 5+ years experience understanding and mitigating network hardware and topology failures.
2. Expert knowledge of TCP/IP and IPv6.
3. Experience operating and designing SDN-based backbone networks.
4. Experience working in a multi-vendor network environment.
5. Experience with developing distributed systems and operating them at scale.
6. Experience with automation frameworks and tools such as Ansible, Puppet, or Chef.
7. Experience in configuration and maintenance of network devices and NMS systems, or applications such as web servers, load balancers, relational databases, storage systems and messaging systems.
8. Experience learning software, frameworks and APIs.
9. Experience developing and understanding network device configuration for at least one vendor (Juniper, Cisco, Arista, Brocade, etc.).
10. Knowledge in routing and switching - hardware design and knowledge of forwarding and data planes.
#J-18808-Ljbffr