Social network you want to login/join with:
Senior Software Engineer, Incident Response & Observability
Client:
Squarespace
Location:
Dublin, Ireland
Job Category:
Engineering
EU work permit required:
Yes
Job Reference:
65da1f98474e
Job Views:
183
Posted:
03.03.2025
Expiry Date:
17.04.2025
Job Description:
The Squarespace Incident Response & Observability team is looking for a Senior Software Engineer to lead the automation & experimentation efforts for detection, monitoring, and mitigation across Squarespace-powered systems, to protect our customers from product and service degradations, incidents, and outages. You will empower our engineering staff with the self-service Observability toolkit to gain insights into our tech ecosystem, equipping them to detect and triage.
Our team mission is to overhaul a Business Continuity program to standardize processes & workflows, mitigate risks, collect data insights from incident trends affecting business-critical uptime metrics, prepare and communicate Incident reports for a broad audience, and improve Incident Service Level Agreement metrics to meet our contractual commitments.
You will promote an accountability model for performance, availability & uptime indicators that help increase resiliency to Incident Response.
This is an opportunity to build real-time dashboards, outlining all business-critical uptime system event signals (i.e. SLAs, SLIs & SLOs) powering the mission to unlock 1 million Monthly Active Sellers.
You will work with a diverse group, including Product Engineering, Infrastructure, Platform Engineering, Product Specialists, Customer Operations, Security, Legal, Enterprise, Data Science, Product Analytics, UX, and organizational leaders.
As a Senior Software Engineer, you are empowered to construct the foundational layer, including the design, implementation, and maintenance of systems & tools to guide and improve Incident Response & Observability at Squarespace.
This is a hybrid role working from our Dublin office 3 days per week. You will report to the Engineering Director.
You'll Get To…
* Develop incident alerts & observability automation, conduct analysis, create health metrics, lead investigations, and provide advisory support.
* Automate processes such as system & network log analysis to re-assemble and replay incident event history for root cause analysis & impact costs.
* Design and conduct tabletop exercises to assure organizational readiness in disaster recovery and business continuity programs.
* Establish processes and build play-book document catalog and implement strategy around operational responses to incidents.
* Manage and contribute efforts to build the next generation Metrics Platform in the Cloud.
* Build/refine our Observability tools that support hundreds of engineers every day.
* Refine the Incident Commander processes and Incident Management training.
Who We're Looking For
* BS in Computer Science or Engineering, or equivalent professional experience.
* 8+ years of demonstrated experience as an engineer.
* Proficiency in at least 1 general purpose programming or scripting language (i.e. Golang).
* In-depth technical understanding to assess incident risks & significance across a broader tech ecosystem.
* Regular on-call rotation expectations.
Benefits:
* Health insurance with 100% covered premiums for you and your dependent children.
* Fertility and adoption benefits.
* Headspace mindfulness app subscription.
* Retirement benefits with employer match.
* Flexible paid time off.
* Up to 20 weeks of paid family leave.
* Equity plan for all employees.
* Commuter benefit in the form of reduced tax.
* Employee donation match to community organizations.
* 6 Global Employee Resource Groups (ERGs).
* Free lunch and snacks.
* Close proximity to cultural landmarks such as Dublin Castle and St. Patrick's Cathedral.
#J-18808-Ljbffr