Social network you want to login/join with:
Senior Software Engineer, Incident Response & Observability, Dublin Client: Squarespace
Location: Dublin, Ireland
Job Category: Other
EU work permit required: Yes
Job Reference: 65da1f98474e
Job Views: 155
Posted:
Expiry Date:
Job Description: The Squarespace Incident Response & Observability team is looking for a Senior Software Engineer to lead the automation & experimentation efforts for detection, monitoring, and mitigation across Squarespace-powered systems, to protect our Customers from product and service degradations, incidents and outages, and empower our engineering staff with the self-service Observability toolkit to gain insights into our tech ecosystem, to be equipped to detect and triage.
Our team mission is to overhaul a Business Continuity program to standardize processes & workflows, mitigate risks, collect data insights from incident trends affecting business-critical uptime metrics, prepare and communicate Incident reports for a broad audience from individual contributors to C-suite executives, measure Incident frequency & volume, downtime costs, and security threats, and improve Incident Service Level Agreement metrics to our contractual commitments.
You will promote an accountability model for performance, availability & uptime Indicators that help increase resiliency to Incident Response.
This is an opportunity to build real-time dashboards, outlining all business-critical uptime system event signals (i.e.
SLAs, SLIs & SLOs) powering the mission to unlock 1 million Monthly Active Sellers.
You will work with a diverse group, including Product Engineering, Infrastructure, Platform Engineering, Product Specialists, Customer Operations, Security, Legal, Enterprise, Data Science, Product Analytics, UX, and organizational leaders.
As a Senior Software Engineer, you are empowered to construct the foundational layer, including the design, implementation, and maintenance of systems & tools to guide and improve Incident Response & Observability at Squarespace.
This is a hybrid role working from our Dublin office 3 days per week.
You will report to the Engineering Director.
You'll Get To… Develop incident alerts & observability automation, conduct analysis, create health metrics, lead investigations, and provide advisory support.
Automate processes such as system & network log analysis to re-assemble and replay incident event history for root cause analysis & impact costs.
Design and conduct tabletop exercises to assure organizational readiness in disaster recovery and business continuity program.
Establish processes and build play-book document catalog and implement strategy around operational responses to incidents, and to protect our customers and Squarespace.
Manage and contribute efforts to build the next generation Metrics Platform in the Cloud.
Build / refine our Observability tools that support hundreds of engineers every day.
Refine the Incident Commander processes and Incident Management training.
Who We're Looking For BS in Computer Science or Engineering, or equivalent professional experience.
Have 8+ years of demonstrated experience as an engineer.
Proficiency in at least 1 general purpose programming or scripting language (i.e.
Golang).
In-depth technical understanding to assess incident risks & significance across broader tech ecosystem.
Regular on-call rotation expectations.
Benefits: Health insurance with 100% covered premiums for you and your dependent children.
Fertility and adoption benefits.
Headspace mindfulness app subscription.
Retirement benefits with employer match.
Flexible paid time off.
Up to 20 weeks of paid family leave.
Equity plan for all employees.
Commuter benefit in the form of reduced tax.
Employee donation match to community organizations.
6 Global Employee Resource Groups (ERGs).
Free lunch and snacks.
Close proximity to cultural landmarks such as Dublin Castle and St. Patrick's Cathedral.
#J-18808-Ljbffr