Skip to content

Staff Software Engineer (Databases SRE)

Grafana Labs
staff
Location

Spain

Work Type

Remote

Open to applicants in

Spain

Seniority

staff

Posted

July 3, 2026


Total Compensation
€110,500
Yearly Savings (Comfortable)
€83,628
Want to apply for this job?

Subscribe to access the application link and 8,000+ more jobs

Job Description

  • We are looking for a Staff Software Engineer - SRE to help us support our highest value Grafana Cloud customers by increasing the reliability of our Cloud databases that are based on Mimir, Loki, Tempo, and Pyroscope. We provide these databases as a SaaS product from AWS, GCP, and Azure across all regions
  • The SRE team is embedded within the Mimir, Loki, and Tempo squads and focuses on ensuring that Grafana Cloud’s database products deliver exceptional reliability for our highest-SLA customers. In this role, you will:
  • Partner closely with product engineering squads (embedded model)
  • Own production reliability for high-SLA and complex customer environments
  • Design and implement automation to scale our reliability practices
  • Ensuring our customers meet our SLO targets
  • Define and evolve per-tenant SLOs and reliability models
  • Proactively reduce SLO burn to prevent repeat incidents
  • Serving as a primary escalation point and on-call for relevant incidents
  • Lead customer-impacting incident response and post-incident reviews
  • Contribute to design docs and code reviews
  • Influence feature design to ensure production scalability and operability
  • Build automation to eliminate toil where needed
  • Improve alert quality and reduce noisy escalations
  • We seek a staff software engineer operating at the intersection of customer needs, production systems, and product engineering
  • Regular 1:1s to with your manager and colleagues
  • Reviewing and creating SLOs, proactively investigating ways in which we can further reduce budget burn for those SLOs, which can be self-directed or as the result of learnings from incidents, and may include improvements to monitoring, automation, increasing self-healing, auto-scaling, etc
  • Improve observability of customers within their environments
  • Designing and implementing solutions to ensure reliability and scalability of our environments can meet rapidly increasing demands
  • Develop fault-tolerant design patterns ensuring that we are considering reliability at all stages of the service lifecycle
  • Collaborating with our Engineering Leaders to help define and influence product strategy, roadmaps and technical designs
  • Participate in PR review and collaborating with other engineers on their Design Docs
  • Teach others about Site Reliability Engineering and communicate best practices to be applied early in development of new features and functionality
  • Participate in Incident Response when applicable, including investigation through to resolution, PIR, and communication with customers via Bridge calls where necessary

Benefits

  • Vacation: Balance is key. Our team enjoys 30 days of paid vacation each year on top of national holidays, parental leave, and sick leave. We also take a breather on a number of Grafana Shutdown Days each year
  • Healthcare: We’re proud to provide health coverage or stipends for our colleagues in the US, UK, Canada, the Netherlands, Sweden, Singapore, and India
  • Retirement planning: There’s no time like the present to start saving for your future. We make employer contributions into the pension pots of our team members in the US, UK, Canada, the Netherlands, Sweden, and Germany
  • Professional development: On top of a $1,500 annual learning and development stipend, Grafanistas have thousands of on-demand courses at their fingertips to help them grow professionally. Want to attend a conference or training? Go ahead. Just pass on what you learned
  • Work location: Vast majority of our roles are fully remote, focused on hiring the best talent and allowing you to perform from the comfort of your home. If you fancy a change of scene, we’ll also reimburse you up to $175 a month for a personal co-working space
  • Choice of tech: There’s no one-size-fits-all when it comes to the tech required to do your job. Choose the laptop and accessories you need when you join us, and we’ll refresh them every three years
  • Mindfulness: When you join the team, you can sign up for a complimentary subscription to Headspace to take advantage of the benefits of mindfulness and meditation. Our wellbeing resource group also organize sessions run by fellow Grafanistas or external trainers
  • Global Employee Assistance Program: We offer all team members a 100% confidential support service with 24/7 365 access to professionally qualified counsellors and specialists
  • Paid parental leave: Grafana offers paid parental leave to all eligible new parents. This offers Grafanistas time to bond with and care for their children in the first year after birth or adoption- You may not meet every requirement, and that’s okay. If this role excites you, we’d love you to raise your hand for what could be a truly career-defining opportunity
  • Experience with Linux operating systems internals, and some knowledge of networking, cloud storage, and scaling
  • Strong Kubernetes experience in AWS, GCP, or Azure, and familiarity with infrastructure-as-code tooling (Helm, Terraform, Jsonnet, etc.)
  • Experience with one or more programming languages (e.g. Go, Python, Java, etc)
  • We highly value those who are intellectually curious, who default to transparency, possess a high bias towards action, and who are also kind (this is important!)
  • Ability to partner deeply with product engineering teams
  • Excellent problem-solving and troubleshooting skills
  • 8+ years engineering experience, 4+ in SRE/CRE/production engineering. Strong preference for those with formal customer reliability engineering experience
  • Strong experience designing and implementing SLOs
  • Ability to reason about performance, scaling, and failure modes
  • Comfortable working within an engineering team where individuals are encouraged to have a strong sense of autonomy and self-direction
  • Experience operating multi-tenant systems in production
  • Strong experience with technical leadership, leading a team through projects, mentoring other engineers on the team and serving as a force-multiplier
  • Experience with calmly and actively participating in blame-free Incident Response, following up on actions, and writing high quality PIRs (Post Incident Reviews, a.k.a. post-mortem documents)
More Jobs You Might Like
Senior Partner Solution Engineer (AWS)

Snowflake

London

Helpful Resources
Salary & Savings Calculator

Compare salaries across European cities and calculate your potential savings. Understand cost of living and take-home pay for tech jobs in Europe.

Career Guides

Expert advice on landing high-paying tech jobs in Europe. Tips on interviews, salary negotiation, and career growth from The European Engineer.

Access 8,000+ High-Paying Tech Jobs

Get unlimited access to our full database of 8,000+ jobs with advanced filters, salary comparisons, and exclusive career guides from The European Engineer.