Skip to content

Senior Software Engineer (Grafana Databases, Managed Services)

Grafana Labs
senior
Location

Germany

Work Type

Remote

Open to applicants in

Germany

Seniority

senior

Posted

July 3, 2026


Total Compensation
€133,250
Yearly Savings (Comfortable)
€71,000
Want to apply for this job?

Subscribe to access the application link and 8,000+ more jobs

Job Description

  • The Managed Services team is a newly formed squad within the Databases department. It owns and operates shared, production-critical infrastructure that powers Grafana Cloud’s next-generation database products (Mimir, Loki, and Tempo). Today, this includes operating 100+ WarpStream clusters across multiple cloud providers and regions, with continued growth anticipated for the future. WarpStream acts as the streaming backbone for ingestion and read/write decoupling across databases. It sits directly on the hot path for metrics, logs, and traces, handling high-throughput, multi-consumer workloads at massive scale
  • In addition to streaming infrastructure, the team works closely with high-volume analytical and storage systems that power query-heavy and aggregation-heavy workloads, where latency, compression behavior, storage layout, and scaling characteristics matter deeply
  • As a Senior Engineer on Managed Services, you will take ownership of running these systems in production. This involves:
  • Operating and evolving 100+ multi-cloud streaming clusters and related database infrastructure
  • Diagnosing and eliminating cross-layer failure modes (e.g., object storage latency, noisy neighbors, control-plane bottlenecks, query performance regressions, etc.)
  • Designing safe upgrade and rollout strategies at scale
  • Improving observability, automation, and operational ergonomics
  • Partnering closely with database and platform teams to ensure safe scaling, partitioning, consumer fan-out, and query performance
  • Working directly with distributed systems behavior, Kubernetes scheduling dynamics, storage engines, compression trade-offs, etc
  • Serving as a primary escalation point and on-call for relevant incidents
  • Owning the relationship with all system vendors, including WarpStream Labs and others
  • As we are remote-first and our engineering organization is largely remote, we provide guidance and meet regularly using video calls, so an independent attitude and good communication skills are a must
  • This role blends deep distributed systems work with the opportunity to influence how the team approaches reliability, scaling, and operational excellence
  • Regular 1:1s with your manager and close collaboration with teammates across regions
  • Reviewing and defining SLOs for shared database infrastructure, proactively reducing error budgets through improvements to monitoring, automation, scaling strategies, and system design
  • Improving the diagnosability of core streaming and database systems in production, where possible
  • Implementing solutions that ensure reliability, scalability, and performance of high-throughput, multi-cloud infrastructure
  • Developing fault-tolerant patterns that account for distributed system realities such as storage latency, partition imbalance, noisy neighbors, and control-plane dependencies
  • Planning and executing safe upgrades and rollouts across dozens of production clusters
  • Collaborating with database and platform engineering leaders to influence architecture, roadmap priorities, and long-term strategy
  • Participating in PR review and contributing to design documents, automation, tooling, and code improvements that reduce operational risk
  • Sharing best practices and distributed systems knowledge with partner teams
  • Participating in incident response, from investigation through resolution and post-incident reviews (PIR)

Benefits

  • Vacation: Balance is key. Our team enjoys 30 days of paid vacation each year on top of national holidays, parental leave, and sick leave. We also take a breather on a number of Grafana Shutdown Days each year
  • Healthcare: We’re proud to provide health coverage or stipends for our colleagues in the US, UK, Canada, the Netherlands, Sweden, Singapore, and India
  • Retirement planning: There’s no time like the present to start saving for your future. We make employer contributions into the pension pots of our team members in the US, UK, Canada, the Netherlands, Sweden, and Germany
  • Professional development: On top of a $1,500 annual learning and development stipend, Grafanistas have thousands of on-demand courses at their fingertips to help them grow professionally. Want to attend a conference or training? Go ahead. Just pass on what you learned
  • Work location: Vast majority of our roles are fully remote, focused on hiring the best talent and allowing you to perform from the comfort of your home. If you fancy a change of scene, we’ll also reimburse you up to $175 a month for a personal co-working space
  • Choice of tech: There’s no one-size-fits-all when it comes to the tech required to do your job. Choose the laptop and accessories you need when you join us, and we’ll refresh them every three years
  • Mindfulness: When you join the team, you can sign up for a complimentary subscription to Headspace to take advantage of the benefits of mindfulness and meditation. Our wellbeing resource group also organize sessions run by fellow Grafanistas or external trainers
  • Global Employee Assistance Program: We offer all team members a 100% confidential support service with 24/7 365 access to professionally qualified counsellors and specialists
  • Paid parental leave: Grafana offers paid parental leave to all eligible new parents. This offers Grafanistas time to bond with and care for their children in the first year after birth or adoption- You may not meet every requirement, and that’s okay. If this role excites you, we’d love you to raise your hand for what could be a truly career-defining opportunity
  • Proficiency in at least one programming language (Go preferred, but not required)
  • Clear communicator who can collaborate across teams and work autonomously
  • Strong Kubernetes experience in AWS, GCP, or Azure, and familiarity with infrastructure-as-code tooling (Helm, Terraform, Jsonnet, etc.)
  • Solid understanding of distributed systems design and large-scale system trade-offs
  • Working knowledge of Linux internals, networking, cloud storage, and performance/scaling behavior
  • Experience operating distributed systems in production (e.g., streaming systems, analytical databases, large-scale storage backends). Examples of these include Kafka, Redpanda, WarpStream, Postgres, ClickHouse, Snowflake, or Cassandra
  • 6+ years of engineering experience, including meaningful time in SRE, platform engineering, production engineering, infrastructure engineering, or distributed systems roles
  • Experience participating in blameless incident response and writing high-quality post-incident reviews
  • Curious, pragmatic, action-oriented, and kind (this is important!)
More Jobs You Might Like
Senior Agent Customer Service (German Speaker)

OKX

Budapest Budapest Hungary

Helpful Resources
Salary & Savings Calculator

Compare salaries across European cities and calculate your potential savings. Understand cost of living and take-home pay for tech jobs in Europe.

Career Guides

Expert advice on landing high-paying tech jobs in Europe. Tips on interviews, salary negotiation, and career growth from The European Engineer.

Access 8,000+ High-Paying Tech Jobs

Get unlimited access to our full database of 8,000+ jobs with advanced filters, salary comparisons, and exclusive career guides from The European Engineer.