Site Reliability Engineer
Job Description
đ About Us
Weâre a product-focused startup with a tight-knit team of 14 engineers building tools that help teams make better decisions through great research. We're pragmatic, fast-moving, and obsessed with product quality.
As we grow, our infrastructure needs to grow with us. That means better observability, stronger systems, faster deploysâand smarter decisions about cloud spend. Weâre hiring someone who can take ownership of this and lay the foundation for long-term platform health.
đŻ What Youâll Do
Youâll be the first dedicated DevOps/Infra hire with end-to-end ownership of platform health, reliability, and scalability. Youâll partner directly with our engineering team to improve our systems, reduce toil, and make infra a product in its own right.
Your scope will include:
Observability & Reliability
Define and maintain service SLOs, dashboards, and alerts
Improve incident detection and response
Establish best practices around reliability and error budgets
Infrastructure
Maintain and improve Terraform-managed infrastructure
Lead our migration of staging infrastructure to AWS
Scale systems to handle growth and changing workloads
Developer Experience & CI/CD
Increase pipeline reliability
Speed up deploy cycles and improve rollback confidence
Database Performance
Help identify and fix slow queries, optimize indexes
Support product teams with performance diagnostics
Cloud Cost Management
Monitor and optimize cloud spend
Build visibility and tooling to help teams make cost-aware decisions
đĄ You Might Be a Great Fit If You...
Have 4â8+ years of experience in DevOps, SRE, or Infrastructure roles
Have hands-on AWS experience (EC2, RDS, VPCs, etc.)
Are confident with Terraform, GitHub Actions, Docker, and PostgreSQL
Have a track record of improving observability and reducing incident response times
Have worked in high-autonomy, high-ownership environments
Are cost-conscious and can identify waste in infra and cloud spend
Love building leverage tools for engineersâinfra as a product
đ Growth Path
This is a foundational hire. Today, the role is fully IC, but thereâs clear runway to grow into:
Platform leadership (tech lead or manager)
Head of Infra/SRE if we expand the team
Principal engineer focused on scale, reliability, and platform strategy
Youâll have support and visibility from leadership, and the freedom to chart your path as the company grows.
âď¸ Our Stack
Cloud: AWS
Infra-as-code: Terraform
CI/CD: GitHub ActionsContainers: Docker, lightweight Kubernetes
Monitoring: Datadog, SentryDatabase: PostgreSQL, Redis
App: Rails, React, Sidekiq
⨠Why This Role?
Impact: Youâll shape the systems and culture of how we build and run software.
Trust: High autonomy and low processâmake smart decisions, move fast.
People: No egos, just a team that values thoughtfulness, speed, and care.
Growth: Opportunity to grow with the company in whichever direction excites you.
Company Information
Location: Oakland, California, United States
Type: Hybrid