Checkly is looking for an experienced Site Reliability Engineer. This is a great opportunity to join an early stage
company, influence the product roadmap and help us do what we love most: building the best monitoring platform for developers.
Make our reliability product more reliable
Checkly is — in essence — a reliability company. People trust our software to alert them when their software goes "poof".
We use AWS Lambda/SQS/SNS/S3, Heroku, Postgres, Redis and soon ClickHouse to make this happen, from 20+ locations around the world.
Build & shape our SRE practices
You will play a key role in defining how "do reliability". Together with your coworkers in the product engineering teams,
you will be responsible for:
- Observability of our backend platform: define bottlenecks, track them and fix them.
- Optimize our performance and reduce error rates: from wild queries, to slow queues to Heisenbugs.
- Streamline our on-call process and optimize our runbooks.
- Work with the product folks to have reliability baked in to everything we do: define SLO's and SLA's and enforce them.
- You have deep experience in operating and troubleshooting mission critical SaaS environments as an SRE.
- You have deep working experience with AWS, SQL & OLAP databases and Node.js.
- You like to work in a growing company with experienced founders.
- You know how to communicate with coworkers and customers in English.
- You are quick to pick up on new stuff and enjoy the process of learning new things.
- You love making software!
- Experience with building SaaS tools for developers.
- Obsessed with browser automation.
What we offer
- Competitive salary.
- Working hours are flexible and we support families: you can pick up your kids without worrying about work.
- Work with the latest technologies.
- Contribute to open source.
- Modern laptop and equipment provided.
Salary and compensation
$60,000 — $100,000/year
Remote (GMT +3/-3)