Our Database Reliability Engineering (DRE) team develops Yelp’s database infrastructure, writing the automation that allows us to scale our MySQL and Cassandra clusters to serve hundreds of thousands of queries per second and enabling Yelp to connect users with great local businesses.
You'll be responsible for developing Yelp's data storage platform, keeping our underlying database infrastructure up and running smoothly in production. You'll design interfaces, automation, monitoring, and alerting to keep us stable, and will work closely with developers as they decide how to store their data and optimize performance.
We're looking for people with a passion for all things related to distributed systems, serving queries fast, uptime, scaling, and solving hard problems with the right tools. We have fun working on these challenges and are looking for others who do, too!
Where You Come In:
- Work closely with developers in developing new features and services
- Serve as a knowledge resource for our team's software and systems
- Help define best practices for storing data at Yelp
- Build next-generation cluster management tooling for Cassandra and MySQL
- Deliver easy, intuitive interfaces to our databases that keep developers moving fast
- Improve the observability of our database usage by instrumenting key systems
- Support and administer Cassandra and MySQL, as well as the stacks they run on
- Propose, test, and deploy database tuning and configuration changes
- Participate in our on-call rotation, acting as a point of call for automated systems and highlighting availability issues when they can't be automatically resolved
What it Takes to Succeed:
- An experienced software engineer with an interest in databases or a database expert with strong software engineering skills
- Fluency in Python, Java, Scala, or a similar language—familiarity with more than one is a plus
- Proficiency with configuration management tools like Puppet, Chef, or Ansible
- Knowledge of best practices related to operating distributed systems in production—scaling, tuning, performance, and disaster recovery
- Comfortable working with Linux
- Excellent communication skills
- Relevant industry experience operating distributed systems like Cassandra or databases like MySQL
What You'll Get:
- Full responsibility for projects from day one, an awesome team, and a dynamic work environment
- Competitive salary with equity in the company, a pension scheme, and an optional employee stock purchase program
- 25 days paid holiday initially, rising to 29 with service
- Private health insurance, including dental and vision
- Flexible working hours and meeting-free Thursdays
- Regular 3-day Hackathons and weekly learning groups, always with interesting topics
- Opportunities to participate in events and conferences throughout Europe and the US
- Public transportation season ticket loan and £50 per month toward any exercise of your choice
- Monthly personal development allowance
- Central location, a fully stocked kitchen, adjustable sitting/standing desks, quarterly offsites, locally roasted coffee, happy hours, and more!