Site Reliability Engineer

Oak Ridge National Laboratory

Date listed

3 weeks ago



Total Funding

$3.8 billion

The Information Technology Services Division in the Business Services Directorate at the Oak Ridge National Laboratory is seeking qualified applicants for a Site Reliability Engineer position in the Research and Development Systems Engineering group. The R&D Systems Engineering group exists to facilitate lab goals through systems engineering, integration, and support for the research community at ORNL. We run mid-range clusters, servers, workstations, and other services where science happens at the lab.

The Site Reliability Engineer is responsible for the compliance and reliability of the source control and CI/CD tooling used by teams and researchers across the lab and in collaboration with many organizations outside the lab. This position will work with a team to provide deployment, automation, monitoring, and management tooling to provide a stable and frictionless experience for both researchers and fellow engineers. In addition, this role will work with researchers and team members to understand their needs and eliminate points of friction.

Major Duties/Responsibilities

  • Support, monitor, and maintain production grade source control and CI/CD applications and associated infrastructure
  • Maintain and enforce compliance with regulatory requirements and best practices
  • Work with customers to understand their needs and reduce points of friction
  • Work with customers and fellow engineers to promote best practices and available solutions
  • Serve as Subject Matter Expert and primary Point of Contact for Gitlab and associated services
  • Work to improve and automate maintenance and best practice implementation
  • Work to eliminate off-hours handling of stability, compliance, and security events
  • Interface with vendor to escalate and resolve technical issues
  • Evaluate and resolve or further escalate technical issues escalated from the help desk
  • Evaluate new technology options and vendor products
  • Collaborate with research teams to ensure new environments meet requirements
  • Identify and document IT best practices that will improve the systems deployment function
  • Ability to present and communicate complex technical concepts to small to medium groups of scientists and engineers
  • Answer escalated helpline calls in addition to primary project work
  • Monitor systems performance

Basic Requirements

  • Demonstrated fluency managing UNIX/Linux Systems
  • Fluency in at least one scripting language such as Bash, Python, Go or equivalent
  • Working knowledge of the Software Development Lifecycle
  • Deep understanding of Continuous Improvement/Continuous Deployment theory and methodologies
  • Working knowledge of security best practices as applied to Linux systems and software development
  • Working knowledge of the git protocols and lifecycle
  • The ability to obtain and maintain a Department of Energy clearance, which requires US Citizenship

Preferred Qualifications

  • Bachelor’s degree in Computer Science, Cybersecurity or related technical subjects or equivalent combination of education and experience
  • Strong knowledge of multiple operating systems
  • Experience with RHEL7, VMware 6+
  • Knowledge of networking fundamentals including TCP/IP, traffic analysis, common protocols, and network diagnostics
  • Experience with performance and diagnostic tools for benchmarking, analysis and tuning of systems, networking, and storage
  • Experience with Nagios, Zabbix, SolarWinds, Ganglia, and other network and device monitoring systems
  • Previous experience working in a government, scientific, or other highly technical environment
  • Demonstrated ability to balance complex research and security requirements
  • Background of contributing to open-source projects or avocational endeavors such as hacker/maker spaces
  • Technical documentation skills, including ability to prepare simple documentation web pages

This position will remain open for a minimum of 5 days after which it will close when a qualified candidate is identified and/or hired.

We accept Word (.doc, .docx), Adobe (unsecured .pdf), Rich Text Format (.rtf), and HTML (.htm, .html) up to 5MB in size. Resumes from third party vendors will not be accepted; these resumes will be deleted and the candidates submitted will not be considered for employment.

If you have trouble applying for a position, please email [email protected]

ORNL is an equal opportunity employer. All qualified applicants, including individuals with disabilities and protected veterans, are encouraged to apply. UT-Battelle is an E-Verify employer.

Findwork Copyright © 2021


Let's simplify your job search. Receive your tailored set of opportunities today.

Subscribe to our Jobs