The Information Technology Services Division in the Business Services Directorate at the Oak Ridge National Laboratory is seeking qualified applicants for a Site Reliability Engineer position in the Research and Development Systems Engineering group. The R&D Systems Engineering group exists to facilitate lab goals through systems engineering, integration, and support for the research community at ORNL. We run mid-range clusters, servers, workstations, and other services where science happens at the lab.
The Site Reliability Engineer is responsible for the compliance and reliability of the source control and CI/CD tooling used by teams and researchers across the lab and in collaboration with many organizations outside the lab. This position will work with a team to provide deployment, automation, monitoring, and management tooling to provide a stable and frictionless experience for both researchers and fellow engineers. In addition, this role will work with researchers and team members to understand their needs and eliminate points of friction.
- Support, monitor, and maintain production grade source control and CI/CD applications and associated infrastructure
- Maintain and enforce compliance with regulatory requirements and best practices
- Work with customers to understand their needs and reduce points of friction
- Work with customers and fellow engineers to promote best practices and available solutions
- Serve as Subject Matter Expert and primary Point of Contact for Gitlab and associated services
- Work to improve and automate maintenance and best practice implementation
- Work to eliminate off-hours handling of stability, compliance, and security events
- Interface with vendor to escalate and resolve technical issues
- Evaluate and resolve or further escalate technical issues escalated from the help desk
- Evaluate new technology options and vendor products
- Collaborate with research teams to ensure new environments meet requirements
- Identify and document IT best practices that will improve the systems deployment function
- Ability to present and communicate complex technical concepts to small to medium groups of scientists and engineers
- Answer escalated helpline calls in addition to primary project work
- Monitor systems performance
- Demonstrated fluency managing UNIX/Linux Systems
- Fluency in at least one scripting language such as Bash, Python, Go or equivalent
- Working knowledge of the Software Development Lifecycle
- Deep understanding of Continuous Improvement/Continuous Deployment theory and methodologies
- Working knowledge of security best practices as applied to Linux systems and software development
- Working knowledge of the git protocols and lifecycle
- The ability to obtain and maintain a Department of Energy clearance, which requires US Citizenship
- Bachelor’s degree in Computer Science, Cybersecurity or related technical subjects or equivalent combination of education and experience
- Strong knowledge of multiple operating systems
- Experience with RHEL7, VMware 6+
- Knowledge of networking fundamentals including TCP/IP, traffic analysis, common protocols, and network diagnostics
- Experience with performance and diagnostic tools for benchmarking, analysis and tuning of systems, networking, and storage
- Experience with Nagios, Zabbix, SolarWinds, Ganglia, and other network and device monitoring systems
- Previous experience working in a government, scientific, or other highly technical environment
- Demonstrated ability to balance complex research and security requirements
- Background of contributing to open-source projects or avocational endeavors such as hacker/maker spaces
- Technical documentation skills, including ability to prepare simple documentation web pages
This position will remain open for a minimum of 5 days after which it will close when a qualified candidate is identified and/or hired.
We accept Word (.doc, .docx), Adobe (unsecured .pdf), Rich Text Format (.rtf), and HTML (.htm, .html) up to 5MB in size. Resumes from third party vendors will not be accepted; these resumes will be deleted and the candidates submitted will not be considered for employment.
If you have trouble applying for a position, please email [email protected]
ORNL is an equal opportunity employer. All qualified applicants, including individuals with disabilities and protected veterans, are encouraged to apply. UT-Battelle is an E-Verify employer.