Site Reliability Engineer

NetApp, Inc.

Site Reliability Engineer

£80000

NetApp, Inc., Windsor, Windsor and Maidenhead

  • Full time
  • Permanent
  • Onsite working

Posted today, 21 Sep | Get your application in now to be one of the first to apply.

Closing date: Closing date not specified

job Ref: 5e1fa2afbcf84dcb8df6c207d47ce4c3

Full Job Description

As a Seasoned Software Engineer, you will be involved in both the SRE operations as well as monitoring using Dynatrace / Instana.

This position requires understanding of different monitoring tools, design use cases and should be able to automate the jobs.

The resource should be involved in SRE operations like OS patching upgrades, facilitating P1 / P2 issues and triaging with the teams, own the infrastructure and work on Linux, Kubernetes, containerization and AWS.

1. Strong Programming and Scripting Skills : Proficiency in languages such as Python, Java. Familiarity with scripting languages like Bash or PowerShell is also valuable.

2. System Administration and Networking - Understanding of Linux / Unix system administration, including troubleshooting, performance tuning, and networking concepts (TCP / IP, DNS, load balancing, etc.).

3. Cloud Technologies : Experience with cloud platforms like AWS.

4. Containerization and Orchestration : Understanding of containerization technologies like Docker and container orchestration platforms such as Kubernetes.

5. Monitoring and Alerting : Familiarity with monitoring and alerting tools like Dynatrace, Prometheus, Grafana. Experience in designing and implementing effective monitoring and alerting systems.

6. Incident Management and Troubleshooting : Experience in incident response, troubleshooting, and root cause analysis and the ability to handle and resolve incidents promptly.

7. Collaboration and Communication : Strong interpersonal and communication skills, with the ability to work effectively in cross-functional teams.

Good documentation skills for creating runbooks, operational guidelines, and incident response procedures.

8. Problem-Solving and Analytical Skills : Strong problem-solving abilities and the ability to analyze complex systems and identify areas for improvement.

Attention to detail and the ability to troubleshoot and resolve issues efficiently.

Education

A bachelors with minimum 1 year of experience is required.

Job Segment : Cloud, Linux, Software Engineer, Developer, Java, Technology, Engineering