Site Reliability Engineer

Capital on Tap, City of Westminster

Site Reliability Engineer

Salary not available. View on company website.

Capital on Tap, City of Westminster

  • Full time
  • Permanent
  • Onsite working

Posted today, 17 Oct | Get your application in now to be one of the first to apply.

Closing date: Closing date not specified

job Ref: 8a5a2667695c4a838b999eb1b00d54b8

Full Job Description

Capital on Tap was founded with the mission to help small business owners and make their lives easier. Today, we provide an all-in-one business credit card & spend management platform that helps business owners save time and money. Capital on Tap proudly serves over 200,000 businesses across the world and our goal is to help 1 million small businesses by 2030. Why Join Us? We empower you to be innovative and solve complex problems. Take ownership, make an impact, and thrive in our scaling and agile environment. This is a Hybrid role, the Site Reliability Team works from our London (Shoreditch) Offices 1 day per week. What You'll Be Doing Our Site Reliability Engineers work closely with our Platform and Engineering teams to ensure our application infrastructure is robust and scalable. As a Site Reliability Engineer at Capital on Tap you will be responsible for designing, building, and monitoring systems to maximise platforms uptime and efficiency for the best possible end-user experience. You are also tasked with identifying and resolving potential outages and performance issues before they become a problem.

  • Manage Azure services and resources, Cloudflare edge security, traffic management in code
  • Create, manage, and monitor development resources within Kubernetes clusters and Serverless (i.e. Function Apps, Automation Accounts) for Product Engineering Teams
  • Own Terraform / Ansible / Pulumi Infrastructure as Code for each Product Engineering team
  • Continuously identify opportunities for improvement in systems, processes, and technologies, and implement changes to improve the overall reliability and performance of the platforms
  • Improve monitoring to provide insights into uptime and availability, and work towards the agreed SLO
  • Own and lead the troubleshooting of incidents that impact the customer experience
  • Our Tech Stack
  • Cloud: Azure and GCP
  • Containerisation: Kubernetes, Docker
  • IaC: Terraform
  • CI/CD: Azure DevOps, Octopus Deploy
  • Monitoring: Datadog, Prometheus, Grafana
  • Scripting: Python, Powershell, Bash

    Experience in managing public cloud processes
  • Experience in Azure DevOps, Octopus, and other CI/CD tools
  • Experience in Python, Powershell, Bash, or other scripting languages
  • Experience with Terraform
  • Experience working with a cloud monitoring solution (Datadog would be advantageous)
  • Experience with Kubernetes and Docker (advantageous)

    We welcome, consider and encourage applications from anyone who shares our commitment to inclusivity. Join us in creating a space where authenticity thrives, and everyone can do their best work.
  • Great Work Deserves Great Perks We try not to take ourselves too seriously (all the time) so we make sure our office is decked out with a pool table, arcade machine, beer tap, and a couple of office dogs thrown in for good measure. Check out our benefits: Private Healthcare including dental and optician services through Vitality ️ Worldwide travel insurance through Vitality Anniversary Rewards (£250, £500, £750, 4-week fully paid sabbatical) Salary Sacrifice Pension Scheme up to 7% match ️ 28 days holiday (plus bank holidays) Annual Learning and Wellbeing Budget Enhanced Parental Leave Cycle to Work Scheme Season Ticket Loan 6 free therapy sessions per year Dog Friendly Offices Free drinks and snacks in our offices