Willis Towers Watson

Site Reliability Engineer

Posted on Apr 1 58 views Salt Lake City, UT

Our engineering team has built the largest private Medicare marketplace in the country. We passionately focus on the continuous improvement of the systems we build and the culture we promote. We build a platform that provides the best possible support to our customers who are shopping for insurance, and where our insurance carriers can be confident that their products are accurately and impartially represented.

The Role

We are looking to grow and lead our teams to include the Site Reliability Engineering discipline. We have spent many years growing and fostering a DevOps culture by bridging the divide between our Software and Infrastructure Engineering departments. We want the cross-functional teams we are building to include Site Reliability Engineers, in order to do that we need someone to guide and curate our internal definition of success for this discipline. The ideal candidate can successfully coach managers on how to find, retain, and manage site reliability engineers; while remaining detached from a direct reporting relationship. Success would also entail coaching engineers looking to specialize in how to learn and master this discipline. We operate in a complex multi-tenant hybrid cloud and on-premises infrastructure that spans both Windows and Linux OS. We strive for security, reliability, and automation in line with DevOps and Site Reliability Engineering principles. If you are passionate about helping team members grow and promoting a culture of learning and improvement through metrics and automation while sharing those lessons learned, we want to hear from you.


You will be responsible for helping our Client Product Family build out a robust new architecture in a hybrid cloud environment. This includes typical data processes like ETL, but with an approach that follows traditional software engineering practices such as source control, unit testing, continuous integration, continuous deployment, etc. The product family is also be responsible for various REST / HTTP services primarily built with .NET. You will not be limited to the area of infrastructure and are invited to help the software development team in any of the tasks they are facing, but you will be the one responsible for the infrastructure and to ensure that everything is resilient, monitored, etc. In the same way that your responsibilities are not limited, our development teams are expected to help out with these kinds of activities as well but may require mentoring in areas they are unfamiliar or inexperienced with.

The Requirements

  • Bachelor's Degree required

  • Hands-on Engineering:
    • 5+ years of hands-on technical experience with many of the following technologies, at least 50% of day to day function will be focused in this area.
  • Windows and Linux Servers
  • VMware
  • Cloud platforms, preferably with Azure
  • Active Directory
  • Secrets management with Consul and Vault or similar systems
  • Configuration management tools like Salt and Terraform
  • Firewalls and load balancers such as F5
  • Web servers including IIS, NGINX, and Tomcat
  • Application Performance Monitoring with tools like New Relic
  • Infrastructure monitoring with tools like Sensu, SolarWinds, or Nagios
  • Continuous Integration and Continuous Delivery with tools like TeamCity, Octopus Deploy, Concourse, or Azure DevOps
  • Log Aggregation tools like SumoLogic or Splunk
  • Network theory and protocols such as DNS, DHCP, Proxy Servers, and Firewalls
  • Security operations with tools for SAST, DAST, RAST, and WAF
  • Strong consideration will be given to the following:
    • Hands on experience with infrastructure as code.
    • Experience with infrastructure as it relates to data pipelines and processing
    • Experience with Apache Airflow
    • Data solutions experience with any of the following; Azure Data Lake, Blob Storage, Apache Spark, Python Pandas

EOE, including disability/vets