We have implemented changes to address intermittent timeouts or hang ups some customers experienced connecting to Automated VM testing services. We are monitoring our services to ensure full functionality; Please reach out to support@saucelabs.com if you need assistance.
2020-April-27 Service Incident
Incident Report for Sauce Labs US West Data Center
Postmortem

Dates:

Monday April 27 13:15 - 14:02 PDT

What happened:

Our dashboard and REST API were not responding and users were unable to login. Sauce Connect Tunnels were down.

Why it happened:

In the process of decommissioning legacy Redis in-memory storage data clusters, a misconfiguration of our REST API surfaced when the final Redis node was shut down.

How we fixed it:

We redeployed the REST API services with the proper configuration to ensure it was referencing the new Redis cluster.

What we are doing to prevent it from happening again:

We have implemented a new policy that requires approval by a four-member committee prior to decommissioning any production service. The people required for a service turn down are a systems engineer, a network engineer, a software engineer (from the team that owns the service in question) and a software lead/manager (from the team that owns the service in question). We are also implementing a new network traffic review of any production service before it is turned down. We will run all production traffic to source, determine if it is required or not, and if it should be swung to a new location prior to decommissioning. Finally, we will be terminating all services via firewall rules and instituting a hold for 24 hours prior to fully decommissioning the services, enabling a rapid return to services in the event something is missed.

Posted Apr 29, 2020 - 09:43 PDT

Resolved
Dashboard, REST API and Sauce Connect tunnels have recovered.
All services are now fully operational.
Posted Apr 27, 2020 - 14:02 PDT
Update
We are taking remedial action. Sauce Connect tunnels and REST API are recovering.
We continue to investigate.
Posted Apr 27, 2020 - 13:52 PDT
Investigating
Our dashboard and REST API are not responding and failing to login.
Sauce Connect tunnels are not starting.
We are investigating.
Posted Apr 27, 2020 - 13:15 PDT
This incident affected: REST API (REST API VMs), Sauce Connect (Sauce Connect VM), Web Interface (Sauce UI, Analytics), and Manual Testing (Manual VM Testing).