2019-March-27 Service Incident
Incident Report for Sauce Labs US West Data Center
Postmortem

Dates:

March 27th, 2019 3:59 AM - 4:49 AM PDT

What happened:

Customers could not start tests or new tunnels. Both the SauceLabs UI and saucelabs.com were unresponsive.

Why it happened:

The normal constraints on the volume of concurrent tests allowed to run on our service were inadvertently removed by a configuration change. Without these constraints, the number of running tests rapidly increased to the point that our service became unresponsive.

How we fixed it:

We corrected this by returning the subsystem config to its proper value and restarting the subsystem.

What we are doing to prevent it from happening again:

We’ve hardened our process around managing configuration changes to this subsystem.

Posted about 1 month ago. Apr 15, 2019 - 15:26 PDT

Resolved
This incident has been resolved.
Posted about 2 months ago. Mar 27, 2019 - 07:59 PDT
Monitoring
All services -- including automated tests, Sauce Connect tunnels, our REST API and website and dashboard -- are behaving normally. We are continuing to monitor but expect a full recovery soon.
Posted about 2 months ago. Mar 27, 2019 - 04:49 PDT
Investigating
Automated and Manual tests are not starting and Sauce Connect tunnels are failing to start. Also the Sauce Labs dashboard and saucelabs.com are failing to load. We are investigating.
Posted about 2 months ago. Mar 27, 2019 - 04:16 PDT
This incident affected: Web Interface (Sauce UI, Real Device Cloud UI, Analytics, saucelabs.com), Manual Testing (Manual VM Testing), Sauce Connect (Sauce Connect VM), REST API (REST API VMs), and Automated VM Testing (Automated PC Testing, Automated Mac Testing, Automated iOS Simulator Testing, Automated Android Emulator Testing).