Wait times are intermittently spiking to 100 or more seconds on all clouds. These spikes only last about three minutes and occur about once every two hours. We are investigating the cause.
2020-February-23 Service Incident
Incident Report for Sauce Labs US West Data Center
Postmortem

Dates:

Sunday February 23, 9:07 PM - Monday February 24, 1:17 AM PT

What happened:

Our Mac, PC & Virtual Device clouds experienced elevated error rates. Sessions did not start correctly and there were intermittent errors with our Web UI.

Why it happened:

A Kubernetes node experienced a sudden spike in memory usage. As a result, the network proxy service on that node was unable to properly manage security rules, leading to inconsistent communication between it and other services used by the cluster.

How we fixed it:

We isolated and reset the unreliable node. Services were re-established on a good node and we started to recover.

What we are doing to prevent it from happening again:

We are implementing stronger limits for memory and CPU utilization on a per-node basis. We are also investigating the ability to defend key services such as Kube-proxy against memory and CPU spikes.

Posted Mar 05, 2020 - 17:10 PST

Resolved
This incident has been resolved.
Posted Feb 24, 2020 - 01:23 PST
Monitoring
We have stabilised our service and are beginning to restore functionality. Sessions are being created, however, customers may still see extended wait times.
Posted Feb 24, 2020 - 00:18 PST
Update
Mac, PC and Virtual Device tests continue to experience issues starting and elevated error rates. Our engineers have eliminated several potential problems and are continuing to investigate.
Posted Feb 23, 2020 - 23:03 PST
Update
Customers will continue to see elevated error rates and problems starting Mac, PC and Virtual Device sessions. We are continuing to investigate.
Posted Feb 23, 2020 - 21:57 PST
Investigating
Our Mac, PC & Virtual Device clouds are experiencing elevated error rates. Sessions may not start correctly. In addition, our Web UI is experiencing intermittent errors. We are investigating.
Posted Feb 23, 2020 - 21:30 PST
This incident affected: Web Interface (Sauce UI), Manual Testing (Manual VM Testing), and Automated VM Testing (Automated PC Testing, Automated Mac Testing, Automated iOS Simulator Testing, Automated Android Emulator Testing).