Wait times are intermittently spiking to 100 or more seconds on all clouds. These spikes only last about three minutes and occur about once every two hours. We are investigating the cause.
2020-March-12 Resolved Service Incident
Incident Report for Sauce Labs US West Data Center
Postmortem

Dates:

Thursday March 12 11:42 PM - Friday March 13 00:47 AM PT

What happened:

Tests running on the U.S. Real Device Cloud were interrupted, and no new tests in the U.S. could be started.

Why it happened:

While upgrading our database we used a “live migration” approach to ensure the production database could still be accessed for user tests. During the migration, we experienced unexpected side effects that impacted our U.S. real device IDs, interrupting current tests and preventing new ones from starting.

How we fixed it:

The migration was canceled, and we rolled back to the most recent backup.

What we are doing to prevent it from happening again:

  • Extending the timeline for the project in cooperation with our database provider to guarantee the necessary diligence for every step of the migration.
  • Implementing additional security and review procedures for any future changes.
  • Improving the monitoring, and alerting for our database instance so alerts are triggered immediately if data got removed
Posted Mar 20, 2020 - 14:47 PDT

Resolved
Our Real Device Cloud stopped being available between 11:45 pm and 00:50 am (PDT), leading to failures for automated and live tests.
We have taken remedial action, and all services are now fully operational.
Posted Mar 12, 2020 - 23:45 PDT