posted this on March 14, 2013, 5:44 PM
At approximately 3:00 PM CST Thursday March 14, 2013, SCORM Cloud experienced a service disruption for around 3 hours. Earlier in the week, we made changes to how SCORM Cloud handles caching in order to increase system stability and performance.
Due to a mistake in the rollout of this change, we experienced import failures on one of our Amazon servers. This caused a series of cascading failures due to excessive CPU load and revealed instability in our use of multiple availability zones across Amazon’s web service.
To rectify the situation, we took all of SCORM Cloud servers offline to troubleshoot and determine the root cause. At this time we believe there was an issue with our use of multiple availability zones. Until we resolve this, we’ve chosen to restrict the SCORM Cloud to running on a single availability zone. This will be a temporary solution while we investigate the issue further but will help to ensure we avoid a repeat of the outage we experienced.
We realize the impact this may have had on your training and apologize for any inconvenience. If you have any questions, feel free to send them our way. We’re happy to help in any way that we can.