9. Managing system failures

From DARWIN
Jump to: navigation, search


Managing system failures

Associated cards

9.1. Supporting Development and Maintenance of Alternative Working Methods

The card supports the development and the maintenance of Alternative Working Methods (AWMs) in case of system failure. System failures are situations in which an essential component to ensure continuity in the service offered by the organization is either lost or functioning in a degraded mode and there is no backup, emergency or contingency solution available by design. Applying an AWM means performing one or more activities within the organizations in a way which is remarkably different from what described in existing procedures or practices, in order to bypass the constrain created by the system failure. It may imply following different steps in the way to perform the activity, using different tools or cooperating with different people (or all of the above) with respect to what is normally done without the system failure.

INTERVENTION

  • Preparing organizations to ensure business continuity in the face of major system failures by supporting the development of alternative working methods (AWMs).
  • Making sure that relevant people in the organization will be ready to identify and use the AWM
  • Making sure that the identification of AWMs is based on a thorough analysis of potential failure scenarios, not manageable with ordinary backup systems and with a potential for compromising the business continuity

MECHANISM

  • Identifying possible AWMs in dedicated focus groups, based on three main principles:
    • Revising already existing AWMs
    • Reverting to “older” working methods, such as using older facilities characterized by a lower level of automation
    • Envisioning alternative uses of existing resources
  • Making sure relevant people in the organizations are aware of the availability of AWMs
  • Making sure relevant people in the organizations are sufficiently trained to use AWMs in case they will be needed
  • Tailoring the mechanism to situations before, during or after a crisis due to a major system failure.

OUTCOME

  • Organizations for which maintaining business continuity shortly after a major system failure is of critical importance
  • Organization for which it is impossible to design in advance a backup system for all possible occurrences of system failure
  • Organizations which are ready to provisionally reorganize their resources in the face of a system failure even if this would imply significant deviation from ordinary procedures, working methods and hierarchical structures
  • Organizations whose activities depend on critical infrastructure that may experience failures on which they do not have full control

CONTEXT

  • It should be clear who is in charge of deciding the feasability and sustenaibility of the alternative methods, as well as the transition process (begin it/end it)
  • There should be a clear communication on the development and manteinenche of alternatives working methods to all the relevant mebers involved in the decisions.