4.3. Noticing brittleness
The interventions proposed here aim to support organisations to identify sources of brittleness in order to invest in their correction.
Brittleness is experienced in situations of goal conflicts and trade-offs, or when there is a competition for resources and a need to establish priorities under time pressure. Other difficulties emerge when an organisation struggles to manage functional interdependencies between different parts of the same organisation, or when there is insufficient buffer capacity to provide additional resources. Noticing brittleness also means observing operational variability and comparing work-as-done with work-as-imagined, so to reveal how the system might be operating riskier than expected. In addition, brittleness manifests itself when the organisation is unable to learn from past events, such as near misses and accidents.
- 1 Implementation
- 1.1 Introduction
- 1.2 Before a crisis
- 1.3 During a crisis
- 1.4 After a crisis
- 2 Understanding the context
- 2.1 Detailed objectives
- 2.2 Targeted actors
- 2.3 Expected benefits
- 2.4 Relation to adaptive capacity
- 2.5 Relation to risk management
- 2.6 Illustration
- 2.7 Implementation considerations
- 3 Relevant material
- 3.1 Relevant Practices, Methods and Tools
- 3.2 References
- 3.3 Terminology
- 4 Navigate in the DRMG
What is needed to notice brittleness:
- Engage personnel at all levels of the organisation in understanding and noticing brittleness.
- Create the conditions for personnel across the organisation to expose and discuss things that do or might not go well in crisis situations.
- Implement recommended activities regularly to facilitate the personnel's capacity to notice and discuss brittleness.
- Rely on external experts if resilience or safety managers familiar with notions of resilience are not available.
- Select methods for the identification of possible sources of brittleness with the involvement of roles and actors at different levels in the organisation, making sure to account for an adequate diversity of perspectives. In order to achieve such diversity, combine individual interviews and workshop-based techniques, taking into account time constraints and availability of resources.
- Plan the methods around triggering questions to be used as guide for the analysis (see examples of triggering questions below for the phases ‘Before’, ‘During’ and ‘After’ a crisis).
- Use the outcome of your analysis to revise your internal guidelines or to create ad-hoc ones.
Note Brittleness is a useful concept because it can be easier to describe and notice when systems can break down. However, this focus on "what goes wrong" is complementary to the approach described in Identifying sources of resilience. It would actually be counter-productive to only focus on the negative aspects of systems and operations: it is fundamental to also understand the nature and characteristics of resilience and how it exists in the organisations considered.
Before a crisis
The assessment of potential sources of brittleness can be performed in two types of situations:(1) on a periodic basis, as part of established self-assessment activities; (2) In anticipation of specific events, to ensure resilience capabilities are in place. Relevant examples of the latter case include especially:
- Anticipated surge in demands (e.g., due to seasonal peak of activity, or to the approach of an identified threat)
- Relevant change brought to the system of interest (e.g. a new technology, a new policy, a new role being introduced).
In all of these cases, the analysis should aim to reveal and discuss potential issues that the system under investigation might experience when handling a crisis. For those organisations which have already identified a list of mitigation measures in case of accidents and crises (e.g., in classic risk management activities), the assessment of brittleness should also focus on understanding what might go wrong when applying the mitigation measures.
What is needed to notice brittleness Before a Crisis
For both the situations described above, noticing brittleness can be achieved through the organisation of a short workshop or focus-group for which:
- participants are introduced to principles of resilience,
- a facilitator leads a discussion about anticipated crisis situations and potential pitfalls,
- the discussion is guided by the triggering questions presented below (the full set or a selection of them).
In such workshop or focus group, it is possible to use actual past events or fictional scenarios, to ground and direct discussions (see Practice 1 for an example related to surge in demand and Method 2 for an example associated to a technological change).
- Are there situations in which the resources we expect to have to respond to a crisis/emergency may not be available?
- What can we put in place to relieve, lighten, moderate, reduce and decrease stress or load?
- Where could we easily add extra capacity to remove stressors?
Lack of Information
- Can we anticipate situations in which we will lack the necessary information to handle a certain event?
- Do we have a protocol in place to gather the missing information?
- Can we anticipate situations in which we may experience uncertainty based on the history of our operations?
- Which processes and/or plans are insufficiently defined and may represent a source of uncertainty?
- What goal conflicts and trade-offs may arise or increase?
- In such situations, will we be able to establish priorities?
- Can some goals be temporarily relaxed or sacrificed to reduce the trade-offs?
Constraints and Bottlenecks
- What constrains us in our ability to execute?
- What conditions may push our system towards its limits?
- Who will be most heavily loaded/stressed?
- Can we anticipate situations in which our operations will be constrained by other organisations?
- Can we anticipate situations in which our operations act as a constraint for other organisations managing a crisis?
Difficulties to adjust
- Do we have the capacity to reallocate existing resources if needed. What may prevent us from reallocating them?
- Do we have a policy that allows us to modify normal operations when needed?
- Do we expect that major mismatches between official procedures and actual practices may occur?
Limits of mitigation plans
- If we have safety/emergency plan, what can go wrong when applying the planned mitigation actions?
- What could prevent us from applying some of the mitigation actions?
During a crisis
During time-critical types of crisis, it may be difficult to use triggering questions as a checklist to be read step-by-step. However, it is important that all the professionals involved in the management of the crisis are fully aware of the topics addressed by the triggering questions and can consider such topics, even without reading them.
For crises that develop over longer time (e.g. Icelandic volcano eruption, or Ebola outbreak) it is possible to organise workshops or operative meetings to reflect with other colleagues on the possible sources of brittleness, and use the triggering questions to support the reflection. The same approach can be used during a drill or a simulation by a facilitator to guide the simulation and stimulate participants to notice brittleness.
- Do we need additional resources (human, technical, material) to manage the event?
- Are other part of our organisation able to renounce to some of their resources, to support us in managing the event?
Lack of information
- Is there additional information available to address the crisis that we are not considering?
- In case of lack of relevant information to handle the situation, can we put a protocol in place to gather the missing information?
- Can we ask the advice of a colleague who is not involved in the crisis, to support us in correctly interpreting the situation?
Constraints and Bottlenecks
- Are our operations during the crisis blocked by member of other organisations?
- Are we hindering the operations of the members of other organisations during the crisis?
Difficulties to Adjust
- Are we in a capacity to reconsider our priorities?
- Can we delay the achievement of some goals, in favour of more urgent ones?
- Can we consider deviations from normal procedures to manage the event?
Difficulties to learn from the crisis
- Are we able to capture experiences from the crisis, in a format that support the dissemination of “lessons learned” inside the organisation
- Will the format of such “lessons learned” encourage remedial actions by the management?
Difficulties to learn from previous events.
- Are we adequately considering “lesson learned” from the past?
After a crisis
Adverse events usually provide information that helps identify sources of brittleness (similarly to the way accidents and incidents can be used for safety-related purposes). However it should be emphasised that analyses must focus on processes, i.e. how operations were conducted, rather than on outcomes, i.e. what the consequences were.
What is needed to notice brittleness after a crisis . Depending on time of implementation, resources and objectives, organisations can:
- Conduct quick assessments based on methods such as the focus groups described in Practice 1, for instance during debriefing sessions.
- Conduct more in-depth analyses based on methods that focus on understanding operations in context (e.g., CTA – see Method 1). Data used in such analyses can come from data recorded during the crisis experienced, investigation reports or debriefings, whether it was an actual event or an exercise.
- Across longer timeframes, assessments need to be conducted about how the organisation has reacted after crisis events, for instance whether it has prioritised and invested resources in the analysis and enhancement of resilience. Failures to do so correspond to forms of brittleness (see Method 3).
- Were our resources (human, equipment, material) adapted to the scale of the event?
- Which were the missing resources, competences, strategies (if any)?
Lack of Information
- Did we experience cases in which the information we had was insufficient to effectively handle the situation?
- Were there difficulties to put in place protocols to gather the missing information?
- Did the crisis we experienced reveal wrong assumptions we had about the nature of threats we are exposed to, and about our capacity to handle them?
- Did the crisis we experienced challenge the plans we had established?
- What goal conflicts and trade-offs did we experience?
- Were the goal conflicts unusual or unexpected?
- Were we able to establish priorities?
- Did we sacrifice any goal in a way that reduced our ability to adapt to certain circumstances
Constraints and Bottlenecks
- What were the bottlenecks?
- Where our operations dependent on others?
- Were the operations of others' dependent on ours?
- Was collaboration with other organisations effective? If not, which were the constraints?
Difficulties to adjust
- Were we able to deploy or mobilise additional resources when needed? If not, what prevented us from doing so?
- Were other parts of the organisation able to renounce to some of their resources when needed? If not, what prevented them from doing so?
- Were we able to adjust goals and priorities when needed? If not, what prevented us from doing so?
- Were we able to modify normal operations when needed.
- Did we observer an excessive mismatch between official procedures and actual practices during operations.
Difficulties to learn from the crisis
- Were we sufficiently able to capture experiences from the crisis and collect them in a format easy to share inside the organisation?
- Were we sufficiently able to use these experiences to promote "after action review" inside the organisation?
Difficulties to learn from previous events
- Have past, potentially similar, events in our own organisation sufficiently helped us being prepared for this crisis?
- Have similar events in other organisations or domains sufficiently helped us being prepared for this crisis?
Limits of mitigation plans
- If a safety/emergency plan was available, what went wrong when applying the planned mitigation actions?
- Did we miss any mitigation action that would have been necessary?
- What prevented us from applying some of the mitigation actions?
- Did some mitigation actions result insufficient to handle the associated hazards?
Understanding the context
As part of the assessment of resilience, noticing brittleness is the approach that aims at revealing and understanding deficiencies in and challenges to resilience in the system under consideration.
The opposite of a resilient system is a brittle one. Brittle systems break down especially in the face of surprising situations at the boundaries of what the system typically handles. In those situations, they are unable to accommodate even minor disturbances without ceasing to function. Examining the factors that undermine resilience is important in order to identify the most effective measures to actually enhance resilience and reduce brittleness. This assessment supports preparedness (e.g., related to planning or training) and the avoidance of situations that would result in potential harm or damage, for instance by anticipating potential bottlenecks in the response to a crisis situation.
Managers are expected to implement the interventions in two ways:
- setting up regular activities that lead to discussions about brittleness and its identification;
- involving actors at all levels of the organisation, in particular team leaders and other operational personnel who are engaged in crisis management activities.
In addition, members of the organisation familiar with resilience notions (e.g., resilience or safety managers), possibly with the help of external experts, play a key role in conducting events, leading and moderating discussions about brittleness.
Understanding brittleness in the system allows organisations to address its sources and underlying factors and avoid situations that would result in potential harm or damage.
Relation to adaptive capacity
Noticing brittleness occurs through understanding when the system lacks adaptive capacity, or, more generally, faces challenges with adaptation. Through investigating brittleness, organisations can notice signs that indicate that their adaptive capacities are either eroding or ill-matched to the demands that are about to occur, allowing them to invest in order to adjust those capacities. This can happen before, during, or after a crisis event.
Relation to risk management
As part of the Resilience Engineering paradigm, noticing brittleness affords proactive safety management. Brittleness relates to how the system under investigation behaves under stress, more than to specific characteristics of the system or of threats. This approach contrasts with the traditional industrial safety paradigm of counting errors after accidents or mishaps and deriving specific risk-based interventions to reduce this count.
Companies arrive on the fire scene and implement standard operating procedures for an active fire on the first floor of the building. The first ladder company initiates entry to the apartment on fire, while the second ladder gets to the second floor in order to search for potentially trapped victims (the ‘floor above the fire’ is an acknowledged hazardous position). In the meantime, engine companies stretch hose-lines but experience various difficulties delaying their actions, especially because they cannot achieve optimal positioning of their apparatus on a heavily trafficked street. While all units are operating, conditions are deteriorating in the absence of water being provisioned on the fire. The Incident Commander (IC) transmits a ‘all hands’ signal to the dispatcher, leading to the immediate assignment of additional companies. Almost at the same time, members operating above the fire transmit a ‘URGENT’ message over the radio. Although the IC tries to establish communication and get more information about the difficulties encountered, he does not have uncommitted companies to assist the members. Within less than a minute, a back-draft-type explosion occurs in the on fire apartment, engulfing the building’s staircase in flames and intense heat for several seconds, and erupting through the roof. As the members operating on the second floor had not been able to get access to the apartment there due to various difficulties, they lacked both a refuge area (apartment) and an egress route (staircase). The second ladder company was directly exposed to life-threatening conditions.
In spite of the negative outcome of the situation described, it illustrates a practice of noticing brittleness during the response to a crisis. The Incident Commander (IC) recognised and signalled a ‘all hands’ situation, in order to inform dispatchers that all companies were operating and to promptly request additional resources. ICs are particularly attentive to avoid risks of lacking capacity to respond to immediate demands as well as to new demands. The ‘all hands’ signal is a recognition that the situation is precarious (brittle) because operations are vulnerable to any additional demands that may occur.
- Noticing brittleness requires that actors are familiarised with the principles of resilience. It is nonetheless a perspective and skill that can be learned (see Practice 1).
- Enhancing resilience also requires understanding why things go right. Noticing brittleness is a useful way to anticipate, react to, and learn from challenging situations, but should not be the sole focus of a resilience assessment.
- Because noticing brittleness focuses on how the system behaves under challenging situations, it is also different from understanding the threats or vulnerabilities of the system.
Some of the methods described can be carried out in short amounts of time, e.g., through workshops or focus groups (e.g., Practice 1, Method 2). However, they require:
- to be carried out by appropriately trained and knowledgeable people who can act as facilitators;
- to involve a sufficient diversity of participants to yield the most information and best results.
Cognitive Task Analysis (see Method 1) is a well documented and practiced method coming from the field of human factors. However, it is a resource- and knowledge-demanding method, best carried out by experts in the field.
Noticing brittleness requires that actors are familiarised with the principles of resilience. Resources need to be anticipated in order to develop the associated perspective and skills (see Practice 1).
Relevant Practices, Methods and Tools
- Brittleness assessment practices in industrial maintenance. Lay and Branlat (2014) describe how the necessary participants’ skills can be built through the use of study groups that aim at observing and discussion resilience and brittleness at play. A table in the document summarises examples of observations of brittleness at play. A workshop can be conducted prior to anticipated peak season (increased demands and risk of events) during which a facilitator helps participants notice brittleness. The document describes a set of guiding questions.
- “All hands” alarm in firefighting operations. The ‘all hands’ signal is used by an Incident Commander and by the dispatcher to quickly request additional resources when all companies on site are busy. It is a recognition that the situation is precarious (brittle) because operations are vulnerable to any additional demands that may occur. See illustration in this card and Woods and Branlat (2011).
All of the methods below are relevant to both Noticing brittleness and Identifying sources of resilience; these topics simply represent different focus of attention during the discussions. The corresponding cards can be used conjointly during the implementation of the methods.
- Cognitive Task Analysis (CTA) - TRL 9 - CTAs are typically based on different techniques that capture aspects of the situations under consideration. Analyses can occur after situations were experienced. CTAs can be conducted during training situations, which provide rich and more controlled situations during which crisis-relevant data can be captured more easily. See Crandall, Klein, and Hoffman (2006).
- Resilience Engineering assessment guidance - TRL 6 - The method was developed as a complement to a traditional safety assessment, in the context of technological changes in the Air Traffic Management domain. It focuses on understanding the variability the system (people and technology) needs to handle in everyday operations, how it currently adapts and handles the more challenging situations, and, finally, to anticipate how adaptation might be hindered or improved after the implementation of the new technological system. The method relies on short workshops/interviews led by a resilience assessment expert and involving relevant stakeholders such as operators (direct users of the system or operators they interact with), managers and designers of the technology.
- Q4 Framework - TRL 2 - Visualisation to assess how the organisation is prioritising and investing in safety, how it has reacted to adverse events. Assessment could also include measuring brittleness and evaluation of cost-effectiveness of countermeasures. See Woods, Herrera, Branlat and Woltjer (2013).
- Crandall, B., Klein, G. A., & Hoffman, R. R. (2006). Working minds : a practitioner’s guide to cognitive task analysis. Cambridge, MA: MIT Press.
- Lay, E., & Branlat, M. (2014). Noticing Brittleness, Designing for Resilience. In C. P. Nemeth & E. Hollnagel (Eds.), Resilience Engineering in Practice: the Road to Resilience. Farnham, UK: Ashgate.
- Woods, D. D., & Branlat, M. (2011). Basic Patterns in How Adaptive Systems Fail. In E. Hollnagel, J. Pariès, D. D. Woods, & J. Wreathall (Eds.), Resilience Engineering in Practice (pp. 127–144). Farnham, UK: Ashgate.
- Woods, D. D., Herrera, I., Branlat, M., & Woltjer, R. (2013). Identifying Imbalances in a Portfolio of Safety Metrics: The Q4-Balance Framework for Economy-Safety Tradeoffs. In I. Herrera, J. M. Schraagen, J. Van der Vorm, & D. Woods (Eds.), Proceedings of the 5th Resilience Engineering Association Symposium (pp. 149–154). Soesterberg, NL: Resilience Engineering Association.
Brittleness describes how rapidly a system's performance declines when it nears and reaches its boundary conditions (Source: Woods, 2015).
- Buffer capacity
Size or kinds of disruptions the system can absorb or adapt to without a fundamental breakdown in performance. (adapted from Woods, 2006)
- Functional interdependence
Interrelationships (mutual dependence) between functions of a system.
- Operational variability
Variability and uncertainty are inherent in complex work such as disaster response; the conditions and challenges that manifest themselves are many and various. These can take the form of changes experienced in the daily life of operational units everywhere; or surprises that emerge from the interface of system elements that interact in unusual ways (e.g., hidden interactions); or challenges such as volcanic ash that defy prediction capabilities.
Work as done refers to he assumptions or expectations of what other people do [as part of their work] is called Work-as-Imagined (WAI), while that which people actually do [as part of their work] is called Work-as-Done (WAD) (Hollnagel, 2018, p. 17).
Work as imagined refers to the assumptions or expectations of what other people do [as part of their work] is called Work-as-Imagined (WAI), while that which people actually do [as part of their work] is called Work-as-Done (WAD). The term 'imagined' is not used in an uncomplimentary or negative sense but simply recognises that our descriptions of work will never completely correspond to work as it takes place in practice - as it is actually done (Source: Hollnagel, 2018, p. 17-18) and how work is being thought of either before it takes place when it is being planned or after it has taken place when the consequences are being evaluated (Source: Wears and Hollnagel, 2015).
- Parent theme: Assessing resilience
- Resilience abilities
- Categories: Evaluation, Situation understanding, Learning lessons, Planning, Training
- Functions of crisis management: BEFORE, Preparation, Build knowledge of crisis situations, Anticipate demands in crisis response, DURING, Damage control and containment, Assess emergency and response, AFTER, Learning, Assess performance