When We Find the Cause of Errors, We are Bound to Make Them Again

We are often exposed to mono-causal explanations when we read about safety and accidents. Sometimes the human is blamed as the one and single cause: It was “human error”.

✎ By Thomas Koester, Senior Human Factors Specialist, January 30th. 2020

Mono-Causal Explanations

This was seen recently in a LinkedIn post from Microsoft introducing the statement that “50 percent of all security breakdowns are caused by human error”. The statement referenced a report by McKinsey & Company from September 2018 about the causes behind security threats in companies related to their own employees and suppliers.

In other cases, technology is denominated as the cause to blame, as seen in an article in Ingeniøren (August 20, 2019) with the headline: “US Navy drops touchscreens after fatalities”.
Both the Microsoft case and the US Navy case are examples illustrating the mono-causal approach, where accidents and mishaps are attributed to a single component in a complex system.
The mono-causal approach is problematic: It leads to interventions that are bound to fail. This is because intervention focusing on one single component alone doesn’t account for the inertia, complexity and mechanisms of the whole system.
The last 30 years of human factors based safety research, and research in “human error” clearly show, that interventions that focus exclusively on the individual human in the system – fail. The system just keeps reproducing the same mistakes, errors and accidents after the intervention – even after replacing humans in the system with other humans.
And the same is the case if focus is exclusively on any other single component of the system, for example the technology, rather than the entire system.
But why then, after 30 years of research, is the mono-causal approach still alive? The answer is probably: because it is simple and intuitive and looks nice in headlines. One effect, one cause. If the cause can be eliminated by patching the system, then the problem must be solved. But this logic ignores the mechanisms and inertia of complex systems, and the simple and intuitive explanation becomes insufficient and well, wrong.
This article explores the problems related to mono-causal explanation and by unfolding some brief highlights from the 30 years of research it makes it even more clear why the simple and intuitive explanation doesn’t work.

Who Should We Blame?

DPsy-Accident.jpg

The expression “it was human error” was very popular in safety research and accident investigation as mono-causal explanation, especially from late 1970s to early 1990s. Then in the 1990s – the scope was extended, and focus shifted from “human error” to “organisational error” recognising the fact that humans were a product of the organisation they worked in.
James Reason played an essential role in this thinking. He introduced the models and frameworks about how accidents are staged through causal chains. Lack of sufficient firewalls, well before the frontline operator action, can lead to accidents and mishaps, e.g., organisational pressures, incompatible goals, lack of supervision or simply poor management (Reason, 1990 and 1997). Blame shifted from the human frontline operator on site where the action happens at the sharp end of the instrument, to the human managers in the corporate headquarter distant from the action at the blunt end. The concept “sharp end - blunt end” was introduced by James Reason in his Accident Causation Model (Reason, 1990).

From Single Cause to Explanatory Factors


The concept “human error” was slowly taken out of service from around the year 2000 as a single cause explanation in the science of human factors. The 2000s introduced a new regime in safety and accident research. It focused on explanatory factors rather than single causal factors. The focus shifted from the safety of a single component, such as humans or technology, to the systems’ overall safety.


You could see this approach in the accident analysis tools used by civil aviation authorities, the ECCAIRS ADREP 2000 “Explanatory Factors” classification system. Years later it was also expressed in the concept of “resilience engineering” (Hollnagel et al., 2006). One of the points about resilience engineering is that system breakdown and accidents should be prevented through good connection between the components of the system. Safety is obtained through strong systems rather than systems with strong components. Systems designed in this way can recover from single component breakdown. This makes them less vulnerable and sensitive to the performance of the single component.


Systems become much more robust when focus is shifted from optimising the safety performance of each system component from an isolated perspective to an optimisation of the overall system structure, i.e. the glue that bind the components together.

The Simple Explanations that Will Not Die


But even though modern safety and human factors research have moved far away from the mono-causal explanation format, it still pops up from time to time in the press. It is used to explain the mechanics behind accidents and the rationale for preventing them. This way of looking at safety and accidents ignores the benefits of the system approach. They revert back to favouring the simple, and not so complex one-to-one explanation, which has been developed from the 1990s through 2010s safety and human factors research.


The idea that we can identify one single component e.g. person or technology in the system as the cause and solve the problem and make the system safe again by substituting this specific component with another person or technology is outdated in research based approaches. But it still lives in the popular folk culture approach to safety and accidents and in the press.

The Pitfalls of the Mono-Causal Accident Understanding

I will present you for a deeper discussion of a specific example which at first glance and on the surface, as it is pictured in the headline of the article in Ingeniøren from August 20, 2019, looks mono-causal. But if you give the example a closer look and read the official accident analysis report, it unfolds a complex structure of explanations, which can not be capture by a mono-causal approach. The mono-causal approach has pitfalls. I will advocate for the system based approach as an alternative to the simple surface understanding of the mechanisms behind accidents and mishaps. And an alternative to the simple surface understanding of how to respond to them and prevent them from happening in the future.

The example is related to the replacement of one technology, touch screen control, with another technology a physical throttle and traditional helm control in the ship’s Integrated Bridge and Navigation System on board all DDG-51 class ships in 2020. This process was presented in Ingeniøren August 20, 2019 with the headline: “US Navy drops touchscreens after fatalities”.
Technology alone can not solve the problem. However, understanding something about how people interact with complex systems can.

Replacing One Technology with Another

The US Navy decided to exchange one ship control technology, a touch screen control, with another technology, a physical throttle and traditional helm control in the ship’s Integrated Bridge and Navigation System on board all DDG-51 class ships in 2020. The background of the decision is a combination of conclusions from analysis of several accidents where the control system was claimed to have a contributing role to the sequence of events and user insights from surveys in the fleet.


The decision is indeed not bad in itself, because the touch screen control solution had significant technical flaws. But is it the one and single solution that can and will fix the problem and prevent any future accidents from happening? Most likely not.

The Lack of Human Factors Engineering

www.public.navy.mil/usff/Pages/usff-comprehensive-review.aspx

The report “Comprehensive Review of Recent Surface Force Incidents” unfolds a long list of factors contributing to the very unfortunate accidents with casualties. The problems with the touch screen control exemplifies a much more general problem: A lack of Human Systems Integration (HSI) and Human Factors Engineering (HFE) when building ship systems.


Any attempt to put a technology in action in a system should, according to the HSI or HFE approach, include detailed analyses of the complete so called socio-technical system the technology is going to be integrated into. Even roll back to old well known controls.

The SEPTIGON-model

A socio-technical system is a system made of human and technical components and the connections between these. One way of explaining socio-technical systems is with the SEPTIGON-model where seven nodes are all connected: Individual, Group, Technology, Physical environment, Organisation, Processes and Society and Culture (Koester, 2007).


In short, each node interact with all other nodes of the system. For example, the organisation or physical environment could influence how individuals and technology interact. This interaction could influence the corresponding individual-process interaction etc.

STReBa is an abbreviation for the systematic analysis of: Socio Technical Ressources & Barriers


A generalised version of the SEPTIGON-model is STReBa. STReBa is an analytic tool used in design of products, services and environments facilitate. Or STReBa can promote resources and eliminate or minimise barriers in a complex socio-technical system.

The touch screen control is a technology in the physical environment of the ship’s bridge, and the crew being in this environment will interact with the technology to control the ship. Doing this, they will follow procedures given by the organisation, and will control the ship according to rules agreed by the society, in this case, the IMO international rules for prevention of collision at sea. The touch screen is, therefore, one component in a much bigger socio-technical system. A full understanding of the touch screen’s role in this system would require a socio-technical approach, as it is rolled out for example in the HSI and HFE methodologies.

Early Warnings


Touch controls are not necessarily bad in nature, but the integration of touch controls in the socio-technical systems of the DDG-51 class ships had failed. Why? Well the main reason for this was, as pointed out by the report, the absence of a structured HSI or HFE approach.


They could have used the HSI or HFE approach as an instrument to evaluate the benefits and drawbacks. And they could have used it to identify risks and problems associated with the integration of touch controls in the specific socio-technical system of the DDG-51 class. The approach could also have pointed out suggestions for countermeasures and solutions, and it would have contributed with an early and proactive pre-implementation identification of the flaws in the technology. Furthermore, the lack of familiarity and lack of experience with touch controls, and the problems this rise, would have been highlighted at an early stage well before implementation.

Committing the Same Mistake Again

DPsy-Safety-Critcal.jpg

Touch screen controls are now rolled back to physical throttle and traditional helm control on the DDG-51 class. This is a decision to take one technology out of the socio-technical system of DDG-51 and replace it with another technology. This assumes that it would eliminate the problem related to the touch screen controls. However, this rollback has, even though it is a well known technology, still the inherent risk from bypassing the HSI or HFE approach.

Will the rollback to the old technology, being just another technology shift, introduce new and yet to be seen problems? Was the observed risk associated with the change in technology from manual to touch controls rather than the technology itself? And will this risk be introduced again when changing back? You can find these answers in the HSI and HFE approach where technologies are evaluated in terms of the socio-technical context they are going to be implemented in rather than as stand alone objects, which can be exchanged without interference with the rest of the system.

The socio-technical context is not only relevant when exchanging one technology with another as in the US Navy examples. It also applies to the understanding of the human element in complex systems. There is no simple explanation like “it was human error” as in the Microsoft example. Accidents and mishaps should be explained from a multi-factorial perspective including all components in the system and how they interact. The system is bound to fail again if this approach is bypassed by a quick fix– if we only react to a single causal factor being the human or the technology.

References

  • Reason, J. (1990). Human Error. Cambridge: University Press, Cambridge.

  • Reason, J. (1997). Managing the risks of organizational accidents. Aldershot: Ashgate.

  • Koester, T. (2007). Terminology Work in Maritime Human Factors. Situations and Socio-Technical Systems. Copenhagen: Frydenlund Publishers.

* * *

Next Steps

Get in Contact

If you want to learn more about the Design Pscyhology approach to design please reach out to:

Director of Socio-Technical Systems
Senior Human Factors Specialist:

Thomas Koester
+45 3126 2072

Services

DPsy offer services delivering in-depth user insights based on socio-technical analyses.

Socio-technical analyses are crucial in situations where, for example, one technology is going to be replaced by another with the intentions of enhancing safety or performance of the system as a whole. And it is important to understand how people will interact with the system after it has been changed and what problems, risks or benefits you should be prepared for in this context. Read more about the socio-technically based user insight services at DPsy here:

Socio-Tehnical Services