Here are my annual ‘reflections’ from 2018. They are based on my work with clients and other things that cropped up in the year that made me think about what we are doing in process safety and human factors. As always, they are in no particular order (apart from the first one).
We passed the 30 year anniversary of this disaster. It is probably the event that has most affected my career as it highlighted so many human factors and process safety issues. I know we have a better understanding of how accidents like Piper Alpha happen and how to control the risks but it is easy for things to get forgotten over time. I wrote two papers for the anniversary edition of Loss Prevention Bulletin. One looked at the role of ‘shared isolations’ (where an isolation is used for several pieces of work). The other was concerned with shift handover, which is one area where I worry that industry has still not properly woken up to.
Control room design
One of my main activities this year has been to rewrite the EEMUA 201 guidance document on design of control rooms and human machine interfaces in the process industry. I have investigated a range of aspects of design and made a point of getting input from experienced control room operators, control room designs, ergonomists and regulators. This has highlighted how important the design is for the operator to maintain the situational awareness they need to perform their job safely and efficiently; and to detect problems early to avoid escalation. This is not just about providing the right data in the right format; but also making sure the operator is healthy and alert at all times so that they can handle the data effectively. A complication is that control rooms are used by many different people who have different attributes and preferences. I found that currently available guidance did not always answer the designers’ questions or address the operators’ requirements but I hope that the new version of EEMUA 201, which will be published in 2019, will make a valuable contribution.
Arguably a bigger issue than original design is the way control rooms are maintained and modified over their lifetime. There seems to be a view that adding “just another screen” or allowing the control room to become a storage area for any paperwork and equipment that people need a home for is acceptable. The control room operator’s role is highly critical and any physical modification or change to the tasks they perform or their scope of responsibility can have a significant impact. We, quite rightly, put a lot of emphasis on designing effective control rooms and so any change needs to be assessed and managed effectively taking into account all modes of operation including non-routine and emergency situations.
Safety critical maintenance tasks
Whilst I have carried out safety critical task analysis for many operating tasks over the years it is only more recently that I have has the opportunity to do the same for maintenance tasks. This has proven to be very interesting. A key difference when compared to operations is that most maintenance tasks are performed without reference to detailed procedures and there can be almost total reliance on competence of the technicians. In reality only a small proportion of maintenance tasks are safety critical, but analysis of these invariably highlights a number of potentially significant issues.
I have written a paper titled “Maintenance of bursting disks and pressure safety valves - it’s more complicated than you think.” It will be published in the Loss Prevention Bulletin in 2019. This highlights that these devices are often our last line of defence but we have no way of testing them in situ and so have to trust they will operate when required. However, there are many errors that can occur during maintenance, transport, storage and installation that can affect their reliability.
Another example of a safety critical maintenance task is testing of safety instrumented systems. This is likely to be my next paper because it is clear to me that often the testing that takes place is not actually proving reliability of the system. Another task I have looked at this year was fitting small bore tubing. It was assumed that analysing this apparently simple task would throw up very little but again a number potential pitfalls were identified that were not immediately obvious.
Safety 2/Safety Different
I am bit bemused by this supposedly “new” approach to safety, or at least the way it is being presented. The advocates tell us that focussing on success is far better than the “traditional” approach to safety, which they claim is focussed mainly on failure (i.e. accidents). The idea is that there are far more successes than failures so more can be learnt. Also, finding out how work is actually done instead of assuming or imagining we know what really happens is another key feature of these approaches.
I fully agree that there are many benefits of looking at how people do their job successfully and learning from that. But I do not agree that this is new. It seems to me that the people promoting Safety 2/Different have adopted a particular definition of safety, which is does not in my opinion represent the full scope of what we have been doing. They suggest that safety has always been about looking at accidents and deciding how to prevent them happening again. There seems to be little or no acknowledgement of the many approaches taken in practice to manage risks. I certainly feel that I have spent most of my time in my 20+ year career understanding how people do their work, understanding the risks and working out the best way to support successful execution. And I have observed this in nearly every place I have ever worked. As an example, permit to work systems have been an integral part of the process industry for a number of decades. They encourage people to understand the tasks that are being performed, assessing the risks and deciding how the work can be carried out successfully and safely. This seems to fulfil everything that Safety 2/Different is claiming to achieve.
My current view is that Safety 2/Different is another useful tool in our safety/risk management toolbox. We should use it when it suits, but in many instances our “traditional” approaches are more effective. Overall I think the main contribution of Safety 2/Different is that it has given a label to something that we may have done more subconsciously in the past, and by doing that it can assist by prompting us to look at things a bit differently in order to see additional and/or better solutions.
Bow tie diagrams
I won’t say much about these as I covered this in last year’s Christmas email with an accompanying paper. But I am still concerned that bow tie diagrams are being oversold as an analysis technique. They offer an excellent way of visualising the way risks are managed but to do this need to be kept simple and focussed.
I had a paper published in Loss Prevention bulletin explaining how human bias can result in people have a misperception about how effective procedures can be at managing risk. This bias can affect people when investigating incidents and result in inappropriate conclusions and recommendations. The paper was provided as a free download by IChemE https://www.icheme.org/media/7205/lpb264_pg09.pdf.
I have written a few papers this year. I have decided to share two this year
1. Looking at the early stages of an emergency, pointing out that it is usually this is usually in the hands of your process operators, often with limited support. http://abrisk.co.uk/papers/
2. My views on Bowtie diagrams, which seem to be of great interest at the moment. I hope this might create a but of debate. http://abrisk.co.uk/papers/Bowties&human_factors.pdf
My last two Christmas emails included some of my ‘reflections’ of the year. When I came to write some for 2017 I found that the same topics are being repeated. But interestingly I have had the opportunity to work on a number of these during the year with some of my clients. As always, these are in no particular order.
This is still a significant issue for industry. But it is a difficult one to address. There really is no short cut to reducing nuisance alarms during normal operations and floods of alarms during plant upsets. Adopting ‘Alerts’ (as defined in EEMUA 191) as an alternative to an alarm appears to be an effective ‘enabler’ for driving improvements. It provides a means of dealing with something they think will be ‘interesting’ to an operator, but that is not so ‘important.’
During the year I have provided some support to a modification project. I was told the whole objective was simplification. But a lot of alarms were being proposed, with a significant proportion being given a high priority. Interestingly, no one admitted to being the person who had proposed these alarms, they had just appeared during the project, and it turned out the project did not have an alarm philosophy. We held an alarm review workshop and managed to reduce the count significantly. Some were deleted and others changed to alerts instead. The vast majority of the remaining alarms were given Low Priority.
I have had the chance to work with a couple of clients this year to review the way they implement process isolations. This has reinforced my previous observations that current guidance (HSG 253) is often not followed in practice. But having been able to examine some examples in more detail has become apparent that in many cases it is simply not possible to follow the guidance, and is some cases it would introduce more risk. The problem is that until we did this work people had ‘assumed’ that their methods were fully compliant both with HSG 253 and with their in-house standards, which were usually based on the same guidance.
I presented a paper at this year’s Hazards 27 on this subject, suggesting that keeping interlocks to the minimum and as simple as possible is usually better, whereas the current trend seems to be for more interlocks with increasing complexity. My presentation seemed to be well received, with several people speaking to me since saying they share my concerns. But, without any formal guidance on the subject it is difficult to see how a change of philosophy can be adopted in practice.
Human Factors in Projects
I presented a paper at EHF2017 on the subject of considering human factors in projects as early as possible. To do this human factors people need to be able to communicate effectively with other project personnel, most of whom will be engineers. Also, we need to overcome the widely held view that nothing useful can be done until later in a project when more detailed information is available.
I have had the opportunity to assist with several human factors reviews of project this year. Several were conducted at what is often called the ‘Concept’ or ‘Select’ phase, which is very early. These proved to be very successful. We found plenty to discuss and were able to make a number of useful recommendations and develop plans for implementation. It is still too early to have the proof, but I am convinced this will lead to much better consideration of human factors in the design for these projects.
This has been a concern of mine for a very long time (since Piper Alpha in 1988). But I am frustrated that the process industry has done so little to improve the quality of handovers. It just seems to fall into the ‘too difficult’ category of work to do. It is a complex, safety critical activity performed at least twice per day. We need to manage all aspects of the handover process well, otherwise communication failures are inevitable, and some of these are likely to contribute to accidents.
I have worked with a couple of clients this year to review and improve their shift handover procedures. It is good to know some are starting to tackle this subject, but I am sure many more have work to do.
I hope you find some of this interesting. To finish, I would like to point you to a free paper available from Loss Prevention Bulletin, presenting Lessons from Buncefield.
This was on last year’s list, but it continues to be a hobby horse of mine. During the year I have had the opportunity to review in-house isolation standards for two companies. This work has further reinforced my view that there are many instances where following the guidance from HSE (HSG 253) is not achievable, and may sometimes increase risk when all factors are considered. My paper is an attempt to illustrate the real-life issues that operators and technicians have to deal with. Click here for the paper.
I am concerned that the use of interlocks is increasing dramatically with no real thought as to the benefit and potential risks. The problem is that there is no clear guidance to say what functions should be interlocked or how many interlocks should be used. And vendors are able and willing to sell ever more sophisticated and complicated interlocking solutions.
I believe that over use of interlocks encourages, or even forces, people to stop thinking about what they are doing, and they become focussed on identifying what they need to do to get the next key. I believe at some point this risk must outweigh the benefits of having interlocks in the first place.
I have tried to encourage clients on a number of occasions to reduce the number of interlocks in their design, but with little (or no) success. I think people feel that they cannot be criticised if they include the interlocks, and may be queried if they do not adopt the most ‘complete’ solution. I have submitted a paper on this subject to the Hazards 27 Conference, which takes place in May 2017 titled “Interlocking isolation valves – less is more.”
Human Factors in Projects
Another repeat from last year. Human Factors in Projects (often known as Human Factors Engineering – HFE) is starting to become normal, which is definitely positive. I have helped two companies with generating in-house procedures for implementing HFE. In both cases the aim was to make implementation as simple as possible, whilst ensuring suitable focus was given to the most important issues.
One of the key messages is that HFE should be on the agenda as soon as possible for any project. I have had the opportunity to assist one client with two projects this year that were at a very early stage. In both cases the consensus of all participants was very positive.
I have submitted a paper titled “Human Factors Engineering at the early phases of a project” to the Ergonomics and Human Factors 2017 conference, which take place in April.
Also, you may find this presentation on HFE interesting.
I have taken part in two investigations this year. Both highlighted human factors issues that I know crop up widely.
In one, scope creep on a maintenance task, combined with an over reliance on informal communications led to misunderstandings about plant status. The operating team, who were considered to be very competent and able, made some assumptions based on past experience, which turned out to be incorrect. The operating team were fully engaged in the investigation, and admitted that they were very disappointed with themselves for the errors they made, and wanted to understand why this had happened.
In the other, the plant was operating on the edge of its capability and multiple items of equipment were unavailable. When a problem occurred the operators perceived that their options to respond were very limited, and they reacted in a way that they thought was correct, but in hindsight simply exacerbated the problem. One thing that this investigation highlighted was how effective operators can be at ‘working around’ problems to keep the plant running. The unfortunate outcome of this is that the problems no longer appear to be so significant and so do not get resolved. However, as this incident demonstrated, this leaves the plant very vulnerable to events as there are not the safety margins available to cope.
I hope you find some of this interesting. To finish, I would like to remind you about a free publication from the Loss Prevention Bulletin summarising major accidents that have had their anniversary this year. It is available at http://www.icheme.org/lpb/free%20downloads.aspx
Management of Alarms
Alarms continue to cause problems. But I am pleased to see that most companies have started recognise the need to modify their systems to reduce the frequency of nuisance alarms during normal operations and floods of alarms when things go wrong. And it is clear that improvements are being made.
I have assisted clients with setting up their alarm rationalisation programs and procedures; and I have been teaching a one day awareness course (based on EEMUA 191). From this I have made the following observations:
- Although it makes sense to focus on alarms, having a clear definition for an ‘alert’ can be a real enabler for people to see how they can improve their system. Whether it is a fear factor or some other concern, people find it difficult to say “that alarm can be removed.” However, they are happier to say “that alarm can be converted to an alert.” Of course we need to make sure that we don’t transfer a problem with alarms to a problem with alerts. But, we have a lot more flexibility with alerts, including how they are notified. For example, we can show them on separate summary pages, direct them to non-operational teams or automatically create daily alert reports. The result is that operators are not distracted by these lesser events if they are dealing with more important situations and ‘real’ alarms.
- EEMUA 191 introduces the concept of the “safety related” alarm (ISA 18.2 refers to them as “highly managed”). I find this term a bit confusing; and I think a lot of other people have struggled to identify which of their alarms fall into this priority. The reality is that many plants/sites will not have any alarms that satisfy the EEMUA definition of “safety related” and it is not just another priority. They are alarms that fill a gap where, in an ideal world, an automated response would be provided but this is deemed inappropriate. This means that the operator response to an alarm is considered to be a layer of protection. If credit is taken for this operator response the alarm is then considered “safety related” and it needs to be handled differently from all the other process alarms. If there is an automated protection device, the associated alarms will not be “safety related” and should be prioritised in the ‘normal’ way.
- We still have some disconnect between what the guidance says about alarms and what operators want. I think this is because, over the years, we have forced people to operate on alarms because they receive so many they don’t have the time to do anything else. When we suggest that alarm rates will be significantly reduced, operators cannot image how they will operate the plant if the alarms are not telling them about every little event that occurs.
- A solution to the concerns about reduced alarm frequency is to improve the quality of graphics on our control systems. This would make events more visible so that the operator does not feel they need an alarm. I think we quickly need to start looking at alarms and graphics together, which makes sense as together they make up the Human Machine Interface (HMI).
This has been a hobby horse of mine for a while. In fact a number of people have contacted me this year having read my paper on the subject titled “process isolation – it’s more complicated than you think.”
I have had the chance to carry out task analysis for some process isolation activities during the year. This has led to some heated debate at times. Everyone is aware of the guidance from HSE (HSG 253) but is finding it difficult to apply in practice. My observations include:
- A lot of designers (and other non-operators) simply do not understand how isolations occur in practice. This was illustrated to me on a project where double block and bleed arrangements were provided. Whilst the block valves had been identified as requiring frequent access the bleeds were not and had been positioned out of reach. Clearly the designers did not recognise that the valves and bleed points were used together to form an isolation.
- It is quite common (especially on older plants) to require multiple points of isolation to perform relatively simple jobs. If every isolation needs to be proven via a bleed, there will be multiple breaks of containment to remove the blanks from each bleed point. It is not uncommon to be creating significantly more breaks of containment to prove an isolation than are involved in the job to be performed. Each break involves risk at the time of the break and on return to service. Also, it creates very high workload. Unfortunately, the guidance currently available does not provide a method of weighing up the overall risks so that sensible strategy can be selected.
- Overall, it is appears that my paper from 2013 is still very valid. The last paragraph makes the point that “companies and individuals have accepted the guidance as relevant and correct but have not checked whether they can be applied in practice and/or whether the requirements are being followed. The concern is that this creates a large disconnect between theory and practice, which could result in risks being underestimated and hence improperly controlled. The solution is not simple, but being open about when the guidance cannot be followed will at least ensure alternative methods are developed that achieve similar levels of risk control.
Human Factors in Projects
I believe it is a very positive development that human factors are now being given more consideration during the design of new process plant. I am convinced that this will result in better designs of process plant that will be easier to operate and maintain; with reduced risk of major accidents. Having been involved in quite a number of projects over recent years my observations include:
- You cannot start to consider human factors too early. There has been a perception by some people that it can only be done once a project reaches detailed design. I have never agreed with this and have been involved in two projects this year at the “Select” phase (pre-FEED). We have been very successful at identifying human factors issues that need to be addressed during the project, and by doing this early we have been able to make sure the solution is covered by the design rather than through softer controls (procedures, training and competence), which is often your only option if you do this later on.
- On the other hand, it is never too late. Whilst the preference must always be to start human factors input as early as possible, if this has not happened it is still worth doing something. Earlier this year I had to complete a human factors study on a project where the plant had already been built, although it had not been operated. By bringing together the designers, vendors and operators together to discuss the potential human factors issues we identified a significant disconnect between what the plant had been designed to do and what the operator was expecting. Whilst it was too late to change anything, at least the operator knew what they had to do before start-up instead of having to learn everything on a live plant.
- We need to be careful that human factors in projects does not become an overly bureaucratic exercise. Unfortunately, on some of my projects I seem to spend more time ‘discussing’ specific details of what a standard may require instead of working towards the optimum solution for the project. I think this occurs partly because of the way some standards are written. Also, because of a general lack of knowledge amongst project personnel about what human factors is all about. Starting early and developing integration plans that are clear, concise and focussed on developing optimum solutions are the best ways, I think, of making sure human factors makes a valuable contribution.
Task Analysis and HAZOP
I wrote a paper a little while ago saying that we needed to create better linkages between the various safety studied carried out in the process industry. My view was that we are tending to do these things in isolation and missing something as a result. As an example, I felt that there must be useful links between task analysis performed as part of human factors and HAZOP performed as part of the process safety scope. I have had the opportunity to explore this idea a number of times this year. My observations include:
- HAZOP does often identify human errors within the causes of hazardous scenarios; and also procedures or training as risk controls. As a minimum, we have to make sure that we can demonstrate that the human factors associated with these causes and controls have been addressed. Task analysis is the obvious way of doing this.
- HAZOP usually differentiates between major accidents and lesser outcomes in a systematic and defensible way. Cross referencing these with our task analyses allows us to build a stronger case for the findings of our analyses and acceptance of the subsequent recommendations. I guess it helps us change perceptions of human factors so that it is not seen so ‘abstract’ (wishy washy) and is more routed in ‘proper’ engineering.
- Building these links between task analysis and HAZOP requires human factors people to start reading HAZOP reports. This is quite an undertaking. In fact, the size of many HAZOP reports makes it impossible for anyone to seriously sit and read them from front to back. Careful use of the ‘find’ function on Word or PDF; and a clear understanding of how the report is structured around nodes can help enormously. It is still not something to be taken lightly, but I do think this is a big part of making human factors more relevant and valuable.
- There is a big variation in the quality of HAZOP reports. One of the main problems is when similar issues are dealt with inconsistently throughout the report. The software available to assist with HAZOP seems to be starting to help in this regard. I don’t think there is much benefit in human factors people sitting through full HAZOPs, but they can work with the HAZOP leader at the start of study to make sure there is a common understanding of what needs to be done to improve the links between HAZOP and human factors (particularly task analysis).
I have had shift handover on my agenda for many years, ever since I studied the Piper Alpha accident as part of my PhD. I have been generally disappointed that industry has not taken the issue more seriously, especially as it has been cited as making a contribution to several other major accidents. I have suspected that it has generally fallen into the ‘too difficult’ category, largely because it is totally reliant on the behaviours of the people involved. However, I have worked with one of my clients this year to improve their procedures for shift handover and in developing a short training course and presenting it to shift teams. My observations from this include:
- Communication at shift handover is far more difficult than most people think it is; and most people are not nearly as good at communicating as they think they are. The circumstances surrounding shift handover create particular challenges. In particular, the person finishing their shift will have just finished working 8 or 12 hours and is understandably keen to get home. However, the person receiving the handover, and who needs the information, does not know what they don’t know. Neither of these is conducive to effective communication.
- Individual personalities make a big difference. The shift teams I spoke to all complained about colleagues who gave poor handovers or did not appear interested when receiving a handover. It is easy to get in to a ‘why bother’ frame of mind in these circumstances. But it was clear that people were often reluctant to challenge their colleagues because they did not want to create any tension given that they had to work together. I believe in a number of these cases the individuals involved had not realised that their behaviour was so critical simply because no one had ever told them.
- Preparation for the handover is absolutely crucial, and time has to be made available to do this well. Management has a very important role in making sure they communicate very clearly that this is a critical part of a shift worker’s job. Also, by making sure they do not ask for (or expect without asking) things to be done towards the end of shift that will limit the time available to prepare.
- A well-structured log sheet and end of shift handover report can make a great difference to the quality of shift handover. I am surprised at how many companies are using blank pieces of paper for these. Also, how many only use a chronological log or end of shift report during handover; as these two documents perform different purposes and are both needed. There is software available that can support shift handover, but it is of no value if the appropriate systems and behaviours are not in place.
I showed a couple of videos on the shift handover courses. The look on some of the operators’ faces and the “oh sh*t – that could have been us” comments highlighted to me that we all become complacent to risk. It is a natural human reaction and coping strategy. This is why we have to keep working to reduce risks, and I am sure that human factors working more closely with other elements of process safety provides us with the means of driving improvement