Wishing you a very Happy Christmas

The following reflections are based various things that cropped up in the last year. As always, they are in no particular order.

COVID-19

Let’s get this over with at the start. The pandemic has certainly caused me to think:

  • Managing risks is difficult enough. Mixing in politics and public opinion magnifies this enormously;
  • We have to remember what it means to be human. Restricting what we can do may reduce the immediate risk but life is for living and there has to be a balance, especially when you take into account the knock-on effects of measures taken;
  • Our plans for managing crises usually assume external support and bought in services. These are far more difficult to obtain when there is a global crisis and everyone wants the same.

I have not given a great deal of thought into how these points can be applied to my work but they do give a bit of perspective and have affected many aspects of work, including management of our 'normal' risk.

Critical Communications

I have shared my views on shift handover several times over the years and it is still something we need to improve significantly. However, this year’s pandemic has highlighted how much we rely on person to person communication in many different ways. Infection control and social distancing disrupted this greatly. Just look at how many Skype/Teams/Zoom calls you have had this year compared to last!

My concern is that companies are failing to recognise the importance of communication or the fact that a lot of it takes place informally. You can hold very successful meetings over the internet but with people working separately you miss all those chance encounters and opportunities to have a chat when passing. These are the times when more exploratory discussions take place. They are safe times when you can ask daft questions and throw in wild ideas. Even if most of the time is spent talking about the weather or football, they help you get to know your colleagues better; fostering teamwork.

I posted an article on LinkedIn to highlight the issues with communication and COVID-19, which you can access at https://www.linkedin.com/pulse/dont-overlook-process-safety-andy-brazier. My concern was that companies had implemented the measures they needed to handle the personal health aspects of the pandemic but were not considering the knock-of effects on communication. This is a classic management of change issue. It is easy to focus on what you want or need to do. But you always need to be aware of the unintended consequences.

Posting my article led to discussions with software company eschbach and we wrote a whitepaper together. You can download it from their website at https://www.eschbach.com/en/blog/posts/manufacturing-in-a-crisis.php

An illustration of how companies do not always think about communication has arisen when looking at shift patterns. It is quite right that risks of fatigue from working shifts have to be managed, but that is not the only concern. For 12 hour shifts a fairly standard pattern is to work 2 days, 2 nights and then have 4 days off. This is simple and does well on fatigue calculations. The first day shift rotates through the days. This means that at one part of the cycle the first day is on a Saturday and the following week it is on a Sunday. The problem is that at the weekend the day staff are not present, so if important information is missed at the handover there is no one available to fill in the gaps or answer questions.

Alternative shift patterns are available that ensure the first day shift always happens on a weekday when the day workers are also present. The patterns are bit more complicated and may involve working one or two additional shifts before having a break, so don’t score so well in the fatigue calculations. I am not saying that everyone should change to a shift pattern like that, but I am pointing out that we need to give more recognition to communication and put more effort into supporting it.

The table below shows a comparison of Standard vs an Alternative shift pattern (days of the week along the top).

Control room design

Last year’s news was the publication of the 3rd edition of EEMUA 201 “Control Rooms: A Guide to their Specification, Design, Commission and Operation.” Unfortunately, it is not available free, unless you, or your employer, are a member of EEMUA. However, a free download is now available at https://www.eemua.org/Products/Publications/Checklists/EEMUA-control-rooms-checklist.aspx that includes a high level Human Factors Integration Plan template that can be used for new or modification control room projects (actually it is not a bad template for any type of project). Also, a checklist for evaluating control rooms either at the design or operational stages. I have used the checklist a number of times this year and am pleased to confirm that it really is very effective and useful. You should really use it with the 201 guide, but even on its own the checklist shows you what to consider and it will probably be a useful way of persuading your employer to but a copy of EEMUA 201.

On the subject of control rooms I published another article on LinkedIn this year titled “Go and tidy your (Control) room.” I used COVID-19 to reinforce the message I often give my clients about the state of their control rooms. My opinion is that we need to make sure our control room operators are always at the top of their game and having a pleasant and healthy place to work can help this. Unfortunately the message often seems to fall on deaf ears. Access the article at https://www.linkedin.com/pulse/go-tidy-your-control-room-andy-brazier/
 
Internet of Things (IoT)

This is a popular buzz word at the moment. The idea is that over the decades technology has given us more and more devices. Recently they have become smarter and so perform more functions autonomously. But there is even greater potential if they can be connected to each other, especially if there is an internet or cloud based system that can perform higher level functions.

Whilst I have a passing interest in the technology I am far more interested in how people fit into this future. The normal idea seems to be that people are just one of the ‘things’ that can be connected to the devices via the cloud. I have a number of problems with this. Firstly, ergonomics and human factors has shown us that technology often fails to achieve its potential because people cannot use it effectively or simply don’t want to. Perhaps more significantly, I feel that the current focus on the technology means that the potential to harness human strengths will be missed.

It is true that simple systems can be automated reasonably easily, and if they are used widely the investment in developing the technology can be justified. But automating more complicated systems is far more difficult. The driverless or autonomous car gives us a very good example. How many billions of pounds/dollars have already been spent on developing that technology? There will probably be a good return on this investment in the end because once it is working effectively it will result in many thousands or millions of car sales. Industrial and process systems are complicated and tend to be unique. It is unthinkable that such massive investment will be made into developing automation that can handle every mode of operation and handle every conceivable event. This is why we still need people, and will do for many years to come.

Although the focus is currently on the technology, I think the greatest advances are going to come from using IoT to support people rather than replace them. By understanding what people can do better than technology we will achieve much more reliable and efficient systems.

Loss Prevention Bulletin

Good news for all members of the Institution of Chemical Engineers (IChemE) is that they will have free access to Loss Prevention Bulletin from January. I believe this is a very significant step forward, making practical and accessible information about process safety readily available to so many more people. I have had a couple of papers published in the bulletin this year. They are both use slightly quirky but tragic case studies to illustrate important safety messages. You can download them at

http://abrisk.co.uk/papers/2020%20lpb274_pg07%20Abergele%20Train%20Crash.pdf
http://abrisk.co.uk/papers/2020%20lpb275_pg11%20Dutch%20surfers.pdf
 
Kletz compendium

Finally. Over the last couple of years a small group of us have been writing a compendium of Trevor Kletz’s work with the aim of introducing his stories to a new generation and bring his ideas update. Pre-orders are now being taken with a publication date of 18th January 2021. More details are on the publisher’s website at https://www.elsevier.com/books/trevor-kletz-compendium/brazier/978-0-12-819447-8

I hope you enjoy reading my reflections of 2020 and that you have a happy and healthy 2021.

Wishing you a very Happy Christmas
Here are my ‘reflections’ from 2019 based on my work with clients and other things that cropped up in the year that made me think about what we are doing in process safety and human factors.  As always, these are in no particular order.

Control room design
The great news is that the new, 3rd edition of EEMUA 201 was published this year. It has been given the title “Control Rooms: A Guide to their Specification, Design, Commission and Operation.” I was the lead author of this rewrite, and it was fascinating for me to have the opportunity to delve deeper into issues around control room design; especially where theory does not match the feedback from control room operators.
I would love to be able to send you all a copy of the updated guide but unfortunately it is a paid for publication (free for some members of EEMUA members). However, I have just had a paper published in The Chemical Engineer describing the guide and this is available to download free at https://www.thechemicalengineer.com/features/changing-rooms.
Now it has been published my advice about how to use the updated guide is as follows:

  • If you are planning a new control room or upgrading or significantly changing an existing one you should be using the template Human Factors Integration Plan that is included as Appendix 1. This will ensure you consider the important human factors and follow current good practice;
  • If you have any form of control room and operate a major hazard facility you should conduct a review using the checklist that is included as Appendix 2. This will allow you to identify any gaps you may have between your current design and latest good practice.

If you have any comments or questions about the updated guide please let me know.

Quantifying human reliability
It has been a bit of surprise to me that human reliability quantification has cropped up a few times this year. I had thought that there was a general consensus that it was not a very useful thing to attempt
One of the things that has prompted discussions has come from the HSE’s guidance for assessors, which includes a short section that starts “When quantitative human reliability assessment (QHRA) is used…”. This has been interpreted by some people to mean that quantification is an expectation. My understanding is that this is not the case, but in the recognition that it still happens HSE have included this guidance to make sure any attempts to quantify human reliability are based on very solid task analyses;
My experience is that a good quality task and human error (qualitative) analysis provides all the information required to determine whether the human factors risks are As Low As Reasonably Practicable (ALARP). This means there is no added value in trying to quantify human reliability and the effort it requires can be counter-productive, particularly as applicable data is sparse (non-existent). Maybe the problem is that task analysis is not considered to be particularly exciting or sexy? Also, I think that a failure to fully grasp the concept of ALARP could be behind the problem.
My view is that demonstrating risks are ALARP requires the following two questions to be answered:

  1. What more can be done to reduce risks further?
  2. Why have these things not been done?

Maybe the simplicity of this approach is putting people off and they relish the idea of using quantification to conduct some more ‘sophisticated’ cost benefit analyses. But I really do believe that sticking to simple approaches is far more effective.
Another thing that has prompted discussions about quantification is that some process safety studies (particularly LOPA) include look-up tables of generic human reliability data. People feel compelled to use these to complete their assessment.
I see the use in other process safety studies (e.g. LOPA) as a different issue to stand alone human reliability quantification. There does seem to be some value in using some conservative figures (typically a human error rate of 0.1) to allow the human contribution to scenarios to be considered. If the results achieved do not appear sensible a higher human reliability figure can be used to determine how sensitive the system is to human actions.
It is possible to conclude that the most sensible approach to managing risks is to place higher reliance on the human contribution. If this is the case it is then necessary to conduct a formal and detailed task analysis to justify this; and to fully optimise Performance Influencing Factors (PIF) to ensure that this will be achieved in practice.
It is certainly worth looking through your LOPA studies to see what figures have been used for human reliability and whether sensible decisions have been made. You may find you have quite a lot of human factors work to do!

Maintaining bursting discs and pressure safety valves
I am pleased to say that my paper titled “Maintenance of bursting disks and pressure safety valves - it’s more complicated than you think.” Was published in the Loss Prevention Bulletin in 2019. It highlights that these devices are often our last line of defence but we have minimal opportunities to test them in situ and so have to trust they will operate when required. However, there are many errors that can occur during maintenance, transport, storage and installation that can affect their reliability. http://abrisk.co.uk/papers/2019 LPB266_pg06 Bursting discs and pressure safety valves.pdf
Unfortunately I have still not written my next paper in the series, which will be on testing of Safety Instrumented Systems (SIS). It is clear to me that often the testing that takes place is not actually proving reliability of the system. Perhaps I will manage it in 2020.
However, I did have another paper published in The Chemical Engineer. It is actually a reprint of a paper published in Loss Prevention Bulletin in 2013, so many of you have seen it before. It is about process isolations being more complicated than you think. I know this is still a very relevant subject. http://abrisk.co.uk/papers/2019 TCE Degrees of separation.pdf

Inherent Safety
I have been aware of the general concept of Inherent Safety for a long time, with Trevor Kletz’s statement “what you don’t have can’t leak” explaining the main idea so clearly. However, I have looked a bit more deeply into the concept in recent months and am now realising it is not as simple as I thought.
One thing that I now understand is that an inherently safe solution is not always the safest option when all risks are taken into account. The problem is that it often results in risk being transferred rather than eliminated; resulting in arrangements that are more difficult to understand and control.
I am still sure that inherent safety is very important but maybe it is not thought about carefully enough. The problem seems to be a lack of tools and techniques. I am aware that it is often part of formal evaluations of projects at the early Concept stage (e.g. Hazard Study 0) but I see little evidence of it at later stages of projects or during operations and maintenance.
I have a couple of things going on at the moment where I am hoping we will develop the ideas about inherent safety a bit. They are:

  1. I am part of a small team writing a book - a Trevor Kletz compendium. We are aiming to introduce a new audience to his work and remind others who may not have looked at it for a while that much of it is still very relevant. A second, equally important aim is to review some of Trevor’s ideas in a current context (including inherent safety) and to use recent incidents to illustrate why they still so important. We hope to publish late 2020, so watch this space.
  2. I am currently working on a paper for Hazards 30 with a client on quite an ambitious topic. It will be titled “Putting ‘Reasonably Practicable’ into managing process safety risks in the real world.” Inherent safety is an integral part of the approach we are working on.

I hope you enjoy reading my reflections of 2019 and that you have a happy and healthy 2020.

Andy

Here are my annual ‘reflections’ from 2018.  They are based on my work with clients and other things that cropped up in the year that made me think about what we are doing in process safety and human factors.  As always, they are in no particular order (apart from the first one).
Piper Alpha
We passed the 30 year anniversary of this disaster.  It is probably the event that has most affected my career as it highlighted so many human factors and process safety issues.  I know we have a better understanding of how accidents like Piper Alpha happen and how to control the risks but it is easy for things to get forgotten over time.  I wrote two papers for the anniversary edition of Loss Prevention Bulletin.  One looked at the role of ‘shared isolations’ (where an isolation is used for several pieces of work).  The other was concerned with shift handover, which is one area where I worry that industry has still not properly woken up to. 

Control room design
One of my main activities this year has been to rewrite the EEMUA 201 guidance document on design of control rooms and human machine interfaces in the process industry.  I have investigated a range of aspects of design and made a point of getting input from experienced control room operators, control room designs, ergonomists and regulators.  This has highlighted how important the design is for the operator to maintain the situational awareness they need to perform their job safely and efficiently; and to detect problems early to avoid escalation.  This is not just about providing the right data in the right format; but also making sure the operator is healthy and alert at all times so that they can handle the data effectively.  A complication is that control rooms are used by many different people who have different attributes and preferences.  I found that currently available guidance did not always answer the designers’ questions or address the operators’ requirements but I hope that the new version of EEMUA 201, which will be published in 2019, will make a valuable contribution.

Arguably a bigger issue than original design is the way control rooms are maintained and modified over their lifetime.  There seems to be a view that adding “just another screen” or allowing the control room to become a storage area for any paperwork and equipment that people need a home for is acceptable.  The control room operator’s role is highly critical and any physical modification or change to the tasks they perform or their scope of responsibility can have a significant impact.  We, quite rightly, put a lot of emphasis on designing effective control rooms and so any change needs to be assessed and managed effectively taking into account all modes of operation including non-routine and emergency situations.

Safety critical maintenance tasks
Whilst I have carried out safety critical task analysis for many operating tasks over the years it is only more recently that I have has the opportunity to do the same for maintenance tasks.  This has proven to be very interesting.  A key difference when compared to operations is that most maintenance tasks are performed without reference to detailed procedures and there can be almost total reliance on competence of the technicians.  In reality only a small proportion of maintenance tasks are safety critical, but analysis of these invariably highlights a number of potentially significant issues.

I have written a paper titled “Maintenance of bursting disks and pressure safety valves - it’s more complicated than you think.”  It will be published in the Loss Prevention Bulletin in 2019.  This highlights that these devices are often our last line of defence but we have no way of testing them in situ and so have to trust they will operate when required.  However, there are many errors that can occur during maintenance, transport, storage and installation that can affect their reliability.

Another example of a safety critical maintenance task is testing of safety instrumented systems.  This is likely to be my next paper because it is clear to me that often the testing that takes place is not actually proving reliability of the system.  Another task I have looked at this year was fitting small bore tubing.  It was assumed that analysing this apparently simple task would throw up very little but again a number potential pitfalls were identified that were not immediately obvious.

Safety 2/Safety Different
I am bit bemused by this supposedly “new” approach to safety, or at least the way it is being presented.  The advocates tell us that focussing on success is far better than the “traditional” approach to safety, which they claim is focussed mainly on failure (i.e. accidents).  The idea is that there are far more successes than failures so more can be learnt.  Also, finding out how work is actually done instead of assuming or imagining we know what really happens is another key feature of these approaches. 

I fully agree that there are many benefits of looking at how people do their job successfully and learning from that.  But I do not agree that this is new.  It seems to me that the people promoting Safety 2/Different have adopted a particular definition of safety, which is does not in my opinion represent the full scope of what we have been doing.  They suggest that safety has always been about looking at accidents and deciding how to prevent them happening again.  There seems to be little or no acknowledgement of the many approaches taken in practice to manage risks.  I certainly feel that I have spent most of my time in my 20+ year career understanding how people do their work, understanding the risks and working out the best way to support successful execution.  And I have observed this in nearly every place I have ever worked.  As an example, permit to work systems have been an integral part of the process industry for a number of decades.  They encourage people to understand the tasks that are being performed, assessing the risks and deciding how the work can be carried out successfully and safely.  This seems to fulfil everything that Safety 2/Different is claiming to achieve.

My current view is that Safety 2/Different is another useful tool in our safety/risk management toolbox.  We should use it when it suits, but in many instances our “traditional” approaches are more effective.  Overall I think the main contribution of Safety 2/Different is that it has given a label to something that we may have done more subconsciously in the past, and by doing that it can assist by prompting us to look at things a bit differently in order to see additional and/or better solutions.

Bow tie diagrams
I won’t say much about these as I covered this in last year’s Christmas email with an accompanying paper.  But I am still concerned that bow tie diagrams are being oversold as an analysis technique.  They offer an excellent way of visualising the way risks are managed but to do this need to be kept simple and focussed. 
 
And finally
I had a paper published in Loss Prevention bulletin explaining how human bias can result in people have a misperception about how effective procedures can be at managing risk.  This bias can affect people when investigating incidents and result in inappropriate conclusions and recommendations.  The paper was provided as a free download by IChemE https://www.icheme.org/media/7205/lpb264_pg09.pdf.

I have written a few papers this year.  I have decided to share two this year

 

1. Looking at the early stages of an emergency, pointing out that it is usually this is usually in the hands of your process operators, often with limited support.  http://abrisk.co.uk/papers/2017%20LPB254pg09%20-%20Emergency%20Procedures.pdf
2. My views on Bowtie diagrams, which seem to be of great interest at the moment.  I hope this might create a but of debate. http://abrisk.co.uk/papers/Bowties&human_factors.pdf

My last two Christmas emails included some of my ‘reflections’ of the year.  When I came to write some for 2017 I found that the same topics are being repeated.  But interestingly I have had the opportunity to work on a number of these during the year with some of my clients.  As always, these are in no particular order.

Alarm management
This is still a significant issue for industry.  But it is a difficult one to address.  There really is no short cut to reducing nuisance alarms during normal operations and floods of alarms during plant upsets.  Adopting ‘Alerts’ (as defined in EEMUA 191) as an alternative to an alarm appears to be an effective ‘enabler’ for driving improvements.  It provides a means of dealing with something they think will be ‘interesting’ to an operator, but that is not so ‘important.’

During the year I have provided some support to a modification project.  I was told the whole objective was simplification.  But a lot of alarms were being proposed, with a significant proportion being given a high priority.  Interestingly, no one admitted to being the person who had proposed these alarms, they had just appeared during the project, and it turned out the project did not have an alarm philosophy.  We held an alarm review workshop and managed to reduce the count significantly.  Some were deleted and others changed to alerts instead.  The vast majority of the remaining alarms were given Low Priority.

Process isolations
I have had the chance to work with a couple of clients this year to review the way they implement process isolations.  This has reinforced my previous observations that current guidance (HSG 253) is often not followed in practice.  But having been able to examine some examples in more detail has become apparent that in many cases it is simply not possible to follow the guidance, and is some cases it would introduce more risk.  The problem is that until we did this work people had ‘assumed’ that their methods were fully compliant both with HSG 253 and with their in-house standards, which were usually based on the same guidance.

Interlocks
I presented a paper at this year’s Hazards 27 on this subject, suggesting that keeping interlocks to the minimum and as simple as possible is usually better, whereas the current trend seems to be for more interlocks with increasing complexity.  My presentation seemed to be well received, with several people speaking to me since saying they share my concerns.  But, without any formal guidance on the subject it is difficult to see how a change of philosophy can be adopted in practice. 
 
Human Factors in Projects
I presented a paper at EHF2017 on the subject of considering human factors in projects as early as possible.  To do this human factors people need to be able to communicate effectively with other project personnel, most of whom will be engineers.  Also, we need to overcome the widely held view that nothing useful can be done until later in a project when more detailed information is available.
I have had the opportunity to assist with several human factors reviews of project this year.  Several were conducted at what is often called the ‘Concept’ or ‘Select’ phase, which is very early.  These proved to be very successful.  We found plenty to discuss and were able to make a number of useful recommendations and develop plans for implementation.  It is still too early to have the proof, but I am convinced this will lead to much better consideration of human factors in the design for these projects.

Shift Handover
This has been a concern of mine for a very long time (since Piper Alpha in 1988).  But I am frustrated that the process industry has done so little to improve the quality of handovers.  It just seems to fall into the ‘too difficult’ category of work to do.  It is a complex, safety critical activity performed at least twice per day.  We need to manage all aspects of the handover process well, otherwise communication failures are inevitable, and some of these are likely to contribute to accidents.
I have worked with a couple of clients this year to review and improve their shift handover procedures.  It is good to know some are starting to tackle this subject, but I am sure many more have work to do.
 
I hope you find some of this interesting.  To finish, I would like to point you to a free paper available from Loss Prevention Bulletin, presenting
Lessons from Buncefield.

Process Isolations
This was on last year’s list, but it continues to be a hobby horse of mine.  During the year I have had the opportunity to review in-house isolation standards for two companies.  This work has further reinforced my view that there are many instances where following the guidance from HSE (HSG 253) is not achievable, and may sometimes increase risk when all factors are considered.  My paper is an attempt to illustrate the real-life issues that operators and technicians have to deal with.  Click here for the paper.

Interlocks
I am concerned that the use of interlocks is increasing dramatically with no real thought as to the benefit and potential risks.  The problem is that there is no clear guidance to say what functions should be interlocked or how many interlocks should be used.  And vendors are able and willing to sell ever more sophisticated and complicated interlocking solutions.

I believe that over use of interlocks encourages, or even forces, people to stop thinking about what they are doing, and they become focussed on identifying what they need to do to get the next key.  I believe at some point this risk must outweigh the benefits of having interlocks in the first place.
I have tried to encourage clients on a number of occasions to reduce the number of interlocks in their design, but with little (or no) success.  I think people feel that they cannot be criticised if they include the interlocks, and may be queried if they do not adopt the most ‘complete’ solution.  I have submitted a paper on this subject to the Hazards 27 Conference, which takes place in May 2017 titled “Interlocking isolation valves – less is more.”

Human Factors in Projects
Another repeat from last year.  Human Factors in Projects (often known as Human Factors Engineering – HFE) is starting to become normal, which is definitely positive.  I have helped two companies with generating in-house procedures for implementing HFE.  In both cases the aim was to make implementation as simple as possible, whilst ensuring suitable focus was given to the most important issues.

One of the key messages is that HFE should be on the agenda as soon as possible for any project.  I have had the opportunity to assist one client with two projects this year that were at a very early stage.  In both cases the consensus of all participants was very positive.

I have submitted a paper titled “Human Factors Engineering at the early phases of a project” to the Ergonomics and Human Factors 2017 conference, which take place in April. 

Also, you may find this presentation on HFE interesting.

Incident Investigations
I have taken part in two investigations this year.  Both highlighted human factors issues that I know crop up widely.

In one, scope creep on a maintenance task, combined with an over reliance on informal communications led to misunderstandings about plant status.  The operating team, who were considered to be very competent and able, made some assumptions based on past experience, which turned out to be incorrect.  The operating team were fully engaged in the investigation, and admitted that they were very disappointed with themselves for the errors they made, and wanted to understand why this had happened.

In the other, the plant was operating on the edge of its capability and multiple items of equipment were unavailable.  When a problem occurred the operators perceived that their options to respond were very limited, and they reacted in a way that they thought was correct, but in hindsight simply exacerbated the problem.  One thing that this investigation highlighted was how effective operators can be at ‘working around’ problems to keep the plant running.  The unfortunate outcome of this is that the problems no longer appear to be so significant and so do not get resolved.  However, as this incident demonstrated, this leaves the plant very vulnerable to events as there are not the safety margins available to cope.

I hope you find some of this interesting.  To finish, I would like to remind you about a free publication from the Loss Prevention Bulletin summarising major accidents that have had their anniversary this year.  It is available at http://www.icheme.org/lpb/free%20downloads.aspx

Andy