HAZARD IDENTIFICATION, RISK ASSESSMENT AND CONTROL MEASURES FOR MAJOR HAZARD FACILITIES
BOOKLET 4
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 1 of 49
TABLE OF CONTENTS 1 Who is this booklet for?...............................................................................................3 2 What does the booklet aim to do?...............................................................................3 3 Hazard identification, risk assessment and control measures introduction.............3 4 Hazard identification...................................................................................................3 4.1 The importance of getting the hazard identification right........................................4 4.2 Features of HAZID.................................................................................................5 4.3 Hazard identification processes and techniques......................................................8 4.4 Review, revision and typical problems..................................................................15 5 Risk assessment..........................................................................................................17 5.1 Risk assessment aims............................................................................................17 5.2 Examples of risk assessment methods...................................................................24 6 Control measures.......................................................................................................33 6.1 Introduction...........................................................................................................33 6.2 What is a control measure?...................................................................................33 6.3 Understanding control measures...........................................................................35 6.4 Selecting and rejecting control measures..............................................................39 6.5 Additional or alternative control measures............................................................40 6.6 Defining performance indicators for control measures..........................................42 6.7 Critical operating parameters................................................................................45 6.8 Involving employees in control measures.............................................................46 6.9 Control measures within the safety report and SMS..............................................46 6.10 Reviewing and revising control measures...........................................................47 6.11 SMS - A suggested combination of key elements...............................................48
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 2 of 49
1
Who is this booklet for? This booklet has been produced for employers in control of a facility that has been classified by Comcare as a major hazard facility under Part 9 of the Occupational Health and Safety (Safety Standards) Regulations 1994.
2
What does the booklet aim to do? This booklet provides guidance on key principles and issues to be taken into account when conducting an effective hazard identification and risk assessment at a major hazard facility (MHF) that is subject to Commonwealth legislation. It also describes the type and nature of control measures that the employer in control of an MHF should consider. These processes should be consistent with the safety management system. This booklet is a guide to the intent of the Regulations but employers in control of an MHF should refer to the Regulations for specific requirements. In addition, this booklet refers to hazard identification and risk assessment techniques that may be appropriate for MHF purposes (and described in the facility SMS – refer to Booklet 3) but these may not be the only techniques in use at a facility, other techniques include job safety analysis and task analysis.
3
Hazard identification, risk assessment and control measures introduction Hazard identification (HAZID) and risk assessment involves a critical sequence of information gathering and the application of a decision-making process. These assist in discovering what could possibly cause a major accident (hazard identification), how likely it is that a major accident would occur and the potential consequences (risk assessment) and what options there are for preventing and mitigating a major accident (control measures). These activities should also assist in improving operations and productivity and reduce the occurrence of incidents and near misses. There are many different techniques for carrying out hazard identification and risk assessment at an MHF. The techniques vary in complexity and should match the circumstances of the MHF. Collaboration between management and staff is fundamental to achieving effective and efficient hazard identification and risk assessment processes.
4
Hazard identification The Regulations require the employer, in consultation with employees, to identify: a)
all reasonably foreseeable hazards at the MHF that may cause a major accident; and
b)
the kinds of major accidents that may occur at the MHF, the likelihood of a major accident occurring and the likely consequences of a major accident.
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 3 of 49
4.1
The importance of getting the hazard identification right Major accidents by their nature are rare events, which may be beyond the experience of many employers. These accidents tend to be low frequency, high consequence events as illustrated in Figure 1 below. However, the circumstances or conditions that could lead to a major accident may already be present, and the risks of such incidents should be proactively identified and managed.
Figure 1: HAZID focus on rare events
HAZID must address potentially rare events and situations to ensure the full range of major accidents and their causes. To achieve this, employers should: a) b) c) d)
identify and challenge assumptions and existing norms of design and operation to test whether they may contain weaknesses; think beyond the immediate experience at the specific MHF; recognise that existing controls and procedures cannot always be guaranteed to work as expected; and learn lessons from similar organisations and businesses.
Some significant challenges in carrying out an effective HAZID are: a)
substantial time is needed to identify all hazards and potential major accidents and to understand the complex circumstances that typify major accidents;
b)
the need for a combination of expertise in HAZID techniques, knowledge of the facility and systematic tools;
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 4 of 49
c)
the possibility that a combination of different HAZID techniques may be needed, depending on the nature of the facility to ensure that the full range of factors (e.g. human and engineering) is properly considered;
d)
obtaining information on HAZID from a range of sources and opinions; and
e)
ensuring objectivity during the HAZID process.
Comcare must be satisfied that hazard identification has been comprehensive and the risks are eliminated or controlled before granting a licence or certificate of compliance to operate an MHF.
4.2
Features of HAZID Comcare’s expectations and some important features of HAZID Comcare will expect: a)
a clear method statement or description of the HAZID process, defining when it was conducted, how it was planned and prepared, who was involved and what tools and resources were employed;
b)
that the HAZID process was based on a comprehensive and accurate description of the facility, including all necessary diagrams, process information, existing conditions and modifications; and
c)
that the overall HAZID process did not rely solely on data that was historical or reactive and that employers ensured that predictive methods were also used.
The HAZID process must identify hazards that could cause a potential major accident for the full range of operational modes, including normal operations, start-up, shutdown, and also potential upset, emergency or abnormal conditions. Employers should also reassess their HAZID whenever a significant change in operations has occurred or a new substance has been introduced. They should also consider incidents, which have occurred elsewhere at similar facilities including within the same industry and in other industries. Refer to the guidance material for Safety Safety Report and Report Outline guidance material (booklet 4) for the definition of significant change.
Involving the right people An effective HAZID process is dependent upon having the right people participating in the process. The employer should: a)
involve Health and Safety Representatives (HSRs) in selecting staff to participate in HAZID;
b)
involve HSRs in determining if the HAZID techniques are suitable for the staff selected;
c)
ensure participants understand the relevant HAZID methods
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 5 of 49
so that they can fully participate in the process; and d)
be alert for hazards that can be revealed by the combination of knowledge from specialists in different work groups.
Features of a HAZID process The following aims to demonstrate the main features of a HAZID. Although the HAZID process chosen by employers must suit the circumstances of the MHF, the features noted below are generally applicable to all processes. Preparation: Prior to commencement of the HAZID, the following steps should be completed: a) Agreement on the purpose and scope of the HAZID; b) Appropriate personnel and HAZID tools identified; c) Sufficient resources and time allocated; d) Clearly defined reporting processes and study boundaries according to the purpose and scope; e) Appropriate background information and studies collated, such as historical incident data; f)
An agreed interpretation of ‘major accident’ that is consistent with the Regulations and relevant to the facility.
System description: At the commencement of the HAZID, the complete system of assets, materials, human activities and process operations within the boundaries of the study should be clearly defined and understood, taking account of the original design, subsequent changes and current conditions. Typically, the system should be divided into distinct separate components or sections to enable manageable quantities of information to be handled at each stage. Systematic evaluation and recording: The HAZID should move progressively through the system, applying the HAZID tools to each component or section in turn. All identified hazards and incidents should be recorded in some way. (See Figure 16 in this booklet for some examples of how hazard registers may be configured.) A checklist of guidewords, questions or issues should be considered at each stage. Some key questions and issues could be: a)
What is the design intent, what are the broad ranges of activities to be conducted, what is the condition of equipment, and what limitations apply to activities and operations?
b)
What are the critical operating parameters? What process operations occur, and how could they deviate from the design intent or critical operating parameters? This should consider routine and abnormal operations, start-up, shutdown and process upsets.
c)
What materials are present? Are they a potential source of major accidents in their own right? Could they cause an accident involving another material? Could two or more materials interact with
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 6 of 49
each other to create additional hazards? d)
What operations, construction or maintenance activities occur that could cause or contribute towards hazards or accidents? How could these activities go wrong? Could other hazardous activities be introduced into this section by error or by work in neighbouring sections of the facility?
e)
Could other materials, not normally or not intended to be present, be introduced into the process?
f)
What equipment within the section could fail or be impacted by internal or external hazardous events? What are the possible events?
g)
What could happen in this section to create additional hazards, e.g. temporary storage or road tankers?
h)
Could a particular section of the facility interact with other sections (e.g. adjacent equipment, an upstream or downstream process, or something sharing a service) in such a way as to cause an accident?
Past, present and future hazards To identify all hazards, the HAZID will need to consider past, present and future conditions, hazards and potential incidents. Past incidents, at the MHF or similar facilities, provide an indication of what has gone wrong in the past and what could go wrong in the future. A wide range of hazards and potential incidents will be present in the facility. New hazards and incidents could be created in the future as a result of planned or unplanned changes. The management of change process described in the SMS should identify new conditions during the planning of modifications or new activities. This should then trigger further HAZID studies and risk assessments, with the identification of control measures as appropriate. Figure 2 below illustrates the range of tools that can be used to identify past, present and future hazards.
Figure 2: Past, present and future hazards
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 7 of 49
4.3
Hazard identification processes and techniques HAZID techniques The flowchart below summarises all the steps needed in a HAZID process and how those steps relate to one another.
Figure 3: HAZID process
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 8 of 49
Examples of HAZID Techniques HAZOP Hazard and Operability Study (HAZOP) is a highly structured and detailed technique, developed primarily for application to chemical process systems. A HAZOP can generate a comprehensive understanding of the possible ‘deviations from design intent’ that may occur. However, HAZOP is less suitable for identification of hazards not related to process operations, such as mechanical integrity failures, procedural errors, or external events. HAZOP also tends to identify hazards specific to the section being assessed, while hazards related to the interactions between different sections may not be identified. Therefore, HAZOP may need to be combined with other hazard identification methods, or a modified form of HAZOP used, to overcome these limitations.
Equipment failure case definition This method is a systematic approach to defining loss of containment events Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 9 of 49
for all equipment within the study boundary. Process flow and equipment diagrams are studied systematically, and all equipment is assigned appropriate loss of containment scenarios, such as pinhole leaks, according to design, construction and operation. This form of hazard identification may be necessary for many major hazard facilities, to avoid missing potential scenarios, but is not sufficient on its own because it does not consider specific causes or circumstances. Therefore, this technique should only be used in combination with other techniques for MHF purposes.
Checklists There are many established hazard checklists which can be used to guide the identification of hazards. Checklists offer straightforward and effective ways of ensuring that basic types of events are considered. Checklists may not be sufficient on their own, as they may not cover all types of hazards, particularly facility-specific hazards, and could also suppress lateral thinking. Again, this technique should only be used in combination with other techniques for MHF purposes.
What-If Techniques This is typically a combination of the above techniques, often using a prepared set of ‘what-if’ questions on potential deviations and upsets in the facility. This approach is broader but less detailed than HAZOP.
Brainstorming Brainstorming is typically an unstructured or partially structured group process, which can be effective at identifying obscure hazards that may be overlooked by the more systematic methods.
Task Analysis This is a technique developed to address human factors, procedural errors and ‘man-machine interface’ issues. This type of hazard identification is useful for identifying potential problems relating to procedural failures, human resources, human errors, fault recognition, alarm response, etc. Task Analysis can be applied to specific jobs such as lifting operations, moving equipment off-line or to specific working environments such as control rooms. Task Analysis is particularly useful for looking at areas of a facility where there is a low fault-tolerance, or where human error can easily take a plant out of its safe operating envelope.
Failure Modes Effects Analysis (FMEA) FMEA is a process for hazard identification where all conceivable failure modes of components or features of a system are considered in turn and undesired outcomes are analysed. This technique is quite specialised and may require expert assistance.
Failure Modes Effects and Criticality Analysis (FMECA) FMECA is a highly structured technique that is usually applied to a complex item of mechanical or electrical equipment. The overall system is described as a set of sub-systems and each of these as a set of smaller sub-systems Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 10 of 49
down to component level. Individual system, sub-system and component failures are systematically analysed to identify their causes (which are failures at the next lower-level system), and to determine their possible outcomes, which are potential causes of failure in the next higher-level system. This technique is quite specialised and usually requires expert assistance.
Fault Tree and Event Tree Analysis Fault Trees describe loss of containment events in terms of the combinations of underlying failures that can cause them, such as a control system upset combined with failure of alarm or shutdown and relief systems. Event trees describe the possible outcomes of a hazardous event, in terms of the failure or success of control measures such as isolation and fire-fighting systems. Fault tree and event tree analysis is time-consuming, and it may not be practicable to use these methods for more than a small number of incidents.
Historical records of incidents Databases of incidents and near misses that have occurred are a useful reference because they give a very clear indication of how incidents can occur. Employers should consider site history, company history, industry history and possibly even wider sources of historical information for this purpose.
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 11 of 49
Examples of major accidents and the role of multiple factors in those accidents Texas City, USA, 2005. An explosion at a large refinery killed 15 workers and injured over 170 others. Equipment upgrades and SMS elements including process safety information, communications and training were targeted for improvements following the incident. Total cost of plant upgrade reported to be 1 billion dollars over 5 years. Longford, Victoria, 1998. Two workers were killed and eight others injured in an explosion at a gas processing plant. As a result, many elements of the SMS were targeted for improvements including process safety information and communication of critical safety information. Pasadena, USA, 1989. A fire and a series of explosions at a refinery complex resulted in 23 fatalities. Inadequate and unofficial isolation procedures, together with human error induced by poor ergonomics played a role in causing the accident. The loss of life and scale of damage were increased due to poor plant layout and subsequent damage to fire-fighting systems. Piper Alpha, UK, 1988. The accident was triggered by a small leak in a condensate pump system, which by itself would most likely have had only minor consequences. However, in combination with failures in management systems, design and equipment, the event resulted in the loss of 167 lives and destruction of the entire platform. Bhopal, India, 1984. Half a million people were exposed and over 20,000 have died to date as a result of a release of methyl-isocyanate via a vent stack. A range of systems and equipment had been malfunctioning or were taken out of service over a period leading up to the disaster, including a safety system for scrubbing tank vent releases, but this was disabled because the plant was shut down and not considered to be a risk. After the plant’s construction, a large ‘shanty town’ had grown up around it, but this had not led to any recognition of changes in risk. Flixborough, UK, 1974. A modification was made to a bypass for one of a series of reactor vessels. Due to the urgency of the work, and the fact that there had been significant organisational change, the modification was designed and constructed inadequately. The bypass failed, releasing a large cloud of cyclohexane, which exploded and killed 28 persons. Not only was the new hazard not considered during the modification, it was not recognised during subsequent operations even though the bypass was seen to move as process pressure rose and fell.
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 12 of 49
Human factors and the nature of hazards – an overview Human factors are defined as the interactions between people and the organisation, systems and equipment they interface with. Consideration of the effect of human factors on risk is sometimes defined as ‘fitting the work to the employees’ or ‘the science and practice of designing systems to fit people’. The subject of human factors is concerned with understanding the capacities and limitations of people in their jobs, and using this understanding to eliminate or control the effects of human weaknesses and exploit human strengths. In this context, human weaknesses may include limitations on information processing capabilities, while human strengths may include adaptability. Some facilities may find that human factors are a major contributor to the nature of hazards at the facility. It is expected that in this case there will be thorough evaluation of the causes of this. Analysis of human errors may require procedural reviews or human factors analysis. From the major hazard control perspective, the role of people is critical to the safe operation of major hazard facilities and should be addressed in the safety report. Accordingly, employers should incorporate human factors into relevant aspects of the operation of major hazard facilities, including the SMS, hazard identification, risk assessment, control measures, the safety role of employees and contractors, emergency planning and training.
Human factor HAZID techniques When identifying human factor hazards, the employer should examine the foreseeable major accidents and consider how human factors may contribute to or cause those accidents. It is important employees with experience in the specific area being studied participate to identify the hazards that may be present. This should involve identifying the ways in which ‘just being human’ can influence the performance of individual tasks and roles. For example, a person may operate a wrong valve by mistake, or may inadvertently use the incorrect procedure. Examples of ‘being human’ include: memory limitations, visual acuity limitations, information processing problems, distraction, fatigue, decision-making biased by experience and knowledge, rigid problem solving, susceptibility to following group behaviour. These can all adversely influence human actions and decisions leading to the possible creation of hazards. It is important to acknowledge that human factors can introduce hazards at all levels, from performing individual operations and maintenance tasks, through to designing the facilities, writing the procedures and even setting standards and policy for the organisation. Human factor hazards can also be latent hazards, in that they will not be revealed until particular circumstances combine to make them obvious. For example, the transfer of a key operational supervisor into a non-operational project role may lead to a deficiency of knowledge within the operations team; this deficiency may not become apparent until an emergency occurs. Accordingly, while it is Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 13 of 49
necessary to involve first-line operations and maintenance personnel in the hazard identification, it is also necessary to consider wider issues than the day-to-day roles and activities of these persons. When considering the type and level of human factors input that is needed in hazard identification, employers should consider their specific circumstances, and in particular, the amount of reliance they place on human actions and decisions in the prevention and control of major accidents. Cases where detailed consideration of human factors might be appropriate include a process plant that requires employee action to prevent or control emergency situations or a dangerous goods warehouse that relies heavily on procedural controls to ensure correct segregation of goods. In addition to calling upon the necessary range of operations personnel to take part in the hazard identification, it may also be appropriate to use persons having specialist human factors knowledge. This specialist knowledge may be essential if human factors hazards can influence critical safety controls. Human factor HAZID techniques are evolving and are based on methods developed from engineering HAZID methods. They follow the same principles and can be conducted in conjunction with an engineering HAZID.
Task analysis An important set of human factors techniques, which can be used in all areas of human factors consideration, is a set of methods collectively called ‘task analysis’. Task analysis is not only used in HAZID but is also a tool for risk assessment and development of control measures to accommodate human factors. Task analysis is used to study what a person, or team, is required to do, in terms of actions and/or mental processes to achieve a system goal. The information used in and derived from a task analysis will depend on the technique used and the objective of the analysis.
Examples of task analysis techniques include: •
•
Task simulation
•
Questionnaires and structured interviews Link analysis
•
Work safety analysis
•
Hierarchical Task Analysis (HTA)
•
Event/fault tree analysis
•
Timeline analysis
•
Human Factors HAZOP
HTA is one of the most commonly used task analysis techniques. It is used to systematically analyse a task or series of tasks. The outcomes of the HTA will depend on the reasons for its use. For example, if a new control room is being designed for a process facility, the design layout and equipment available in the control room should be tested to ensure that it is appropriate for handling all foreseeable operations (start-up, normal, abnormal). If HTA is used to assess workload, the information, processing Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 14 of 49
and time requirements of the task, or tasks, should be tested.
4.4
Review, revision and typical problems Review and revision of hazard identification The Regulations require that the safety report must be reviewed in particular circumstances and reviewing the HAZID is part of this requirement. Consequently, the overall HAZID process should include regular and proactive reviews to identify any new hazards and to refresh knowledge of existing hazards. The HAZID process should include a range of triggers for individual HAZID studies. These triggers may be a scheduled program of reviews or could arise from information gathered during regular safety meetings.
Use of the HAZID results The value of a high-quality HAZID, and the major commitment of resources required, demand effective use of the HAZID. However, the cost of a major accident will be far more significant than that for the process that could prevent it. The hazard register should facilitate the process of revisiting and updating the knowledge of hazards and incidents within the facility. The register should communicate clear linkages between the employer’s processes for hazard identification, risk assessment and the selection or rejection of control measures.
Lateral thinking and realism in HAZID History clearly shows that major accidents often arise from a set of complex conditions or coincidental events, which may include multiple failures in both equipment and procedures. The initial conditions that lead to these major accidents might have been relatively minor problems, but they developed into major accidents because other problems arose concurrently. Companies and individuals can exhibit ‘corporate blindness’ when identifying or reporting hazards by assuming that the systems and procedures in place only ever function as intended. It is possible that other safeguards or pure luck will prevent a major accident from occurring, but the employer should consider the possibility that several events may at some time combine to cause a serious incident. These situations arise in reality, and should be allowed for in HAZID.
Use of worst-case scenario in HAZID Employers should include all foreseeable hazards in the HAZID and transparently analyse each event within the risk assessment. The employer should consider all possibilities, on a case-by-case basis, and document any assumptions regarding the definition of worst case. The employer should then be in a position to define the worst-case scenario for the facility. By using risk assessment techniques employers should be in a position to identify worst-case scenarios. Definitions of what constitutes the worst-case scenario can be difficult to Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 15 of 49
assess. The worst-case scenario is sometimes incorrectly deemed to be the largest event within the capacity of the on-site protection systems, simply on the basis that any event worse than this cannot be planned for. However, such events are merely the worst that has been allowed for during design and are not necessarily the worst that can occur. Most examples of major accidents given above clearly exceeded the design basis, which is why they resulted in such serious outcomes. Both the ‘design’ events and the true ‘worst case’ events are required to be considered. It should also be recognised that the ‘worst case’ in terms of the distance of impact might not be the ‘worst case’ in terms of potential consequences. It may be necessary to consider both these consequences. The worst-case scenario for one area of a facility may not be the same as that for another area of the same facility. This will depend on a large number of factors such as materials normally or not normally present, extreme process conditions, isolation systems that may fail, the proximity and the layout of vessels and the presence of personnel. Employers should consider all available information, including historical incident records, in deriving the worst-case scenario.
Common mistakes in HAZID A common mistake in hazard identification is to screen out or discard some incidents because they are perceived to be extremely unlikely or of low consequence. Incidents may be unlikely or of low consequence only as a result of the control measures in place. However, a key purpose of the HAZID and risk assessment process is to identify critical control measures and to determine their effectiveness. Therefore, it would be self-defeating to disregard incidents because control measures are in place. All potential major accidents should be recorded during the HAZID. Any screening, analysis and assessment of the hazards, their consequences or the effectiveness of the controls, should occur during the risk assessment. In practice, the HAZID and risk assessment processes are often combined into one workshop (or similar) which makes this distinction difficult. This ensures transparency and the ability to audit the entire process. This is necessary for the justification of the adequacy of control measures.
Other potential pitfalls that should be avoided include: a) being too generic in identification of hazards and potential major accidents; b) limiting the hazard identification to the immediate cause of potential major accidents, without determining the fundamental underlying cause; c) attempting to conduct the risk assessment and assessment of control measures during the hazard identification. Except for very simple facilities, it is almost certainly better to separate hazard identification from the subsequent stages; this helps ensure a systematic HAZID process. However, in practice, the HAZID and risk assessment are often combined so that the risk assessors are aware of the hazards identified and any other information discussed during the HAZID; Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 16 of 49
d) widening the scope to include too great a range of incident types, such as all occupational health and safety issues. If these issues must be considered they should only be the issues relevant to controlling the risk of a major accident; e) carrying out the HAZID with incomplete or inaccurate facility descriptive information; f)
proceeding with the study without first having developed, agreed and planned the approach and the method of recording. A pilot study on a selected area of the facility, may be beneficial in deciding on an effective HAZID approach;
g) failing to be comprehensive and systematic with respect to the activities, operations and possible different states of each part of the facility; h) failing to record important information discussed during the HAZID, e.g. assumptions, uncertainties or debated issues and gaps in knowledge; i)
allowing the hazard identification workshops to be dominated by individual persons or groups within the organisation; and
j)
where HAZIDs are conducted across several sessions - failing to review previous session findings or remind participants of the scope and objectives.
5
Risk assessment
5.1
Risk assessment aims The aims of risk assessment are to: a) provide a basis for identifying, evaluating, defining and justifying the selection of control measures for eliminating or reducing risk, and to therefore lay the foundations for demonstrating the adequacy of the standards of safety proposed for the facility; b) provide the employer and employees with sufficient objective knowledge, awareness and understanding of the risks of major accidents at the facility; c) capture knowledge of risk of a major accident at the facility so it can be managed, disseminated and maintained. The management of knowledge generated in the risk assessment will also greatly assist the efficient development of a safety report for the facility, for example by handling assumptions and actions arising; and d) provide practical effect to the employer's safety report philosophy. For example, if the employer intends to base the safety report largely on the facility’s compliance with specific codes or standards, the risk assessment should address corresponding issues such as the basis of the codes and standards and their applicability to the facility.
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 17 of 49
Creating and transferring knowledge using risk assessment Understanding the risks of major accidents may be accompanied by uncertainty, but the risk assessment will be successful if it reduces this uncertainty to an acceptable or tolerable level. The results of risk assessment must be captured and disseminated to those who require the knowledge, to enable the uncertainty of the entire organisation to be reduced to an acceptable level. An effective risk assessment should involve the processes of debating, analysing, sharing views and generating information and knowledge on the risk of major accidents and their means of control. It should include the active participation of employees or contractors who influence safe operations. This is the definitive criterion for participation of employees and contractors. Formal roles or assumptions about someone’s role are irrelevant to producing a high quality risk assessment. There are no limits to the activity or sources that can be used to understand the facility and its risks. For example, it could incorporate information from incident investigations, discussions during safety meetings about hazards and ways of controlling them, condition monitoring programs, analysis of process behaviour, evaluation of process trends or deviations from critical operating parameters, procedure reviews or flood or weather records. Risk assessment results are a useful input to training needs analyses. For example, if a procedure or task carried out by employees is an important control measure that can fail if there is inadequate employee knowledge, then the risk assessment should identify that risk and the need for that knowledge. It can then be a tool to assist in imparting that knowledge to employees, either by direct involvement in their defined roles or as a source of information to develop instruction or training sessions. The above example illustrates the importance of knowledge management in the process of complying with the Regulations: the HAZID and risk assessment generate knowledge, and this knowledge should be captured and implemented via the safety management system and the processes of consulting, informing, instructing and training. It must be recognised however, that assessing safety is not the same as managing safety, and risk assessment is only worthwhile if it informs and improves the decisionmaking and implementation processes. Reducing uncertainty should balance the improvement in the effectiveness of decisions against the cost of additional assessment. The SMS should instigate risk assessments to maintain a comprehensive and up-to-date understanding of risk as the facility changes. Risk assessment may be triggered via the management of change process at the facility. The results of risk assessments would be expected to feed into the development of the SMS.
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 18 of 49
Identifying and evaluating control measures using risk assessment The risk assessment should consider a range of control measures and provide a basis for the selection of control measures. Risk assessment can be a useful tool, which can save or optimise the use of resources, by determining the effectiveness and costs of different control options, improving the decision-making process and providing a basis for allocating resources in the most effective manner. The risk assessment process should provide the following in relation to control measures: a) identification or clarification of existing and potential control measure options; b) evaluation of effects of control measures on risk levels; c) basis for selection or rejection of control measures and the associated justification of adequacy; and d) basis for defining performance indicators for selected control measures. The range of control measures that should be considered in the risk assessment is addressed later in this guidance material. The risk assessment should evaluate the range of control measures in terms of viability and effectiveness to provide a basis for selection or rejection of each control measure: a) Viability relates to the practicability of implementing the control measure within the facility; and b) Effectiveness relates to the effect of the control measure on the level of risk. For example, the reliability and availability of control measures influence the likelihood of an incident occurring, while the functionality and survivability of the control measures during the incident influence the consequences. Specific studies may be carried out as part of the risk assessment to evaluate these issues for individual or groups of control measures. By evaluating options for control measures within the risk assessment the employer should be able to determine what additional benefit is gained from introducing additional or alternative control measures. If these do not result in any reduction in risk, the basis for rejection is apparent. The employer should look for gaps in the existing control regime, where the introduction of further control measures may be necessary.
Using the risk assessment to set performance indicators The risk assessment should generate information useful to the setting of performance indicators for the adopted control measures. For example: a) matching performance indicators with the control measures – control measures with more rigorous performance standards are more likely to be associated with the high consequence hazards than the lower consequence hazards; b) control measure functionality, including reliability, reflecting the scale Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 19 of 49
of incidents being controlled; c) reliability, or number of control measures, reflecting the likelihood of the corresponding incidents.
Overall framework and principles for risk assessment There are fundamental questions most forms of risk assessment attempt to address to ensure the risk assessment is comprehensive and systematic (see Figure 4).
Figure 4: Basic questions within risk assessment
The risk assessment should use assessment methods (quantitative or qualitative or both) that suit the hazards being considered. This means that the tools employed must be selected according to the nature of the risk. A tool that does not address any variability or uncertainty in the nature of the hazards and incidents identified can fail to generate the necessary understanding and provide no basis for differentiating between control measures. There is no single tool able to meet all the requirements for risk assessment, and all tools have limitations and weaknesses. For example: If the dominant contributor to a major accident relates to aging of equipment and associated mechanical integrity problems, then an analysis of mechanical integrity, corrosion rates, breakdown data, reliability and inspection/testing/maintenance issues may be necessary to develop the required understanding. In such a case, a quantitative risk assessment (QRA), which is usually based on generic data, may not provide the necessary information or lead to effective solutions. Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 20 of 49
Similarly, if a facility employer has identified human error as a key risk driver, then a Task Analysis, Human Reliability Analysis, or detailed analysis of the operating procedures may be appropriate. Analysis of equipment condition and reliability in this case would probably not be effective. For many facilities, there may be several types of assessment required. In the interests of efficiency, it is desirable to clearly identify the types of detailed study required, before following any particular route. Two basic tools can assist this process, they are preliminary/qualitative risk assessments and hazard or risk ranking. There are plenty of examples of both types of tool, but they all have a common purpose - to determine the nature of the risk in terms of the basic causes, likelihood, consequences and controls. Where it is clear that the employer has insufficient knowledge of causes or likelihood, detailed studies may be needed. A preliminary evaluation should point towards the types of detailed study required. An appropriate ranking methodology allows the key areas to be identified and prioritised. It enables the employer to determine if the gaps in knowledge correspond to what may be major risk contributors. Priority should be given to those areas where it is obvious there is likely to be a high risk and there are also gaps in knowledge about the things giving rise to the risk. Some iteration may be required where the ranking of key areas is revisited following detailed assessment, to see if any hazards have increased in rank and now require more detailed study. Figure 5 aims to illustrate the relationship of preliminary evaluation, ranking and detailed studies. Figure 5: Relationship between preliminary evaluations, ranking and detailed studies
The above discussion introduces the concept of a "tiered approach" that is frequently used in risk assessment. If a simple technique generates the information required by the Regulations and also generates sufficient understanding of the risk and the options for its control, further risk Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 21 of 49
assessment may not be necessary. However, if substantial uncertainty remains, or the employer wishes to look at a range of options in greater detail, then further effort is justified and more detailed tools may be desirable. In general, greater assessment effort should result in a more quantitative, accurate and robust understanding, thereby allowing a more transparent and rational basis for decision-making. The key to the tiered approach is that, at each stage, the employer should compare the potential cost of increasing the detail of the assessment against the benefit that further assessment may give. In this context, the “benefit” may be a higher level of knowledge of the hazards and the risk, or may be a better understanding of the optimum means of controlling the risk (described in Figure 6).
Figure 6: Tiered Approach to Risk Assessment
Some facilities may use a semi-quantitative risk assessment where qualitative brainstorming sessions of staff are combined with quantitative studies and information. If data and knowledge have been collected previously about the MHF and remain relevant to risk assessments under the MHF Regulations, it is acceptable to make use of that data and knowledge.
Cumulative assessment of hazards in the risk assessment The risk assessment must consider hazards cumulatively and individually. The effects of several hazards occurring in combination must be considered. Many major accidents have been caused by the realisation of a number of hazards concurrently. For any accident there may be several independent Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 22 of 49
hazards or combinations of hazards, each of which could lead to that accident, and several control measures which may be particularly critical because they may influence one or more of those hazards. The risk assessment should give an understanding of the total likelihood of each accident and the relative importance of each separate hazard and control measure. The potential for escalation of major accidents, and the consequences of this which may be greater than an event in isolation, need to be considered along with the consequences and their effects (e.g. number of injuries, extent of property damage). A facility may have a range of major hazards that could lead to potential major accidents. Both the highest risk incidents and the overall profile of risks from all incidents must be determined, so that the risk can be shown to be adequately controlled. In cases where a large number of different hazards and potential accidents exist, the cumulative risk may be significant even if the risk arising from each event is low. The "bow tie" diagram (Figure 7) is similar to a combined fault and event tree that shows how a range of causes, controls and consequences can be linked together and associated with each major accident scenario. Cumulative consideration of the hazards can be seen as the overall evaluation of interactions between different parts of a single bow tie or consideration of a range of bow ties together. Cumulative consideration of hazards enables the employer to assess the overall risk picture for the facility and to understand how different causes and events can combine to lead to an accident. It also enables the key causes and controls for the risks to be identified and evaluated in more detail if required.
Figure 7: Examples of bow tie diagram
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 23 of 49
Uncertainty and assumptions in risk assessment Handling uncertainties and assumptions in risk assessment is a difficult issue, but necessary. A complete and accurate understanding of the risk of major accidents is unlikely. Uncertainty cannot be eliminated and it will be necessary to make assumptions in some areas. The key is to record and test assumptions wherever possible and to explicitly recognise where the main gaps or uncertainties exist. Where important assumptions are made (e.g. assuming a control measure has a high level of effectiveness), the employer should ensure these assumptions are consistent throughout the safety report, and are implemented, tested and confirmed in practice. Employers should consider using sensitivity analysis to test the robustness of the risk assessment results against variations within the key areas of uncertainty. This may involve changing key assumptions and determining if the changes in results would affect any decisions that have been made based on those results. Where sensitivity analysis or consideration of the gaps in knowledge indicates a significant level of uncertainty or poor confidence in the resulting decisions, further detailed assessment may be required.
Is quantitative risk assessment required? QRA is only one tool within the risk assessment toolkit. It involves calculation of the frequency and consequence of a range of hazardous events and numeric combination of these to estimate the risk in a numerical format to allow direct comparison of results, including a measurement of risk to nearby neighbours and society as a whole. However, it is not a requirement of the Regulations that a QRA is performed. The methods used must be appropriate to the hazards, the nature of the options available, the facility safety philosophy and the decisions that are required from the risk assessment. A decision not to perform QRA does not preclude the employer from carrying out specific quantitative calculations regarding frequencies, consequences or other aspects of the risk. If QRA assists understanding of the risks and the appropriateness of control measure options, then it should be considered as a tool to support the risk assessment. However, QRA may not provide all the answers and is typically best suited to differentiating design, layout, location and engineering options. QRA is generally considered to be most useful for quantifying offsite risk; however, it can be useful in assessing on-site risk if sufficient detail and an understanding of the reality of people’s response to accidents are included.
5.2
Examples of risk assessment methods To recap, risk assessment must involve an investigation and analysis of the identified hazards and major accidents, so as to provide an understanding (and documentation) of the: a) nature of each hazard and major accident
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 24 of 49
b) likelihood of each hazard causing a major accident c) magnitude of each major accident d) severity of consequences of each major accident to persons on-site and off-site e) range of control measures available to control each major accident f)
effectiveness and viability of control measures for each major accident
g) individual and cumulative effects of hazards Each of these aspects is discussed below, with examples to illustrate the concepts, together with discussion of simplified, overall, preliminary qualitative methods of risk assessment, that may be used to focus the detail of the assessment onto the high-risk cases. This section is not intended to be a detailed or comprehensive description of risk assessment methods. The methods and figures shown below are purely selective examples to illustrate the approaches and are not a recommendation for any specific application.
Preliminary or qualitative risk assessment The most common form of preliminary or qualitative risk assessment is a “risk matrix”, which assesses individual incidents in terms of categories, e.g. low, medium and high, according to their expected consequence and likelihood. An example of the risk matrix approach is provided in Australian Standard AS4360 (Risk Management). Risk matrices may need to be tailored to the requirements of the MHF Regulations as they are not typically designed for very low frequency events. Risk nomograms provide an alternative approach; although it is little used in practice (see Figures 8 & 9). These methods can provide a relatively rapid understanding of the risk profile of the facility and can be based on judgment or be refined using more detailed information. However, the understanding gained will be relatively coarse, and the methods have limitations. For example, it is not easy to incorporate the effects of risk reduction measures within the risk matrix, and neither method is easy to use to assess cumulative hazards, in particular at facilities where a large number of hazards exist. To assess such issues, methods that are more detailed are likely to be required. When using risk matrices or nomograms it is important to define individual incidents or scenarios on a consistent basis so that comparable events are assessed. For example, a risk matrix could be used to evaluate specific outcomes of incidents or individual incidents or a specific cause of an incident. The likelihoods and consequences would be defined very differently depending on which definition of incident is used. Hence, the employer should decide how incidents will be defined and use the same approach for all incidents. A balance must be struck between defining events in sufficient detail and defining too many events to manage in the assessment.
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 25 of 49
Figure 8: Example of a risk matrix
Figure 9: Example of a risk nomogram
Ranking methods Most forms of preliminary risk assessment can be used as a basis for ranking different incidents to establish their approximate order of importance. In the risk matrix example, a simple scoring system can be introduced to represent the combined effect of likelihood and consequence. For example, the highest-ranking incident is m.a.7 (i.e. major accident Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 26 of 49
number 7) with a score or risk index of 16, closely followed by m.a.12 with a risk index of 15. The sum of the risk indices for all incidents is 76; therefore, the contribution of incident m.a.7 is 16/76 or about 21% of the cumulative risk. Note that the risk index on the matrix is a multiplication of the numbers assigned to the rows and columns NOT an addition. An extension of the above scoring approach is to define a range of specific factors that affect the likelihood or consequences of each incident. For each factor, each incident may be given a score such as from 1 to 5 or a simple rating such as low, medium or high based on specific, established criteria. The scores for each incident are then added to give an overall likelihood, consequence or risk score for each incident.
Investigating and analysing the nature of hazards and major accidents A range of different techniques is available for investigating and analysing the nature of hazards and major accidents. The MHF Regulations do not prescribe a specific method; therefore, it is the employer’s responsibility to determine the most appropriate for the circumstances of the particular MHF. To assess the nature of hazards and potential major accidents requires knowledge of what may go wrong within the facility if measures to eliminate or prevent accidents are not present. Depending on the different types of hazards and potential outcomes, the employer may need to employ a combination of techniques to develop a complete understanding. Techniques, which may have been used for identifying hazards and accidents, can sometimes also be used in the risk assessment to assist in understanding the nature, consequences and likelihood of the hazards and their control. For example, while HAZOP is primarily a tool for hazard identification, the HAZOP process can also include assessment of the causes of accidents, their likelihood and the consequences that may arise, so as to decide if the risk is acceptable, unacceptable or requires further study. However, within the scope of a combined HAZID and risk assessment workshop, this assessment would necessarily be coarse, qualitative and subjective and would in many cases need to be supplemented by more detailed assessment outside the workshop. A HAZOP would not necessarily be the appropriate technique for detailed analysis of the causes of some other types of accidents (e.g. failures within complex electrical or mechanical equipment). In such cases, a failure mode effects and criticality analysis (FMECA) may be more useful and supplemented by whatever mechanical integrity information already exists for the systems within the facility’s maintenance and breakdown records.1 Alternatively, a Fault Tree Analysis may provide the necessary understanding of the nature and causes of different types of hazard.2 In many cases, an FMEA may be used to identify what can go wrong, and how 1
See under the heading “Examples of HAZID techniques” in section 6.6 of this booklet for a brief explanation of an FMECA. 2
Also see the reference above for a brief explanation of a fault tree and event tree analysis.
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 27 of 49
low-level failures may affect higher-level systems.3 A Fault Tree may then be used to show how low-level failures, combined with external aspects such as loss of power supply or human error may combine to cause overall system failure. The Fault Tree can also be used, in principle, to estimate the likelihood or frequency of the failure occurring.
Figure 10: Example of a Fault Tree
Investigating and analysing the likelihood of hazards causing major accidents Likelihood analysis is a complex and potentially difficult process. The range of values in this process varies from “may occur within a plant lifetime”, through to “extremely rare” or “never known within industry”. Likelihood is highly dependent on a range of site-specific factors such as the number of equipment items, its condition, activity frequencies, the quality of the management system and human error levels. Likelihood analysis should consist of a mixture of qualitative and quantitative information that, overall, gives an indication of likelihood of each incident. This may be based on calculations of basic task frequencies, analysis of how often errors are made, the reliability of safety devices, previous incident history or near miss data for the facility or industry sector. If historical data is used, it should be critically assessed and the sources of such information documented. If simple judgment or other techniques are used, the employer must record the assumptions and logic used to determine likelihoods. Likelihood analysis flows directly from the preceding process of assessing the nature of hazards and accidents. Further evaluation of information generated in these processes may be carried out to derive an understanding of likelihood. For example, Fault Trees can be used to produce a qualitative or quantitative understanding of the likelihood of different hazards and how they may combine to lead to a specific accident. 3
Also see the reference above for a brief explanation of an FMEA.
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 28 of 49
Event Trees may be used to determine what alternative outcomes may arise from an initial event, and the relative likelihood of each outcome. Again, it is possible to develop qualitative or quantitative event trees. Event trees also assist in defining the significant consequence scenarios, which need to be evaluated in detail. Both Event Trees and Fault Trees can be used to evaluate quantitatively or qualitatively what effect existing or potential control measures have on risk levels. The effects of control measures assumed in these assessments should be reflected in the performance indicators defined for the control measures.
Figure 11: Example of an Event Tree
Human factors and likelihood Human factors can have a major effect on the likelihood of breakdowns, hazards or overall incidents. It is very difficult to fully quantify the effects of human factors on likelihood; however there is some robust general information on human error rates that may be utilised in risk assessment.
Investigating and analysing the magnitude and severity of accidents The magnitude and severity of accidents can be determined using consequence and impact analysis. Consequence analysis involves calculation of the size and duration of the physical and or chemical effects of accidents, while impact analysis involves determination of the harm done to people and property. There are complex computer packages available for consequence analysis, but also simple equations and nomogram techniques for some cases such as Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 29 of 49
the radiation from pool fires or the toxic gas cloud formation from releases of chlorine. Typical consequences which need to be considered within a risk assessment are toxic exposure from gas clouds or smoke inside or outside buildings. The selective use of “worst-case” consequence modelling can improve the efficiency of a process when it is necessary to identify which areas of the facility can cause offsite effects. It is necessary to also consider less than worst-case conditions to develop a comprehensive understanding of the risk. The Regulations apply equally to onsite and offsite populations, and the worst-case scenarios for onsite and offsite populations may be very different. The worst-case approach involves defining the credible combination of conditions giving rise to the maximum consequence zone for the identified accident, in relation to the target population. This can include defining release quantity, duration, pressure, composition, location, wind speed and other atmospheric conditions, time of ignition and functioning of control measures. It is common to assume the worst-case release quantity is the maximum vessel contents, released over a defined period of time. However it should be noted that this cannot be assumed to be the correct assumption for all types of plant or storage area. Where there is a clear mechanism for releasing more than the maximum vessel inventory, this should be considered in the consequence analysis. Active control systems such as isolation valves and blowdown systems need to be assessed for worst-case scenario. Passive control measures that are assured of functioning in the event of the worst-case accident may be included in the assessment. The impact distance in all directions from the release point should be determined allowing for the fact that the wind can blow from any direction. The impact distance is usually determined to a predefined “consequence criterion” which can be material and/or effect specific. For example, LPG flash fires will occur to a defined lower flammable limit while LPG can also produce fireballs or jet fires that can cause injury or damage from thermal radiation effects. Thermal radiation criteria are the same for all flammable materials. Employers should carefully define all relevant consequence criteria based on their definition of a major accident. Below is an example of one method for illustrating the consequences and effects of a major accident. The example is a major accident involving a pool fire. This method may prove helpful during the risk assessment process and, if used, should be included in risk assessment documentation.
Figure 12: Pool fire consequences & effects
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 30 of 49
The employer should consider consequences under a range of meteorological conditions. Usually the worst-case meteorological condition for toxic or non-dense gas releases is high atmospheric stability and a low wind speed, typically experienced at night-time or in the very early morning. For dense gases, the worst-case condition is typically a high wind speed, which tends to occur at neutral atmospheric stability and during the day. Definitions of stability and other environmental conditions can be found in safety or meteorology literature. Ambient temperature and humidity may also affect the consequences of releases. In particular, high temperature can increase the flammable effect range of low volatility materials. Surface type and topography can also affect the consequence, such as a spill into water or onto sloping areas. For flammable materials the consequences should be analysed both when ignition occurs immediately following the release, and if ignition occurs after sufficient delay for a flammable cloud to fully develop. Further factors to consider include day versus night conditions, extreme weather conditions such as flooding, storms and including cyclones for facilities located in cyclone-prone areas of Australia. To evaluate the impacts of major accidents on people, it is necessary to consider the number and distribution of potentially exposed people, and their characteristics. Variations in these factors should also be considered Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 31 of 49
such as temporary populations, maintenance crews and on-site populations for specific operational modes. A further factor that should be considered is that people such as emergency services or investigators may be present specifically because there is a developing incident.
Presentation of risk assessment results Risk assessment results can be presented in many different ways (see the example below). Again it is important to provide the results showing explicitly which control measures are reflected in each result.
Figure 13: Example risk assessment results
Key achievements for a quality risk assessment The risk assessment is conducted for all hazards and potential major accidents at a facility, ensuring that: a) it is comprehensive, systematic, rigorous and transparent; b) it generates all information required by the Regulations, and provides employers with sufficient knowledge to operate safely; c) the knowledge is kept up to date, through review and revision; d) the information is provided to persons who require it to work safely; e) an appropriate group of employees is actively involved; f)
uncertainties are explicitly identified and reduced to an acceptable level;
g) all methods, results, assumptions and data reflect the nature of the hazards considered and are documented; h) a range of control measures are considered and their effects on risk are explicitly addressed; i)
it supports the development of the safety management system;
j)
it is used as a basis for adoption of control measures, including
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 32 of 49
emergency planning; and k) it is used as a basis for the demonstrations in the safety report.
6
Control measures
6.1
Introduction The previous sections discussed key elements for the range of control measures that should be in place at an MHF. This section provides more detailed guidance on how to select and judge the effectiveness of specific control measures. Choosing the best control measures and being able to demonstrate their effectiveness is a critical feature of compliance with the Regulations.
6.2
What is a control measure? A control measure is the part of a facility, including any system, procedure, process or device that is intended to eliminate hazards, prevent hazardous incidents from occurring or reduce the severity of consequences of any incident that does occur. It is the principal tool that delivers safe operation. Control measures are not only physical equipment; they may include high-level procedures or detailed operating instructions and information systems. Control measures may be proactive, in that they eliminate, prevent or reduce the likelihood of incidents, or they may be reactive, in that they reduce the consequences of incidents. They must be implemented under and fully supported by the managerial elements of the SMS. The employer should identify control measures carefully for the MHF, to avoid unnecessary effort or confusion via assessing measures that are not relevant to major accidents. Understanding what part is a control measure, and how it actually controls or affects hazards and risks, is critical to safe operation. This understanding is also essential to the safety report and the associated justification of adequacy of the adopted control measures. Control measures can be regarded as the “barriers” between the hazards of an MHF, the occurrence of a major accident as a result of these hazards, and ultimately the harm that may be caused to people, property and the environment in the event of a major accident. This concept is illustrated in Figure 14.
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 33 of 49
Figure 14: Control measures as “barriers” to major accidents
con seq uen ces
Control measures can be identified while identifying hazards and during the risk assessment. Employers should be able to identify a range of control measures immediately, both the existing measures and possible alternatives. Checklists of "typical" control measures may be able to assist in the process, but these should not be used in isolation. The specific nature of each hazard and the associated part of the facility should be considered when identifying control measures. The table below is an example of the consequences and key control measures that might apply for a warehouse. An example: Identification of scenarios and control measures, dangerous goods warehouse Scenarios
Key Controls
Flash or pool fires from puncturing drums containing flammable liquids.
Drum inspection and handling procedures Ignition source control Fire fighting equipment
Fires in packaged goods areas, in pallet storage stacks, or amongst general rubbish.
Housekeeping Ignition source control Smoke detection and automatic vents
Fire escalation.
Separation and segregation rules Stacking restrictions Fire fighting equipment and emergency response
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 34 of 49
6.3
Understanding control measures Each identified control measure should be clearly linked to the causes, hazards, major accidents or outcomes they are designed to control. Employers should understand the nature, scale and range of hazards and outcomes each control measure must deal with and the effects each control measure has on these factors. This understanding is required to cover the whole range of conditions that might exist at the facility. This knowledge provides a clear basis for defining which control measures are critical to safe operation. It also provides a basis for defining performance indicators and standards for control measures.
Using a risk control hierarchy to determine control measures In an occupational health and safety context, risk control is often categorised according to an effectiveness hierarchy; often simply called the “risk control hierarchy”. The hierarchy lists the type of control measures in a priority order, based on the extent each measure has an impact on risk. In the context of MHFs, a useful effectiveness hierarchy of control measures is as follows: a) eliminate hazards; b) prevent incidents; c) reduce consequences; and d) mitigate the harm. The different categories are defined below. The control "hierarchy" in an MHF context: Control measures that eliminate a hazard are clearly the most effective. If practicable they should be selected in preference to any other type, as their existence removes the need for other controls. Control measures for prevention are those intended to remove certain causes of incidents or reduce their likelihood. The corresponding hazard remains, but the frequency of incidents involving the hazard is lowered. Control measures for reduction are those intended to reduce the severity (consequences) of incidents. They include reduction of inventory. Control measures for mitigation are those that take effect in response to an incident to limit the consequences. They may include fire-fighting systems and, in particular, the emergency response plan. While they may be the "last line of defence", they remain necessary if risk cannot be reduced to a negligible level by other means. Control measures can also be categorised as hardware (i.e. engineered systems) or software (e.g. management systems or operating /maintenance Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 35 of 49
procedures). Controls may also be grouped into categories that define the nature and spread of the control such as engineering, organisational, procedural and administrative controls. Whatever method of categorisation is employed, safe operation will depend on an appropriate balance of different types of control measures. These categorisations can help in determining the most effective control measures for a facility and in ensuring a range of measures is chosen so that one failure does not remove many controls. A single category of control measure will rarely be enough for a risk to be controlled as far as is reasonably practicable unless the elimination of the hazard has occurred. Most commonly, layers of protection will be required to reduce a risk so far as is reasonably practicable. For most facilities or items of equipment, there are numerous layers acting as barriers to eliminate, prevent, reduce or mitigate incidents. This is illustrated in Figure 15. Equipment integrity, operating and maintenance procedures are the "inner layers", and are the barriers normally relied on to ensure incidents do not occur. Systems that reduce or mitigate incidents are the "outer layers" which are relied on in abnormal or emergency conditions. A robust risk control regime will feature a range of risk control layers; the number and integrity of which should reflect the inherent level of hazard and risk within the protected part of the facility.
Figure 15: Layers of Protection COMMUNITY EMERGENCY RESPONSE PLANT EMERGENCY RESPONSE PHYSICAL PROTECTION & MITIGATION SYSTEMS AUTOMATIC SAFETY INSTRUMENT SYSTEM CRITICAL ALARMS AND OPERATING PROCEDURES BASIC PROCESS CONTROL SYSTEM PROCESS DESIGN
Examples of control measures are shown below, using the above categorisation. The table is illustrative only, and is not intended to be a complete list of possible controls for any facility. The categorisation shown is not intended to be rigid, and many controls may apply in more than one category. Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 36 of 49
Some examples of control measures Type
Engineering Controls
Administrative Controls
Elimination
• Mounding of LPG storage tanks.
• Inherently safe process concept.
• Substitution with nonhazardous materials.
• Feedstock quality specifications.
• Inherent design features, layout.
• Plant design procedures.
Prevention
• Impact and dropped object • Operating procedures and barriers. instructions. • Isolation valves to enable safe maintenance work.
• Maintenance and isolation procedures.
• Mechanical ventilation systems.
• Management of change.
• Process Control systems. • Corrosion and erosion probes. • Materials specifications, corrosion allowance. Reduction
• Physical barriers between incompatible materials.
• Spill containment and clean-up procedures.
• Secondary containment of hazardous substances.
• Ignition suppression equipment.
• Process emergency controls and alarms.
• Procedures.
• Shutdown, isolation and de-pressurisation systems. • Bursting disks. • Safety and relief valves. • Bunds, other containment and drainage systems. Mitigation
• Fire detection systems.
• Emergency alarms.
• Fire suppression and cooling systems.
• Emergency planning and procedures.
• Passive fire protection systems.
• Employer-owned buffer zones.
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 37 of 49
Control measures may vary for different stages of the facility's life cycle. For example, design and construction standards are important for new facilities, but as the facility ages more emphasis may be required on asset integrity management. Similarly, control measures may themselves have life cycles that may need to be considered. The balance and type of control measures are expected to be consistent with the employer’s overall safety philosophy. If the safety philosophy is based primarily on engineering controls there is less need for other controls such as administrative ones. On the other hand, if the safety philosophy is based on personnel knowledge and skills, then procedural and competency controls might be dominant, although there would need to be additional hardware controls. The assessment required to understand control measures, their function and their effects on hazards and associated risks, is driven by three factors: a) a highly complex reaction process, new technology, or complex process equipment may require detailed assessment to understand the control measures, whereas a simple system can be understood more rapidly and without using sophisticated methods of assessment; b) where there are numerous options available to control the associated risk, more effort is likely to be required to reach an understanding of the available controls, to differentiate the options in terms of their effects on risk and to provide a basis for selecting or rejecting options appropriately; and c) a high level of uncertainty regarding the nature of the hazard or risk or the behaviour of the control measures is likely to require greater effort to reach an overall understanding; e.g. Class 6.1 liquids are more straightforward to analyse than Class 2.3 toxic gases. The above concepts illustrate the issues that need to be considered in defining and understanding control measures. There may be many other issues that need to be considered in developing an understanding of control measures for a facility. For many facilities this may result in a significant amount of information. Therefore a simple method of linking and communicating the information together should be considered, for example "bow tie" diagrams or registers of hazards and controls. Figure 16 provides examples of how to use bow tie diagrams or registers to link and communicate control measure information. Alternatively, simple hazard management tables or diagrams can be developed.
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 38 of 49
Figure 16: Examples of presentation formats for hazard and control information a) "Bow Tie" diagram
b) Register of hazards and controls
6.4
Selecting and rejecting control measures There are several factors to consider when selecting or rejecting control measures. These factors have a bearing on the fundamentally important requirement to: a) justify the adequacy of control measures (where “adequacy” means “adequate to eliminate risk or reduce it so far as practicable”); b) identify potential common mode failures; and c) define performance indicators for the control measures. The text below sets out a series of core questions that the employer may consider using when selecting or rejecting control measures:
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 39 of 49
Core questions to ask when selecting or rejecting control measures Are there controls clearly linked to each hazard, or are there some hazards having no (or insufficient) control measures? Does the number of controls reflect the level of severity of the hazards? The extent of demonstration should be proportional to the level of risk. What is the functionality of a control measure against the relevant hazards? Is it sufficient to control the hazard in the intended manner, i.e. is it fit for purpose, will it suppress the hazard completely, prevent escalation or simply mitigate effects? What is the survivability of the control measure in an accident? Is the control measure able to function as intended during the types of accidents it is intended to reduce or mitigate? Is the reliability of individual control measures, and of all control measures in combination, appropriate to the level of risk presented by the associated hazards? Is function testing sufficiently frequent to detect failures, and will failures once detected be rectified sufficiently promptly? Has the hierarchy of control measures been considered, with measures to eliminate the hazard adopted first if practicable, followed by measures to prevent, reduce and mitigate? Is there a balance of different types of control measure for each hazard, i.e. is there a diversity of control measures? Are the control measures associated with individual hazards independent of each other, or can they all be disabled by the same mechanism? Are the control measures maintainable? For example, are they accessible, can they be maintained (i.e. safety valve with no means for removal/maintenance as it is the only one and must remain in service)? Are new control measures compatible with the facility, and any other control measures already in use? Can the control measures be implemented at the facility considering their availability and cost?
6.5
Additional or alternative control measures. Employers should objectively challenge how safety is achieved and consider ways to improve safety. This means that alternative control measures not currently in place must be considered alongside existing control measures, and either adopted or rejected according to the results of the risk assessment. In particular, additional controls should be considered if the risk is not reduced as far as practicable or hazards have been identified with no control measures in place. Alternative controls should be identified where the risk is not reduced as far as practicable. The importance of being prepared to challenge the "norms" of facility operation has been highlighted in past disaster inquiries such as the
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 40 of 49
Longford Royal Commission and the Cullen Inquiry into Piper Alpha. Therefore the employer should typically consider the following circumstances: a) existing control measures which are believed to be fully functional and appropriate; b) existing control measures which may have become disabled, degraded or deficient; c) existing control measures which function as intended but could be improved; d) control measures which were considered or used in the past and rejected for some reason; e) existing control measures which are to be replaced due to obsolescence or old age; f) new control measures which could replace or add to the existing range of control measures; and g) new control measures for modifications to the facility. For many existing facilities, there may be control measures that were adopted or rejected in the past without records to support those decisions. Employers should identify past decisions and control measures that need to be recorded and reviewed, to understand what was done in the past and why it was done, and to maintain the integrity of existing control measures in the future. This relates to the need for a knowledge base of the control measures on the facility and is an important part of justifying the adequacy of an existing facility in the safety report. Given the potentially large number of decisions and control measures for a typical MHF, which may have decades of operating experience, the employer will need to identify the critical areas that require review, and determine which areas need to be reviewed in brief or in detail. Circumstances where control measures would require review include: a) new operating conditions have arisen; b) knowledge of the basis for safe operation has been lost; c) there may have been a degradation in effectiveness of existing controls; d) the knowledge or technology employed is now outdated; and e) an incident occurred. The employer should identify both proven technology and newly developed options, as appropriate and not dismiss any option on the grounds that it is "unproven". The process of risk assessment should include the evaluation of new technologies and practices to determine if they are appropriate to the facility. A reasonable number of existing and alternative control measures should therefore be considered, depending on: a) the scale and complexity of the facility; Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 41 of 49
b) the nature of the risk profile; and c) the rate of development of new technologies and practices.
6.6
Defining performance indicators for control measures Performance indicators for control measures will generally relate to some standards or target levels of performance (performance standards) to ensure safe operation. Performance indicators and the corresponding standards play a vital role in the justification of the adequacy of control measures. A performance indicator is defined as information that is used to measure the effectiveness of a control, e.g. a test of effectiveness, an indicator of failure, an action taken to report a failure or a corrective action taken in the event of a failure. An indicator is an objective measure, which shows current and/or past performance. A performance standard is defined as the target set for a performance indicator. The standard represents the required performance for the control/SMS (whatever) to be considered effective in managing the risk to ALARP. Once the employer has decided which control measures are to be adopted, performance indicators must be defined for the control measures, which enable the employer to: a) measure, monitor or test the effectiveness or failure of each control measure; and b) determine the best reporting and corrective actions to be taken in the event of failure. Performance indicators should measure not only how well the control measures can perform, but also how well the management system is monitoring and maintaining them. This shows how performance indicators for control measures overlap with the performance standards required for the SMS. Some performance indicators and their corresponding performance standards for engineered control measures may be adopted from manufacturer's recommendations; however, the employer should determine if these are appropriate to the specific conditions of the facility. Performance indicators take many forms, and can be quantitatively or qualitatively expressed. One type of target is a desirable long-term goal or a limit, the breaching of which can be tolerated to a certain extent or under certain conditions in the short-term. Another type of target is one that needs to be achieved within a prescribed timeframe, or where there is zero tolerance for any breaches.
An example of performance indicators for control measures: Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 42 of 49
A hardware control measure has performance standards relating to its capacity and reliability, plus management system standards for inspection, testing and maintenance, which aim to assure that the capacity and reliability of the control measure are maintained. Performance indicators need to be set that measure performance against these standards. For example, for a pressure relief valve the performance indicators and standards may relate to: Min number on-line: x Min relief rate: y kg/s Max probability of failure: y% Max interval between tests : z yrs One example of a performance indicator that provides a range of acceptable performance is a pre-alarm limit that can be exceeded for a period of abnormal operations provided this is monitored. A performance indicator that does not allow a range of acceptable performance is set at the level of the critical operating parameter (see the section on critical operating parameters). Performance indicators for control measures should include the following considerations: a) failure of any control measures - what are the performance requirements for functionality, availability, reliability and survivability of control measures that indicate how or how often the control measures may fail to perform, and what performance standards are required for any activities necessary to achieve these standards? b) reporting of control measure failures - what activities are necessary to confirm or assure performance, what degree of reporting of failures is required, how quickly will the reporting system identify a failure, and what level of independent verification is needed in addition to routine assurance? c) corrective action in the event of such failures - what steps are to be taken and how quickly following detection, and what performance standards are required of the corrective process? Performance indicators can be defined at various levels, e.g. there may be high-level performance indicators as well as lower level and detailed performance indicators. High-level indicators tend to address overall performance issues, for example: a) employee perceptions, incident rates, improvement programs, availability of control measures which may be taken as indicators of overall safety performance; b) maintenance of operating conditions within a critical operating envelope, which may indicate overall integrity of the process control regime; and c) total number of resources dedicated to testing, inspection and Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 43 of 49
maintenance of critical control measures. Detailed performance indicators tend to relate to individual measures that when combined; contribute to achieving overall high-level performance. At a detailed level, there are many different types of performance indicators that can be defined for each control measure. When specifying performance indicators or standards, it may be necessary to provide detail on “who, what, where, and when” for implementation of procedures and activities relating to these indicators and standards. The responsibility for implementation of performance indicators can be defined at a very specific level for each performance standard. For example, responsibility for operational parameters may lie with operations management teams. Where performance standards relate to control measures, they should be assessed as part of the justification of adequacy. It is also necessary to show that the control measures achieve the standard that has been set. In the simplest cases, performance standards may be industry standards, codes or norms. However, these need to be shown to be appropriate to the specific facility and this can be by a combination of techniques such as: a) risk assessment results; b) qualitative argument or reference to the basis for the standard; and c) cost-benefit or cost-effectiveness analysis of options. In more complex cases, where there may be no appropriate existing standards, the employer may need to demonstrate the suitability of the performance standard based solely on the risk assessment. Some examples of performance indicators for control measures: • Management system compliance levels as shown by audit. • Test frequency/interval for safety-critical equipment. • Average skill level of the operations shift personnel. • Compliance level with operating procedures as shown by monitoring. • Number of failures in specific safety devices. • Number of times staffing levels fall below target minimum numbers. • Number of times pressure, temperature etc exceed particular levels. • Measured mechanical integrity (e.g. extent of corrosion). • Detection and response times for unintended material releases. • Sensitivity levels and response times for process alarms. • Compliance levels with manufacturer's or design standards. • Vibration levels in rotating equipment (e.g. compressors).
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 44 of 49
6.7
Critical operating parameters For the purposes of this booklet, a critical operating parameter (COP) is “the upper or lower performance limit of any equipment, process or procedure, compliance with which is necessary to avoid a major accident”. The sum total of COPs may define an overall safe operating envelope for the facility. Control measures are required to prevent COPs being exceeded; in general each COP will have at least one associated control measure, and each control measure will relate to at least one COP. A COP is a process or other variable that can be measured instantaneously, and where a breach of the safe operating envelope may be detected by exceeding the COP limit. This contrasts with performance standards, which are generally something against which performance may need to be tracked over a period of time, to determine whether operations are acceptably safe. Performance standards might include the number of times COPs are exceeded each year. Examples of COPs are the quantity, pressure, temperature or composition of a material in storage or process systems, or the manning level at the facility, or the number of fire-water pumps available on-line. Each COP may have an associated performance standard, such as the allowable number of times a soft target associated with the COP is exceeded. This performance standard then also relates to the associated control measure. There may be critical design or maintenance parameters that can be used in setting performance standards. The concept of COPs is illustrated in Figure 17.
Figure 17: Illustration of critical operating parameters
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 45 of 49
6.8
Involving employees in control measures The Regulations require employers to consult with employees and contractors (where practicable) in all decision-making processes associated with controlling risks. The employer should consider defining roles for employees in relation to adopting or reviewing control measures. Through this involvement, employees are able to provide their knowledge of how the facility is operated in practice and assist in identifying the control measures actually in place. The employees’ knowledge may also assist in providing an understanding of how control measures function in practice, and how they may fail or be defeated. Employees will be aware of issues such as compatibility and maintainability of alternative control measures and are vital to the process of selecting or rejecting control measures. The objective is to make use of employees’ knowledge and experiences in the working of the facility. In practice, only particular employees are involved in this way, however, all employees must be provided with information, instruction and training on the adopted control measures regardless of their involvement in the review and assessment activities. This ensures that all individuals understand the control measures to the extent necessary to perform their work safely. Evidence of genuine participation by employees in all aspects of selecting control measures will make an important contribution to the quality of the safety report.
6.9
Control measures within the safety report and SMS Control measures should be systematically managed within the SMS and must be presented within the safety report. This information should include statements on the viability and effectiveness of the range of control measures considered, methods and results of the corresponding risk
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 46 of 49
assessments, and the reasons for selection or rejection of control measures. It should also include the COPs and performance indicators for the adopted control measures and a justification of the adequacy of control measures, including the means by which performance is assured. The SMS must relate to each activity used in the selection and ongoing maintenance of control measures. Each element of the SMS should have performance standards to provide regular monitoring of the effectiveness of each element. Consultative methods used to involve the people working at the facility to identify and develop control measures should be described. The employer’s processes for adopting and managing control measures and their related information are illustrated in Figure 18. Examples of methods for recording information, such as a register of hazards and control measures, were discussed above. However, these can be expanded if necessary to include additional data on the control measures, for example consequence and likelihood information from the risk assessment, performance indicators, the responsibility for "who, what, when and how", and information to support the justification of adequacy.
Figure 18: Adopting and Managing Control Measures
6.10 Reviewing and revising control measures Control measures must remain valid or effective for the conditions at the MHF. Ordinarily, it is improbable that all control measures will always remain valid or effective, given changes at the facility and new knowledge Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 47 of 49
about hazards, risks and control measure options. Reviews of control measures should be triggered whenever a situation arises that would indicate that control measures are no longer valid or effective, for example if there is a proposal to modify the facility, if there has been a major accident or if a control measure fails to meet the set performance standard. In addition, reviews of the safety report HAZID and risk assessment are required if requested by Comcare and at least every 5 years. It follows that an ongoing process of reviewing and revising control measures simplifies the 5-year requirement to review the safety report.
Investigating and analysing control measures Throughout the above steps, the employer will be reflecting upon existing or potential new control measures in the determination of causes, likelihood, consequence and risk. It is essential to be explicit about what control measures are being included and how they are considered to affect risk levels. The investigation, adoption and rejection of control measures are discussed later in this guidance material.
6.11 SMS - A suggested combination of key elements This section discusses the key SMS elements for the range of control measures at an MHF.4 This booklet provides guidance on the nature of control measures, selecting facility-specific measures and ensuring the effectiveness of the control measures. Refer to Booklet 3 for further information on the SMS. The actual configuration of control measures used must suit the facility. This suggested combination of key SMS elements for control measures is not intended as a template, but as an indication of the type of elements and features that should be in place at the facility (refer to Figure 19).
Figure 19: Suggested combination of key control measure SMS elements
4
This combination of key elements is mainly derived from the US Department of Labour’s Occupational Safety and Health Administration’s (usually contracted to OSHA) guidelines for process safety management (see reference in Appendix A under the topic heading “Role and development of an SMS”.)
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 48 of 49
Hazard identification, risk assessment and control measures for Major Hazard Facilities – Booklet 4
page 49 of 49