Before we begin
A word to the wise
There are circumstances, for example in complex systems engineering, where there are extensive sources of data on mean time between failures and so on which can support detailed and scientific assessment of risk. The purpose of this paper is to help people new to risk assessment and management to approach the discipline for a business, a charity, a project or other endeavour where such data are neither available nor particularly relevant.
I have been using this approach for more than 20 years for business, projects and programmes in life sciences, utilities, banks, building societies, insurance companies, charities, competitions, retailers, manufacturers, software companies, consultancies and IT service providers. I make no claims for uniqueness of the approach or even originality of the concept, but I know it works and this is how I explain it when training others in performing risk assessments.
This approach requires no aptitude for mathematics or science, but a little imagination and a desire to succeed in whatever terms you define success.
What is a risk?
Before we get into trouble by proposing our own definition, here are a few definitions of the noun (the word is also used as a countable noun and has multiple meanings as a verb) provided by authoritative sources:
- Cambridge English Dictionary: the possibility of something bad happening;
- Collins English Dictionary: If there is a risk of something unpleasant, there is a possibility that it will happen;
- The Free Dictionary: The possibility of suffering harm or loss; danger;
- British Standards Institution (ISO/IEC 27001): the combination of the risk of exposure and the impact = combination of (likelihood of the the threat being able to expose an element(s) of the system) and impact;
- Oxford English Dictionary: The possibility that something unpleasant or unwelcome will happen;
- Managing Successful Programmes (UK Govt): An uncertain event or set of events which, should it occur, will have an effect on the achievement of objectives; a risk is measured by a combination of the probability of a perceived threat or opportunity occurring and the magnitude of its impact on objectives.
As they say on the BBC, other definitions are available. These definitions have a common theme in that they designate a risk as the possibility of a negative event occurring. Which is fine as far as it goes, but if we are to use the techniques associated with risk assessment and management to drive positive outcomes for our organisations or communities, would it not be a good idea to consider a wider definition? So taking inspiration from PRINCE2 (yes, I am that old), I propose the following definitions as the foundation of what follows:
- Risk: an event, which might occur in the future, that could require us to respond by changing what we were planning to do;
- Issue: a current event or situation which requires is to consider responding by changing what we are planning to do.
Using these definitions, we can consider both risks (future) and issues (current) as having either negative or positive impact on our plans.
Why bother when the future is unknowable?
Whilst the details of future events are impossible to predict, there are types of possible events which can be predicted in conceptual terms even if the details are somewhat opaque. By identifying such types of event and making a reasonable stab at working out what you might need to do differently should the risk become reality.
Changing plans in response to events is only sensible and reasonable. Spending a little time on a regular basis assessing what might happen (risks) and what is happening (issues) and considering whether we should change our plans as a result can do no harm. We have all heard sceptics suggesting that such activity is introspective and generally a waste of time when “we all know what we are doing”, haven’t we? Such arrogance is one of the key factors in the litany of project and business failures caused by events which external observers might, were they being unkind, consider entirely predictable.
Take the impact of ransomware as an example. I read recently of a successful business, which had steadily grown over more than 20 years, killed almost overnight by a ransomware attack. Why? The business had no reliable backups of its computer systems and thus no ability to rebuild. Is it harsh to consider this to be unacceptably negligent in the face of an all-too-common and well publicised form of extortion?
So let’s get real. There are lots of factors within and around our organisations, projects and communities which can impact us and our ability to succeed. Some we can affect, some we cannot, but it is frankly absurd to think we can succeed by ignoring them.
Oh yes, there is also the small matter than in most jurisdictions and for many types of organisation, there is a requirement (either in law, regulatory requirements or applicable standards) to maintain a corporate risk register and to have effective risk based controls in place to manage the most important risks facing the organisation.
Where to begin?
In summary, the process is cyclical and is based on the both the common Plan-Do-Check-Act cycle and the less well known but equally useful OODA Loop (Observe, Orientate, Decide, Act ) developed by military strategist John Boyd.
The starting point is to define the scope of the assessment, which may be a single project or a whole organisation, and the relative criticality of each component part to the success of the whole. Thereafter begins a cycle of identifying risks (first pass) or reviewing the risks previously identified (all subsequent loops), assessing for each risk the likelihood of it occurring, the impact it might have and how visible its occurrence would be. These factors help us determine what action we need to plan in order to minimise negative impacts and maximise positive impacts. We then incorporate those action plans into the overall activity plan for the business, project or whatever and then, at regular intervals or as suggested by important milestones in our overall plan, we review the impact our actions have had on the risks we previously identified and the cycle starts again.
Agree scope & relative criticality
The first thing to make absolutely clear is that, at all times and at all costs, it is vital to avoid over-thinking… excessive analysis will cause only paralysis and momentum is a vital component of any successful endeavour. So this first step, which forms the foundation of everything that follows, should not require more than getting the right people together, preferably in person but remotely if circumstances dictate, for a 2-3 hour workshop. The goal of this workshop is to identify the activities, processes, functions, systems or other components of the overall undertaking (be it a business, project or whatever) that will form the scope of the assessment. Anything not explicitly identified as being in scope, is by implication not in scope and, if you get this right, can safely be ignored when it comes to identifying risks.
Always use language which is familiar to the participants in both the assessment and the overall undertaking – avoid jargon, avoid pseudo-science and do not fall into the trap of seeking perfection. Try to keep the number of major components low, as the longer the list becomes, the more time consuming the process will be and momentum will be reduced. If your scope is very broad or large, break it down into manageable chunks that make sense to you.
The components should provide a good idea of how the business is structured and the participants in the risk assessment should be able to agree which criticality designation should apply to each component using the following definitions:
- CRITICAL: vital to day-to-day operations
- STRATEGIC: vital to implementation of long-term strategy.
- TACTICAL: necessary to short- to medium-term performance
- OPERATIONAL: important to short-term performance
So, let’s take an example of a risk assessment for the Software-as-a-Service business. The major components could be described and designated as:
- Management – STRATEGIC
- Finance – CRITICAL
- Marketing – STRATEGIC
- Sales – TACTICAL
- Software development, maintenance and operations – CRITICAL
- Customer Service – OPERATIONAL
At the heart of any process for assessing risk must be a set of types of risk that can be easily understood by those conducting the assessment. Within each type there will be a wide variety of manifestations, which will most likely be different for each part of the business.
Hence, the types of risk that may be identified include:
- Change to legislation or regulatory requirements;
- Changes in customer preferences;
- Changes to environmental conditions;
- Changes to societal norms;
- Failure of, or disruption to, a process or activity – this includes risks ranging in effect from the catastrophic failure of the entire process to minor disruption to a single step in the process;
- Failure of, or disruption to, a dependency – this includes risks ranging in effect from the collapse of a critical supplier of goods or services to the temporary failure of an information flow from another business process;
- Failure of, or disruption to, plant or equipment;
- Failure of, or disruption to, information technology or systems;
- Compromise of information security (confidentiality, integrity, availability);
- Project risks, which include risks associated with failing to deliver the solution as required at the point of delivery, risks associated with the solution itself and risks associated with its delivery.
In assessing the types of risk to which a physical or organisational component of the business may be subject, it is important to ensure that the assessment is well informed and based on verifiable evidence. Where possible and appropriate, the views of acknowledged experts should be called upon to ensure that the assessment of the nature and likelihood of a particular risk is as realistic as possible.
At this stage it is only necessary to record summary details for each risk. These details should include a name, which should convey something of the nature of the risk, and a one or two sentence description of the nature of the risk.
Assess Likelihood of Occurrence and Business Impact
The probability of a risk occurring, its likelihood, should be described in simple terms, such as:
- Financial performance of the company;
- Health and safety of employees and the public;
- Morale of employees;
- Productivity and process efficiency;
- Product Quality;
- Regulatory or Legislative Compliance;
- Reputation of the company with its customers, investors, staff and suppliers.
- Trivial – little discernible impact internally, no impact on external stakeholders, easily managed; (In)convenient – minor impact internally and on external stakeholders, can be managed with modest additional effort;
- Disruptive – causes disruption to the organisation and its stakeholders, which could be positive or negative and requires effort to be devoted to managing impact on operations and reputation;
- Harmful/Beneficial – significant impact on the organisation and external stakeholders, requires substantial effort from dedicated team to manage, using proven response plans and a trained team, including external expertise if required;
- Existential/Game-Changer – I hope this is self-explanatory! These risks need to be carefully planned for, the responses well rehearsed and the people involved withdrawn from routine business operations until the impact of the event has passed, which might require a significant period of time and careful, dedicated management of internal and external communications.
- Risks falling in the red cells must be addressed with actions clearly defined in terms of responsibility, purpose and firm deadlines;
- Risks in the amber cells should probably be addressed and should definitely be monitored;
- Risks in the green cells can probably be accepted but that decision should be made explicitly by the right people.
- Acceptance – no specific action is required;
- Monitoring – where visibility is the major concern, putting specific monitoring tools, services or processes in place to increase visibility of both trigger events and the risks themselves;
- Communication – ensuring that relevant stakeholders are aware of the risk and understand their responsibilities should it occur;
- Mitigation – actions designed to reduce the likelihood or change the impact on the overall undertaking so that adverse impacts are reduced and positive impacts increased;
- Contingency or continuity planning – having defined processes for responding appropriately to the occurrence of the risk, operating the organisation whilst the impact of the risk continues, recovering from adverse impacts and returning the organisation to normal operations once the event has passed.
- A clear and unambiguous description of the outcome that is required. The SMART structure can be very helpful in ensuring that the outcome is specific, measurable, agreed, realistic and timely;
- Identifying actors in terms of roles, responsibilities, skills and experience required;
- Describing each deliverable in terms of its purpose, structure, content, ownership of development, effort required, reviews/tests required and acceptance criteria that reviewers will use to determine its fitness for purpose. Dependencies on other deliverables, external events etc are also required as are clear start and completion dates;
- A statement of the pre-requisites for starting work on each deliverable.
In performing a risk assessment it is necessary to identify not only the immediate effects of the risk occurring but also the impact on the business of those effects. For example, the effect of a hard disk problem may be the corruption of some data stored on that disk, whilst the business impact of corrupt data relating to customer accounts could result in significant cash flow problems and could also adversely effect the company’s reputation for excellence. In general, the assessment of each risk should consider the impact on (in alphabetical order, implying no relative importance):
When assessing the impact of a risk it is important to ensure that the assessment is well informed and based upon verifiable evidence. Hence, expert opinion should be called upon where possible and appropriate to do so.
The impact of a risk occurring should also be described in simple terms, such as:
The combination of likelihood and impact for each risk can be viewed as a matrix that helps identify the level of severity the risk should be considered to have for the organisation.
This information provides the basis for a cost-benefit analysis, which will support decision-making on how each risk should be addressed, using the following guidelines:
In each case, properly informed decisions on what, if anything, should be done to address each risk should be made by the right people and recorded in an appropriate manner for future consultation and ease of maintenance. Generally speaking, if your list of risks is short, you could use a spreadsheet or table in a document, but my experience is that such mechanisms are not fit for purpose and are of limited use when it comes to searching and reporting. There are many specialist software tools available, not all of which are expensive or complex to use and if the subject of your risk assessment is important to you, your organisation or community, then a little investment up front can save a world of pain later on.
Before making any decisions on what action should be taken to address your risks, the final element of the assessment is to make a determination of how visible the risk would be should it occur. This is common practice in the development of medicines as adverse reactions can be hard to detect at all, and even harder to attribute to a single cause, so a great deal of effort is deployed to ensure that all possible side effects are known, studied and documented as part of the development process.
In cybersecurity, the same question needs to be asked because not all cybersecurity risks are necessarily obvious. For example, on average in 2020 it took over 190 days to learn that a malicious actor had compromised a private network. Ransomware is clearly obvious when it is triggered, because computers become unusable, but it can be sitting on your network for a long time before being triggered.
Hence, one of the key decisions on how best to address a risk is best informed by a realistic assessment of how visible the occurrence of the risk would be.
There is only value in implementing a risk response if the tangible and intangible benefits of doing so outweigh the tangible and intangible costs. In addition, the tangible and intangible costs of preparing the response and ultimately of deploying it need not to outweigh the costs of taking no action. Since success in business involves a degree of risk-taking, there will be risks that the business is happy to accept in the expectation that doing so will result in improved profitability, market share or other tangible benefits.
The appropriate response to a risk takes a variety of forms and in a well managed undertaking, evolves over time as previous actions take effect and those effects become evident in terms of affecting the likelihood, impact or visibility of the risk. Don’t forget that maintaining momentum at this stage in the process, when fatigue and shortening attention spans are increasingly significant. So try to work through the risks in priority order and with a bit of pace. Short workshops are one of the best formats for this stage before assigning actions for detailed planning to appropriate individuals.
The kinds of response that should be considered could include:
Clearly, these are not exclusive options and combinations may be required for a response to be properly effective.
Whatever response has been selected, explicit plans need to be drawn up to ensure that the right people do the right things at the right time to deliver the outcomes the organisation or project requires. As time passes and circumstances evolve, these plans will be subject to change like any plan, but that should never prevent us from producing a professional and robust plan of action.
Characteristics of a good risk response plan include:
Each risk response plan that is approved should then be incorporated into the overall plans for the undertaking so that management of risk is embedded and not considered as somehow disconnected from your true purpose. It is what you do to ensure that your true purpose is not diverted or disrupted unnecessarily.
One of the most critical factors in ensuring that contingency plans (aka continuity or disaster recovery plans) is effective in ensuring that the plans are both logically tested using desktop reviews and rehearsed on a regular basis using realistic simulations. This final point cannot be stressed too strongly. If you enjoy watching plays, comedy shows (except improvised shows of course), TV dramas or movies, one of the things you cannot see which is vital to the success of the production is that every aspect is rehearsed and tested beforehand to ensure it works. You would expect nothing less. Actors fluffing their lines is both embarrassing to them and infuriating to the audience, so they invest time and energy in reducing the risk of that happening. Contingency plans require the same effort in order to ensure that those involved in implementing them know their roles and everybody else’s, know what to do, when to do it and how to do it without having to look it up because they have rehearsed to the point of it being second nature.
It really does the difference between success and failure.