Ever tossed a fair coin? I’d bet you have! At least in your childhood days while deciding which team would bat first in a baseball or a cricket match, or who would serve first in a badminton or tennis game. Every kid agrees to it because it’s unbiased. When you toss a fair coin, the chances of getting a head is 1/2 (0.5) or 50%. This is the division between the favorable outcome, which is a head and all possible outcomes (head and tail).
A coin toss is perhaps is the simplest introduction to probability, which informs chance or likelihood of occurrence of a random variable. The random variable here is getting a head. Let’s note the random variable as X and probability as P(X). Mathematically put:
Probability (Getting head) or P (X)
= Favorable number of outcomes of the event / Total number of possible outcomes
= 1 / 2 = 0.5 = 50%
Now, what’s the chance of getting a head when you toss a pair of coins together? In this case, the total number of possible outcomes is four: tail (first coin) and tail (second coin), tail and head, head and tail, and finally head and head. The values that the random variable can take are many, as shown in the below table:
As you can see, the possible values of X range from 0 to 2 (i.e., 0 head, 1 head, or 2 heads). This leads to the concept of distribution. Looking at the above table, we can see the frequencies of this random variable’s occurrences are 1, 2, and 1. This is distribution or frequency distribution. One could say: distribution is the possible values a random variable can take and how frequently these values occur.
Now, if I add probabilities to this random variable’s values, we get a probability distribution. This is depicted in below:
As shown in the above table:
- We have all favorable outcomes for 0, 1, or 2 heads. These are represented in the first and second columns.
- All possible outcomes are obviously four, and that’s shown in the third column.
- P(X) is shown in the final column, and the probabilities are 25%, 50%, and 25%, respectively for 0, 1, or 2 heads. This is probability distribution. Summed up, it equals one.
Hence, we can saythat probability distribution for a random variable describes how probabilities are distributed over the values of the random variable.
Discrete and Continuous Distribution
Distribution can be discrete or continuous. Discrete means you are getting an integer number (1 head or 2 heads). You don’t say that you will get 0.33 head! Considering another example of counting the number of children in households of a locality, you will come-up with results such as 0 child, 1 child, 2 children, etc. You won’t get 0.57 child!
You may be laughing now – what’s 0.33 head or 0.57 child!? Good to see you smiling. Smiling lessens stress and helps in understanding.
All random variables; however, are not discrete. For example, let’s say you are determining the distribution of age, weight, or height of people in a locality. Considering height, it can be anything: 5 feet, 5.5 feet, 5.85 feet, 6.1 feet, and so on. In such a case, the distribution is continuous. So, this distinction is important: at a high-level, there are two types of random variables – discrete and continuous and respective probability distributions – discrete probability distribution and continuous probability distribution.
But, how does all of this fit into Risk Management? Risk Managers don’t toss coins or calculate heads/tails in an experiment. That’s kids’ games and not for grown-up men or women! Perhaps; although, child play teaches the basics neatly.
Probability Distribution and Risk Management
With the above basics, let’s consider another example to understand probability distribution from the perspective of risk management. You are going to a friend’s house. It may take you one hour to reach your destination if you encounter no obstacles. If there is heavy traffic, it’s possible that you may not get there for three hours. With less traffic, it’s more likely to take 2 hours. Hence, you can say there are three possibilities:
- Minimum (or Optimistic) travel duration = 1 hour
- Most likely travel duration = 2 hours
- Maximum (or Pessimistic) travel duration = 3 hours
In this case, the random variable (X) is the “travel duration.” Can you conclusively say which one of the estimates is correct? Unlikely, because other factors such as traffic conditions are involved. Now, if I add chances to these numbers, then we get probability distributions. I’ve prepared the below video to explain in more detail [Duration: 05m:33s]. For better audio-visual experience, you may want to go full HD and plug-in your earphones.
Importance of Probability Distribution
In project risk management, the concept of probability distribution is applied to estimation. Continuing with our previous example, when we estimate, we take the most likely outcome of two hours, which is not correct because we’ve forgotten to consider other possibilities.
We can (and should!) consider possible scenarios, not just the most likely one. In other words, instead of saying an activity in a project is going to take “X” number of days, we also can consider other days using a distribution. For each duration in the distribution, there is a probability available.
This can be done for all the activities or tasks of the project, which in turn impacts the project schedule and cost. This enables us to build a more realistic plan.
Now that we have understood the basics of probability, distribution, and probability distribution, let’s look at the various types used in risk management.
Triangular Distribution
Triangular distribution is the most common type of distribution used. Named triangular because of the shape of the curve, this refers to there being no pre-existing data, but only expert opinions or judgment.
Symmetrical Triangular Distribution
The below distribution is triangular and symmetrical.
By looking at the graph above, we can say: There is approximately a 30% chance of the duration being 6 days, a full chance of the duration being 8 days’, and also a 30% chance of the duration being 10 days.”
Asymmetrical Triangular Distribution
Do note that the triangle shown need not be symmetric. Asymmetrical diagrams are shown below:
From here, you can calculate the durations with respective chances or probabilities.
Let’s take another example of a project, once with a task of Product Requirement Documentation (PRD) Preparation with an estimated 5 days duration. This is the most likely estimate, but we do not have the minimum and maximum value.
By using the Primavera Risk Analysis (PRA) software tool, the triangular distribution is depicted as below:
The durations can be 4, 5, or 6 days (shown in the X-axis). The respective chance for minimum, likely, and maximum values are entered when you perform a duration risk analysis. This is demonstrated in a video in the later part of this article.
While building the schedule model, this triangular distribution can be noted as Triangle (4, 5, 6) or Triangle (4; 5; 6).
Uniform Distribution
In rectangular distribution, you can use a maximum value and a minimum value, but not any most likely value. In the below example, we have a uniform (or rectangular) distribution.
Looking at it, we might say: The task has a minimum duration of 4 days, but a maximum duration of 12 days.
You can use Uniform Probability Distributions when you specify the extremes of uncertainty of the activity under consideration and when the intermediate values have equal chances of occurring. It is also possible when you cannot draw any inference on the possible distribution shape.
Taking our previous example of the PRD Preparation task, which is estimated to be 5 days, using PRA, we have the following values for Uniform distribution:
Like Triangular distribution, while building the schedule model, this distribution can be noted as Uniform (4, 6).
Beta Distribution
Beta distribution, like triangular distribution has also three possible values – worst case, most likely, and best case. Like the triangular model, it also gives more weightage to the most likely case. We have seen one example of Beta distribution in the earlier video.
Unlike the triangular distribution, the shape for beta distribution is smoother and the tails in Beta distribution taper off less quickly. A sample beta distribution curve is shown below:
Beta distribution can also be symmetric or asymmetric in shape. The notations happen like Beta (6, 8, 10). As you can see above, there can be many values close to the most likely values, and it slowly tapers off towards the minimum or maximum ends.
Using the PRA software tool, for our task, PRD Preparation, a Beta (or BetaPert) distribution will come out as below:
Along with the triangular distribution, beta distribution is another frequently used probability distribution.
Normal Distribution
Normal distribution is defined by the mean of a planned (or remaining duration) activity for an activity and standard deviation (SD) of the activity.
This distribution is used if there is historical information available. Normal distribution also has a bell-shaped curve like Beta, but considers SD to calculate the worst (and best) case scenarios.
For our example (task of PRD Preparation with a duration estimate of 5 days), we note the normal distribution as Normal (5,1), where 5 is the mean and 1 is the SD.
Discrete Distribution
In a discrete distribution model, the duration of an activity under consideration can have a number of integer values, but without any intermediate values. In other words, the distribution is discrete, rather than continuous like in a triangle, beta, or uniform.
In the above sample, the activity has discrete distribution of values 6, 10, 18, and 20.
Considering our task of PRD Preparation, the discrete distribution will be seen as below with the PRA tool. The distributions are 2, 3, 4, and, 5 with respective weighting factors of 10, 20, 30 and, 50, respectively. This can be noted as Discrete ({2, 3, 4, 5}, {10, 20, 30, 50}).
Practical Example and Demonstration
With this understanding, let’s take a practical look using MS Project and Primavera Risk Analysis. The video [Duration: 05m:42s] demonstrates a project plan with fixed activity estimates. It’s next imported to the PRA tool and analyzed with various probability distributions for the activities of the project.
Conclusion
Probability distribution is very important when you use quantitative risk analysis, which involves a number of mathematical modeling and sampling. Managers or planners can also deploy advanced probability distributions such as lognormal distributions, cumulative distributions, general distributions, among others. The above video explains a few of these.
We have come a long way and seen a number of examples. I propose just one more exercise. I promise it won’t be difficult, provided you have read the content sincerely. Going back to our first examples of coin tosses, can you answer these:
- What’s the probability distribution of getting a head when you toss three coins?
- What are the values that the random variable can take?
If you are getting four values for the random variable of getting a head and when all your probability distributions are summed-up to equal one, then you have well understood the concept.
I welcome your thoughts, feedback, and suggestions in the comment section below.
*This article is dedicated to the memory of my father, the late Harendra Nath Dash, who passed away three years ago on June 11, 2019. He first introduced me to the concept of probability and statistics. It was mesmerizing then, and I still remember it. I wish this article to be a tribute to him and his teachings.
References:
[1] RMP Live Lessons – Guaranteed Pass or Your Money Back, by Satya Narayan Dash
[2] Practical RMP with Primavera Risk Analysis, by Satya Narayan Dash
[3] RMP 30 Contact Hours Online, by Satya Narayan Dash
[4] Book: I Want To Be A RMP, The Plain and Simple Way, Second Edition, by Satya Narayan Dash