41 – How to Calculate Conditional Probability in Google Sheets

How often do people respond to your emails? How often do they open it? How often do they delete your emails before even opening it? Something that a marketing analyst aims to do is optimise their email campaigns. Something that would be beneficial for them is knowing the probability of a customer actually engaging with them. Even knowing the likelihood of this customer opening an email based on their past engagement would be really helpful. Conditional Probability is what can assist this analyst. By calculating the probability of an email being opened given that the customer has clicked on previous emails, this analyst can tailor their strategy.

After completing this post about Conditional Probability, it will be possible to leverage the maths behind data-driven decisions, boost open rates, and create personalised campaigns that resonate with an audience.

Definition

Firstly, what is Conditional Probability? It’s the likelihood of an event or outcome occurring, based on the occurrence of a previous event or outcome. It is calculated by multiplying the probability of the preceding event by the updated probability of the succeeding, or conditional, event.

Independent Events

Events can be independent, which means that each event, in a series of events, is not affected by any other events. A prime example of this is tossing a coin. Every time that you toss a coin, each toss is completely isolated from the previous toss. The side it landed on in the previous toss will not affect the next coin toss you do.

Let’s have a look at the Google Spreadsheet:

https://docs.google.com/spreadsheets/d/12HzdwkqLbfFiwKOtYpnBTnOVRZdp3aDruJAwEhso8eE/edit#gid=0

Example 1

So, what does this mean from a probability standpoint? We’re going to assume that a coin can never land on its side – even though I’ve been witness to the contrary when I was 11, but I was by myself so it doesn’t technically count *rolls eyes hard at my brothers*. What’s the probability that it lands on a Heads (H)? What about a Tails (T)? 50-50/50%/one in two.

Dependent Events

As a counter to independent events, we have dependent events. These events mean that they have been impacted by the events that precede it.

Example 1

An example of a dependent event is the probability outcome of a final goal that is changed based on changing variables through the series of events. One such example of this is drawing marbles from a bag.

In a bag, we know that we have 2 blue marbles and 3 red marbles. What are the chances of getting a blue marble? Let’s use some maths to figure it out.

Total number of marbles: 2 blue plus 3 red equals 5 marbles.

Number of blue marbles: 2

Probability of Blue: count of blue marbles / total number of marbles = 2 / 5 = 40%.

Example 2

Let’s assume that we’re not placing those marbles back in after we draw them. What happens to the probability of drawing a blue now? First, let’s look at it from a worded perspective.

New Total number of marbles: 1 blue plus 3 red equals 4 marbles.

New Number of blue marbles: 1

New Probability of Blue: count of blue marbles / total number of marbles = 1 / 4 = 25%.

Can you see how the events are dependent on one another and the probability of future events is now changing? Do you know what it would be called if we left the marbles in the bag for each draw? If we kept the marble in the bag after each draw then this would become an independent probability problem.

Let’s try looking at it in a visual form. This is called a Tree Diagram:

This is the original bag that we started with; 2 blue marbles and 3 red marbles.

What was the probability of drawing a blue marble? 2/5. What about red? 3/5. So, let’s set up the potential future outcomes. You can see that we’re starting with that one bag, containing 5 marbles, then transition to probability scenario 1 (blue track 40%) or probability scenario 2 (red track, 60%):

Let’s now move it to the next two and see what the next (dependent) probability draw is worth:

Can you work out the probability of each scenario? By this, I mean each bag of marbles on the very right-hand side of the above image. Since each stage has its own probability, we need to multiply each stage by the preceding stage. Here are the corresponding probability:

What’s the probability of drawing two blue marbles?

Stage 1 (2/5 = 40%) * Stage 2 (1/4 = 25%) = 10%

What’s the probability of drawing a blue marble then a red marble?

Stage 1 (2/5 = 40%) * Stage 2 (3/4 = 25%) = 30%

What’s the probability of drawing a red marble then a blue marble?

Stage 1 (3/5 = 60%) * Stage 2 (2/4 = 50%) = 30%

What’s the probability of drawing two red marbles?

Stage 1 (3/5 = 60%) * Stage 2 (2/4 = 50%) = 30%

Instead of having to wrap your brain around numbers in the future, you could use this tree diagram to work out the probability of this exact event that could happen to you in your life. Could, not will. There’s a probability and it’s probably near zero, but not zero.

If you don’t actually have the paper to write down a big tree like this though, there’s a theorem that you can use to calculate these types of events.

Bayes’ Theorem

Bayes’ Theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event.

Source: https://byjus.com/maths/conditional-probability/

Before we use this theorem to solve the above marble problem, I want to figure out some of the notation that we’re going to use (from the image above).

  • P(A) means “Probability Of Event A”

From the example above, Event A will be ‘draw a blue marble first’. So, P(A) = 2/5  = 40%.

  • P(B) means “Probability Of Event B”

This means that Event B will be ‘draw a blue marble second”. But we actually have two choices here based on the staging:

  • If we got a blue marble first, the probability is now 1/4 (25%), and
  • If we got a bed marble first, the probability is now 2/4 (50%)

This means we’re going to introduce one more symbol. This pipe symbol, ‘|’, means ‘given’.

  • P(B|A) means “Event B given Event A”

To put this in another way: since Event A has already happened, what is now the probability of B occurring?

P(B|A) is also called the “Conditional Probability” of B given A.

And in the case above:

P(B|A) = 1/4

So the probability of getting 2 blue marbles is:

2/5 * 1/4 = 1/10 (10%).

This is how it is written:

P( A and B ) = P( A ) * P ( B | A )

It is also written as:

P( A | B ) = P( A and B ) / P( B ) = [P( A ) * P ( B | A )] / P( B )

Using English words:

“Probability of event A and event B equals the probability of event A times the probability of event B given event A”

Example: Drawing 2 Aces from a Deck

Event A is drawing an Ace first, and Event B is drawing an Ace second.

For the first card the chance of drawing an Ace is 4 out of 52 (there are 4 Aces in a deck of 52 cards):

P(A) = 4/52

But after removing an Ace from the deck the probability of the 2nd card drawn is less likely to be an Ace (only 3 of the 51 cards left are 3 Aces):

P(B|A) = 3/51

And so:

P(A and B) = P(A) x P(B|A) =(4/52)x (3/51) = 12/2652 = 1/221

So the chance of getting 2 Aces is 1 in 221, or about 0.5%

Marketing Example

Within Marketing, there are various customer interactions that analysts track:

  • Clickthrough Rate: (Total clicks OR unique clicks ÷ Number of delivered emails) * 100
  • Open Rate: the percentage of email recipients who open a given email.
  • Conversion Rate: (Number of people who completed the desired action ÷ Number of total emails delivered) * 100
  • Bounce Rate: (Number of bounced emails ÷ Total number of emails sent) * 100
  • List Growth Rate: ([(Number of new subscribers) minus (Number of unsubscribes + email/spam complaints)] ÷ Total number of email addresses on your list]) * 100
  • Email Sharing/Forwarding Rate: (Number of clicks on a share and/or forward button ÷ Number of total delivered emails) * 100
  • Overall ROI: [($ in additional sales made minus $ invested in the campaign) ÷ $ invested in the campaign] * 100
  • Unsubscribe Rate: the percentage of email recipients unsubscribe from your send list after opening a given email.

Let’s do an analysis of the conversation rate by linking back to the original intent of this post – calculate the conditional probability of a customer responding positively to an email campaign. Specifically, let’s calculate the probability of an email being opened given that the customer has clicked on previous emails, this analyst can tailor their strategy. First transform our problem into maths:

P( A | B ) = [P( B | A ) * P( A )] / P( B )

P(email open | customer clicked a previous email) = [P(customer clicked a previous email | email open) * P(email opened)]/P(customer clicked a previous email)

x = (1.49% * 15.36%) / 9.06%

0.0253 or 2.53%

This figure is around the the industry standard, but there are some noted ways to fix this conversion rate:

  1. Understand Your Audience:
    1. Develop detailed buyer personas to understand your target audience’s needs, preferences, and pain points.
    2. Tailor your messaging and offers to address the specific needs of your audience.
  2. Clear Value Proposition:
    1. Clearly communicate the value of your product or service. Visitors should immediately understand what makes your offering unique and beneficial to them.
  3. Optimise Landing Pages:
    1. Ensure that your landing pages are well-designed, easy to navigate, and have a clear call-to-action (CTA).
    2. Test different variations of your landing pages to see which elements (headlines, images, CTAs) perform best.
  4. Compelling CTAs:
    1. Create persuasive and action-oriented calls-to-action (CTAs) that guide visitors toward the desired action.
    2. Use compelling language that emphasises the benefits of taking the desired action.
  5. Mobile Optimization:
    1. Ensure that your website and landing pages are optimised for mobile users. A growing number of users access websites on mobile devices, so a seamless mobile experience is crucial.
  6. A/B Testing:
    1. Test different elements of your marketing campaigns, such as headlines, images, CTAs, and colours, through A/B testing.
    2. Analyse the data to identify which variations perform better and implement the winning elements.
  7. Improve Page Load Speed:
    1. A slow website can lead to high bounce rates. Optimise your website’s speed to provide a better user experience.
  8. Build Trust:
    1. Use customer testimonials, reviews, and trust badges to build credibility.
    2. Clearly display privacy policies and security measures to reassure visitors.
  9. Personalisation:
    1. Implement personalization strategies to deliver tailored content based on user behaviour, preferences, and demographics.
  10. Optimise Forms:
    1. Simplify and streamline your forms to reduce friction. Only ask for essential information.
    2. Use a progress bar for longer forms to show users how much is left.
  11. Use Urgency and Scarcity:
    1. Create a sense of urgency or scarcity in your offers to encourage quicker decision-making.
    2. Limited-time promotions and exclusive deals can motivate potential customers to take action.
  12. Analytics and Measurement:
    1. Use analytics tools to track and measure your conversion rates.
    2. Analyse user behaviour on your website to identify areas for improvement.
  13. Email Marketing:
    1. Implement targeted email campaigns to nurture leads and guide them through the conversion funnel.
    2. Use personalised and relevant content in your email communication.
  14. Social Proof:
    1. Display social proof, such as customer testimonials, case studies, and user-generated content, to showcase the positive experiences of others.
  15. Continuous Optimisation:
    1. Marketing is an ongoing process. Regularly analyse data, gather feedback, and make continuous improvements to your strategies.

Homework

Exercise 1:

Can you work out what percent of your friends who like Chocolate ice cream also like Strawberry ice cream?

70% of your friends like Chocolate, and 35% like Chocolate AND like Strawberry.

Exercise 2:

45% of the children in a school have a dog, 30% have a cat, and 18% have a dog and a cat.

What percent of those who have a dog also have a cat?

Exercise 3:

A box contains 5 green pencils and 7 yellow pencils.

Two pencils are chosen at random from the box without replacement.

What is the probability they are different colours?

3 responses to “41 – How to Calculate Conditional Probability in Google Sheets”

Leave a comment