69 – AI Problems 2: Asking ChatGPT for Google Sheets Problems to Solve

In a previous post, I was considering how people might be using AI to solve their problems, or potentially create interesting & efficient solutions that they were unable to develop within a particular timespan. With Ai easily within reach of those who understand how to engage with it, I believe that there is the internal (and external) ethical debate around the value of its opportunity cost regarding cognitive lifting. Ideally, you have AI run analysis for you and you make the decisions based on that analysis. But, the systems of AI that we are using are only as good as the models and inputs that we are training them on.

Part of me likes to reject particular technical progress because the answers I’ve seen to the ‘why’ aren’t matching up to my thinking. However, instead of grabbing my luddite torch and running into the town centre, I want to be a part of the constructive criticism side and see how I can assist in evolving its understanding. Because, you know, this blog and my brain literally has that much power. Honestly, I’ve switched my thinking, quite strongly, over the past 10 years from ‘this is bad,’ or ‘this doesn’t work,’ into ‘this isn’t going to work because of this, this, and this, but let’s reconsider our approach from this angle and how it might address those problems.’

That was a long-winded way of saying, this is the second instalment of my AI Problems series. Actually, I understand what I was subconsciously doing. This series is about my metacognition and how I instinctively approach problems to solve them. That knowledge connection dump was all about me getting into the zone.

Here is the Sheet:

https://docs.google.com/spreadsheets/d/1wUP6IXJlcSCusLM07aRQVdYZI7-pX6vA_i60WJwRfCY/edit?gid=0#gid=0

Problem 1

Regression Analysis: Perform a regression analysis on a set of data to predict future values based on historical data.

Mock Data:

YearSales ($)
20151000
20161200
20171300
20181400
20191500

Approach

The moment my eyes see Regression Analysis, I am transported back to my university degree and the countless number of regressions that I have done. Not only just putting the numbers in and talking about the output, but digging into the maths behind how to perform one, visualising each component (regression, residuals, spreads, etc). My instinct is to observe the table information and revisit the question/task. I can see the years and sales (the only difference in the sales is the 20% increase from the first year, then the following years follow a $100 increase each subsequent year.

So, I’ll copy this data into Google Sheets and run the regression. There is no specific information regarding the type of regression, so I will just assume a linear regression and use the following function, where column B contains the years and column C contains the Sales:

=LINEST(C3:C7,B3:B7,1,1)

From here, we can use the table to gather the necessary information (dependent and independent variable coefficients) and use those values to forecast the next coming years:

From here, we can use the table to gather the necessary information (dependent and independent variable coefficients) and use those values to forecast the next coming years:

I’ll also add a graph to track the forecast over time:

Problem 2

Running Total: Calculate the running total of sales in a column.

Mock Data:

DateSales
01/01/2024100
02/01/2024200
03/01/2024150
04/01/2024250

Approach

After initially glancing at Running Total and spotting the few rows of sales data, I immediately think of subsequent cell + previous cell, then update previous cell to absolute reference previous cell. My brain starts wandering down the path of Arrayformulas and if there’s any other way that I can make this simple. Since I’m getting these different pathways created in my head, I’ll attempt to try different approaches to this problem.

I’ll create the output here, using the brute force approach so that I have a benchmark to compare it to:

01/01 = 100

01/01 + 02/01 = 100 + 200 = 300

01/01 + 02/01 + 03/01 = 100 + 200 + 150 = 300 + 150 = 450

01/01 + 02/01 + 03/01 + 04/01 = 100 + 200 + 150 + 250 = 450 + 250 = 700

First, I’ll use the approach that initially came to my head, which means that I need to create a new column titled Running Total 1:

Next, I’ll enter the first sales figure from column C:

Using simple addition, I can add the value in D3 to the value in C4. Using this approach, I’m always going to have the total value being counted for each row where there exists a numerical value in column C:

I can now copy this function all the way down and it looks like it matches the brute force benchmark:

For the second approach, I’m going to use the Scan function. This function looks though (or scans) an array and outputs an answer for each step, or each available value. A base use case for this function is running total. I’ll maintain the same logic in the LAMBDA expression that you need to enter so that this function makes more sense:

=SCAN(0,C3:C6,LAMBDA(row_neg1,col_1,row_neg1+col_1))

Problem 3

Monte Carlo Simulation: Use Monte Carlo simulation to analyse the probability distribution of different outcomes for a business decision.

Mock Data:

ScenarioSales Growth Rate (%)Market Share Change (%)
Scenario 152
Scenario 231
Scenario 373

Approach

Ooo, Monte Carlo Simulations. I’ve been interested in this concept since my mate Evan was telling me about it back in 2019 after he had transitioned into a Maths/Statistics degree from Economics. I know very little about it, except that there’s a model with ‘x’ variables and this simulation is about manipulating these different variables over a certain number of iterations. I don’t know how many iterations that is, but I’m going to assume that it’s more than the Central Limit Theorem rule of thumb (30 observations within the sample size). Even if it’s 30 multiplied by the number of variables and scenarios (30*2*3 = 180). Since I’m not sure of the number of iterations, I will set a limit to 750 (arbitrarily, halfway between 500 and 1,000). 

So, now that I’ve set myself this iteration limit, I’m going to reread the question and make sure that this new (self-prescribed) standard still holds.I need to analyse the probability distribution of different outcomes for a business decision. So, a business can make a decision based on the product/service that they sell. From the observable data table, the business has projected that scenario 1 will yield their Sales Growth Rate to be a 5% increase and a 2% increase in the Market Share Change. I’m also going to lock away the information that we’re dealing with percentage changes for both Sales Growth Rate and Market Share Change.

I can now take this data table to Google Sheets:

I’ll create a dynamic field for the number of simulations that I’d like to run (and update as I see fit):

Next, I’ll create a table of simulations for each scenario based on the information from the original data table and the number of simulations. The following Sales Growth Rate Simulation formula will take the corresponding Growth Rate (for scenario 1), and it multiplies that value by a percentage that is randomised. This randomisation is an output (between 0 and 1) as an array for the number of pre-determined simulations entered in B8. This value of 1+(0,1) is then multiplied by 10% to scale down the output and an additional 5% is taken off to increase the accuracy of the simulation variance, leaving a 10% differential:

=ARRAYFORMULA(C3*(1+RANDARRAY($B$8,1)*0.10.05))

The same formula, as above, is used for the Market Share Change across all scenarios:

=ARRAYFORMULA(D3*(1+RANDARRAY($B$8,1)*0.10.05))

Here is an output of those scenario simulations:

I’ll create a summary Scenario statistics table so that I can easily interpret the results for each variable in this system:

Sales Growth 1

Max:

=MAX(F$3:F)

Min:

=MIN(F$3:F)

Average:

=AVERAGE(F$3:F)

Standard Deviation

=STDEV(F$3:F)

Market Share 1

Max:

=MAX(G$3:G)

Min:

=MIN(G$3:G)

Average:

=AVERAGE(G$3:G)

Standard Deviation

=STDEV(G$3:G)

This form makes things a little easier for me to interpret, but I’ll go one step further and create some decision rules for each statistic that I’ve generated. For the Max, Min, and Average variables, I’ll get Google Sheets to report back for each variable with the following formula:

=INDEX($M$5:$Q$7,MATCH(MAX(N$5:N$7),N$5:N$7,0),1)

For the Standard Deviation variable, I am interested in the lowest variability as this points to how volatile (unpredictable) a system may be. So, I will change the MAX function in the MATCH function to MIN:

=INDEX($M$5:$Q$7,MATCH(MIN(Q$5:Q$7),Q$5:Q$7,0),1)

Summary / Analysis
Sales Growth Rate (%)Market Share Change (%)
MaxMinAvgStd DevMaxMinAvgStd Dev
Scenario 15.254.754.990.142.091.902.000.06
Scenario 23.152.853.000.091.050.951.000.03
Scenario 37.356.657.010.203.152.853.000.09
BestScen 3Scen 3Scen 3Scen 2Scen 3Scen 3Scen 3Scen 2
WorstScen 2Scen 2Scen 2Scen 3Scen 2Scen 2Scen 2Scen 3

After reading through these numbers of the various scenarios, I immediately think about balancing performance and risk. My brain starts to assess the average growth and the standard deviation, which signals to me whether a scenario favours risk-taking or a more cautious approach. I can feel different strategies forming in my head, so I’ll dive deeper into a comparison of these scenarios.

First, I’ll break it down using some hard numbers as a baseline for further exploration: Scenario 3: Highest average growth and market share, but also the highest standard deviation. This one clearly signals high risk, which might appeal to a risk-taker. Scenario 2: Lower averages paired with lower risk. A more stable option, likely preferred by a risk-averse decision-maker. Scenario 1: Moderate averages and risk, sitting in a balanced middle ground, making it attractive for someone with a risk-neutral approach.

I’ll approach this the same way I would when dealing with running totals or complex calculations. I’ll start with a straightforward comparison, identifying how risk and reward play off against each other. Scenarios with higher averages and higher risk may offer bigger rewards, but they require a tolerance for volatility. On the other hand, safer scenarios with lower risk and lower averages may not deliver the same growth, but they come with more stability.

Now, using these comparisons, I can begin crafting a decision-making process that aligns with specific business goals. Stability? Focus on the scenarios with lower standard deviations. Growth? Prioritise those with higher maximums, even if it means accepting more risk. This framework gives me a clear path for making an informed decision that considers both the potential rewards and the associated risks.

Problem 4

Automatically Update with a Formula: Use the ARRAYFORMULA to automatically calculate discounts on sales in one column.

Mock Data:

ProductSalesDiscount (%)
Laptop50010%
Coffee1505%
Keyboard3008%
T-Shirt10012%

Approach

There’s two main things that I consider when solving these kinds of problems. I need to consider the arrays that I want to work with and how to handle cells where there are no values (yet). I’ll start by copying the data into Google Sheets then I can start the formula for the Discount value. I read this problem as asking me ‘what is the discount value that has been applied for Product x?’ So, the output values should be 500 * 10% = 50, 150 * 5% = 7.5, 300 * 8% = 24, and 100 * 12% = 12.

If I perform the following operation, I will receive an output that will carry zeros down to the bottom of the sheet past the table:

=ARRAYFORMULA(C3:C*D3:D)

This means that I need to introduce the IF function so that I can search for cells where there are no values (positive or negative). I’ll track this by using Sales (column C) as the trigger for the formula. So, if there exists a row in column C where a number is present (IF(ISNUMBER(C3:C), then I want the discount operation (C3:C*D3:D) to be applied, otherwise don’t output anything:

=ARRAYFORMULA(IF(ISNUMBER(C3:C),(C3:C*D3:D),))

Just in case we want to track the selling price (after the discount has been applied), we can enter a similar formula into column F, with a slight adjustment (1 – the discounted percentage in column D):

=ARRAYFORMULA(IF(ISNUMBER(C3:C),(C3:C*(1D3:D)),))

Problem 5

Scenario Analysis: Conduct a scenario analysis to evaluate the impact of different market scenarios on investment portfolios or project returns.

Mock Data:

Scenario Investment Return (%)Market Growth Rate (%)
Base Case75
Pessimistic52
Optimistic108

Approach

The moment I see Scenario Analysis, my mind immediately starts thinking about different market conditions and how they might affect an investment portfolio. This takes me back to countless discussions I’ve had about risk management and portfolio diversification. It’s always intriguing to explore how different scenarios — from a booming economy to a sluggish market — can impact returns.

I’ll start by migrating the data over to Sheets and briefly analyse the available information. I can see the investment returns vary between 5% and 10%, while the market growth rates span 2% to 8%. I want to explore how these different conditions affect a portfolio — let’s say I’m working with an initial investment of $100,000. Now I can start thinking about how to represent the portfolio’s growth under each scenario.

Instinctively, I head straight for the simple calculations. I’ll compute the portfolio’s value after 1 year under each scenario, multiplying the initial investment by (1 + Investment Return). For the Optimistic scenario, that looks like:

=$A$4*(1+D4)

I’ll extend this formula to the other scenarios, and I now have the portfolio values:

After running these numbers, I can start drawing conclusions. The Optimistic scenario results in the highest gain of $10,000, whereas the Pessimistic scenario yields only $5,000. The Base Case, as expected, sits right in the middle with a gain of $7,000. These outcomes make sense: more optimistic market conditions drive higher returns, while a sluggish market caps your potential gains. But, as always, my brain starts to wander towards risk. Sure, a 10% return sounds fantastic, but what if the market growth doesn’t align with those optimistic expectations? Would the portfolio withstand a worse-than-expected scenario?

If we want to reflect market conditions in our portfolio’s growth, we might say that a strong market will boost returns, while a weak market could drag them down. One simple approach is to adjust the Investment Return by the Market Growth Rate. In essence, this means multiplying the Investment Return by the Market Growth Rate to simulate how much of the market’s overall performance is passed on to your investment.

Adjusted Return (%) = Investment Return (%) * (1 + Market Growth Rate (%))

=D4*(1+E4)

When I look at these updated results, the Market Growth Rate shows a noticeable impact on the portfolio. For the Optimistic scenario, the additional 0.8% from the market boost might seem small, but it adds an extra $800 to the overall gain. In contrast, the Pessimistic scenario adds just $100 more than we initially thought, underscoring the minimal effect of a slow-growing market.

This adjustment further solidifies the idea that the Optimistic scenario is appealing to risk-takers, especially when you factor in a strong market. Meanwhile, the Pessimistic scenario might provide more stability but at a lower gain — appealing to someone who is risk-averse. The Base Case, again, serves as a nice middle ground.

To get a clearer picture, I’ll chart these results in Google Sheets. A scatter plot will help me compare the portfolio value and gain/loss across the three scenarios. This will make it easier to digest the differences at a glance — and it’s always handy when explaining these concepts to someone less familiar with the numbers.

I’m a big fan of providing more detailed insights for this type of analysis, so I’ll add the trendline equation to the chart in case someone would like to use this in their digging:

Looking at the output, it’s clear that each scenario offers a different risk/reward profile. The Optimistic scenario provides the highest return but comes with the potential for disappointment if market growth doesn’t meet expectations. The Pessimistic scenario, on the other hand, represents a more cautious approach — smaller gains, but perhaps less worry about an economic downturn. And, as expected, the Base Case finds the middle ground, offering moderate returns with reasonable growth.

Now, when it comes to decision-making, these numbers offer an interesting dilemma. Should you go all-in on the Optimistic scenario and aim for maximum growth, or play it safe with a more modest return in the Pessimistic case? Personally, I’d want to consider my own risk tolerance here. If I’m a risk-taker, the high returns in the Optimistic case might appeal, but a risk-averse person may prefer the stability of the Base Case or even the Pessimistic scenario.

2 responses to “69 – AI Problems 2: Asking ChatGPT for Google Sheets Problems to Solve”

Leave a comment