84 – How to Create Higher-Order Polynomial Regressions and Charts in Google Sheets

In this previous post, I mentioned that I wanted to calculate the t-statistic and degrees of freedom for completeness sake, but for the case where I’m in the situation again needing to calculate them for another project. Although this post isn’t specifically about t-statistics or degrees of freedom, it is about the concept of documenting something that I can use for future reference. This saves me the time having to consider the project that figured out a concept for and having to decipher what I had created (hello, my sometimes lazy documentation).

It was this post where I used the equation option in a Google Chart to display the equation being used in a chart; the one that ‘looked similar to the Gini coefficient’ (feel free to do a text search on that if you want additional context). Whilst completing the work on that post, I realised that you can turn the linear regression function (LINEST) into a multi-degree polynomial regression function. As someone who likes to model as closely as possible to the data that is presented and likes to dig down into any necessary (and achievable) level, this discovery left opportunity to establish some learning.

That’s what this post is about — showing you how to create higher degree polynomial regressions and find that same equation (if not, very close….this includes a slew of frustrated people on the internet attempting to distinguish the difference between the equation in the Google Chart and the built-in functions, e.g. LINEST()). Here is the spreadsheet for your reference:
https://docs.google.com/spreadsheets/d/1S-Zo_e2792Pb5kJFGZdSbP0d4VrRORJcPA2WoyJdKNM/edit?gid=0#gid=0

You could make your own randomised dataset by using the following formula to output an array of number, between 0 and 200, for 100 rows, and then sort them in order, but then this is reliant on the randomisation factor within RAND, which should yield a fairly straight line, in the grand scheme of things. Ultimately, this should show a high correlation as these randomised numbers will be relatively straight and in order:

=SORT(ARRAYFORMULA(RANDBETWEEN(0, 200 + ROW(A2:A101)*0)),1,1)

1st-degree polynomial (linear)

Instead, let’s use the same dataset from the Applying Regression Discontinuity Design (RDD) in Google Sheets blog post to demonstrate how you can go to these different levels of analysis. I’ll start by pasting that data in then plotting it on a chart; x-axis = Pre-test Score, and y-axis = Final Exam Score:

I’ll clean this up by reducing the size of the data points by clicking, in the Chart editor, Customize > Series > Point size > 2px:

Now, let’s add a trendline (coloured red) to see how this looks from a linear perspective:

I can see that this is a relatively good fit, but there’s more data points below the trendline up until halfway and more data points above the trendline up from halfway. I’ll turn the Show R² option on to see its current fit:

At 96%, this doesn’t look too bad. This trendline and additional statistics is what we’ll be creating using the LINEST() function. Just like setting up this chart, I’m going to assign the known y data (y-axis) to Final Exam Score (D2:D351) and known x data (x-axis) to the Pre-test Score (B2:B351), and I want the b to be calculated and verbose analysis to be displayed:

=LINEST(D2:D351,B2:B351,1,1)

As an FYI, this model follows the ‘y=mx+b’ form. I say this in case there are some people who learned the ‘y=mx+c’ form. I will not doubt use the two interchangeably and apologise for any inevitable confusion.

We are now estimating this equation in the form:

Final Exam Score = b + a(Pre-test Score)

Where,

b is the intercept, and
a is the linear coefficient (for Pre-test Score)

It’s a shame that the output doesn’t include the statistical information titles (SE, R², etc), but I’ll always add it so I remember what I’m looking at. You can see from this output above that the R² aligns with the chart. Let’s see if the coefficients (x and y) are aligned too. I’ll navigate back to the Chart editor and under the Series section, I’ll select Use equation under the Label sub-section:

We can see all of the coefficients and R² values are aligned:

2nd-degree polynomial

Now for the fun part; let’s make this chart and LINEST() function a 2nd-degree polynomial. I’ll start by completing this on the chart. Under the Type sub-section, I’ll select Polynomial:

The LINEST function analyses a series of known dependent values against a series of known independent values and computes a line of best fit around them. When you multiply an array of size ‘n’ to the independent variables, it will generate an output that contains that n number of independent variables. Essentially, you’re increasing the number of dimensions that your regression factors in. I like to visualise these concepts, so this image is the visual representation of the LINEST function; we are transforming the dependent and independent data into a y=mx+b form (using OLS to derive the regression):

When we take this regression to a higher dimension, we’re duplicating the independent variable and transforming it (with the second degree polynomial) to that higher dimension:

If we could adjust this ‘linear estimation’ LINEST function, we would essentially be creating a ‘multi-variate estimation’ function that says: MULTEST(dependent_values, indepdent_values_array1, indepdent_values_array2, calculate_a_array1, calculate_b_array2, return_all_statistics). However, this ARRAYFORMULA function, wrapped around the LINEST imputes this above customised formula:

=ARRAYFORMULA(LINEST(D2:D351,B2:B351^{1,2},1,1))

From this this 2nd Degree Polynomial Regression output information, it shows that we are now estimating this equation in the form:

Final Exam Score = c + b(Pre-test Score) + a(Pre-test Score)²

Where,

c is the intercept (for Final Exam Score),
b is the linear coefficient (for Pre-test Score), and
a is the quadratic coefficient (for Pre-test Score squared).

For this purpose, it suggests that:

a = -0.001: This means that for every increase in the square of the Pre-test Score, the Final Exam Score decreases by 0.001,
b = 1.280: For each additional point in the Pre-test Score, the Final Exam Score increases by 1.280, and
c = -12.416: The intercept, which indicates the expected Final Exam Score when the Pre-test Score is 0.

3rd-degree polynomial

Let’s make this chart and LINEST() function a 3nd-degree polynomial. I’ll start by completing this on the chart. Under the Type sub-section, I’ll select Polynomial and increase the Polynomial degree to 3:

And now to update the LINEST formula:

=ARRAYFORMULA(LINEST(D2:D351,B2:B351^{1,2,3},1,1))

We are now estimating this equation in the form:

Final Exam Score = d + c(Pre-test Score) + b(Pre-test Score)² + a(Pre-test Score)³

Where,

d is the intercept,
c is the linear coefficient (for Pre-test Score),
b is the quadratic coefficient (for Pre-test Score squared), and
a is the cubic coefficient (for Pre-test Score cubed).

For this purpose, it suggests that:

a = -0.001: This means that for every increase in the cube of the Pre-test Score, the Final Exam Score decreases by 0.001,
b = 0.151: This means that for every increase in the square of the Pre-test Score, the Final Exam Score increases by 0.151,
c = -9.012: For each additional point in the Pre-test Score, the Final Exam Score decreases by 9.012, and
d = 210.485: The intercept, which indicates the expected Final Exam Score when the Pre-test Score is 0.

Homework

Now that you’ve got the idea of increasing polynomial degrees, let’s see how you go with the following regression types and explaining their outputs:

4th-degree polynomial
6th-degree polynomial
Natural Log
Exponential
- Note 1: You will need to take a few extra steps to transform the function,
- Note 2: You won’t get the exact answer, but it will be pretty close.

jamesmillermusic

February 10, 2025

Google Sheets, Mathematics, statistics

2nd-degree polynomial, 3rd-degree polynomial, Advanced polynomial regression, ARRAYFORMULA, Bivariate regression, Chart customization, Chart editor, Chart labelling, charts, coefficients, correlation, Correlation coefficient, Correlation output, Cubic coefficient, Customize, Customized formula, Data analysis process, data cleaning, Data correlation, Data interpretation, data modelling, Data plotting, Data Points, Data sorting, Data structure, Data transformation, Dataset creation, degrees of freedom, Dependent data, Dependent regression, Dependent values, dependent variable, Equation display, Equation on chart, Exponential regression, Final exam equation, final exam score, Formula output, Function setup, Google Sheets, Higher dimension, Independent data, Independent values, independent variable, intercept, Intercept for Final exam score, Label, Linear coefficient, Linear fit, Linear model, Linear perspective, Linear regression formula, Linear trendline, LINEST, Model interpretation, Multi-degree polynomial, Natural log regression, ordinary least squares, Point size, Polynomial degree, Polynomial degree analysis, Polynomial function, polynomial regression, pre-test score, Pre-test score squared, Quadratic coefficient, quadratic term, R-squared, R2 statistics, RANDBETWEEN, Random dataset, Random number generation, Randomised dataset, Randomized numbers, regression, Regression curve, Regression output, Residuals, series, Series section, Series sub-section, Show R2, Slope-intercept form, SORT, Sort function, standard error, statistical analysis, Statistical information, Statistical model, Statistical models, t-statistic, transform data, Trendline, Variable analysis, Visual analytics, Visual representation, Visualising regression, Visualization in Google Sheets, x-axis, X-coefficient, y-axis, Y-coefficient, Y=mx+b

James Miller CV

84 – How to Create Higher-Order Polynomial Regressions and Charts in Google Sheets

1st-degree polynomial (linear)

2nd-degree polynomial

3rd-degree polynomial

Homework

Leave a comment

84 – How to Create Higher-Order Polynomial Regressions and Charts in Google Sheets

1st-degree polynomial (linear)

2nd-degree polynomial

3rd-degree polynomial

Homework

Share this:

Leave a comment