Hi Jim,

The method of calculating the r^2 value is described in the Excel helps under the topic heading "Equations for calculating trendlines." What the description there doesn't mention is what SSE and SST mean. The SSE is the sum of the squared residuals between the line and the data values, and this is the value that the least squares methodology is minimizing. The SST value is the variance of the data values. As a result of this definition the r-value is a measure of how well the line was able to fit the data. A value of 1 indicates a perfect fit, meaning that all the data points lie exactly on the line. A value of 0 means a "perfectly awful" fit.

For the linear case the r value is the same as the Pearson product moment correlation coefficient, the value returned by the Excel RSQ function, but this is only true for the first-order (linear) case, not in the general (curvilinear) case. The RSQ r value is defined as

r = cov(x,y)/(sx sy)

where cov is the covariance, sx and sy are the standard deviations of x and y.

The nice thing about the 1 - SSE/SST definition of r is that this value will continue to provide a "goodness of fit" measure even in the curvilinear case so that you can tell if adding more terms to the equation, or using different types of fits produces a quantitively better fit. One caveat regarding this though: Excel uses a "transformed regression model" for types of fits other than the power series polynomial fit. As a result the least squares solution will not be the true minimum variance solution but will for example be the minimum of the squares of the logs of the residuals if a log fit is being done. This of course means also that the r-value will be skewed by the transformation as well, but the value should still lie between 0 and 1. But you cannot assume that an r-value of 0.9 for a log fit represents a better fit than an r-value of 0.8 for a power series fit.

I hope this helps.