WSU STAT 360
Class Session 13 Summary and Notes

Finally we move forward to regression!Did I just utter an oxymoron?

This is a complex topic that will occupy our efforts during the next 3 weeks, and which will require the use of every tool we have learned to this point in class. First, however, here is my list of talking points.

  1. On the first class period dealing with regression I simply showed how the problem of linear regression is formulated and how the "normal" equations enter the problem.
  2. I mentioned in class that the "normal" equations can become quite ill-conditioned, especially if we attempt to solve for too many high order terms in a non-linear regression. The example I presented in class was better than I represented, but I have another example which follows this list of items.
  3. There are methods to solve the regression problem without the use of normal equations. If any of you are interested, I can explain an interesting approach called "singular value decomposition."
  4. The parameters that result from a linear regression are, themselves, subject to some uncertainty, and I'll spend all of the next class period discussing measures of that -- all of the class period, that is, except what you will present in the item below. In addition we will look directly at parameter space and see what it tells us about the linear regression problem.
  5. Each of you made a brief presentation of your first project(s) during class on the 19th. By next class period I expect that each of you will have two projects more or less finished, and have some firm results to show the class. Be prepared.

An ill-conditioned matrix equation

WE saw in class that the normal equations produce a matrix that is symmetric and looks much like the matrix below. The unknown parameters in this problem are 'a' and 'b.' The vector on the right hand side comes from observations.

| 1.01    100 | |a|   | 1 |
| 100    9980 | |b| = |110|

It is likely that we know the value of the elements in the matrix to high precision even though they result from measured data. These, as you will recall, are composed of independent parameters. The vector on the right hand side is composed on dependent and independent observations, and may have some uncertainly as to its true values. However, our computer will treat the value of 1 as 1.0000..000 to as many places as the machine has precision. This is very unrealistic. Consider the solution that we get for the above problem. It is...

using the inverse of the matrix...

| 1.01    100 |-1 | 1 |   |-12.78|
| 100    9980 |  |110| = |  0.14|

i.e. (a,b)=(-12.78,0.14)

Now what happens if we change the observation vector to (1.1,110)? That is, what happens to the solution vector (a,b) if we change one element of the observation vector by a mere 10%? Using exactly the same matrix inverse we find...

| 1.01    100 |-1 |1.1|   | -1.52|
| 100    9980 |  |110| = |  0.03|

i.e. (a,b)=(-1.52,0.03)

Changing only one element of the input vector by a factor of 0.1 causes the two elements of the solution vector to change by factors of 10 and 5 respectively. Unless we know the true values of all the observations involved this sort of sensitivity to data can make our results suspect, worthless, or even downright unphysical.

If the elements of the matrix itself are uncertain, then we have an even more difficult situation. We can place confidence intervals on the solution using propagation of error or Monte Carlo simulations. More about this in the next class.


Link forward to the next set of class notes for Friday, December 15, 1999. this is actually a summary of discussions on the messy vote in Florida.