WSU STAT 360
Class Session 13 Summary and Notes
Finally we move forward to regression!Did I just utter an oxymoron?
This is a complex topic that will occupy our efforts during the next 3 weeks, and which will require the use of every tool we have learned to this point in class. First, however, here is my list of talking points.
An ill-conditioned matrix equation
WE saw in class that the normal equations produce a matrix that is symmetric and looks much like the matrix below. The unknown parameters in this problem are 'a' and 'b.' The vector on the right hand side comes from observations.
| 1.01 100 | |a| | 1 | | 100 9980 | |b| = |110|
It is likely that we know the value of the elements in the matrix to high precision even though they result from measured data. These, as you will recall, are composed of independent parameters. The vector on the right hand side is composed on dependent and independent observations, and may have some uncertainly as to its true values. However, our computer will treat the value of 1 as 1.0000..000 to as many places as the machine has precision. This is very unrealistic. Consider the solution that we get for the above problem. It is...
using the inverse of the matrix... | 1.01 100 |-1 | 1 | |-12.78| | 100 9980 | |110| = | 0.14| i.e. (a,b)=(-12.78,0.14)
Now what happens if we change the observation vector to (1.1,110)? That is, what happens to the solution vector (a,b) if we change one element of the observation vector by a mere 10%? Using exactly the same matrix inverse we find...
| 1.01 100 |-1 |1.1| | -1.52| | 100 9980 | |110| = | 0.03| i.e. (a,b)=(-1.52,0.03)
Changing only one element of the input vector by a factor of 0.1 causes the two elements of the solution vector to change by factors of 10 and 5 respectively. Unless we know the true values of all the observations involved this sort of sensitivity to data can make our results suspect, worthless, or even downright unphysical.
If the elements of the matrix itself are uncertain, then we have an even more difficult situation. We can place confidence intervals on the solution using propagation of error or Monte Carlo simulations. More about this in the next class.
Link forward to the next set of class notes for Friday, December 15, 1999. this is actually a summary of discussions on the messy vote in Florida.