WSU STAT 360
Class Session 6 Summary and Notes October 13, 2000

Estimators and parameters

We have examined 4 important parameters in our investigations of sampling and probability. These are

  1. The population mean (greek letter mu=m). This represents the arithmetic average of the random variable of interest. Often we do not actually know the value of this parameter in advance, although we have worked many problems by assuming that we do.
  2. The population standard deviation (greek letter sigma=s). This is the square root of the population variance. Once again we may not know this value in advance. If the population follows a particular theoretical pdf, like the normal distribution, then we can calculate it.
  3. The sample mean (denoted by some symbol with a bar above it). This is the arithmetic average of a number of observations made of the random variable. It is an unbiased estimator for the population mean. In cases where we do not know the population mean, we can use the sample mean instead and provide confidence limits for how much the true population mean might vary from it. In case we know the population mean from some other information, we typically assume that the central limit theorem applies, and then we calculate how likely is the deviation of the sample mean from the population mean. This is a test of significance.
  4. The sample standard deviation (usually denoted with the letter 's'). This is the square root of the sample variance. If the sample variance is calculated as the sum of squared deviations from the sample mean divided by the number of observations minus 1, then the sample variance is an unbiased estimator of the population variance.
  5. We examined least squares estimators in class and saw that the mean is a least squares estimator. It minimizes the sum of squared deviations between the observations and a single measure of the center of the distributions. The sum of squared errors is also called the L2 norm of the data. By the term "norm" I mean a measure.
  6. To find a least squares estimator(s) we differentiate the sum of squared errors with respect to the various parameters, and then look for parameter values for which the derivatives are zero. These are extrema and may represent a minimum.

A note from October 22, 1999 Regarding a maximum likelihood estimator.


A maximum Likelihood estimate using the exponential distribution

Another class of estimators are those called maximum likelihood estimators. When attributes of a population are estimated from maximizing likelihood they are the values of these attribute parameters that maximize the probability of the data we observe in the sample. The example of maximum likelihood estimators that I showed in class is instructive. I repeat that argument below. Suppose for sake of illustration that we have measured three times to failure for some device. Let these three times be a, b, and c. Assume that the data are to be modeled with an exponential distribution because we feel that the expected rate of failure is a constant. Before going any further, I should mention that there are two ways of writing the exponential distribution. These are...

P(y)=1/b e-y/b 
or...
P(y)=l e-yl 

In the first form we interpret the parameter b as the expected time to failure, while in the second the parameter l is the expected rate of failure. Let me choose the second form.

Assuming that the observations are independent of one another, we can write the joint probability distribution of our three observations as...

L(l)=P(X1=a,X2=b,X3=c)=P(X1=a)P(X2=b)P(X3=c)
L(l)=l3e-l(a+b+c)

Taking a dervative with respect to l and setting the result to zero leaves us with the task of finding roots of the equation...

L(l)*(3-l(a+b+c))=0

Two solutions are l=0 and l=Infinity. L is zero at these two roots, so they represent minima and we are not interested in them. The other root is l-1=(a+b+c)/3 and represents the maximum that we are searching for.

Notes:Regarding issues and examples from class.


Hazards and Extreme Events This is probably the last installment on this issue.

Designing for extreme events is a two step process.

  1. We have to decide what extreme event we are to design for. That is, what is the return period of the design event? Is it a hundred year event a thousand year event, or something else?
  2. Then we must decide how big this event is, what is its magnitude.

The first question is not too difficult to answer. We have, in fact, already covered all of the theory that is required to calculate it. Consider the following problem. Suppose that some event has a return period of 100 years. Then its probability of occuring in any specific year is p=0.01. Using the geometric distribution, the probability that this event will not occur in an N year period is,...


P(N) = qN = (1-p)N

So if we calculate this for a 10 year period we get a probability a little greater than 0.90, and for 11 years it is a little less than 0.90. In other words, we calculate only a 10% risk that the 100 year event will occur in a 10.5 year period. Now let's turn the problem around.

We first choose the period over which we must limit the chances of design failure. Let's suppose 50 years as an example. Then we decide what risk of failure we are willing to accept during this period. Typical choices would be 10%, 5% or 1% An engineer would choose smaller values for failures with deadly consequences. Let's choose 5% as an example here. Thus we wish to find the number of periods for which the probability remains above 0.95. We must solve the equation P(50)=0.95


(1-p)50 = 0.95, but since p=1/T

(1-1/T)50 = 0.95

(1-1/T) = 0.951/50

T=1/(1-(0.95)1/50) = 975 years.

Therefore we must make our design tolerant of the 975 year event -- the 1000 year event for all practical purposes. At this point we have answered the first question in the list.

The second question is one that we began to play with in the last class. The specific example I have been using is extreme high temperatures in the Portland area. We are using actual data taken at Portland airport from 1928 until 1996. So far, I have shown numerous graphs of this data, and I have spoken of some of its unusual characteristics which suggest that it may not be appropriate to treat it as we have. Most notably the measurements may not be independent samples of extreme temperature even though they are taken about a year apart. Nevertheless we will continue.

Refer to the spreadsheet named summer.xls for several graphics. On the sheet named "normal" is a normal probability plot of the extreme summer temperatures. Vining would consider this sufficient to demonstrate that the data are approximately normal, even though extreme temperatures should not follow a normal distrinbution, and the histogram of the data (shown on the sheet named "histogram") does not look normal either. A sheet named "reduced" is a plot of extreme summer temperature against reduced deviate (-Log(-LogP)) which is what I would expect the extreme values to follow. They do not.

The normal probability plot shows a straight line fit to the data by regression (we will not encounter regression until chapter 6). However, our design probability is 1/T or approximately 0.001; the corresponding value of the normal deviate (Z) is 3.08. If we extrapolate our estimated line to an ordinate of 3.08, the corresponding abcissa is 112.8F. Thus, 112.8F is the 1000 year extreme temperature. We believe there is only a 5% risk of reaching this extreme temperature in the next 50 years.

The relationship of temperature to normal deviate is...

Y=0.219*X-21.626 where X=temperature.

If we assume that the reduced deviate is a more reasonable model of extreme temperature then we can use regression to find a linear relationship of temperature to probability. A probability of 0.001 corresponds to a reduced deviate of 6.907. The relationship of temperature to the reduced deviate is...

Y=0.266*X-25.651 where X=temperature.

In this case I extrapolate to a temperature of 122F for the 1000 year event. This seems unbelievably high, but there is no reason it could not happen. For example, if the temperature happened to reach 104 in the inland plateau (Pendelton), and this air were forced to flow to sea-level here in the Portland area, then adiabatic compression would raise the temperature to 122F. Only time will tell if this ever happens.


Projects: Wind Power

I have data regarding wind in an area of the country where people are installing electricity generating windmills. These cannot be run when the winds exceed 50mph and they won't generate power at all below 10mph. If someone in the class wishes to tackle this problem, please estimate the amount of time per year in useful electrical generation. A further extension to this problem is to find expected values for power generated and so forth.

Link forward to the next set of class notes for Friday, October 13, 2000