Reasoning and Fallacies

Copyright © 1990, 1999 Kevin T. Kilty, All Rights Reserved

Contents

Affirming the consequent
Post hoc ergo propter hoc
Ad hoc ergo propter hoc
Biased samples and experiments
Not carrying arguments to a conclusion
Analyzing small effects
Circular reasoning
Lack of internal consistency
Wrong models
Interpolation and extrapolation
Hidden complexity
Definitions
Irrelevances
Confirmation
Knowledge and belief

Introduction

An exchange between two announcers on the mutual broadcasting network during the 1990 Penn State-Notre Dame game.
Announcer#1: "What a barn burner of a game they're having in Pasadena! Its USC 38, UCLA 35. The team that has the ball last is going to win or lose."
Announcer #2: "Yeah, we've seen a lot of those!"

In fact, these two announcers have seen nothing but games like "those." As logicians they fail, but they are very good hedgers! Identifying cause and effect, making predictions, weighing arguments, logic in general is a slippery thing. The following examples illustrate this, and classify types of fallacies that ensnare people. A few of these are easy to identify and avoid; others are more subtle and more difficult to recognize.

I do not include categories like purposely leaving out information or fabricating data. These purely dishonest strategies are not traps for the analyst, but, rather, means employed by dishonest folks to gull the unwary. The committee for scientific investigation of claims of the paranormal (CSICOP) investigates scams and dishonesty better than I could ever hope to do.

The categories I make here are interrelated and it may seem artificial to divide things as I have. Send any comments or complaints to K.T. Kilty

Affirming the consequent. This is the most basic fallacy of logic. The essence of it I can explain with a simple example. It is perfectly valid to argue from the truth of propositions to the truth of conclusions in the following way. Take as true a premise such as Aristotle is a man; add to it a true proposition such as all men eat tomatoes; to reach a valid conclusion like: Aristotle eats tomatoes. Affirming the consequent argues backward from the truth of a conclusion to the truth of one of the propositions like this. Gustav eats tomatoes. All men eat tomatoes. Therefore, Gustav is a man, when in fact Gustav is the Dachshund who raids ripe tomatoes near the ground.

While the fallacy of arguing in this manner is obvious, many people, including myself, risk committing it when we look for confirming evidence for a hypothesis. We tend to work on determining what parameter values will confirm the hypothesis best; rather than determining what consequences of the hypothesis are testable.

Post hoc ergo propter hoc, means after this therefore because of this. This fallacy is one in which cause and effect are misidentified simply because two things occur in a particular order.

Two things which occur in sequence do not necessarily make a cause-effect pair. Each might be related to a third phenomenon, or they can have occurred together by coincidence. The finest example of this fallacy is called the regression effect.

Why does this reasoning occur? This reasoning is insidious because it is programmed to occur automatically in all animals, including humans. In a hostile world survival depends on making connections between threats and danger; and, often there is no penalty for classifying something a danger when it is not one. For example, a dog bites into a newspaper which the sadistic newsboy has laced with a lit firecracker. After being frightened by the firecracker, the dog is forever afraid to bite newpapers again because this causes loud noises. Obviously, there is no penalty the dog pays for his unfounded fear of newsprint.

Related topic: Superstitions

Ad hoc ergo propter hoc means with this therefore because of this. Sometimes people confuse two thing that occur together with a cause-effect pair. However, a statistical correlation between two items does not prove a cause-effect relationship. It may be that both are related to a third phenomenon, or it may be that they occur together coincidentally. A recent example of this which had severe public health ramifications was the mistaken identification of a Japanese Enchephalitis outbreak in Malaysia.

Biased samples and experiments. Sometimes samples are intentionally biased in an effort to improve efficiency. The stratified sample, for instance, is just one of a group of biased sample techniques known as Importance samples. However, biased samples can lead to completely erroneous conclusions. The extreme attention that was paid to selection of observations in both the Baltimore Case and in the search for the Fifth Force, shows how sensitive scientists are to charges of biased selection. The classic example of a biased sample leading to fallacious results, is the prediction that Thomas Dewey would beat Harry Truman in the 1948 Presidential race. The surveys for this incorrect prediction were biased by not including enough rural and middle class people.

The biased sample has made its way into the lore of pathological science. Joseph B. Rhine experimented with ESP at Duke University in the 1930's. He was talking about his work with Irving Langmuir (a physicist) and told him about thousands of experiments in which the expected value of successes was 5/25, but in which his actual success averaged 7/25. He claimed this indisputably proved the reality of ESP. However, he also admitted to Langmuir that he often included results not done by himself but submitted by reputable people. Some submissions were from people who didn't like him, he said. They made the results too low on purpose (less than 5/25) and he didn't include these results in his publications. If he had included these results the grand total might have come down to about 5/25 or so.

Paul Brodeur, who began much of the hysteria over microwaves and cancer, produced biased samples the likes of which bedevil the discussion of safety of electromagnetic radiation to the present day. For those who cannot see the fallacy of Brodeur's way of collecting correlations, I have included a short analogy.

Related topic: Selectivity

Not carrying arguments to their conclusion. The late Richard Feynmen illuminated the role of this fallacy in the classical theory of para- and diamagnetism in his classic Feynman Lectures On Physics. The classical theory treated only the transient period immediately after imposing a magnetic field on a sample; and then applied the results uncritically, but incorrectly, to the long term state of thermal equilibrium. (Lectures Vol II 34-6). Here is an example that doesn't take a Ph.D. in physics to understand.

Related topic: Selectivity.

Analyzing small effects. Pathological science originates most often in analyzing small effects. I am certain this happened to Pons and Fleischman, who thought they had discovered fusion in a test tube. They were looking at a very small effect which could have resulted entirely from noise, which is why other researchers couldn't repeat their results.

Something similar happened to Blondlot, who thought he had discovered N-rays; to Alexander Gurwitsch, who thought he had discovered mitogenic rays, and so forth. Two recent examples of analyzing small effects include the Palmdale Bulge and the Fifth Force.

Why does this error occur so often? Every experiment or measurement has some uncontrolled influence or lack of resolution that represents noise. Small samples of observations, particularly near the noise level, exhibit random excursions in amplitude, or small clusters of occurences that appear significant. People with a strong incentive to believe in the phenomenon have to be very cautious about assigning significance to these events, and have to guard against the tendency to cull favorable occurences from their data for interpretation.

Carrying more significant figures in data than are warranted by accuracy is another one way to become susceptible to errors of this sort. However, being creduluous presents the greatest danger. Scientists have to be subjective and creative while they form theories; but then have to become merciless skeptics while they test them. Apparently this is a difficult role to assume.

Related topic: Selectivity

Circular reasoning. Circular reasoning means that someone presumes in advance the thing he or she intends to prove through their argument. You might think this is such an obvious error that no one could be ham-headed enough to do it. However, quite bright people, including Issac Newton himself, engage in it.

The fallacious reasoning about the outbreak of Japanese Enchephalitis discussed in ad hoc reasoning is also an example of circular reasoning. The authorities presumed (perhaps unconsciously) that JE was the cause and then gathered data that had very little chance of rejecting the presumption. In effect they presumed their conclusion. Three other examples include evidence gathered against Teresa Imanishi-Kari, estimating past climate from borehole temperatures, and deriving speed of sound in the Principia. The second example is quite technical. A person who is not conversant with least squares analysis is unlikely to understand it.

Why does this error occur? The examples I have shown indicate three common occurences. First, the offending presumption occurs indirectly as part of the study design or data reduction. This disguises the circular reasoning well enough that a person cannot recognize it easily. This is how circular reasoning appears in scientific work. The second place in which it occurs is in long chains of reasoning or evidence. The human mind can comprehend only a limited amount of detail, and reasoning chains that are too complex can't be seen as circular. Perhaps the courtroom is the most common arena for this. Finally, Issac Newton was engaged in bitter disputes with European rivals, and must have been under intense pressure to improve his results.

Data not internally consistent. Internal consistency is a subtle concept that many people do not fully understand. I gave it little attention myself until a physics professor who was not on my Ph.D. committee showed me that there was something wrong with data I was analyzing for my dissertation.

If data are not consistent internally with the premises of a study, then there is some fundamental flaw in the premises or perhaps in the data. Until the inconsistency is explained, there is no way to produce valid results.

A simple example occured recently on The News Hour with Jim Lehrer.

Related topic: Not carrying arguments to their logical conclusions.

Choosing a wrong model. We have no idea what things are; we can only decide how they behave. To do so, though, requires a model, an analogy, or even a mathematical equation that describes how something works. Models are an extremely important part of making decisions and rendering judgements.

A researcher will have models in mind when planning an experiment, reducing data, or analyzing results. A person walking down a city street has some subconscious model in mind which governs how to relate to other people. If these model are inappropriate, or simply wrong, then the scientist draws faulty conclusions, and the pedestrian acts strange.

Zen and the Art of Oil Explanation summarizes a very poor model of oil exploration, which a competent geoplogist pursued for over a decade.

Extrapolation and interpolation. Sometimes it is necessary to extend data to cover places in which no data were measured. Extrapolation is prediction outside the range of our data; interpolation is prediction within the range. Both extrapolation and interpolation can be done very badly, and both depend on having valid, complete models to interpret the data. One example of extrapolation and interpolation affecting the work we do and products we use is in assigning risks to chemical and radiation hazards.

Hidden complications. Rarely in the real world does an effect relate neatly to a single cause. More often there are messy details involved that make testing a hypothesis difficult. These hidden complications play an important part in experiments in agriculture. Fields are not uniform. They usually have fertility gradients which bias the results of experiments with chemicals, fertilizers, crops, and hybrids. For this reason field trials are often organized into Latin squares.

A Latin square is a square or rectangle that is divided into a grid of smaller squares or rectangles. The test is applied randomly to one of the smaller squares in each row and each column. Analyzing results within individual rows and columns seaparates the hidden complications from test (or controlled) variations.

Confirmation. Often we hear a scientist, journalist, attorney, advocate or politician claim that such-and-such confirms their particular point of view. However, confirmation is often a baffling and slippery concept. When does data confirm a theory? Is it possible to prove a theory? When can we honestly say that we know something is true? In effect, how does a person go about confirming a hypothesis?

The straight forward means is to gather confirming evidence. Each confirmation counts as evidence in favor of the hypothesis. Often there is so much evidence to gather that the hypothesis is never more than proved conditionally. An even greater problem is that this one-piece-at-a-time approach encourages people to form Ad Hoc explanations of unfavorable observations, and encourages the application of questionable corrections. In a seminar at the University of Utah a researcher stated, tongue in cheek, that although he felt bad about making so many favorable corrections to his measurements of background black-body radiation, he compensated by including enormous error bars in all his figures.

This straight forward approach is not always reasonable. There are pathological cases in which nothing but confirming evidence actually disproves a theory. Martin Gardner presented an example in an installment of Mathematical Games in the Scientific American Magazine. His example is something akin to this. Suppose you have 10 playing cards, ace through 10, shuffled randomly; and hypothesize that in turning over the ten cards sequentially you will not turn a card over in the same order as its face value. In other words, hypothesize that you will not turn the ace over first, the 2 second, ... , or the 10 on the tenth turn. It is possible to turn the cards over in a way that confirms the hypotheses with each turn, but which disproves the hypothesis completely. For example, turn the 2 over first, so that confirms the hypothesis; turn the ace over next, and continue to turn over 9 cards out of order. Each turn is confirmation. Yet if the first 9 confirming trials do not produce the 10 card, then the hypothesis is disproved without even making the last turn. Slippery stuff this confirmation.

An alternative path toward confirmation is through the contrapositive. This is proving that all ravens are black by proving that "all non-black things are non-ravens." It sounds very clumsy, but it is a logically valid approach. Unfortunately, if the sheer volume of evidence in straight forward confirmation is daunting, the contrapositive is often worse.

In both the case of direct confirmation and confirmation of the contrapositive, work stops the instant that we find a powerful counterexample. Hypotheses often are easier to refute (reject) than to prove. In these cases the shortest course to analyzing a hypothesis is to gather evidence that rejects it. One school of philosophers (logical positivists) focuses their effort entirely on refutation. A hypothesis that cannot be refuted is not worth considering, since in their view it is untestable and unscientific. Unfortunately hypotheses often depend on other hypotheses; called auxiliary hypotheses. Refuting evidence can actually refute one of the auxiliaries instead of the hypothesis in question. Thus, not everything the logical positivists find is the undiluted truth.

It is not valid logic to argue backward from a true conclusion to the truth of one of the premises. However, in scientific work this is essentially proving a hypothesis from confirming observations, and it seems a reasonable way to work. Potential problems arise from using conclusions that are unrelated to a premise. This is simply a case of using a wrong model or hidden complications. There is also a danger of using fudged evidence to affirm the consequent.

Refutation is the inverse of the confirming process. It is inference known as modus tollens; or denying the consequent. The hypothesis under test becomes one premise in a chain of reasoning that leads to a conclusion which is provably false. Then we argue backward to show the absurdity of the hypothesis or one of the links in the chain of reasoning.

To summarize, it is possible to prove a hypothesis by examining all possible cases to see if each confirms it. Ordinarily there are so many cases to consider that this is not practical. In contrast, there are situations in which proving the contrapositive has fewer cases to consider and is easier to pursue. However, in most situations there is a prohibitive number of cases to consider here also. It always easier to refute a hypotheses by looking for a single piece of data that disproves it. Logical positivism has a lot to offer here. Yet, even in this instance complicating factors might refute only a cartoonish version of a hypothesis, or an auxiliary to it, while the real hypothesis remains untested.

The next time you hear someone claim that such-and-such confirms a particular point of view, think about how difficult confirmation is.

Miscellaneous fallacies.

Proof by definition. Misdefined items lead people to mistaken conclusions. For example, non-profit organizations often use direct mail campaigns to solicite donations. Some will spend more than 90% of their budget in this way. Most contributors would be horrified to learn that 90% of their contributions produce more junk mail. However, accepted accounting procedures allow the organization to classify this mail as "public education." Without a clear definition of "public education" the adminstrators can claim that an audit of their organization proves they spend 90 cents of every dollar on education.

Convenient definitions. Sometimes people force a definition simply to have one. An example occurred recently in efforts to measure and compare the productivity of university research. The accepted definition of productivity is output per unit input. Unfortunately it is very difficult to measure the output of university research. So, a university in the Rocky Mountain States decided to measure productivity as the ratio of research funds garnered per researcher. Guess what? The university in question turned out to be the best in the United States when measured this way!

The definition used in this example is not only backward to the usual definition, it is also extremely convenient. We have to be wary of novel measures advanced by interested parties.

Irrelevance.Without any substantial argument to make, clever people sometimes make arguments that are trivially true and perhaps even irrelevant to the discussion at hand. The idea is to toss out lots of information to make it difficult to find relevant facts and get to the truth. Sometimes however people simply believe that statistics and facts have a logic of their own. Irrelevances appear in adversarial proceedings, especially in court, which brings the opposing attorney to ask --"Relevance, your honor?"

There is a further class of irrelevance that I find especially annoying. This is the replacement of evidence with symbolism and gesture. For example, when one of their disciples is found to be utterly incompetent, or to have lied and deceived, it is common to hear advocates spin how this doesn't matter, what really counts is the symbolism. Symbolism and gestures are under the control of interested parties and have no evidentiary value what so ever. I have produced a brief table that represents the value of evidence.

Appeals to superstition and biases I alluded to the unconscious growth of superstition through the regression effect earlier in the section on Ad Hoc fallacies. The manipulation of superstition and bias once it is in place is a time-honored tradition in the world of advocacy, trial proceedings, and politics. In a strange way the persistence of superstition and related evils like racism, bigotry, and intolerance, depend upon the same factors as Post Hoc thinking. If there is no penalty or bad consequences that follow from superstitious thinking, then there is no effective means of eliminating it. Schumpeter in Capitalism, Socialism, and Democracy, referred to responsibility and consequences of action as sobering influences on irrational and superstitious behavior. Thus he says, a reduced sense of reality and responsibilty explains why the citizen "...expends less disciplined effort on mastering a political problem than he expends on a game of bridge." Advertisers recognize that there are no real consequences that follow from the public believing in their shameless propanganda, and this is why they engage in it so recklessly. Here the plaintiff's bar occasionally marshalls some discipline in the world, but, unfortunately, often by making their own use of superstition and prejudice.

When people are faced with decisions regarding topics in which they have no direct knowledge or expertise, they typically fall back upon prejudice and superstition. Lack of any penalty only encourages such behavior.

Knowledge and Belief

I have no desire to become entangled in a philosophical debate about when and how a person actually knows something. I can stand very little of "theory of knowledge" theorizing. Richard Rorty has it right that the best thing to do with a "theory of knowledge" project is abandon it. Knowledge is like obscenity. I'm hard-pressed to define it, but I know it when I see it. For those of you interested in wandering in this vacuum for a while, you may refer to some extensive philosophy lecture notes regarding the theory of knowledge problem.

However, I am interested in the idea that knowledge follows from justified true belief or justified true acceptance. A web author producing the dictionary of the mind refers to a Gettier Problem as a counter example to a justified true belief. The examples of Gettier problems that he offers, and others I have read, are contrived, however. Better examples come from case histories, and particularly from the field of computer programming. Gettier problems often occur when trying to track down and fix "bugs" in software. Anyone who has ever grappled with difficult bug fixes will recognize this example.

Software bugs that are Gettier-esque occur commonly in the development stage of software when there are still many bugs in the code whose influence overlaps. The bugs interact with one another. Suppose someone has reported a bug to me. I believe or accept this bug because I am able to reproduce its strange behavior in the program. Because I understand both how the bug behaves and how the software is constructed, I can identify where the offending code is located. Sure enough, when I examine the relevant part of the code I find some incorrect logic that explains the bug. Now I am justified in my knowledge about the cause of the bug, because a consequence of my chain of reasoning explains its occurence, or it "fits-in" with my conception of the program and the bug. I fix the code. Now I am completely justified in knowing what I fixed actually fixed the problem. However, unless I actually subject the code to a rigorous test, I am in danger of making a Gettier error. The bug I fixed is a potential problem, but if many bugs overlap and interact, it is possible, in fact likely, that I did not fix the bug. I would release the patched code only to have users report the same bug again.

Thus my fix fails, and as the bug now recurs I puzzle over the code that I have just fixed. Perhaps, for a time I may even continue to alter this part of the code, and recede farther from fixing the true error.

I have observed this same problem when identifying faults in machinery or systems. It occurs here most often when control of the machine flows through parallel structures in hardware or software. Thus, the ability to distinguish between alternative possible causes, and actually testing for these alternatives to excluded them is the path away from Gettier mistakes. The trouble is, how does one know when all the alternatives are tested?

In engineering we subject systems to failure mode effects analysis (FMEA) to get a handle on the complexity of the problem and identify the possible alternatives. Yet I doubt that we ever progress much beyond a preponderance of evidence state about the correctness of our engineered systems.