But if they do, they will also have to deny the following rather plausible principle about the value of knowledge :

Knowledge is Valuable (KIV): A rational person should never prefer to merely truly believe that p, rather than to know that p.*

Now, consider this lottery. Again, there is a lottery with n tickets where one and only one ticket will win. The lottery tickets come with a bar code on them that has whether the ticket won or lose encoded on them. You receive ticket i.

Now consider the following two situations. In the first situation, you receive ticket i and you know the value of n. So you know you have a 1/n chance of winning. You believe that you will lose, and, in fact, the ticket is a loser. Do you therefore know that you lose? Again, most people say no.

In the second situation, you receive ticket i and are given a lottery ticket detector. The detector reads the bar code and outputs “winner” or “loser.” The detector is very, very reliable and is indeed more reliable than any other faculty (perceptual or otherwise) you have of gaining knowledge. Let us say that it has a probability of p of correctly outputting “winner” or “loser” when reading the bar code, and p is very high (but less than 1-1/n : we are talking about a truly massive lottery here). You don’t know how many tickets are in the lottery. You receive ticket i, your lottery ticket detector outputs “loser” and you believe that your ticket loses. Do you therefore know that you lose? I find it incredibly implausible that the answer here is no (note that the reasons people standardly give against knowledge in the normal lottery ticket do not apply, your belief here is (very) sensitive to the truth of what you believe, etc). At least, I find it very implausible for anyone who isn’t a radical skeptic.

But note that you should find it preferable to be in the first situation instead of the second. In the first situation, your credence that the ticket loses given that it is an n-ticket fair lottery plausibly should be 1-1/n. In the second situation, your credence that the ticket loses given a lottery ticket detector reading of “loser” plausibly should be p, which is stipulated to be *less *than 1-1/n.** In other words, your rational credence that you lose should be higher in the know-the-number-of-tickets situation than in the lottery-ticket-detector situation. It certainly seems like you are better informed about your chances of winning in the first situation as opposed to the second, and so you should prefer the first situation to the second. But if the opponents of knowledge in the normal lottery case are correct, this means you should prefer merely truly believing that you lost to knowing that you lost. Someone who rejects that you know in the lottery case therefore seems like they have to embrace skepticism or reject KIV.

Here is another example : the concerned doctor. The concerned doctor is always worried about each new disease he reads about in the medical literature. To his horror, he reads about a new disease X that has just been discovered. It seems that people always catch X before they are 30, and never show any (macro) symptoms whatsoever until they are 40. But X is quite painful, and the doctor, who is 35, is therefore worried. Now consider the following two situations :

(1) The doctor reads in the article describing the disease that (amazingly) the researchers have also conclusively demonstrated that 1/n people between the age of 30 and 40 have this disease. The value of n is vanishingly low, so the doctor is relieved, believes he does not have the disease, and (as a matter of fact) he doesn’t.

(2) The doctor reads in the article that there is a test for this disease that tells you whether you have the disease with probability p (and, again, p is very high). The doctor orders up the test, administers it to himself, and it comes up negative. The doctor is relieved, believes he does not have the disease, and (as a matter of fact) he doesn’t.

Now, similarly to before, let 1-1/n be greater than p. It seems that anyone who rejects knowledge in lottery cases will also reject that the doctor knows he is disease free in (1). It seems that if anyone knows anything, the doctor knows he is disease free. But, again, shouldn’t the doctor be *more *relieved in (1) than in (2), since 1-1/n is higher than p? It seems that the doctor has more reason to think he is disease free in (1) than in (2) and hence should prefer (1) to (2). But, of course, this again violates KIV. The lesson here, it appears, is that sometimes it is more useful to know the base rate (alone) than it is to know the result of a test (alone).

Of course, in both cases we should prefer to know both the number of lottery tickets/base rate for the disease as well as the detector result/test result. But if we aren’t in that position, it seems like we should prefer the lottery ticket/base rate-derived belief over the detector/test result belief. So, again, we get a failure of KIV. Maybe that isn’t that big a deal : maybe we do have other reasons to reject KIV. But it is hard to see how a *weaker *version of KIV could be formulated that would avoid the above considerations. If so, then maybe we should be wondering just what is so special about knowledge after all (ie, whether there is anything valuable about knowledge intrinsically, rather than just derivately) or maybe we need to give up the (fairly reasonable) claim that we don’t know in lottery type cases.

*Now, this is may be too strong and so not that plausible. Potential counterexamples might involve situations where your knowledge may undermine other inferences you should make, where your merely truly believing something wouldn’t also undermine those other inferences. If so, then you can revise the Knowledge is Valuable premise with the proviso that it only applies when your degree of belief when you know is no higher than your degree of belief in what you merely truly believe. I think that weakening takes care of any counterexamples I can think of to KIV, and the lottery/detector example still works.

**I say plausibly because it may be that we shouldn’t have credences in those cases. A credence of 1-1/n for the claim that we lose seems to follow from the type of principle of indifference defended by Adam Elga in his Sleeping Beauty paper and Chris Bostrom in his Simulation paper. But principles of indifference often are pretty flaky, so maybe we shouldn’t have a credence of 1-1/n. Similarly, maybe we shouldn’t have a credence in the detector case because we don’t know “the base rate” of the disease and so can’t really compute the conditional probability of “ticket i loses given that detector reads ‘loser’.” I’m not sure what I think about either of these objections, but even if we don’t have a proper probability that we can compute, there is still an intuitive sense where knowing the number of tickets is better than knowing the detector result.

]]>

Pr(H|E)=Pr(E|H)*Pr(H)/Pr(E)=1*Pr(H)/.999. This is going to be very nearly Pr(H). And so we might think that the evidence *barely *confirms the hypothesis. This will be bad, because we do have cases of old evidence where we were quite confident in that old evidence, and where that old evidence *strongly *confirmed the hypothesis (again, this is taken to be the case for Mercury and GTR). And, on some measures of confirmation – measures that define the *degree *of confirmation of a hypothesis by some evidence – this will be true. So now we have a quantitative problem of old evidence : how can old evidence that we are quite sure about strongly confirm a hypothesis?

Here are three of the most plausible measures of confirmation:

The difference measure : c(H,E)=Pr(H|E)-Pr(H)

The ratio measure : c(H,E)=log(Pr(H|E)/Pr(H))

The likelihood ratio measure c(H,E)=log(Pr(E|H)/Pr(E|~H)).

(The ratio and likelihood ratio measure are given in terms of logs in order to normalize them such that they give a value greater than 0 just in case Pr(H|E)>Pr(H)).

So, lets see what happens for them when the probability of the evidence is high (=.999) and the hypothesis entails the evidence

difference measure = Pr(H|E)-Pr(H) = Pr(E|H)*Pr(H)/Pr(E)-Pr(H)=Pr(H)/.999-Pr(H)=Pr(H)(1/.999-1)=Pr(H)*(.001001…). This value, obviously, will be quite low, and will not count on “strongly confirmed” on any satisfactory definition of “strongly confirmed” in terms of the difference measure.

ratio measure = log(Pr(H|E)/Pr(H)) = log(Pr(E|H)/Pr(E)) = log(1/.999). This value will be exceedingly small no matter your base for the logarithm, and will not count as “strongly confirmed” on any satisfactory definition of “strongly confirmed” in terms of the difference measure.

likelihood ratio measure = log(Pr(E|H)/Pr(E|~H))=log(1/Pr(E|~H)=log(1/[Pr(~H|E)*Pr(E)/Pr(~H)])=log(Pr(~H)/[Pr(~H|E)*.999]). This value can be as high as we like, as long as Pr(~H) is sufficiently larger than Pr(~H|E). In other words, as long as the old evidence we are quite certain about strongly *disconfirms *the *negation *of the hypothesis under consideration, then we will get strong confirmation from the log-likelihood ratio measure, even for very certain evidence.

And a little reflection shows us this is more or less what happens in the case of the perihelion advance of Mercury for GTR, and is exactly the condition when old evidence should strongly confirm a hypothesis. Take the second point first :

We obviously don’t want any highly certain piece of old evidence to strongly confirm any hypothesis that predicts it. Lets say we are highly certain that the sun has appeared to rise and set at certain times in the past. Now lets say we are considering a bold new hypothesis that the earth is, in fact, flat and rests on the back of a turtle swimming in an ocean, and two other turtles are swimming along on each side of him and lazily playing catch with the sun by tossing it back and forth at specified intervals. We obviously don’t want our highly certain evidence of the sun appearing to rise and set to confirm this theory, even though this theory.

Now consider a relatively more plausible case. GTR, we know, “reduces” to classical physics in low-energy cases in the sense that it recovers Newton’s predictions. But we don’t say, and don’t want to say, that GTR’s correct prediction of, say, the orbit of Mars strongly confirms GTR, because we have another theory (Newton’s) that also predicts this. In other words, if we have an alternative theory that accords with our old evidence, then that old evidence doesn’t strongly confirm our new theory. And this is exactly what the log-likelihood ratio measure says. The negation of H (here, H is GTR) can be thought of as a disjunction of each member of a partition of theories (say, our theories of gravity). Newton’s theory will be a member of that partition. So, ~H will entail E (when E is something like the orbit of Jupiter), so Pr(~H|E) won’t be “much smaller” than Pr(~H). So you won’t get strong confirmation according to the likelihood ratio measure. The likelihood ratio measure only provides strong confirmation for highly certain old evidence when there is not an alternative theory we have which also accounts for that evidence, which is exactly what we want. I’m glossing over some important questions and details here, but whatever. The likelihood ratio measure seems to provide strong confirmation in exactly the scenarios we want it to : when there isn’t an alternative theory that also predicts that old evidence.

And this, for instance, in the scenario we find ourselves in with the famous advance of the perihelion of Mercury. Newton’s classical mechanics and gravitational theory was unable to correctly predict this advance. The observation of the advance was pretty well known about prior to Einstein’s formulation of GTR. So we can say that the old evidence strongly disconfirmed classical physics (in that Pr(Classical|Perihelion) was relatively low), but the probability of classical physics was relatively high (it was strongly confirmed in many other areas). So we have a case where, plausibly, Pr(~H) was a good deal higher than Pr(~H|E), which gives us strong confirmation for GTR by the advance of the Perihelion of Mercury by the likelihood ratio measure, just as we want (and, I’m emphasizing again, I’m ignoring important questions and details, such as that classical physics + GTR don’t form a partition).

So, the ikelihood ratio measure solves the quantitative problem of old evidence. It provides strong confirmation just in those cases we want it to : when we have a relatively well-confirmed alternative theory that is inconsistent with some piece of old evidence, and that our new theory correctly predicts. And it allows us to see how GTR was strongly confirmed by the advance of the perihelion of mercury. So using the likelihood ratio measure as our measure of confirmation solves the quantiative problem of old evidence, and Bayesian confirmation theory is safe from the attacks of Glymour and Earman.

]]>

Ok, for Bayesians, confirmation is defined for a hypothesis and some evidence as so :

Pr(H|E)>Pr(H).

That is, if the conditional probability of the hypothesis given the evidence is greater than the prior probability of the hypothesis, then the evidence confirms the hypothesis. But, Glymour pointed out, we have a problem when the probability of the evidence is equal to 1, for theories which deductively entail that piece of evidence. Since the theory deductively entails the evidence, Pr(E|H)=1, and we also have Pr(E)=1, so we have :

Pr(H|E)=Pr(E|H)*Pr(H)/Pr(E)=1*Pr(H)/1=Pr(H). Thus, Pr(H|E) is not greater than Pr(H), and we don’t have confirmation. The problem is that, in traditional Bayesian thought, we learn some evidence when our certainty in that evidence goes to 1. At that point, update our probabilities in the rest of what we believe by conditionalization, which proceeds as so : our new probability, Pnew, in the hypothesis is equal to Pnew(H)=Pold(H|E), where Pold(-|-) is our probability function before becoming certain of E. And this makes sense : if we become certain of E, then we should take it to be true and think what our probabilities should be for everything else given that E is true.

But, if this is so, then evidence we have already learned can never confirm any hypothesis. This is because if we have already learned the evidence, then our probability in that evidence will have already gone to 1. So, as was pointed out above, Pr(H|E) will equal Pr(H), and evidence we have already learned will not confirm any hypothesis. But this is bad, because “old evidence” – evidence we have already learned – does seem to confirm hypothesis : the old evidence of the advance of Mercury’s perihelion, for a classic example, confirmed GTR.

Several answers have been proposed, including simply using the old probability for our evidence before we learned it, or allowing Pr(E|H) to be less than 1 even when H entails E, and thus modeling confirmation by old evidence to proceed by “learning” that H deductively entails E. Neither option really suceeds. But one option readily does : ditch the strict conditionalization model provided above, and allow learning even when we do not become certain of the evidence. If this is so, then some evidence can be old, but still less than 1, and if it is then Pr(H|E)=Pr(E|H)*Pr(H)/Pr(E)=Pr(H)/Pr(E). This will always be greater than Pr(H) when Pr(E) is less than 1, so confirmation occurs. Success, the problem of old evidence goes away! Plus, there are many, many reasons to give up strict conditionalization :

It is dogmatic, for one : once your probability in some evidence goes to 1, there is no Bayesian mechanism for it to ever decrease. But surely sometimes you will have reason to become less certain of lots of ordinary, every-day evidence you were once certain of. There is no way to model this via strict conditionalization.

It is archaic, for another : it presumes an old-fashioned version of foundationalism that is no longer all that popular, even among foundationalists. However it gets spelled out, it usually requires there to be basic, perceptual beliefs that we can become absolutely certain of via our sense perceptions. But what beliefs are those?

It disrespects skepticism, for another : even if we think the skeptical challenge can be answered, there is a sense in which skeptical scenarios are live possibilities. We hope to provide reasons to think we don’t live in such a skeptical scenario, perhaps, but (few of us) think we can show that all such scenarios aren’t even possible. Now, it is tricky (and probably wrong) to equate probability 0 with “impossible.” But there is certainly a sense in which anything we assign a probability of 0 to is not a “live” possibility for us, and if we disregard problems with scenarios in which we have to decide between uncountably many possibilities, we can get away with equating probability 0 and impossibility. But if we do so, then we are counting every skeptical hypothesis as impossible. This is because if our probability in some evidence is 1, then our probability in the negation of the evidence is 0. But the negation of our evidence will be true in every (or almost every) skeptical hypothesis. So, by assigning a probability of 1 to our evidence, we are assigning a probability of 0 to every skeptical hypothesis.

And there are more such problems with requiring a probability of 1 to be applied to our evidence. We also have formal Bayesian methods (basically, Jeffrey conditionalization, but also some others) to update our beliefs when we “learn” some piece of evidence, but with less than certainty. So that seems the right path to go on independent grounds, and if we take it, then the problem of old evidence never arises, even from the start. Unfortunately, a new problem of old evidence lurks right around the corner.

]]>

On the betting interpretation of our degrees of belief, the presence of such Dutch Books present a reason for our degrees of belief (alternatively, credences) to be probabilities. If our credences in a proposition are defined as the prices we would be willing to pay for a (fair) bet on the truth of that proposition, or if our credences are measured by those prices, then a theorem due to De Finetti shows that if our credences are not (at least finitely-additive) probabilities, then a dutch book can be constructed against us. A converse theorem due to (I believe) Skyrms shows that if our credences are probabilities, then no such dutch book can be constructed against us. That is, if our credences are the prices for the bets we would view as fair given our credences, and were we to bet on those prices then dutch books can be constructed against us if our credences are not probabilities and we can avoid dutch books if they are probabilities. The betting interpretation of credences thus provides an easy way for establishing our credences as probabilities, allowing us to demonstrate various results in formal epistemology about our credences, and motivating the subjective, personalist interpretation of probability.

However, the betting interpretation itself runs into familiar problems. First, there are agents who may have fundamental reasons not to make bets of any kind. A religious person whose doctrine forbids gambling may refuse to make any bets whatsoever. The betting interpretation will thus deliver the result that the person assigns a credence of 0 to every proposition, despite the obvious possibility that the religious person may well have positive credences for many propositions. Second, there is no obvious reason that a rational person should be willing to bet on any and all propositions for which they have credences. Something is clearly right about the betting interpretation, but it can’t be that our credences are, or are measured by, our actual betting behavior (or actual dispositions to bet, or whatever).

In light of these difficulties, Alan Hajek and Lina Eriksson propose taking the concept of degree of belief to be a primitive concept that is not in need of, or perhaps cannot be, analyzed into some more basic notion. While their discussion has a certain appeal, I think a better approach is simply to move from analyzing our credences as our actual betting behavior to merely what we would view as a fair bet. A fair bet is defined as one where each side of the bet (for A, or against A) is equivalent in terms of expected value. We view a bet as a fair bet simply if we would be indifferent to either side of the bet. If we do this, then the objections raised above to the betting interpretation of degrees of belief fall away. After all, even if a person is religious, or risk averse, or for whatever other reason does not want to engage in various bets, there is no reason to suppose they may not still view bets as fair or not. They could view a bet as fair, without any corresponding behavior or disposition to bet on their part.

If we move to what we would view as a fair bet, rather than defining an agent as actually disposed to making bets on what they view as fair, then it might appear as if we lose our dutch book justification for having our credences be probabilities. After all, if we give up the traditional betting interpretation’s assumption that we would take any bet we view as fair, then even if a dutch book could be constructed against us there is no longer the result that we would end up with a sure loss (precisely because we would no longer automatically make this bet).

However, there still might be a dutch-book related argument for requiring our credences to be probabilities. Henry Kyburg argued that dutch books would never present a problem for a rational agent, because the rational agent could deduce from the dutch book alone that it resulted in a sure loss, and so would never accept that bet. For the purposes of this post, the rational agent could deduce that the bet was *not *fair, because it resulted in a sure loss. However, the nature of a dutch book is that every individual bet is one that the agent *would *view as fair. As a result, if a dutch book could be constructed against an agent, then the rational agent would think *both *that the dutch book was fair (because it was a combination of individual bets, each of which was fair) and that the dutch book was not fair (because the agent could deduce that accepting one side resulted in a sure loss, while accepting the other side resulted in a sure gain).

Given de Finetti’s theorem, then, if our credences are not probabilities then the agent would be in a position to believe a contradiction (that the bet was fair and was not fair). Given Skyrms’, if our credences are probabilities, then the agent would be safe from this contradiction. Given the paucity of reasons for our credences to not be probabilities – there may be reasons not to reveal our credences as probabilities, and to represent them non-probabilistically, but no reasons I know of for them not to be probabilities – these considerations should be sufficient to establish that our credences should be probabilities even without the sure-loss result of traditional dutch book arguments. So we can still have that our credences should be probabilities, even given the failures of the traditional betting interpretation of probabilities so long as our credences are defined as/measured by what we would view as fair bets.

]]>

Now, whether or not it is a fallacy – as Peter Lewis (apparently falsely) claims – the argument form itself is kind of interesting. Let’s just call an argument which infers from repeated past failures to current (and perhaps future) skepticism of success a “pessimistic induction.” As Errol and Adam might recognize, Timothy Williamson makes a somewhat similar pessimistic induction (although he doesn’t call it this) in *Knowledge and its Limits* regarding an analysis of knowledge in terms of truth, belief and any other condition(s). This partly motivates his claim that no such analysis is possible – knowledge instead is semantically unanalyzeable. John Norton, in “Causation as Folk Science,” makes another pessimistic induction on our past failures at explicating any adequate notion of causation. This motivates his seemingly correct relegation of (scientific) causation to the status of a folk science without any “fundamental” reality not derivative from acausal scientific theories (much as newtonian gravitational force can be recovered from general relatavity although gravity is not a newtonian force in general relativity).

I’m sure there are many other such examples of a “pessimistic induction” out there, even if the one against scientific realism is the one that is associated with the actual phrase. So, this leads to some different questions. First, what degree of support should such arguments lend to their conclusion? Presumably they cannot be decisive or else the history of philosophy should probably lead us to abandon philosophical inquiry about any given topic. But in terms of establishing the burden of proof (for instance), they seem like they work. Second, what other examples of such “pessimistic inductions” can everyone think of? Any particularly good ones? Finally, if anyone has read those articles or *Knowledge and its Limits* did you think the pessimistic inductions that were employed in them worked for their respective aims?

The Laudan article is from *Philosophy of Science*, Vol. 48, No. 1 (Mar., 1981), pp. 19-49. The Peter Lewis article is from *Synthese* 129, pp. 371-380, 2001. The Norton article is from *Philosophers Imprint*, Vol. 3, No. 4 (Nov., 2003) and is available here

]]>

]]>