8 min

The Lottery Paradox

Published on December 11, 2020

It's an article of faith that entering a lottery is a losing proposition.

Yet suppose a lottery were held in which participants pick a number from \(1\) to \(10,000\), the town keeps ten percent of ticket sales, and the rest is split amongst those picking the right number. Assume 15,000 people each pay one dollar for one ticket, and they choose their numbers randomly, displaying no preference for one number over another.

Then along comes Mary. She buys \(10,000\) different tickets. She writes \(1\) on the first ticket, \(2\) on the second, all the way to \(10,000\). On average, Mary will win. If this lottery is repeated many times, she will average a handy 16% return. 



I recently gave a talk at MIT on a use of this paradox - namely the elicitation of distributional estimates for Machine Learning. Previously I had realized that few people are aware of the possibility of excess return in a lottery, despite its publication (see article by Stephen Moffitt, which I myself had missed) and proximity to Jensen's Inequality. This was true even for people designing algorithms that compete in something similar to a lottery (if you don't know what I'm talking about, poke around this site). 

So in this blog post I will first cover Mary's curious wealth creation, as a mathematical artifact, before inviting you to consider the application to real-time prediction. As an academic note, if you would like a technical treatment with more results, I'd refer you Moffitt's papers. He is an erstwhile lottery arbitrageur and surely knows this better than I do. 

There are two other observations I'm making here that I haven't thoroughly researched for originality or otherwise. One is the paradox of indifference. The other is an interpretation of Kullback Leibler distance as excess return in a lottery. I'm not sure if those have been noticed before - let me know if you have leads. As a further aside, here are the slides and a video, if you prefer either of those formats to a blog.  

HubSpot Video


Positive Returns in a Negative Expectation Game?

Mary makes money! 

To be clear, this is a statistical arbitrage and not an arbitrage. There is no hard and fast guarantee that Mary will make back her $10,000 investment every time. On the contrary, she will lose if forced to share the prize with more than one other person. Also of note, the conditions must be right. If there are too many participants, her scheme fails due to the town take. 

But setting aside these concerns, lotteries such as this can be defeated. Here is a plot of Mary's average return as a function of the number of random ticket buyers. It is perhaps counter-intuitive that by buying lottery tickets, each one of which is perceived as an investment with a -10% return, one can wind up making a (positive) 16% return. This is the "lottery paradox," as I shall term it, though we will use lowercase as this isn't a standard naming.


This post considers the resolution of this paradox and its role in strategy for a continuous generalization of a lottery - one used to solicit distributional forecasts for some quantity of interest. (You may be surprised to learn that tens of thousands of these continuous lotteries occur every day, as algorithms compete to predict electricity, emergency room wait times, and many other live data sources). 

I should start, however, by mentioning that there is an entirely different (uppercase) Lottery Paradox, attributed to Henry E. Kyburg, Jr. That one is really an absurdity, not a paradox. Kyburg pointed out that if one's philosophy cannot discern the difference between a tiny chance of winning a lottery and the certainty of not winning, then we might be forced to conclude that nobody ever wins lotteries. (If that doesn't sound profound, notice that Kyburg's paradox counters arguments along the lines of "you shouldn't ascribe probabilities less than one percent because ... blah blah, model uncertainty, blah blah" and is therefore interesting - though for another day). 


Turning to the resolution of our (lowercase) lottery paradox, let us first consider a simpler setup. We replace the 10,000 possible choices with just two, labelled heads and tails, and the 15,000 random buyers with just two people, Alice and Bob. Both Alice and Bob flip a coin to make their choice. Mary buys both heads and tails. To further simplify, let us imagine that there is no 10% rake. It is a zero sum game. 

We can enumerate the possibilities. It is clear that if Alice and Bob both choose heads then, on average, Mary will profit. For in this scenario, she scoops the pool (for a net gain of 1/2 the pool) half the time and otherwise, she collects 1/3 (a net loss of 1/6 of the pool). Thus, she is well ahead. 

On the other hand, if Alice and Bob are lucky enough to choose different outcomes, then nobody wins on average.

Overall, Mary is winning. It should be clear that she benefits from Alice and Bob's uneven buying, when that occurs. The coin example is special because Alice and Bob have a decent chance of accidentally buying evenly, but obviously as we increase the number of players, and lottery numbers that can be chosen, that will cease to be the case. 


Mary's Last Ticket

We can come at this a different way. Consider Mary's second ticket. It takes advantage of knowledge of Mary's first ticket - which we can suppose to be heads without any loss of generality. By choosing tails rather than heads, we reduce the expected number of other people we will split the pot with ... by one.

Let's check that. Mentally copying the leftmost diagram so that we see all eight possibilities, Mary will share with one other person on four occasions out of eight; with two people on two occasions out of eight, and the other two times she will win outright. She shares with an average of one other person. 

A fictional bad Mary, who nominates the same outcome on both tickets (say tails, tails) will find that each of her tickets shares the prize with two others (twice); three others (once) and one other (once) - because we have to include her other ticket in the count. Bad Mary shares with an average of two other people. 

Alice, on the other hand, will share with Mary on two occasions; and with Mary and Bob on two occasions. So she shares with an average of one and a half other people. The difference between Mary and Alice is half a person, as it must be, since Alice (in conjunction with Bob) is half Mary and half bad Mary. 

Now let's return to our original problem, the lottery with \(15,000\) other people buying tickets. As the ordering doesn't matter, we could imagine that Mary has bought all but one of her tickets and also, that \(14,999\) tickets have been sold to those choosing randomly, with only Alice left to buy a ticket. But, unlike Alice, Mary knows the \(9,999\) numbers to avoid - the ones she has already chosen. And what's true of Mary's last ticket is true of her first, again by symmetry, though we will return to that momentarily. 

As compared to the coin example, it is almost certain that Alice's choice will mean she splits the pot with exactly one more person than Mary will. The odds of Alice accidentally being Mary-like are much slimmer now - only one in 10,000. So at the risk of tempting the uppercase Lottery Paradox, we can simply ignore that possibility. Alice will share with almost exactly one more person than Mary. And what's true of Alice, the last random ticket buyer, is true of all others by symmetry.

What matters, of course, is not the average number of people we share the prize with but the ratio of Mary to Alice's payout. But this is almost given by: \begin{equation} R = \frac{ E[1/M] } { E[1/(M+1)]} \end{equation}

where \(M\) is the random variable representing the number of people sharing the prize. The average is over ticket choices, one might say, though also over the usual probability space, since each lottery outcome is equally likely.

Mary's First Ticket

Hang on you say ... there was nothing special about Mary's first ticket. So how can this be? Have we replaced one paradox with another? 

The path is a little different, admittedly. One concedes that if Mary had stopped buying after one ticket, then that ticket would, of course, have been no better or worse than Alice's, Bob's or any other random ticket buyer. However, Mary's first ticket has been getting more valuable ever since, because it has been "lucky" not to attract more investment - at least from Mary. 

Conversely, if Alice were the first ticket buyer then she is initially no worse off than Mary's first ticket. However, Alice's first ticket will lose value as others happen across the same number - including Mary. Mary's first ticket would lose value also, but only due to the random buyers and not her own buying. It is a zero sum game, so Mary's ticket actually appreciates because the growing pool size more than compensates for growth in the number of people picking the same number, whereas in Alice's case the reverse is true. 

But wait ...

Two paradoxes whacked, and now a third pops up. Since Mary always wins, the average number of people Mary shares the prize with is also the average number of people who share the prize. Is it not strange that Alice, choosing randomly, winds up sharing the pot with (an average of) one more person than the average number of people who share the pot?  

The nuance is the difference between averaging over tickets and averaging over people. For Mary, these concepts coincide because she buys every ticket. However, when considering Alice's fortunes and those like her, the relevant average is over people. When five people all choose the same ticket, the number five gets counted five times. This clustering effect moves the mean number of winners sharing a prize (averaging over people like Alice) higher than the mean number of winners (averaged across ticket choices). 

The Lottery Problem

Now we have several angles with which to see that Mary is a winner. Are you heading out the door to buy lottery tickets?

This lottery paradox contributes to the ability of lottery syndicates to profit from lotteries - though it is far from the whole story. There are many other effects influencing the vulnerability of a lottery, if we frame it that way, to the buy-every-combination attack. The size of the running jackpot and degree of non-uniformity in ticket choice (birthdays, etc) obviously play a role in assessing return.   

There are variants of the "attack" carried out by Mary. Depending on roll-down rules (i.e. countbacks) and prize distribution (minor prizes) it may be possible to buy subsets of tickets and profit. This line of reasoning leads to interesting set-covering problems, including one appropriately named the Lottery Problem

The Lottery Problem challenges us to find a minimal set of lottery tickets that will ensure we match some, if not all, of the numbers drawn. For example, if a lottery asks us to choose six numbers, we want to buy a large number of tickets and ensure we have at least one ticket that has five of the drawn numbers. You can read the paper by Li and van Rees called Lotto Design Tables if you are interested in this area. 

Just this week, an interesting example of non-uniform buying arose in a South African lottery, where numbers 7, 8, 9 and 10 were all drawn, leading to the pot being split between twenty winners (story). This will no doubt be investigated and foul play is a possibility, but it is also consistent with laziness. 

Lotteries in Disguise?

Our considerations apply, to a lessor degree, to racetrack totalizator wagering. The pari-mutuel scheme is the usual name for this, but it can be viewed as a special case of a lottery - just one where normally the pot is split by a very large number of winners, since the number of outcomes is small. The lottery paradox is not really a thing, at least as far as win betting is concerned, because the density of buyers relative to outcomes is so high.  

However that isn't always the case when we consider combinatorial wagers. 

Let us imagine a race in which, by chance, all runners are regarded as roughly equally likely to win. Let us further suppose that patrons are able to place wagers on the first six horses to finish in the correct order. This lottery (of sorts) might also exhibit the same characteristics. The higher the entropy of the race (all horses having similar odds), the more the racetrack market starts to resemble a lottery. 

Another disguised lottery I am somewhat loath to bring up is COVID-19. Here we are on the right side of the lottery paradox. Nature plays like Alice, reducing overall deaths through randomness. The reason is that peak infection level is inversely proportional to contagiousness in the community. 

Let us imagine that Mary has the power to "even out" infectiousness across different regions or countries. Infectiousness, as we know, is driven by population density, behavior and (most importantly) whether people voted for Ralph Nader in 2000 (not kidding, see article). I'm not really sure how Mary could do this, but perhaps she could arrange for some combination of Donald Trump and Jacinda Ardern to set policy for both the United States and New Zealand (weighting Trump in proportion to population). The lottery paradox suggests that this will increase the aggregate number of deaths.

Beyond that contrived example, I note parenthetically that there are more serious convexity considerations and I refer the interested reader to this working paper where some analytic estimates are computed, and in which the following rough approximation of adjusted reproduction number is provided

\begin{equation} susceptible\ multiplier \approx 1 + \frac{relative\ variation}{R} \end{equation} where \(R\) is an estimated reproduction number and the relative variation is the stanard deviation in infection rate divided by the mean infection rate.

There's Something (Else) About Mary

Returning to more pleasant themes, let us revisit the lottery paradox from just one more angle, emphasizing conditional expectation in the resolution of the paradox. Notice that for both Alice and Mary, what is important is not the mean number of winners but the conditional mean number of winners (conditional if you win).

The number of winners conditional on Alice winning is approximately the unconditional distribution shifted up by one (since we have to add Alice winning and previously she had only a 1 in 10,000 chance). Does this "plus one" shift happen with Mary? No.

In fact, Mary has chosen a strategy where revelation of her winning reveals no information whatsoever, and that is a useful way to think about her choice. No new information arrives. Thus no +1 shift. This explains (again) the differential between Mary and Alice. 

Let us suggest that the lottery gods are rewarding Mary for not only identifying the correct distribution (Alice does that too) but also representing it faithfully with a collection of perfectly chosen tickets. That is the real theme of this article. 

And the lottery gods are even more protective of Mary (if she identifies the correct distribution) if there is no rake. Mary's strategy is impregnable if the town takes no cut. Now Mary was not up against the stiffest competition, admittedly, since her opponents - the Alices and Bobs - aren't talking to each other and to the contrary, seem intent on stepping on each other's toes. So this theory hasn't really been tested yet. 

However, if we replace Alice and Bob by more worthy adversaries, it will still be the case that against any strategy (adopted by any combination of opponents in collusion and intent on her financial destruction) Mary will do no worse than break even in the long run. 

We can prove this using the harmonic mean inequality. This states that for any collection of positive real numbers \(q_1, q_2, \dots q_n\), \begin{equation} n \left( 1/q_1 + \dots + 1/q_n \right)^{-1} \le \frac{q_1 + \dots q_n}{n} \end{equation} where we recognize the right hand side as the arithmetic mean and the left side the harmonic mean. 

Let \(q_k\) represent the number of tickets bought where number \(k\) is chosen. Mary's payout is inversely proportional and, averaging over all equally likely outcomes, her mean prize collect, assuming a one dollar ticket cost, is \begin{equation} \frac{1}{10,000}  \left( 25,000/q_1 + \dots + 25,000/q_n \right) \ge 25,000 \frac{10,000}{ q_1 + \dots q_n} = 10,000 \end{equation} using the harmonic mean inequality. Thus her mean prize is no less than her outlay. 

Alternatively, one might bring out the slightly heavier machinery in the form of Jensen's Inequality. We reason that any strategy adopted by opponents can, when added to Mary's ticket buying, be viewed after normalization as a probability. The payout is the expectation of \(x \mapsto 1/x\) viewed as a random variable on \(\{1,\dots,10,000\}\). Non-negativity of Mary's return follows from \(E[1/X] \ge 1/E[X]\) since the right hand side is equal to Mary's investment. 

The Normal-ish Distribution

And now, as they say, for something completely different. Or not that different as it turns out - but before our story can continue I need to introduce you to a distribution you almost certainly haven't met yet.

Define, for integer \(i\), the transformation \begin{equation} g(i) = \left( -\log\left(1-\frac{i-1/2}{10000}\right) \right)^{1/4} \end{equation}

and notice what happens when we apply it to our tickets labelled \(i=1\) through \(i=10,000\). We shall call the result the normal-ish distribution. It is similar to normal but more tractable for our purposes. I have absolutely no idea if it has a real name. 


If you look very closely you may notice some departure from normality. Our transformation is made less obscure, incidentally, by noting it is a composition of three transforms. We map tickets into \((0,1)\), then use a standard method of converting a uniformly distributed variable into an exponentially distributed variable. Finally, we take a fourth root which almost converts an exponentially distributed variable into a normally distributed one. The three transforms are \begin{eqnarray*} i & \mapsto & \frac{i-1/2}{10000} \\ u & \mapsto & -\log\left(1-u\right) \\ t & \mapsto & t^{1/4} \\ \end{eqnarray*} You can see this recreates \(g\). 

A Continuous Lottery

Let's play a new game. In place of drawing a number between 1 and 10,000 to determine the winner of the lottery, we will instead draw a random number from the normal-ish distribution shown in blue above. Let's call this draw \(x\). The value \(x\) defines a winning interval \(I=(x-\epsilon, x+\epsilon)\) that we shall term the "hoop," where \(\epsilon\) is a fixed small number known to contestants in advance.

Those participants will choose a real number \(y\) and write it on their ticket before dropping it in the ballot box. All tickets in the hoop are declared winners, and the prize-money will be split evenly amongst them. 

If the number of choices \(n=10,000\) is allowed to grow, and if the hoop shrinks at an appropriate rate (i.e. \(\epsilon \rightarrow 0\)) the continuous lottery will be very similar to the discrete one. We could make them the exact same game if we choose \(\epsilon\) very small and, instead of drawing \(x\) directly, draw \(i \in \{1,\dots, 10,000\}\) and then set \(x=g(i)\) since in that situation players might as well be choosing an integer \(i \in \{1,\dots, 10,000\}\) as before. Even without taking this artificial step the games are morally the same. 

Let us suppose that Mary has been able to discern the normal-ish distribution that is being used to settle this continuous lottery. Let us further suppose that Mary submits her view in the form of a vast number of choices \( y_1,\dots,y_n\) that collectively take on the exact same distribution. Our previous discussion of the lottery paradox, together with the seeming similarity between the two games, leads us to believe that her strategy is unbeatable. Mary will again best the uncoordinated opponents, whose random sampling leads them to collectively ascribe a lumpy, and therefore incorrect, estimate of probability.

But we aren't playing an equivalent game just for the sake of it. The continuous lottery is, I think, more suggestive of a second level of complexity, one that arises because the distribution is unknown or unknowable. In this new game, Mary must keep the lottery paradox in mind. However, she can also benefit from discerning a closer approximation to nature's true distribution than her opponents. Mary will win if the rest of the market exhibits inaccuracy - such as assuming a translation of the true distribution. 

I say this is more suggestive but of course you can always map back to the space \(0,1\) or the set \(\{1,\dots,10,000\}\) and remove the assumption that the drawing of the lottery ticket from the hat is fair. However, I think we have slightly more intuition for how different opinions of probability are likely to differ from one another on the other side of the transformation \(g\). For example, one person's estimate might be a translation of another, or a rescaling, if both are normal, say. 

We're going to assume that Mary's opponents mis-estimate nature's true normal-ish distribution. There aren't too many cases where Mary's excess return can be computed in closed form, and the normal-ish distribution provides one such example while at the same time looking close enough to normal to ground our intuition. Why else would I have brought it up?  

The Zero-Impact Approximation

To proceed, I will assume that Mary is a small player. Her investments are negligible relative to everyone else's.

This assumption is more realistic in some applications than others. It can be accurate for combinatorial racetrack bets in Australia, because one to choose a large number of combinations and then specify a fixed total investment, rather than specifying a dollar amount per combination. Elsewhere, a higher minimal investment might make the assumption less realistic, though the calculation to follow is still useful. 

An estimate of Mary's expected payout when adopting distribution \(\pi\) can be provided by the following formula \begin{equation} E_{\pi} = \int_0^{\infty} \frac{ P(x) \pi(x) }{Q(x)} dx \end{equation} where \(Q(x)\) is the distribution of all tickets (including Mary, strictly speaking, but also assumed to be essentially untouched by Mary's small contribution). Here \(P(x)\) is the true underlying distribution from which \(x\) will be drawn.

Perhaps it is already obvious that Mary cannot be beaten if she adopts \(P(x)\). In the case \(\pi=P\) we take the original game as our cue, and make the change of variable \(x=g(u)\), noting also that \(dP=du/g'(u)\). We also let \(q(u)\) denote the distribution on \((0,1)\) generating \(Q\) under the map \(g\). This lifts Mary's expected return back to (a continuous version of) the space on which the original game was played - and one in which both the objective probability and her choice are both uniform

\begin{equation} E_{\pi} = \int_0^1 \frac{1}{q(u)/g'(u)} du/g'(u) = \int_0^1 \frac{1}{q(u)} du \ge \left( \int_0^1 q(u) du \right)^{-1} = 1 \end{equation} where the inequality is Jensen's Inequality, as we anticipated.

(By the way, in this exercise some technical laziness seems justified to keep this exposition on the lighter side. For example, we have assumed in our notation that \(P\) is absolutely continuous with respect to Lebesque measure).

I merely want to convey the core message here - that Mary can adopt a strategy analogous to buying every ticket. The fact that the lottery draw might take place using a continuous, non-uniform distribution might be seen as smoke and mirrors - distracting us from the simple lottery paradox. Indeed, to keep the optical illusion metaphor going, you might want to view one game as the image of the other in a distorting mirror - the kind you find in an amusement park. 

The transform \(g\) we used, or its inverse rather, is almost a distributional transform. Distributional transforms expresses the fact that "all continuous dice are equivalent". For us, the moral is that "all lotteries are equivalent" - at least those where the distribution is knowable. 

Mary's Reward for Precision

But assuming the distribution isn't known, at least to everyone, Mary's mean return (when she is more accurate than others, say) can be approximated for the normal-ish distribution, thus giving rise to an intuitive relationship between accuracy and reward. 

In fact, we will first consider exponential distributions, which is like stopping half way through the calculation of the map \(u \rightarrow g(10000 u)\). To keep things simple, we assume Mary's makes no distributional error but her opponents get the rate parameter for the exponential distribution wrong.

That is to say that we assume a "market distribution" \(q(x) = \gamma e^{-\gamma x}\) capturing the opinions of everyone else, where \(\gamma \approx 1\), and true distribution \(p(x) = e^{-x} = \pi(x)\) that is (we shall suppose) correctly identified by Mary. It follows that Mary's mean payout is

\begin{eqnarray*} E_{\pi} & = & \int_0^{\infty} \frac{p(x)\pi(x)}{q(x)} dx \\ &= & \int_0^{\infty} \frac{e^{-x}e^{-x}}{\gamma e^{-\gamma x}} dx \\ & = & \frac{1}{\gamma} \int_0^{\infty} \gamma e^{-(2-\gamma)x} dx\\ & = & \frac{1}{\gamma(2-\gamma)} \\ & = & \frac{1}{(1-\delta^2)} \\ & \approx & 1 + \delta^2 \end{eqnarray*} where we have defined \(\delta \ll 1\) by \(\gamma = 1 + \delta\), showing that if others mis-estimate the parameter in the exponential distribution by \(\delta\), her return will be roughly \(\delta^2\). 

In order to translate this result over to the sort-of-normal distribution, we need to know a little about that transformation. After taking the forth root of Mary's data points, their mean and variance will be approximately \(\mu=0.9\) and \(\sigma=1/4\) respectively. Furthermore, the change of variable \(v=(1+\delta x)\) informs how these quantities will change when we replace \(\pi(x)=e^{-x}\) with \(q(x)=(1+\delta)e^{-(1+\delta)x}\). Both will scale by \((1+\delta)^{-1/4} \approx (1-\delta/4)\). 

Mary's return varies as the square of the error of her opponents, or if you prefer, with their variance. 

To briefly illustrate, let us suppose that the "market" gets the location and scale of the almost normal distribution wrong by an amount equal to 10% of the standard deviation, or \(0.025\) in absolute terms. We set \((1+\delta)^{-1/4}=1.1^4\) implying \(|\delta| \approx 0.46\). In turn, this suggests that Mary's return will be on the order of \(20\)%. 

On the other hand, if the other contestants are wrong by only 1% of the standard deviation, then Mary's return will be closer to \(0.2\)%, or \(20\) basis points in the lingua of finance. 

Mary Doesn't Care What Anyone Else Does

We have been assuming that Mary doesn't know the values chosen by her adversaries. If she does, and she also knows the true distribution, then naturally she will be tempted to "fill in the holes" of the market distribution \(q(x)\) rather than supplying her honest estimate of \(\pi(x)\). This is no different to someone betting on the one horse whose odds seem to be the most out of line, rather than betting on all horses.

And yet, I must mention one remarkable mathematical accident that arises in this context. Suppose the following conditions hold:

  1. There is no rake. Mary and company play a zero sum game;
  2. Mary is constrained to invest all her wealth (which if you think about it isn't really an imposition, since multiples of \(q(x)\) are riskless); and 
  3. Mary optimizes the logarithm of her posterior wealth (in the spirit of maximizing long run growth rate of her wealth).

Mary will in fact choose \(\pi=P\). In other words, she will ignore the market \(q(x)\)!

This is perhaps the most counter-intuitive fact of all, as it seems inconceivable that one should completely ignore the odds when choosing one's bets. It seems far more natural that as \(q(x)\) decreases, our investment near \(x\) should increase in order to take advantage of the increasing value for money (bet more on a horse if offered better odds).

However, the math doesn't lie and one way to approach this is via Langrange multipliers in the discretized case. In the zero-impact approximation, we are trying to maximize a quantity proportional to \begin{equation} f(\pi) = \sum_{i=1}^{n} p_i \log\left( \frac{\pi_i}{q_i} \right) \end{equation} subject to \begin{equation} g(\pi) = \sum_{i=1}^{n} \pi_i - 1 = 0 \end{equation} It is apparent that the optimization doesn't see the \(q\) terms as they split from the logarithm and contribute only constant terms, so we know that Mary's strategy won't depend on what other people do - again assuming that her investment is small compared to others. The fact that Mary invests in proportion to her true belief can be established readily. The standard approach considers the first order Langrange condition \begin{equation} \frac{\partial}{\partial \pi_i} \left( f - \lambda q \right) = 0 \end{equation} for Langrange multiplier \(\lambda\). From this we observe that the quantity \begin{equation} p_i \frac{q_i}{\pi_i}q_i = \frac{p_i}{\pi_i}\end{equation} is proportional to \(\lambda\) and independent of \(i\). Thus \(\pi \propto p\) and it is clear what is going on. 

A more elementary approach considers transferring an incremental investment \(\epsilon\) from the first ticket choice to the second, assuming others have invested proportional to \(q_1\) and \(q_2\) on these tickets respectively. Setting \begin{equation} 0 = \frac{d}{d\epsilon} \left\{ p_1 \log \frac{\pi_1-\epsilon}{q_1} + p_2 \log \frac{\pi_2+\epsilon}{q_2} \right\} \end{equation} we arrive at the conclusion that \(\pi \propto p\) once again, independent of \(q\).

Distance is Profit

It will not escape the reader that with \(p=\pi\), the mean return includes a term equal to entropy:\begin{equation} H = - \sum_{i=1}^n p_i \log(p_i) \end{equation} Mary is trying to reduce entropy, in this sense. In the case of a lottery, that means buying evenly. Moreover, in generality her return is proportional to the Kullback and Leibler cross-entropy - the standard distance between the true distribution that she identifies and the market distribution.\begin{equation} D(P,Q) = \sum_{i=1}^n p_i \log\left( \frac{p_i}{q_i} \right) \end{equation}The further away from the truth the market strays, according to this distance, the more Mary makes.

This comes as no great surprise because the connection between investment returns and probabilistic distance measures is well established. For a recent review, see Applications of Entropy in Finance: A Review by Rongxi Zhou, Ru Cai and Gaunqun Tong. The relationship between functionals of markets (stochastic processes representing them) and returns is the subject of Stochastic Portfolio Theory  pioneered by Robert Fernholz. 

Want to Play?

Is all of this theoretical? No. The larger vision is the subject of my new book Microprediction: Building An Open AI Network, published by MIT Press. Continuous lotteries are used by Intech for the best stock volatility predictions, and for transport prediction, and much more. If you'd like to pit your skills in continuous lotteries, head to the Python tutorials or the R language tutorials  or just the documentation.

Further Reading

  • Does it Pay to Buy the Pot in the Canadian 6/49 Lottery? Implications for Lotttery Design (ssrn 2018). Steven Moffitt and William T. Ziemba. 
  • The excess return paradox is considered in A Method for Winning at Lotteries (ssrn, 2017). Steven Moffitt, William T. Ziemba. 
  • The indifference paradox was mentioned in Keeping Punters Log-Happy: Some Properties of a Pristine Parimutuel Market Clearing Mechanism (blog, 2013) Peter Cotton. (I noticed it in a past life optimizing parimutuel betting). 
  • The book Parimutuel Applications in Finance: New Markets for New Risks (amazon, 2007) covers the pioneering work of Ken Baron and Jeffrey Lange to adapt lotteries (e.g. linear bundling of Arrow Debreu states, as it were). 
  • Combining Probability Forecasts. Roopesh Ranjan and Tilmann Gneiting (tech report 2007) considers lack of calibration and sharpness of linear pooling. 
  • Lotto Design Tables. P. C. Li and G. H. J. van Rees (pdf) considers the lottery problem.