Monty Hall game: a host with limited budget

Abstract

Abstract

In this paper we introduce a new version of the classical Monty Hall problem, where the host is trying to maximize the audience while restricted in its budget. This problem is related to the design of games with a predetermined outcome and decision-making process under uncertainty when the agent does not know if the received advice is favorable or not.

Introduction

The Monty Hall problem appeared first in a letter to the American Statistician of S. Selvin [1], and it is a nice and controversial problem for introductory courses in probability, statistics, and game theory. The problem can be posed as follows:

You are playing a game on a TV show. There are three doors. One of them has a car in the backside, and the other ones have goats. You select one of the doors, say door No. 1. And before opening it, the host who knows where the car is, opens another door, say No. 3, which has a goat. And then he gives you the possibility to change, allowing you to pick door No. 2. What do you do?

The problem posed in this way may lead to a lot of controversy, mainly because we do not know whether the behavior of the host had anything to do with your first choice or not. As Gill stated in [2], this is a problem of mathematical modeling, and the answer is not a probability but a decision, and the decision must be chosen in a setting of uncertainty.

Perhaps the host would open a door with a goat only when your first choice was right. In this case, it was not a good choice to change doors. But if the host always shows you a door with a goat after your first choice, then by changing the first choice you increase your probability of winning a car from 1/3 to 2/3. One of the easiest way of seeing this is the following: imagine that you play this game repeatedly, around 1/3 of the times your first choice is right, and then you do not win the car because you change doors; in the other 2/3 of the times your first choice is wrong, and changing to the door the host did not open will make you win the car. So, 2/3 of the times you will be winning the car.

There are many variants on this game, see the recent book of Rosenhouse [3]. Some of the analysis are based on game theory although there is only one player, since the host’s behavior is completely determined. In all the variants we know, the game is analyzed from the point of view of the player, and the host acts in almost a deterministic way (of course, sometimes the host takes a decision at random). In several variants, the host’s behavior correspond to one of two main cases: the malicious host who opens a door only when the player chooses the right door, and the benevolent host who opens a door only when the player chooses one of the wrong doors, see [4, 5] and Chapter 5 in [3]. A combination of both cases can be found in [6], and it is assumed that the participant has no information on the proportion of times that the host behaves as a malicious or benevolent host, and he cannot change doors when the host does not open a door.

However, as Fernandez and Piron remark in [7], the host is another player in the game, with his own motivation, and he shows a door or not according to his own objectives. Hence, we wish to analyze here a different combination of both cases, based on the following assumptions on TV shows. First, we assume that the host tries to maximize the audience while minimizing the costs. Now, suppose the TV show wants to last long playing this game because of its success, and its budget is not compatible with giving a car in 2/3 of the programs. Let us say that in the long run (by using the law of large numbers) he can give a car with probability α each show, where

$\frac{1}{3}\le \alpha \le \frac{2}{3}.$

We will restrict our analysis to this case, since lower values of α can be easily allowed by adding more doors, and higher values can be obtained by adding more doors and then opening more than one with goats. This general case follows in almost the same way.

Moreover, we assume that the host believes that the success of the game is based on the tense moment in which the host has opened a door with a goat after the first choice of the player and gives the player the possibility of changing his choice. So the host plays with a strategy which fixes the probability α of the player of winning the car and maximizes the proportion of shows in which the host opens a door with a goat.

We can think of this problem as an inverse problem in game theory, since α is the minimax solution of the zero-sum game between the player and the host: it is the minimum expected prize that the player can win and the maximum expected prize that the host can pay.

Another difference with other versions of the Monty Hall problem is the following: the player will have the option of changing his first choice both when the host shows a door with a goat and also when he does not show anything. This is a variant that seems to be overlooked in previous works; although in many real game shows, the host offers the player the option to change his decision, asking repeatedly things like ‘Are you sure?’, ‘Do you want to change your answer?’, and ‘Is that your final answer?’ Usually, there exists a key question which finishes the option to change. We believe that it is more realistic to include both option - the player can change his choice even if no door is open and the host is constrained in the number of times he can offer to open a door - in this game.

Finally, let us mention that this model is related to a more complex multi-agent problem, where each agent is a driver choosing among few roads. A real-time device - like a radio station, GPS, and intelligent transport systems - can inform or not the state of the roads, reducing the travel time uncertainty. However, several little accidents in the same road (cars stopped due to flat tires or lack of gasoline) are not interesting enough to catch the attention of the media compared with a major accident or the effect on the traffic cannot be measured quickly, although the road starts to be congested. Here, the system acts as the host, and in the worst-case scenario, it gives the minimum information about traffic or the information reaches the agent when a change of roads is not possible due to communications delays. We refer the interested reader to [810] for models of route choice with real-time information, and a discussion about the behavioral mechanism of drivers in this kind of decision-making process.

The work is organized as follows: in Section ‘The Monty Hall problem and our model’, we introduce the parameters of the model and the rules of the game. In Section ‘Optimal strategiesOptimal strategies’, we formulate and solve an equivalent problem, and we find the optimal strategies of the host and the player. The equivalence of the problems is proved in Section ‘The dual problem’ where we solve the inverse problem giving the optimal strategies as functions of the probability parameter α. We compare the payoffs when the players is not allowed to change doors if the host does not open an extra door in Section ‘A related modelA related model’, and we conclude in the Section ‘Conclusions’.

The Monty Hall problem and our model

A variant of the problem

Let us state the conditions of the game:

1. 1.

There are three doors, one of them has a car in the backside, and the other ones have goats.

2. 2.

The host keeps the winning probability fixed at some α [ 1/3,2/3].

3. 3.

The host knows where the car is, and his strategy will be based in two numbers, say:

• m = Probability of the host showing a door with a goat given the first choice of the player was right.

• b = Probability of the host showing a door with a goat given the first choice of the player was wrong.

We can associate as in [6] the letter m to malevolence, since the host is tempting the player to change his choice when the player has chosen the right door; and the letter b to benevolence, since the host is tempting the player to change his mind after he has chosen the wrong door.

He will always use those values.

1. 4.

The host wants to maximize the number of times he opens a door.

2. 5.

The host understands that in the long run both numbers m and b will be well estimated by the show’s followers since in each tv show, when the game is over, all three doors are shown to the viewers in order to demonstrate the transparency of the game. So, the host expects that the player will know m and b and that the player will act in order to maximize its probability of winning.

3. 6.

The player’s strategy is based on two probabilities, say:

• c = Probability for the player changing given the host has open a door with a goat.

• n = Probability for the player changing given the host has not open a door with a goat.

• The players will use the same values of m, b, n and c in each game.

The problem, now, is as follows: How the probabilities m, b, c and n must be chosen by the host and the player? Recall that the host is trying to maximize the expected number of times that he will open a door, keeping the probability α fixed, and the player is trying to win the car.

We have the following sequential game in each tv show between the host and the player:

• A car is hidden behind one of three doors and remains there until the game is finished.

• The player chooses a door.

• The host - knowing if this choice was right or wrong - decides to open a door or not by using the probabilities m or b, respectively. If he opens a door, he shows a door with a goat.

• The player can change his initial choice, knowing the host’s decision, and the new information if the host shows a goat or not, according to the probabilities c or n.

• The game finishes and the host open all the doors.

• The player wins if and only if the car is behind the door he finally choose.

Observe that the host must pay the full value of the car when the player wins. The law of large numbers enables him to estimate the prizes as the number of shows times the winning probability, although the variance introduces a serious risk when expensive prizes are involved. A classical way to deal with this uncertainty is to obtain a specialized coverage from an insurance company and the cost will depend on the player’s winning probability.

Optimal strategies

Let us call $\mathbb{P}\left(m,b,c,n\right)$ the player’s probability to finally win the car when the host’s strategy is (m, b) and the player’s strategy is (c, n).

We define now

$\mathit{\text{PFW}}\left(m,b\right)=\underset{\left(c,n\right)}{\mathrm{max}}\phantom{\rule{2.77626pt}{0ex}}\mathbb{P}\left(m,b,c,n\right),$

which gives the best probability for the player to finally win the car if the host uses strategy (m, b).

Let us now calculate the probability for the host showing a door with a goat in terms of m and b. First of all we name some events as follows:

• O = ‘The host opens a door with a goat’.

• FR = ‘Player’s first choice was right’.

• FW = ‘Player’s first choice was wrong’.

• PW = ‘The player finally wins using his best possible strategy’.

• C = ‘The player changes his first choice’.

• NC = ‘The player does not change his first choice’.

We will call Oc the complementary event of O.

According to the law of total probability, we have

$\mathbb{P}\left(O\right)=m·\mathbb{P}\left({F}_{\mathrm{R}}\right)+b·\mathbb{P}\left({F}_{\mathrm{W}}\right)=\frac{m}{3}+\frac{2b}{3}.$
(1)

Also, if (m, b) ≠ (0,0)

$\mathbb{P}\left({F}_{\mathrm{R}}|O\right)=\frac{\mathbb{P}\left({F}_{\mathrm{R}}\cap O\right)}{\mathbb{P}\left(O\right)}=\frac{\mathbb{P}\left({F}_{\mathrm{R}}\right)·\mathbb{P}\left(O|{F}_{\mathrm{R}}\right)}{\frac{m}{3}+\frac{2b}{3}}=\frac{\frac{m}{3}}{\frac{m}{3}+\frac{2b}{3}}.$
(2)

and if (m, b) ≠ (1,1)

$\mathbb{P}\left({F}_{\mathrm{R}}|{O}^{\mathrm{c}}\right)=\frac{\mathbb{P}\left({F}_{\mathrm{R}}\cap {O}^{\mathrm{c}}\right)}{\mathbb{P}\left({O}^{\mathrm{c}}\right)}=\frac{\mathbb{P}\left({F}_{\mathrm{R}}\right)·\mathbb{P}\left({O}^{\mathrm{c}}|{F}_{\mathrm{R}}\right)}{\frac{1}{3}\left(1-m\right)+\frac{2}{3}\left(1-b\right)}=\frac{\frac{1}{3}\left(1-m\right)}{\frac{1}{3}\left(1-m\right)+\frac{2}{3}\left(1-b\right)}.$
(3)

Now that we have made all these definitions, the host is willing to find

$\left({m}^{\star }\left(\alpha \right),{b}^{\star }\left(\alpha \right)\right)=\underset{\mathit{\text{PFW}}\left(m,b\right)\le \alpha }{arg max}\frac{m}{3}+\frac{2b}{3},$
(4)

that is, the host is maximizing the number of times that he opens a door, keeping the player’s winning probability bounded by α.

A dual problem

Let us consider the problem,

$\left(\stackrel{~}{m}\left(\beta \right),\stackrel{~}{b}\left(\beta \right)\right)=\underset{\frac{m}{3}+\frac{2b}{3}\ge \beta }{arg min}\mathit{\text{PFW}}\left(m,b\right),$
(5)

namely, the host chooses his probabilities in order to minimize the winning probability of the player, constrained to open the door in at least 100·β% of the shows.

As we will see in the next section (see Lemma 4.1), it is equivalent to solve any of the problems (4) and (5). However, the minimization problem (5) is conditioned on a very simple restriction, and we can give explicitly one of the variables in terms of the other; problem (4) on the other hand is conditioned on PFW which is given by a piecewise function

Now, if the host deviates from the optimal strategy, there are two possibilities: when he opens a door more times, he is playing with higher values of m and/or b, and the dual problem shows that the player will win more games than the ones allowed by the host’s budget (recall that the player can detect the values of m and b). On the other hand, by reducing the number of time he opens a door, the player’s winning probability decreases, and the host spent only a part of the budget.

So we will concentrate now in solving problem (5), and in Section ‘The dual problem’ we show that if this problem has a solution, this must be a solution of problem (4).

An expression for PFW

First of all we compute PFW. Notice that when the host opens a door with a goat, the player has only two options: either he keeps his first choice with a probability of winning $\mathbb{P}\left({F}_{\mathrm{R}}|O\right)$ or he can change his first option to the other unknown door with a probability of winning of $1-\mathbb{P}\left({F}_{\mathrm{R}}|O\right)$. Whereas when the host does not open a door with a goat, he can stay with a probability of winning of $\mathbb{P}\left({F}_{\mathrm{R}}|{O}^{\mathrm{c}}\right)$ or he can choose any of the other two doors with probability of winning $\frac{1-\mathbb{P}\left({F}_{\mathrm{R}}|{O}^{\mathrm{c}}\right)}{2}$

Hence, we can calculate the probability for the player to finally win using the best possible strategy given the host’s strategy is (m, b) as follows:

$\begin{array}{lll}\mathbb{P}\left(\mathit{\text{PW}}\right)\hfill & =\hfill & \mathbb{P}\left(\mathit{\text{PW}}\cap O\right)+\mathbb{P}\left(\mathit{\text{PW}}\cap {O}^{\mathrm{c}}\right)\hfill \\ =\hfill & \mathbb{P}\left(O\right)\mathbb{P}\left(\mathit{\text{PW}}|O\right)+\mathbb{P}\left({O}^{\mathrm{c}}\right)\mathbb{P}\left(\mathit{\text{PW}}|{O}^{\mathrm{c}}\right).\hfill \end{array}$

Using the independence of C with FR and FW given O respectively, we have

$\mathbb{P}\left(\mathit{\text{PW}}|O\right)=\underset{c,n}{\text{max}}\left\{\mathbb{P}\left({F}_{\mathrm{R}}\cap \mathit{\text{NC}}|O\right)+\mathbb{P}\left({F}_{\mathrm{W}}\cap C|O\right)\right\}.$

However, since n appears only in the case that the host does not open a door, we get

$\begin{array}{lll}\mathbb{P}\left(\mathit{\text{PW}}|O\right)\hfill & =\hfill & \underset{0\le c\le 1}{\text{max}}\left\{\left(1-c\right)\mathbb{P}\left({F}_{\mathrm{R}}|O\right)+\mathrm{cℙ}\left({F}_{\mathrm{W}}|O\right)\right\}\hfill \\ =\hfill & max\left\{\mathbb{P}\left({F}_{\mathrm{R}}|O\right),\mathbb{P}\left({F}_{\mathrm{W}}|O\right)\right\},\hfill \end{array}$
(6)

since we will take c = 0 (respectively, c = 1) when $\mathbb{P}\left({F}_{\mathrm{R}}|O\right)>\mathbb{P}\left({F}_{\mathrm{W}}|O\right)$ (resp., when $\mathbb{P}\left({F}_{\mathrm{R}}|O\right)<\mathbb{P}\left({F}_{\mathrm{W}}|O\right)$). Clearly, the player is indifferent when $\mathbb{P}\left({F}_{\mathrm{R}}|O\right)=\mathbb{P}\left({F}_{\mathrm{W}}|O\right)$.

Analogously, we get

$\mathbb{P}\left(\mathit{\text{PW}}|{O}^{\mathrm{c}}\right)=max\left\{\mathbb{P}\left({F}_{\mathrm{R}}|{O}^{\mathrm{c}}\right),\frac{1}{2}\mathbb{P}\left({F}_{\mathrm{W}}|{O}^{\mathrm{c}}\right)\right\}.$
(7)

When (m, b) = (0,0),we have

$\mathit{\text{PFW}}\left(0,0\right)=\mathbb{P}\left(\mathit{\text{PW}}\right)=\mathbb{P}\left(W|{O}^{\mathrm{c}}\right)=max\left\{\mathbb{P}\left({F}_{\mathrm{R}}|{O}^{\mathrm{c}}\right),\frac{1}{2}\mathbb{P}\left({F}_{\mathrm{W}}|{O}^{\mathrm{c}}\right)\right\}=\frac{1}{3},$

and when (m, b) = (1,1)

$\mathit{\text{PFW}}\left(1,1\right)=\mathbb{P}\left(\mathit{\text{PW}}\right)=\mathbb{P}\left(\mathit{\text{PW}}|O\right)=max\left\{\mathbb{P}\left({F}_{\mathrm{R}}|O\right),\mathbb{P}\left({F}_{\mathrm{W}}|O\right)\right\}=\frac{2}{3}.$

For other values of (m, b), using Equations 1, 2, 3, 6, and 7 we get

$\begin{array}{lll}\phantom{\rule{4em}{0ex}}\mathbb{P}\left(O\right)\hfill & =\hfill & \frac{1}{3}m+\frac{2}{3}b,\hfill \\ \phantom{\rule{3.2em}{0ex}}\mathbb{P}\left({O}^{\mathrm{c}}\right)\hfill & =\hfill & \frac{1}{3}\left(1-m\right)+\frac{2}{3}\left(1-b\right),\hfill \\ \phantom{\rule{1.6em}{0ex}}\mathbb{P}\left(\mathit{\text{PW}}|O\right)\hfill & =\hfill & max\left(\frac{\frac{1}{3}m}{\frac{1}{3}m+\frac{2}{3}b},1-\frac{\frac{1}{3}m}{\frac{1}{3}m+\frac{2}{3}b}\right),\hfill \\ \phantom{\rule{.8em}{0ex}}\mathbb{P}\left(\mathit{\text{PW}}|{O}^{\mathrm{c}}\right)\hfill & =\hfill & max\left(\frac{\frac{1}{3}\left(1-m\right)}{\frac{1}{3}\left(1-m\right)+\frac{2}{3}\left(1-b\right)},\frac{1-\frac{\frac{1}{3}\left(1-m\right)}{\frac{1}{3}\left(1-m\right)+\frac{2}{3}\left(1-b\right)}}{2}\right).\hfill \end{array}$

Then, $\mathbb{P}\left(\mathit{\text{PW}}\right)=\mathit{\text{PFW}}\left(m,b\right)$ is given by

$\begin{array}{lll}\mathit{\text{PFW}}\left(m,b\right)\hfill & =\hfill & \left(\frac{1}{3}m+\frac{2}{3}b\right)max\left(\frac{\frac{1}{3}m}{\frac{1}{3}m+\frac{2}{3}b},1-\frac{\frac{1}{3}m}{\frac{1}{3}m+\frac{2}{3}b}\right)\hfill \\ +\left(\frac{1}{3}\left(1-m\right)+\frac{2}{3}\left(1-b\right)\right)max\left(\frac{\frac{1}{3}\left(1-m\right)}{\frac{1}{3}\left(1-m\right)+\frac{2}{3}\left(1-b\right)},\frac{1-\frac{\frac{1}{3}\left(1-m\right)}{\frac{1}{3}\left(1-m\right)+\frac{2}{3}\left(1-b\right)}}{2}\right).\hfill \end{array}$

The function PFW

We now analyze the function PFW(m, b). In the previous formula for PFW there are two maxima, and we can think of it as a piecewise-defined function. The changes occur in the lines b = m and $b=\frac{1}{2}m$.

Let us analyze the function in the segment $\frac{1}{3}m+\frac{2}{3}b=\beta$ inside the square [ 0,1] × [ 0,1]. In this segment we have $b=\left(\beta -\frac{1}{3}m\right)\frac{3}{2}$. Then, for each β we can introduce the function h1, β,

${h}_{1,\beta }\left(m\right)=\mathit{\text{PFW}}\left(m,\left(\beta -\frac{1}{3}m\right)\frac{3}{2}\right).$

Replacing in the formula for PFW, a simple computation gives the following piecewise expression for h1, β(m):

${h}_{1,\beta }\left(m\right)=\left\{\begin{array}{lllll}\beta \left(1-\frac{\frac{1}{3}m}{\beta }\right)\hfill & +\hfill & \left(1-\beta \right)\frac{\frac{1}{3}\left(1-m\right)}{1-\beta }\hfill & \phantom{\rule{1em}{0ex}}\text{if}\hfill & m\le \beta \hfill \\ \beta \left(1-\frac{\frac{1}{3}m}{\beta }\right)\hfill & +\hfill & \left(1-\beta \right)\frac{1-\frac{\frac{1}{3}\left(1-m\right)}{1-\beta }}{2}\hfill & \phantom{\rule{1em}{0ex}}\text{if}\hfill & \beta

We will replace now the minimization problem (5) by a simpler one, which can be solved explicitly in terms of h1, β.

The minimization problem

Let us consider first the auxiliary problem

$\left(\stackrel{̂}{m}\left(\beta \right),\stackrel{̂}{b}\left(\beta \right)\right)=\underset{\frac{1}{3}m+\frac{2}{3}b=\beta }{arg min}\mathit{\text{PFW}}\left(m,b\right),$

and let us define the function h2 as

${h}_{2}\left(\beta \right)=\mathit{\text{PFW}}\left(\stackrel{̂}{m}\left(\beta \right),\stackrel{̂}{b}\left(\beta \right)\right).$
(8)

Observe that we are minimizing now on the boundary of the restriction of problem (5).

By inspecting h1, β we get

${h}_{1,\beta }\left(m\right)=\left\{\begin{array}{lll}\beta -\frac{2m}{3}+\frac{1}{3}\hfill & \text{if}\hfill & m\le \beta \hfill \\ \frac{\beta }{2}-\frac{m}{6}+\frac{1}{3}\hfill & \text{if}\hfill & \beta

Observe that h1, β(m) decreases when $0\le m\le \frac{3\beta }{2}$, and increases when $\frac{3\beta }{2}. We then find the following:

• If $\frac{3\beta }{2}\le 1$, then h1, β has a unique minimum at $m=\frac{3\beta }{2}$.

• If $\frac{3\beta }{2}\ge 1$, the function h1, β decreases in the whole interval [ 0,1] and has its minimum at m = 1.

Hence, in order to compute the function h2, let us note that

• If $\frac{3\beta }{2}\le 1$, then $\stackrel{̂}{m}\left(\beta \right)=\frac{3}{2}\beta$ which implies that $\stackrel{̂}{b}\left(\beta \right)=\frac{3\beta }{4}$.

• If $\frac{3\beta }{2}\ge 1$, $\stackrel{̂}{m}\left(\beta \right)=1$ which implies that $\stackrel{̂}{b}\left(\beta \right)=\left(\beta -\frac{1}{3}\right)\frac{3}{2}$.

Then h2 is given by

${h}_{2}\left(\beta \right)=\left\{\begin{array}{lll}\frac{\beta }{4}+\frac{1}{3}\hfill & \text{if}\hfill & \beta \le \frac{2}{3},\hfill \\ \frac{\beta }{2}+\frac{1}{6}\hfill & \text{if}\hfill & \frac{2}{3}\le \beta \le 1,\hfill \end{array}\right\$
(9)

so h2 is increasing in β, which implies that $\left(\stackrel{~}{m}\left(\beta \right),\stackrel{~}{b}\left(\beta \right)\right)=\left(\stackrel{̂}{m}\left(\beta \right),\stackrel{̂}{b}\left(\beta \right)\right)$ is the unique solution of (5).

Hence, we have

$\left\{\begin{array}{llll}\stackrel{~}{m}\left(\beta \right)=\frac{3\beta }{2},\hfill & \stackrel{~}{b}\left(\beta \right)=\frac{3\beta }{4}\hfill & \phantom{\rule{1em}{0ex}}\text{if}\hfill & \beta \le \frac{2}{3},\hfill \\ \stackrel{~}{m}\left(\beta \right)=1,\hfill & \stackrel{~}{b}\left(\beta \right)=\left(\beta -\frac{1}{3}\right)·\frac{3}{2}\hfill & \phantom{\rule{1em}{0ex}}\text{if}\hfill & \frac{2}{3}\le \beta \le 1.\hfill \end{array}\right\$
(10)

Host’s strategies

It’s worth noticing that for $0\le \beta \le \frac{2}{3}$ (or equivalently $\frac{1}{3}\le \alpha \le \frac{1}{2}$ as we show in Section ‘The dual problem’), the optimum strategy of the host consist in having double malevolence, that is $\stackrel{~}{m}\left(\beta \right)=2\stackrel{~}{b}\left(\beta \right)$. Whereas if $\frac{2}{3}<\beta \le 1$ or equivalently $\frac{1}{2}<\alpha \le \frac{2}{3}$, this is not possible since $\stackrel{~}{m}\left(\beta \right)=1$ and $\stackrel{~}{b}\left(\beta \right)>\frac{1}{2}$.

The function h2 defined in (8) represents the winning probability of a player following its best strategy given that the host is using its best strategy. It can be seen in (9) that this probability increases with a higher speed ($\frac{1}{2}$ instead of $\frac{1}{4}$) when the host is not able of having malevolence anymore while doubling the benevolence.

Player’s strategies

We are assuming that in the long term, the player will estimate well the parameters m and b of the host. If the host opens an extra door, the player must compare $\mathbb{P}\left({F}_{\mathrm{R}}|O\right)$ with $\frac{1}{2}$ and

• If $\mathbb{P}\left({F}_{\mathrm{R}}|O\right)<\frac{1}{2}$, then the player must change door.

• If $\mathbb{P}\left({F}_{\mathrm{R}}|O\right)>\frac{1}{2}$, then the player must keep his first choice.

• If $\mathbb{P}\left({F}_{\mathrm{R}}|O\right)=\frac{1}{2}$, then is indistinct.

If the host does not open an extra door, the player must compare $\mathbb{P}\left({F}_{\mathrm{R}}|{O}^{\mathrm{c}}\right)$ with $\frac{1}{3}$ and now:

• If $\mathbb{P}\left({F}_{\mathrm{R}}|{O}^{\mathrm{c}}\right)<\frac{1}{3}$, then the player must change door.

• If $\mathbb{P}\left({F}_{\mathrm{R}}|{O}^{\mathrm{c}}\right)>\frac{1}{3}$, then the player must keep his first choice.

• If $\mathbb{P}\left({F}_{\mathrm{R}}|{O}^{\mathrm{c}}\right)=\frac{1}{3}$, then is indistinct.

If the host is using an optimal strategy, then there are two possibilities for these parameters:

• If m = 2b then if the host shows an extra door, $\mathbb{P}\left({F}_{\mathrm{R}}|O\right)=\frac{1}{2}$, and changing is indistinct. In addition, if the host does not show an extra door, $\mathbb{P}\left({F}_{\mathrm{R}}|{O}^{\mathrm{c}}\right)<\frac{1}{3}$, and the player must change door.

• If m = 1 and $b>\frac{1}{2}$ then if the host opens an extra door, $\mathbb{P}\left({F}_{\mathrm{R}}|O\right)<\frac{1}{2}$, and the player must change. In addition, if the host does not open an extra door, then $\mathbb{P}\left({F}_{\mathrm{R}}|{O}^{\mathrm{c}}\right)=0<\frac{1}{3}$, and so the player must change door.

In summary, the player’s strategy could be always changing and would be doing as best as possible if the host is using an optimal strategy.

The dual problem

Our goal is to show now the equivalence of solving (4) or (5). This gives the values of the desired optimal probabilities of the host (m, b) as functions of α.

Lemma4.1.

For any $\alpha \in \phantom{\rule{0.3em}{0ex}}\left[\phantom{\rule{0.3em}{0ex}}\frac{1}{3},\frac{2}{3}\right]$, let

$\left({m}^{\star }\left(\alpha \right),{b}^{\star }\left(\alpha \right)\right)=\underset{f\left(m,b\right)\le \alpha }{arg max}g\left(m,b\right),$
(11)
$\left(\stackrel{~}{m}\left(\beta \right),\stackrel{~}{b}\left(\beta \right)\right)=\underset{g\left(m,b\right)\ge \beta }{arg min}f\left(m,b\right),$
(12)

where β = g(m(α), b(α)), $g\left(m,b\right)=\frac{m}{3}+\frac{2b}{3}$ and f(m, b) = PFW(m, b). Then

$\left({m}^{\star }\left(\alpha \right),{b}^{\star }\left(\alpha \right)\right)=\left(\stackrel{~}{m}\left(\beta \right),\stackrel{~}{b}\left(\beta \right)\right).$

Proof

Let β = g(m(α), b(α)), this β will appear in the constraint in problem (12), for which we will find $\left(\stackrel{~}{m}\left(\beta \right),\stackrel{~}{b}\left(\beta \right)\right)$. Observe that by the constraint of problem (11), we have

$f\left({m}^{\star }\left(\alpha \right),{b}^{\star }\left(\alpha \right)\right)\le \mathrm{\alpha .}$
(13)

If β = 1, then the only possibility is that (m(α), b(α)) = (1,1). Moreover, (m, b) = (1,1) is the only point which satisfies the constraint in problem (12). Hence,

$\left({m}^{\star }\left(\alpha \right),{b}^{\star }\left(\alpha \right)\right)=\left(\stackrel{~}{m}\left(\beta \right),\stackrel{~}{b}\left(\beta \right)\right)=\left(1,1\right).$

Indeed, we get the classical Monty Hall problem and in this case f(m, b) = 2/3.

Let us consider now the case β < 1.

Take (m, b) such that g(m, b) > β, then f(m, b) > α. Otherwise, (m(α), b(α)) did not maximize (11).

If g(m, b) = β < 1, then (m, b) ≠ (1,1). Since g is continuous and (m, b) is not a local maxima, there exists a sequence {(m n , b n )}n≥1 which converges to (m, b) with g(m n , b n ) > β. In particular, f(m n , b n ) > α.

So, by continuity of f we have f(m, b)≥α. So we have seen that in any case that g(m, b)≥β then f(m, b)≥α.

Moreover, we have that (m(α), b(α)) satisfies

$g\left({m}^{\star }\left(\alpha \right),{b}^{\star }\left(\alpha \right)\right)=\beta ,$

and therefore

$f\left({m}^{\star }\left(\alpha \right),{b}^{\star }\left(\alpha \right)\right)\ge \mathrm{\alpha .}$

Then, by inequality (13), we get f(m(α), b(α))=α so is a minimizer for (12) which is unique and so $\left({m}^{\star }\left(\alpha \right),{b}^{\star }\left(\alpha \right)\right)=\left(\stackrel{~}{m}\left(\beta \right),\stackrel{~}{b}\left(\beta \right)\right)$.

To finish solving the original problem (4), we only need to find an expression for

$\beta \left(\alpha \right):=g\left({m}^{\star }\left(\alpha \right),{b}^{\star }\left(\alpha \right)\right)=\underset{f\left(m,b\right)\le \alpha }{\text{max}}g\left(m,b\right)$

since we know from Lemma 4.1 that $\left({m}^{\star }\left(\alpha \right),{b}^{\star }\left(\alpha \right)\right)=\left(\stackrel{~}{m}\left(\beta \left(\alpha \right)\right),\stackrel{~}{b}\left(\beta \left(\alpha \right)\right)\right)$. The next lemma will show that β(α) is the inverse function of h2 defined in (8).

Lemma4.2.

Let K be any set, and $f:K\to \mathbb{R}$ and $g:K\to \mathbb{R}$ are such that

$\begin{array}{ccc}\Phi \left(\alpha \right)& =& \underset{f\left(k\right)\le \alpha }{\text{max}}g\left(k\right)\\ \Psi \left(\beta \right)& =& \underset{g\left(k\right)\ge \beta }{\text{min}}f\left(k\right)\end{array}$
(14)

are well defined and that one of them (say Ψ) is strictly increasing in the interval I, then Ψ:II m(Ψ)is invertible and its inverse is Φ:I m(Ψ)→I.

Proof.

Let α = Ψ(β), then there exists an element k0K such that g(k0) ≥ β and f(k0) = α. Actually g(k0) = β, because if g(k0) = β>β then Ψ(x) ≡ α in the interval [ β, β] which contradicts the fact of Ψ being strictly increasing.

Now, to prove the lemma is enough to see that Φ(α) = β. Since f(k0) ≤ α, then Φ(α) ≥ g(k0) = β. Suppose Φ(α) = β > β. Then, there exists k1 such that f(k1) ≤ α and g(k1) = β > β. Therefore Ψ(β) ≤ α = Ψ(β), being β < β which is absurd because Ψ is strictly increasing.

Taking f and g as in Lemma 4.1 and $K=\phantom{\rule{0.3em}{0ex}}\left[\phantom{\rule{0.3em}{0ex}}0,1\right]×\left[\phantom{\rule{0.3em}{0ex}}0,1\right]\subseteq {\mathbb{R}}^{2}$. Φ and Ψ of Lemma 4.2 are well defined because of the continuity of f and g and the compactness of K. Then h2 defined in (8) can be seen as

${h}_{2}\left(\beta \right)=\underset{g\left(m,b\right)=\beta }{\text{min}}f\left(m,b\right),$

but since h2 is strictly increasing, then

${h}_{2}\left(\beta \right)=\underset{g\left(m,b\right)\ge \beta }{\text{min}}f\left(m,b\right).$

Then h2 = Ψ and seeing that the formula (9) $\Psi :\phantom{\rule{0.3em}{0ex}}\left[\phantom{\rule{0.3em}{0ex}}0,1\right]\to \left[\phantom{\rule{0.3em}{0ex}}\frac{1}{3},\frac{2}{3}\right]$ is strictly increasing. Due to Lemma 4.2, $\beta \left(\alpha \right):\phantom{\rule{0.3em}{0ex}}\left[\phantom{\rule{0.3em}{0ex}}\frac{1}{3},\frac{2}{3}\right]\to \left[\phantom{\rule{0.3em}{0ex}}0,1\right]$ is its inverse.

By inverting h2, using $\left({m}^{\star }\left(\alpha \right),{b}^{\star }\left(\alpha \right)\right)=\left(\stackrel{~}{m}\left(\beta \left(\alpha \right)\right),\stackrel{~}{b}\left(\beta \left(\alpha \right)\right)\right)$ and (10) we get

$\begin{array}{ccc}\left({m}^{\star }\left(\alpha \right),{b}^{\star }\left(\alpha \right)\right)& =& \left\{\begin{array}{lll}\left(6\alpha -2,3\alpha -1\right)\hfill & \text{if}\hfill & \frac{1}{3}\le \alpha \le \frac{1}{2},\hfill \\ \left(1,3\alpha -1\right)\hfill & \text{if}\hfill & \frac{1}{2}\le \alpha \le \frac{2}{3}.\hfill \end{array}\right\\end{array}$
(15)

A related model

The same approach we have done can be applied for the Monty hall game described in [6]. In this version, the player is not allowed to change doors if the host does not open an extra door. If we call PFW c (m, b) the best probability for the player to finally win if the host uses strategy (m, b), then we find that

$\begin{array}{ll}\phantom{\rule{.5em}{0ex}}{\mathit{\text{PFW}}}_{\mathrm{c}}\left(m,b\right)& =\mathbb{P}\left(O\right)max\left(\mathbb{P}\left({F}_{\mathrm{R}}|O\right),1-\mathbb{P}\left({F}_{\mathrm{R}}|O\right)\right)+\mathbb{P}\left({O}^{\mathrm{c}}\right)\mathbb{P}\left({F}_{\mathrm{R}}|{O}^{\mathrm{c}}\right)\phantom{\rule{2em}{0ex}}\\ =\left(\frac{m+2b}{3}\right)max\left(\frac{m}{m+2b},1-\frac{m}{m+2b}\right)\phantom{\rule{2em}{0ex}}\\ \phantom{\rule{1em}{0ex}}+\left(1-\frac{m+2b}{3}\right)\frac{1-m}{3-\left(m+2b\right)}.\phantom{\rule{2em}{0ex}}\end{array}$

As we did before, we analyzed the function in the segment $\frac{m+2b}{3}=\beta$ and defined

${h}_{1,b}^{c}\left(m\right)={\mathit{\text{PFW}}}_{\mathrm{c}}\left(m,\left(\frac{3\beta -m}{2}\right)\right).$

After some calculations, we get

${h}_{1,\beta }^{c}\left(m\right)=\left\{\begin{array}{lll}\beta -\frac{2m}{3}+\frac{1}{3}\hfill & \text{if}\hfill & m\le \frac{3\beta }{2},\hfill \\ \frac{1}{3}\hfill & \text{if}\hfill & m\ge \frac{3\beta }{2}.\hfill \end{array}\right\$

So we have the following:

• If $\frac{3\beta }{2}\le 1$, then any $m\ge \frac{2\beta }{3}$ is a global minimum of ${h}_{1,\beta }^{c}\left(m\right)$ and its minimum value is $\frac{1}{3}$.

• If $\frac{3\beta }{2}\ge 1$, then ${h}_{1,\beta }^{c}\left(m\right)$ has its minima in m = 1 and its minimum value is $\beta -\frac{1}{3}$.

Repeating the previous argument, we define

$\left({\stackrel{̂}{m}}_{c}\left(\beta \right),{\stackrel{̂}{b}}_{c}\left(\beta \right)\right)=\underset{\frac{m+2b}{3}=\beta }{arg min}{\mathit{\text{PFW}}}_{c}\left(m,b\right),$

and let us introduce the function ${h}_{2}^{c}$ defined as

${h}_{2}^{c}\left(\beta \right)={\mathit{\text{PFW}}}_{c}\left({\stackrel{̂}{m}}_{c}\left(\beta \right),{\stackrel{̂}{b}}_{c}\left(\beta \right)\right).$
(16)

We get

${h}_{2}^{c}\left(\beta \right)=\left\{\begin{array}{lll}\frac{1}{3}\hfill & \text{if}\hfill & \beta \le \frac{2}{3},\hfill \\ \beta -\frac{1}{3}\hfill & \text{if}\hfill & \beta \ge \frac{2}{3}.\hfill \end{array}\right\$
(17)

Now, ${h}_{2}^{c}\left(\beta \right)$ is the percentage of times the player will win the car if both the player and the host use their optimum strategies (let us recall that β is the proportion of times an extra door is showed to the player). The comparison with h2(β) is shown in Figure 1.

The argument at the end of Section ‘The dual problem’ can be repeated since ${h}_{2}^{c}\left(\beta \right)$ is strictly increasing in $\left[\phantom{\rule{0.3em}{0ex}}\frac{2}{3},1\right]$ to find ${\beta }_{c}\left({\alpha }_{c}\right):\phantom{\rule{0.3em}{0ex}}\left[\phantom{\rule{0.3em}{0ex}}\frac{1}{3},\frac{2}{3}\right]\to \left[\phantom{\rule{0.3em}{0ex}}\frac{2}{3},1\right]$, the inverse function of ${h}_{2}^{c}\left(\beta \right)$. Now, by using

$\left({m}_{c}^{\star }\left(\alpha \right),{b}_{c}^{\star }\left(\alpha \right)\right)=\left(\stackrel{~}{{m}_{c}}\left({\beta }_{c}\left(\alpha \right)\right),\stackrel{~}{{b}_{c}}\left({\beta }_{c}\left(\alpha \right)\right)\right),$

and $\stackrel{~}{{m}_{c}}\left(\beta \right)=1$, with $\stackrel{~}{{b}_{c}}\left(\beta \right)=\frac{3\beta -1}{2}$ if $\beta \ge \frac{2}{3}$, we get

$\left({m}_{c}^{\star }\left(\alpha \right),{b}_{c}^{\star }\left(\alpha \right)\right)=\left(1,\frac{3\alpha }{2}\right).$

This formula can be compared with (15).

We see that the player’s chances are worse than in the previous version of the game. From the host’s perspective, in this version he can pay the same prizes as before, by opening an extra door more times. However, this is achieved by reducing the number of times the host opens a door when the player misses with his first choice and does not have the option to change doors, which is a very unfriendly policy.

Conclusions

We have analyzed a different version of the Monty Hall problem, where the player faces a host which has its own objectives (maximize the audience) and limited budget. From the host’s perspective, he must solve an inverse problem in zero-sum game theory: to determine the payoffs of the game with a given minimax equilibria. From the player’s point of view, this is a toy model of a decision process where the agent does not know if the received information is beneficial or not (i.e., the host is benevolent or malevolent), although the player knows the probability of each behavior.

We show that we can formulate the problem both in terms of the expected prize that the host will pay (α) and in terms of the proportion of times he opens a door (β). The later one is better for computations.

References

1. Selvin S: A problem in probability. Am. Statistician 1975, 29(1):67.

2. Gill R: The Monty Hall problem is not a probability puzzle (it’s a challenge in mathematical modelling). Statistica Neerlandica 2011, 65: 58–71. 10.1111/j.1467-9574.2010.00474.x

3. Rosenhouse J: The Monty Hall Problem. New York: Oxford University Press; 2009.

4. Granberg D: To switch or not to switch. In : vos Savant, M (ed.) The Power of Logical Thinking. New York: St. Martin’s Press; 1996.

5. Tierney J: Behind Monty Hall’s doors: puzzle, debate and answer? The New York Times July 21 1991.

6. Schuller JC: The malicious host: a minimax solution of the Monty Hall problem. J. Appl. Stat 2012, 39: 215–221. 10.1080/02664763.2011.580337

7. Fernandez L, Piron R: Should she switch? A game-theoretic analysis of the Monty Hall problem. Math. Mag 1999, 72: 214–217. 10.2307/2690884

8. Abdel-Aty MA, Abdalla MF: Examination of multiple mode/routechoice paradigms under ATIS. IEEE Trans. Intell. Transportation Syst 2006, 7: 332–348. 10.1109/TITS.2006.880634

9. Ben-Elia E, Shiftan Y: Which road do I take? A learning-based model of route choice with real-time information. Transportation Res. Part A Policy Pract 2010, 44: 249–264. 10.1016/j.tra.2010.01.007

10. Gao S, Frejinger E, Ben-Akiva E: Adaptive route choices in risky traffic networks: a prospect theory approach. Transportation Res. Part C: Emerg. Technol 2010, 18: 727–740. 10.1016/j.trc.2009.08.001

Acknowledgements

AA is a fellow of University of Buenos Aires, and JPP is a member of CONICET. This research was partially supported by grants W276 and 20020100100400 from the University of Buenos Aires and by CONICET (Argentina) PIP 5478/1438.

Author information

Authors

Corresponding author

Correspondence to Juan P Pinasco.

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Rights and permissions

Reprints and permissions

Alvarez, A., Pinasco, J.P. Monty Hall game: a host with limited budget. J. Uncertain. Anal. Appl. 2, 2 (2014). https://doi.org/10.1186/2195-5468-2-2