## Wednesday, July 18, 2012

### Problem with probilities - Monte Hall's Problem

* There once was a game show hosted by Monte Hall.  The game show at some point had the contestant choosing between one of three doors.

* Behind two of the doors there was a goat while behind the third one there was a car.

* The contestant would first choose one door.  Then Monte would reveal a goat from one of the two remaining doors (being how Monte Hall knew where the car was) and the contestant had the chance to choose a new door (the one that was not opened) at that time.

* After that the doors were openned and the goat or car was revealed behind the door that was finally chosen.  If the contestant had chosen the door with the car then the contestant won the car.

* So the question was asked, "What is the ideal strategy for this game?"

* A clever columnist responded by stating that the optimal stategy was to always change your door after one goat is revealed.

* Why is this?

* Simply, if you stick with you original door as a strategy then your probability of being right is 1/3.

* However, if you change your vote after one of the goats in revealed then you have increased your probability of getting the car to 2/3.

* One way to think about this is, if the probability of getting a car from the initial initial_guess is 1/3 then the strategy of keeping with your original initial_guess must yield the same results.

* However, if getting a car by sticking with your original initial_guess is 1/3 then by switching your initial_guess you must be therefore getting the remaining probabily of getting a car (ie. 1-1/3=2/3)

* Don't believe it? I did not at first either!

* Well let's write a simulation to check the expected results of the two strategies

* First let's imagine 100,000 repeated games (a decent sampling for an 'unknown' probability distribution)

clear
set obs 100000

* The car is behind one of the doors: (1,2,3)

gen car = ceil(runiform()*3)
tab car
* Looks good

* The contestant initially initial_guesses one of the doors:

gen initial_guess = ceil(runiform()*3)
tab initial_guess

* This also looks good.

* Now a bit more tricky we must find which door is revealed.

* The revealed door cannot be the door the contestant chose and it cannot be the door that has the car.

* Let's just do a while loop and add one to bad initial_guesses till we draw a door not drawn by the contestant.

gen revealed = ceil(runiform()*3)

* Loop through twice adding one to any revealed value that is not legitamate
forv i = 1/2 {
replace revealed = mod(revealed+1,3)+1 if revealed == initial_guess | revealed == car
}

* This count command will count how many random revealed doors are not legitamate.
count if revealed == initial_guess | revealed == car
* The loop successfully worked

tab revealed
* We can see revealed still looks like it is performing well (in terms of equal sampling from each of the doors)

* Now let's formulate three strategies
* 1. Choose a door and stick with it
* 2. Choose a door then always switch
* 3. Choose a door then fair flip a coin, if heads switch.

* For strategy one it is easy
gen door1_choice = initial_guess

gen strat1_car = 0
replace strat1_car = 1 if door1_choice == car

* For stragey two it is a bit more complicated.  First we need to figure out how to identify which alternative is chosen.
gen door2_choice = mod(initial_guess+1,3)+1
* That is if the empty value is 1 plus the initial choice
replace door2_choice = mod(door2_choice+1,3)+1 if revealed==door2_choice
* But there is the possibility that the new choice will be the door revealed which is not good.
* However, in the modular system if we add +1 and +1 to a initial draw then we have gone through all of the potential problems.
count if revealed==door2_choice | initial_guess == door2_choice
* No violations of the strategy

gen strat2_car = 0
replace strat2_car = 1 if door2_choice == car

*   Finally strategy three: choose a door randomly from the two doors remaining
* In order to execute these random draws of the remaining choices we will use a repeated loop to keep drawing random draws until all of the observations have draws not equal to the revealed draw.

gen door3_choice = ceil(runiform()*3)

count if door3_choice == revealed
while r(N) > 0 {
replace door3_choice = ceil(runiform()*3) if door3_choice == revealed
count if door3_choice == revealed
}

tab door3_choice

gen strat3_car = 0
replace strat3_car = 1 if door3_choice == car

* Before executing this next bit of code, check your intuition.

* What do you expect to be the probabilities of each strategy?  Strategy 1 and 2 have beed discussed, what about three.

* Ok, drum roll....

sum strat?_car

* Surpising?  Perhaps not.

* Interesting how different strategies are clearly so much better than other strategies ;)

* Note, btw that strategy 1 and 2 have the same standard deviation.  That is because for a Bournoulli random variable X, var(X)=P(1-P).  Where P is the probability of X=1.