Imagine that you have 50 pennies labeled 1 through 50 and you flip each one 10 times and write on each coin how many times it landed head side up.
I should have said, that we are looking for some lucky pennies, because I going to use the lucky pennies we find in a magic trick later. To find a lucky penny, I need to decide how many heads I need to observe to be surprised.
I am going to cheat, because I happen to remember just enough theoretical statistics to know that when you observe a group of boolean trials, you can use the binomial distribution to calculated the probability of observing any number of one type of observation (heads) out of a known number of trials. (We are not going to show the equation, but it’s a combinatorial expression with factorials so it is pretty.)
For the sake of brevity, I am going to tell you it takes a lot to surprise me.
I am going to be a traditionalist here and decide I will be surprised if there are 8 or more heads out of 10 tosses.
I can confidently carry these pennies into my magic show knowing they will continue to perform as well there as they did here.
## [1] 2.049443e-86
## [1] 0.0006958708
## [1] 0.05337477
## [1] 0.04684365
Now as statisticians, 10 is not a big enough sample for a good estimate so let’s make it bigger.
surprise_threshhold_large <- 526
results_large <- rbinom(n_coins, tosses_large, p_heads)
picks_large <- seq_along(results_large)[results_large >= surprise_threshhold_large]
length(picks_large)
## [1] 4
## [1] 5 15 26 39
## [1] 528 533 531 539
It is not the initial sample, the results of flipping the 50 coins. It is the belief that the coins that behaved in a surprising way will continue to behave in a surprising way.
Number of candidate predictor variables affected the number of noise variables that gained entry to the model
Predictors | Noise | %Noise |
---|---|---|
12 | 0.43 | 20 |
18 | 0.96 | 40 |
24 | 1.44 | 46 |
Number of candidate predictor variables affected the number of noise variables that gained entry to the model
Predictors | Noise | %Noise |
---|---|---|
12 | 0.47 | 35 |
18 | 0.93 | 59 |
24 | 1.36 | 62 |
Number of candidate predictor variables affected the number of noise variables that gained entry to the model
Predictors | Actual | Noise | %Noise |
---|---|---|---|
12 | 1.70 | 0.43 | 20 |
18 | 1.64 | 0.96 | 40 |
24 | 1.66 | 1.44 | 46 |
Number of candidate predictor variables affected the number of noise variables that gained entry to the model
Predictors | Actual | Noise | %Noise |
---|---|---|---|
12 | 0.86 | 0.47 | 35 |
18 | 0.87 | 0.93 | 59 |
24 | 0.83 | 1.36 | 62 |
\(N = 900\)
Predictors | Noise \(\alpha = 0.0016\) | Noise \(\alpha = 0.15\) |
---|---|---|
12 | 2 | 2 |
18 | 3 | 3 |
24 | 4 | 4 |
50 | 8 | 10 |
100 | 16 | 17 |