class: center, middle, inverse, title-slide .title[ # Confidence intervals with bootstrapping ] .subtitle[ ##
STA35B: Statistical Data Science 2 ] .author[ ### Akira Horiguchi
Figures taken from [IMS], Cetinkaya-Rundel and Hardin text ] --- ### Based on Ch 12 of IMS .pull-left[ We'll now talk a bit about *confidence intervals*: <!-- * Core idea: use a sample proportion to estimate a population proportion --> * Provides a plausible range of values (an interval) where we expect to find the true population proportion * To construct such an interval from our data, we can use the same randomization idea we used to test the null hypothesis. We saw how to use randomization to see whether difference in sample proportions was due to chance * useful for yes/no questions (e.g., Does this vaccine make it less likely to get a disease? Does drinking caffeine affect athletic performance?) * We will now discuss how to *estimate* the value of an unknown parameter (e.g. How much less likely am I to get a disease if I get a vaccine? How much faster can I run if I have caffeine?) ] .pull-right[ <!-- The technique we'll focus on is called *bootstrapping* --> We will construct confidence intervals using a technique called *bootstrapping* * Goal is to understand the variability inherent in a statistic (e.g., the sample mean) * If we can understand how different sample means are from the population mean, then we can make decisions about what was due to random chance vs. what is a true property of the population ] --- .pull-left[ #### Medical consultant study Consider the following setting: People seek out a medical consultant for navigating the donation of a liver. * The average complication rate for liver donor surgeries is 10% * One consultant claims that her clients have only had 3 complications in the 62 liver surgeries she has facilitated * Is this strong evidence that her work meaningfully contributes to reduced complications? * Let `\(p\)` = true complication rate for liver donors working with this consultant * We estimate `\(p\)` using data; label estimate as `\(\hat p\)` * In this sample, complication rate is `\(3/62 = 0.048 = \hat p\)`. ] .pull-right[ What can we infer from this estimate? * It is NOT possible to assess the consultant's claim using this data. * The claim is about a *causal* connection, but we are only looking at *observational data*. * There could be confounders (e.g. she refuses to take patients that are likely to have failed surgeries; or those which have medical consultants are richer / healthier and result in fewer complications) * What we *can* do is to get a sense of the consultant's true rate of complications. ] --- .pull-left[ #### Bootstrapping: the main idea Suppose we have `\(x_1, \dots, x_n\)` as a random sample from a population * If we **resample** subsets of `\(x_1, \dots, x_n\)` (with replacement), this "mimics" as if we sample from the true population * For the medical consultant setting, this means imagining we have a big bucket of index cards, with 3 that say "complication" and 59 that say "no complication" * We shuffle the bucket, reach in, record what it says, then put the card *back into the bucket*, and continue * Since we are randomly sampling from our *subsample*, and the initial subsample was a random sample, we can get an idea of sample-to-sample randomness ] .pull-right[ <img src="boot1prop2.png" width="675" /> - Each resample is called a *bootstrap sample* - The diagram shows `\(k\)` bootstrap samples - Typically each bootstrap sample has the same number of observations as the original sample (but does not have to be the case) ] --- <img src="boot1propboth.png" width="80%" /> --- .pull-left[ * What happens if we take 10,000 bootstrap samples of the medical consultant data. * Remember that original data had 62 observations, 3 of which had "complications"; `\(\hat p = 3/62 \approx 0.0484\)`. <img src="lec15_files/figure-html/fig-MedConsBSSim-1.png" width="432" /> ] .pull-right[ * Since the 2.5% percentile is at 0, 97.5th percentile at 0.113, we are confident that in the population, the true probability of a complication from the medical consultant is between 0% and 11.3%. * We were asked to compare this to the national rate of 10%. * Since our interval of 0-11.3% *includes* 10%, we cannot say that the consultant's work was associated with a lower risk of complications -- it could just be randomness * Even if the interval did not include 10%, we could not make a claim about causality. ] --- .pull-left[ #### Example: tappers and listeners Consider this study: a person conducts an experiment using the "tapper-listener" game. * Goal: pick a simple, well-known song; tap the tune on your desk; and see if the other person can guess the song * Data: 120 tappers, 120 listeners, 50% of tappers expected the listener would be able to guess the song. - Is 50% a reasonable guess? - In study, 3 / 120 (`\(\hat p = 0.025\)`) listeners were able to guess the song - Given this, what are typical values one could expect for the proportion? * We can use bootstrapping as before: imagine we have a jar with 120 marbles, 3 are green (guessed correctly) and 117 are red (could not guess the song) ] .pull-right[ <!-- * Bootstrapping corresponds to shuffling the jar, grabbing a marble, recording the response, then putting the marble back in, and repeating; e.g. the first 5 times we get --> <!-- | W | W | W | R | W | --> <!-- |:-----:|:-----:|:-----:|:-------:|:-----:| --> <!-- | Wrong | Wrong | Wrong | Correct | Wrong | --> * We repeat this 10,000 times and visualize: <img src="lec15_files/figure-html/fig-tappers-bs-sim-1.png" width="432" /> * Expect between 0-5.83% are able to guess tapper's tune ] --- .pull-left[ #### Bootstrap confidence interval **Confidence interval**: plausible range of values for (unknown) population parameter `\(p\)` **Bootstrap procedure**: if we have `\(n\)` observations, responses in two categories, with initial estimated proportion `\(\hat p\)` for proportion in category #1 - Randomly sample the `\(n\)` observations **with replacement** - Each resample is called a "bootstrap sample"; let's index the bootstrap samples as `\(i=1,2, \ldots, m\)`, where `\(m\)` could be 100 or 1000 or 10000, etc. - Each bootstrap sample produces a different proportion estimate `\(\hat p_{boot, i}\)` <!-- - Each time we randomly sample `\(n\)` observations with replacement, we get the `\(i\)`-th "bootstrap sample", each one with a different estimate `\(\hat p_{boot, i}\)` --> - We examine the **distribution** of the `\(\hat p_{boot, i}\)` (dot plot, histogram, ...) - All the `\(\hat p_{boot,i}\)` will be centered around baseline `\(\hat p\)` - Original `\(\hat p\)` is centered around the population `\(p\)` - Thus interval estimate for `\(p\)` can be computed using `\(\hat p_{boot, i}\)` ] .pull-right[ More formally, the 95% bootstrap confidence interval for parameter `\(p\)` can be estimated using the (ordered) `\(\hat p_{boot, i}\)` values * Call `\(a\)`= 2.5% bootstrapped proportion, `\(b\)`= 97.5% bootstrapped proportion * 95% bootstrapped confidence interval: `\((a, b)\)` = those values between `\(a\)` and `\(b\)` ] --- .pull-left[ #### Example: Youtube Videos Want to estimate proportion of YouTube videos taking place outdoors * We sample 128 videos and find 37 take place outdoors * Want to estimate proportion of all youtube videos which take place outside via bootstrap confidence interval; we get the following <img src="lec15_files/figure-html/unnamed-chunk-5-1.png" width="432" /> ] .pull-right[ * What is the relevant statistic and parameter for this problem? - Statistic: sample proportion `\(\hat p = 37/128 \approx 0.289\)`; parameter: population proportion (`\(p\)`; unknown) * If we want to be 90% confident that between `\(a\)`% and `\(b\)`% of YouTube videos that take place outdoors, how should we find `\(a\)` and `\(b\)`? - We want 5% of values to be below `\(a\)`, and 5% of values to be above `\(b\)` - The interval should be centered at `\(\hat p \approx 0.289\)` (so, `\(\hat p = (a + b)/2\)`) - From the graph, we see that `\(a\approx 0.22\)` and `\(b\approx 0.35\)` is correct. ]