Question

Topic: Research/Metrics

Sample Size Calculations For The Un-statistical

Posted by Anonymous on 125 Points
Hi all,

I want to test an e-mailing initiative and would like my results to be statistically significant.

I am trying to figure out how many people I need to contact in order to achieve the following:

90% confidence
10% margin of error

My expected response rate is about 0.1% (very low, I know..!)

Any thoughts, I am very un-mathematical, so an answer for dummy's would be ideal.

Thanks all,
To continue reading this question and the solution, sign up ... it's free!

RESPONSES

  • Posted on Member
    What's the universe you are sampling? If we're talking millions of people, you should be OK with 300-400 respondents, assuming its just an A/B test -- that is, you are only testing one variable. You need 150-200 getting A and 150-200 getting B.

    The real key, though, will be making absolutely certain that the sample from which you draw your respondents is truly representative of the larger universe to which you want to project results. If you only sample current customers, for example, you can't then project results to non-customers.

    And you have to remember that when you have a self-selecting group of respondents (as you appear to), there's an automatic bias based on the fact that they CHOSE to respond. That by itself may make the sample non-representative of your universe.

    Without knowing more, this is about the best I can do. I didn't actually do the calculation to be certain you'll get 90% confidence, but I've done enough of these to be able to wing it like this and not be far off.

    You'll also want to control to be sure that the self-selection rate is about the same in both A and B groups. (It depends on what exactly you're testing as to whether this is important.)

    In order to get 400 respondents at 0.1% response rate, you'll need to mail to 400,000 people initially, of course. Just be sure you've got a good list going in. That means that the 400,000 are truly representative of the 25 million in your real target audience. That's usually the hardest part.
  • Posted by wnelson on Member
    If I am interpreting your question properly, you want the sample size so that the sample will estimate the true response rate within 10% with a confidence of 90%. Your expected return rate is 0.001 (0.1%) so you want to estimate the true sample rate to within 0.01% or .0001. Since this involves a bunch of trials with return/no return as the answer for each, this is a binomial distribution. Since the probability of success is so low and I expect the sample size to be large, the binomial can be approximated by the Poisson distribution and the sample size for a Poisson distribution can be estimated using the standard normal distribution. That being said, here’s the formulas you will use:

    n = (Z*(SQR(np)/E)^2

    SQR is the square root function
    ^2 is the “squared” function
    n = sample size
    p = expected probability = 0.001
    E = desired error = 0.0001
    Z = the standardized normal statistic for the value (1-confidence)/2. For this, Z is 2

    Dragging you through the statistic was bad enough. I won’t drag you through the algebra. So the final equation is:

    n = Z^2*p/(E*E)

    And the answer is
    n = 4*0.001/.00000001 = 400,000

    This will net you an estimate of the real return rate to within 10%. BUT, this is only true if the true value of the return rate is close to 0.1%. If you find your return rate is drastically different, then you need to recalculate. For instance, if your return rate is actually 1%, then a sample size required to be within 10% of the true value is:

    n = 4*0.01/0.000001 = 40,000

    What this means is that if your sample results yield a HIGHER return rate, your margin or error is less than 10% - meaning you are estimating the return rate more accurately. You only have to take more samples if your return rate is much less than 0.1%.

    I hope this helps.

    Wayde
  • Posted by wnelson on Accepted
    If you double your response rate to 0.2%, the sample size is half for a 10% expected error or 200,000. So, if you send out 200,000 eMails and get from 360 to 440 responses, you can be 90% confident that the actual response rate is 0.2% with similar eMail tactics on a similar audience.

    Wayde

Post a Comment