Question
Topic: Research/Metrics
Estimating Total Usage Incidence From Samples
Related Discussions
- Who Are The Top Companies Competing In The Market?
- Geographic Metric - Help Needed!
- How Fast Does An Untouched Sales Lead Degrade?
- Need Clear Standards To Judge Facebook Metrics
- How Can I Get Backlinks For My Website?
- Beauty Services Relocation And Market Research
- Thoughts On Abm Platforms?
- Research - Masters Dissertation Survey Help
- Experiential Marketing Survey
- Collection + Dissemination Of Customer/market Data
- Search more Know-How Exchange Q&A
Community Info
Top 25 Experts
(Research/Metrics)
- mgoodman 27,311 points
- koen.h.pauwels 25,348 points
- Jay Hamilton-Roth 21,662 points
- Chris Blackman 15,808 points
- Gary Bloomer 10,191 points
- wnelson 8,013 points
- Peter (henna gaijin) 7,543 points
- steven.alker 6,672 points
- Frank Hurtte 6,632 points
- Dawson 4,619 points
- telemoxie 4,595 points
- SteveByrneMarketing 3,358 points
- SRyan ;] 2,396 points
- Blaine Wilkerson 2,387 points
- ReadCopy 2,081 points
- Pepper Blue 1,863 points
- bobhogg 1,748 points
- BizConsult 1,491 points
Group #1:
Total count: 300,000
Sample available: 30,000
Sample Size % of Total: 10.0%
Similarly, groups 2 & 3 lay out the same way:
Group 2:
5,000 (total)
2,000 (sample)
40.0% (sample %)
Group 3:
40,000
10,000
25.0%
Total
345,000
42,000
12.2%
Next, I use individual names to count duplicates within each group to get numbers like the following:
Group 1:
Uniques: 24,000
Single Use Sample: 20,000
Dual Use Sample: 3,000
3+ Use Sample: 1,000
…using the same layout for other groups:
Group 2:
1,150 (uniques)
1,000 (1x)
100 (2x)
50 (3+x)
Group 3:
7,100
6,000
1,000
100
Adding across the groups, provides an incorrect “additive total” as follows:
“Additive Total”:
32,250
27,000
4,100
1,150
If I actually combine the raw, individual name data from all three groups, I might get true, deduped total numbers like:
30,900
25,000
4,400
1,500
You’ll notice that the increased incidence of multi-use persons (versus the “additive total”) when combining the three different groups as some of the same users exist across the different groups. This, in turn, reduces the one-time use people and uniques relative to the “additive total” figures.
I point this out because the next step, and ultimate objective, is to estimate the uniques and single, dual and 3+ use for each of the groups individually and in total: Here again, I can’t use straight-line math to get to a total as when the quantities of users increase, the uniques and single-use persons will be reduced versus a linearly-estimated total.
As an example: I can’t take 24,000 uniques in the Group 1 sample and divide by the 10% sample size to get an estimated 240,000 total uniques as many of the sampled uniques would show up again if we actually had the full roster of 300,000 Group 1 people; the real uniques value, from the full data set, might be 180,000 or 210,000.
Similarly, when combining the samples across groups, I can’t apply the 12% factor to the combined total sample uniques of 30,900 to get an estimated total of 253,821 as the multi-use counts would increase, decreasing the uniques and one-time use individuals.
So that was a long, but (hopefully) illustrative, way of getting to the question of “What’s an Excel formula that can be applied to the individual sample groups, and to the combined sample totals, to estimate the full data set values?"
It's worth noting that sample sizes might vary from 10-100% in the groups and across different sets of other groups of users that I need to compare to. The bottom line is I need estimates for this set of groups, and several other varied sets of groups so I can compare the absolute, estimated uniques, 1, 2 and 3+ numbers across the sets.
Thanks for your help!