Selection Bias

Also known as: Sampling bias (related)

Selection bias occurs when the participants or data in a study are not representative of the larger population because they were selected in a non-random way. This distorts the results and makes them inapplicable to the general population.

Statistical Biases

•

2 min read

•

experimental Evidence

Selection Bias

The Psychology Behind It

To know the truth about a group (e.g., "Do Americans like pizza?"), you need a random sample. If you only ask people inside a pizza restaurant, your data is garbage. That is selection bias.

It happens because true randomness is hard. We naturally sample what is convenient, available, or willing. This creates a "distorted mirror" where we think we are seeing the world, but we are only seeing a specific slice of it.

Real-World Examples

The 1936 Literary Digest Poll

The magazine polled 2.4 million people and predicted Alf Landon would beat FDR in a landslide. FDR won. Why? The magazine polled its subscribers, car owners, and telephone users. In 1936 (the Great Depression), these were the rich. They selected a wealthy sample that hated FDR, ignoring the poor majority.

Online Reviews

Product reviews are heavily biased. Who writes a review? People who loved it (5 stars) or people who hated it (1 star). The vast middle ground of people who thought it was "okay" don't bother writing. Thus, reviews show a polarized world that doesn't exist.

Medical Studies

If a study on a new weight-loss drug recruits volunteers, it gets people who are motivated to lose weight. The drug might work for them but fail for the unmotivated general population.

Consequences

Selection bias can lead to:

False Conclusions: We believe things are true that are only true for a specific subgroup.
Bad Policy: Laws are passed based on the loud voices of a selected few (lobbyists, activists) rather than the silent majority.
Algorithm Bias: AI trained on biased data (e.g., resumes of mostly men) will learn to replicate that bias (hiring only men).

How to Mitigate It

Randomize, randomize, randomize.

Random Sampling: Ensure every member of the population has an equal chance of being selected.
Check the Source: Ask, "Who is in this dataset? Who is excluded? Why?"
Weighting: If you know your sample is biased (e.g., too many men), mathematically weight the data to represent the true population.

Conclusion

Selection bias reminds us that "data" is not "truth." Data is only as good as the method used to collect it. If the net is flawed, the catch will be flawed.

Selection Bias

Selection Bias

The Psychology Behind It

Real-World Examples

The 1936 Literary Digest Poll

Online Reviews

Medical Studies

Consequences

How to Mitigate It

Conclusion

Mitigation Strategies

Potential Decision Harms

Key Research Studies

Related Biases

Neglect of Probability

Ludic Fallacy

Sampling Bias

Survivorship Bias

Texas Sharpshooter Fallacy

Pareidolia