«Winwood Reade is good upon the subject,» said Holmes. «He remarks that, while the individual man is an insoluble puzzle, in the aggregate he becomes a mathematical certainty. You can, for example, never foretell what any one man will do, but you can say with precision what an average number will be up to. Individuals vary, but percentages remain constant. So says the statistician.»

«The Sign of Four» by Sir Author Conan Doyle

I’m still exploring R and, damn, it’s cool. For example, I did write a just-for-fun function providing me with the odds ratios, the confidence intervals, and a couple of other information. Using

`my_stats_oddsRatio(20, 80, 10, 90, "bubblegum", "no bubblegum", "cancer", "no cancer")`

results in the output

Odds Ratio for table: cancer no cancer bubblegum 20 80 no bubblegum 10 90 The odds for "cancer" for "bubblegum" is 0.25. The odds for "cancer" for "no bubblegum" is 0.11. Odds ratios and 95%-Confidence Intervals Odds ratio "bubblegum" vs. "no bubblegum" is: 2.25 [0.99; 5.09] Odds ratio "no bubblegum" vs. "bubblegum" is: 0.44 [0.2; 1.01] (Careful: CI contains 1.) Odds ratios With "bubblegum" it is 2.25 times more likely to "cancer" than "no bubblegum". With "no bubblegum" it is 0.44 times more likely to "cancer" than "bubblegum". Yule's Y (standardized Odds Ratio [-1; 1]) is 0.2, or -0.2 for the reverse order. Phi correlation coefficient is: 0.14 The Binomial Effect Size Display shows for "bubblegum" and "cancer": 57% and for "no bubblegum" and "cancer": 43%, i.e. a difference of 14 percentage points. If 100 people had a choice between the two options, it would make a difference in the lives of 14 of them.

Like written, not exactly sure whether all the calculations are correct (if not, please leave a comment), but being able to take the results and use them in generated sentences — that’s surprisingly cool. I wonder why statistics software like SPSS and the like do not provide this kind of output. I mean, sure, I see the advantage of providing the numbers only, especially for experts, but using output in sentences and providing some checks and background information could make stats more accessible to lots of people.

Cool 🙂