Making sense of scientific news stories

There’s an old expression in the newspaper business: “If your mother says she loves you, check it out.” In other words, make sure your bullshit detector is always on. Be skeptical of what you’re told, of what you read. Cross-check your facts with other sources. What applies in the newsroom applies tenfold on the internet, where anybody is free to post any damned thing they want to.
Unknown

Our “information age” confronts us with hundred upon hundred of news stories each day and a lot of them are about scientific findings. News sites love reporting those, and links to these reports get shared on Twitter, Facebook, etc.

However, while science is mankind’s greatest invention, the reporting of scientific findings in the media it is often seriously (over)simplified and biased. And it gets worse on social networks.

That’s easy to see if you are a scientist yourself. Just look at your area of expertise on news media — it will make you cringe. And that’s not because your branch science is the hardest to the most misunderstood. The same oversimplification and biases hold true for any other science. Only there you often don’t notice it.

It is hard to evaluate reported scientific findings without

access to the original article published in an academic journal
to check what information in the new report was oversimplified, distorted, fabricated, or otherwise biased
a solid understanding of the scientific methods
to check for the scientific merit of the work done (reputable academic journals employ peer-review, i.e., other scientists from the same field evaluate the quality, but it is possible that mistakes are not spotted)
a solid understanding of the (sub-)domain
to see how the results fit into the overall picture of results (after all, the study might be a fluke or an outlier)

But there are some questions you can ask of published scientific results.

Given that I am a psychologist, I focus on reports on studies or surveys about human behavior and experience, expanding something I have written on a German blog. These are particularly interesting for many people — few things are as interesting as information about ourselves. But at least some of the questions should hold true for other scientific domains as well.

Be a Skeptical Optimist

Goodwin and Goodwin (2013) characterize the attitude of scientists as “skeptical optimists”:

Research psychologists are skeptical optimists — optimistic about discovering important things about behavior, but skeptical of claims made without solid empirical support.
Goodwin and Goodwin (2013)

The attitude of a skeptical optimist is important. Don’t be a cynic looking only for flaws, but demand proof and check the reported information for yourself.

Don’t assume that the person who links to it has checked the findings (even if this person augments his/her name with a “doctor”)

Social media makes it easy to link to information, but few take the effort to check the information for themselves — no matter whether they have a scientific background or not. So be careful regarding the reputation of the reference, after all:

distributing does not mean agreeing
it can be based on the fit to interest/prior opinions of the person, not the scientific relevance
a doctorate does not mean content expertise in the topic at hand
What, for example, can a person with a degree in English Literature say about the scientific merit of a psychological study?

Ask yourself: What does the source tell you? Ignore the reputation of the people who pointed you to this source — it must stand on its own or fall.

Keep in mind that scientists are (only) human and have their own biases

Some scientists, including professors, could be better classified as lobbyist than as objective seekers of truth. And there is a lot of latitude when it comes to doing research — from topic, definitions, research questions, operationalization/methods, data analysis, interpretation and much more. Science is a social enterprise, — other scientists have to check the findings and data never speaks for itself. There even are some cases of deliberate fraud that took years to be discovered.

So, don’t confuse science with yet another religion or its findings as gospel. Knowledge is constructed, there is a better and worse (see A different view on discussions), and science — correctly done — can make much headway into the ‘better’ side, but it’s not a given.

Ask yourself: Who did the research and what might their agenda be? Do they have conflicts of interest? Are they active trying to achieve social change?

Be very wary of secondary or tertiary sources

Scientific findings get published in academic journals. There are some really bad journals out there, but most journals use high quality standards and peer review to ensure them. News reports on the other hand are often guided by the interests of the news organization and the reporters involved. Reporters select what is reported — to chose the information that is considered by them as most interesting for the readers/fits best in their publication/fits best in the overall narrative — which sometimes massively skews the actual findings. It does not really matter which source you consult — if you do not have access to the original article you can expect heavy simplification and biases, after all:

The complexity of studies on human behavior rarely if ever fits into a single newspaper headline.
Reporters select what is reported — to chose the information that is considered by them as most interesting for the readers/fits best in their publication/fits best in the overall narrative — which sometimes massively skews the actual findings.
‘Interestingness’ might conflict with what the findings really say
A story or a real life example or an (anecdotal or outright invented) case might be a method for reporters to convey the message they want to convey, but it might also massively distort the findings.

Ask yourself: Which findings that are in the actual study are not reported? Why?

If there are not qualifiers, it’s most likely wrong

Psychological research is never black and white. While differences often can be found, there is usually an overlap between the groups. Even if gender differences are found, e.g., “men/women are better in …”, it does not mean that all men/women are better than all women/men, just that on average, one group is better/worse than the other. So any headline without qualifiers is a red flag.

Ask yourself: Did they dumb down the results for public consumption? Removing complexity because they think you would not understand it.

Just because it’s plausible does not mean that it’s right

What is plausible depends in part on the fit of the information to your prior knowledge and opinions. However, this does not mean that it’s true. Humans have some severe biases when it comes to dealing with information. Furthermore, some of the most interesting findings are counter-intuitive.

Ask yourself: How would you evaluate the study (methods, etc.), if the results were the other way around? How much is your believe in the veracity of the results determined by your liking of the results, because they fit your world-view?

Is there really an effect from A to B?

There are good reasons why the experiment (in the scientific sense!) is the silver bullet in science — for determining causality, that is. Only if you randomize people into two or more groups, give these groups different treatments/interventions and one group nothing/a placebo, and find systematic differences between these groups — only then can you assume that there is a causal effect of the treatment. The treatment can be anything from a new tool to use, getting money for sharing knowledge, whatever. However, not all interventions are possible — due to ethical, legal, or practical reasons. So in some cases, the results are just correlations. If A changes, then B changes. However, this does not mean that A causes B. It is also possible that B leads to A (even the order in time in no guarantee, perhaps you can measure A earlier), or that an unknown factor C leads to A and to B, and so on.

Unfortunately, humans love causality. We are searching for it — and (think we) find it even if it is not really there. And news reports love causality too. Reporter want to show causal chains, show how something affects something else. Even if the research itself is based only on a correlation.

This is a problem because readers think that they can affect B when they change A, but a correlation does not guarantee it. Another frequent problem is that A is made responsible for B, although other factors might be involved and A has nothing to do with B per se. For example, just imagine a study that shows that women earn less after marriage. Are they being punished by their employers for being married? Or are other factors, e.g., going part time and doing less overtime, responsible?

Thus check whether the results are based on correlations or on a controlled experiment. And note that all survey results are based on correlations! They cannot support that something leads to something else.

Ask yourself: Which other factors could be responsible that were not assessed?

Significant findings vs. relevant findings

Psychological research usually examines differences due to a specific treatment and it uses groups as basis of comparison. The reason is that human beings differ and you have to control for this natural variation. Psychology also frequently uses statistical tests that give estimates how likely the differences between the groups (e.g., with vs. without treatment) could be due to chance (perhaps they differ on something that was not assessed but that influences the results). If chance differences are unlikely — usually less than 5% — the term “significant” is used. It is then assumed that the differences due to the treatment are ‘really there’, that the treatment really had an effect. However, this method is highly criticized — for good reasons. Among others, just because something is unlikely (i.e., significant), does not mean that it is also relevant. Even an extremely small difference becomes significant if you assess a lot of people.

Thus, an important question is whether the differences are actually relevant. How large is the difference between the groups due to the treatment — and how much do the people in each group vary. Just imagine you have a “significant” difference that is actually only between a 3.4 and a 3.6 on a scale of work satisfaction from 1 to 10. Or imagine that you have a large difference, but there is such a large variance within the groups that while you improve the overall result, a lot of people are actually worse than before. Besides the standard deviation, histograms are informative here. There is also the effect size, which deals with the strength of the treatment.

Also note that this thinking requires a causal relationship. If the results are based on a correlation, there is no reason to assume that you can change one by changing the other, or that you can make one responsible for the differences on the other.

Ask yourself: Was there really a statistically significant difference between the groups? And are the reported differences really relevant? What does it mean in practice? For example, how many people are affected and how?

Were the results replicated?

Given that psychology is a probabilistic science — we look at likelihoods — it is possible that results happen due to chance. Publication practices amplify this problem, as only significant results are published. So, while all the chance results get printed in scientific journals, the non-significant findings are never seen. A way to be at least somewhat sure of the validity of the results is to look for replications. Did other scientists do similar studies and did they come to the same conclusions?

Ask yourself: How do these research results tie into prior research? Did other researchers doing similar work come to the same conclusions? If not, why not? What did they do different?

Are the results applicable to your situation?

Like written, many studies use specific groups. In psychology mostly psychology students, as they have to participate for course credit. Also, the studies use ‘artificial’ situations to avoid confounding factors — the only difference between the groups should be the treatment. There are also practical constraints when it comes to the treatment itself — e.g., studies dealing with learning often do not use the typical time of a lesson but are usually shorter. And almost no studies examine long-term usage of specific treatments (outside of clinical psychology and the like).

While all this “artificial” control is beneficial to really narrow the research down to the influence of a specific treatment, it makes it harder to apply the results to real life. Can you really generalize, e.g., a one-time study in a lab with psychology students learning verbs for 20 minutes with learning in general? What happens when you do it for days or months?

Ask yourself: Under which conditions were the results obtained? Who was studied? What did they do? Can this be generalized to human beings in general? Which conditions are needed for the results to occur? How similar is the situation in the study to the situation I deal with? What are possible differences? Does the treatment work long-term? Or will the effect wear off quickly (i.e., a newness effect that dissipates quickly)?

Concluding Remarks

Note that while this posting is at times very critical of psychological research, psychology is actually a very powerful discipline — if done right.

It also does not mean that psychology is arbitrary and ‘anything goes‘ — it does not. First of all, some studies are better than others and there is a more right and less right (see A different view on discussions). Second, human behavior is very complex. There are a lot of possible conditions that can influence behavior and not all influence factors are known.

However, I am very critical of the way much psychological research is conducted and the implications some researchers claim. And I am even more critical of the sensationalistic reporting style of many journalists, who sacrifice scientific accuracy and honesty for a good story. And don’t get me started on Twitter. Even worse is when people claiming to have a PhD/Doctor tweet links to journal articles without having evaluated the findings themselves. But that’s a posting in itself.

Literature

Goodwin, C. J., & Goodwin, K. A. (2013). Research in Psychology. Methods and Design. (7th Edition). Singapore: John Wiley & Sons.

ORGANIZING CREATIVITY

How to generate, capture, and collect ideas to realize creative projects.