Science, Fraud, and the Flow of Data

“Do not send another eMail to Mr. [Important]. He does not communicate with people on your level.”
Comment overheard in a large company with an hierarchy fetish

Thinking more about the fraud triangle applied to scientific misconduct, I had another look at the report regarding the misconduct (i.e., outright fraud) of Diederik Stapel(*). What struck me was how a rather basic principle was violated: One or more levels of hierarchy were by-passed.

What happened in some cases of him committing outright fraud was that he would have PhD students collect data (apparently with paper questionnaires), then let student assistants enter the data, student assistants who then send the data files to him — and him only.

In a second variant Mr Stapel altered data in existing datasets. These data actually were collected and were usually entered by a student assistant, who then forwarded the dataset to Mr Stapel. After manipulating the data Mr Stapel passed the dataset to the PhD student, who had been required to set down his or her expectations about the outcomes in advance. The Committees are in possession of datasets entered by student assistants and the datasets ultimately given to the PhD student. The differences between the datasets clearly show the changes made.
https://www.commissielevelt.nl/wp-content/uploads_per_blog/commissielevelt/2012/11/120695_Rapp_nov_2012_UK_web.pdf

That’s rather strange — a professor getting the data files before the PhDs conducting the studies do get the data. But hey, it was explained by his devotion to his students, his dedication, him wanting to be able to support them. And it’s very helpful for PhD students to get their data — collected on paper — in digital form. Without having to enter the data themselves (did it a couple of times, I know, it sucks).

Mr Stapel relieved many of his PhD students of work by engaging a student assistant for data entry. The PhD student would frequently collect the data in person, but the student assistant would give the data directly to Mr Stapel, and it might be altered before being forwarded to the PhD student. […] The PhD students perceived the help they received in data entry and processing as a luxury (not necessarily to be taken for granted) because it saved much time. The practice sometimes also provoked a degree of jealousy among PhD students, which in some cases was exacerbated by the alleged quality of the data. Otherwise Mr Stapel’s practice of ‘sitting on the data’ was also appreciated as a sign of great commitment to his PhD students’ research. Mr Stapel observed in this connection in his interview with the Noort Committee that he was personally convinced that he was helping his PhD students. By his own account the collection of the data was the greatest chore in research, which PhD students must be helped through as quickly as possible. The paradox involved in ‘helping’ through falsification became apparent to him only later.
https://www.commissielevelt.nl/wp-content/uploads_per_blog/commissielevelt/2012/11/120695_Rapp_nov_2012_UK_web.pdf

But the issue remains — why skip a couple of hierarchy levels. Why isn’t the file checked by the PhD student, or Post-Docs first?

I mean, if you see data going this way:

Data flow when data was entered. The Phd (and Post-Doc level, not displayed here) was completely bypassed.

shouldn’t this be somewhat suspicious?

No.

Not if the person committing the fraud did — deliberately or by habit — the best he (or she) could to hide the fraud.

But now you know that this is one way fraudsters commit fraud. They get the data you collected and have the opportunity to manipulate it. They have the data before you do. It does not take more than that.

But it also does not take more than a simple BCC to get a copy of the original data from the student assistant — and compare them for differences.

And if differences were made — even if there are good reasons for doing so — there rarely are reasons not to state these decisions when it comes to reporting the data.

So, stop for a moment and look around. Have you ever compared the actual received data — whether via questionnaire or instrument — with the data you received for analysis?

Perhaps it’s time to take a look at it. And if you do not like what you see, here’s something that might help you. Ah, and another issue — one of the reasons Mr. Stapel (he lost his PhD) was able to get away with fraud for so long, was that no-one expected him to commit fraud. Not on this level anyway. Because he was so friendly and supportive to his PhDs — but considered any critical questions regarding the data (quality) as a lack of trust. I’m doing another posting on fraudster strategies to remain undetected, but if that sounds awfully familiar … well, in dubio pro reo, but still, have a close look. Fraud has a habit of coming to public, if not now, then when you least want it — and you will be tainted by the fraud of a supervisor or adviser — even if you yourself did nothing wrong.

(*) If you haven’t heard of the case (unlikely, unless you work in psychology), he did some really messed up stuff (e.g., sat down with an empty spreadsheet and completely fabricated data, used his position of authority to quell critical question, and much much more). And he did a lot of damage. Strange thing, apparently he is a public speaker about scientific misconduct now. Yeah, let’s ask the fox to guard the hen-house. What did the report say?

The last thing that colleagues, staff, and students would suspect is that, of all people, the department’s scientific star, and faculty dean, and the man who personally taught his department’s scientific ethics course, would systematically betray that trust.

So yeah, let’s give him a similar position — what can go wrong? Some people think he’s practically a hacker. He’s not. A hacker tests the security of systems, but does no harm. He did. He did not try to test, to expose weaknesses. He did try do make his career by playing the system. He was caught. People suffered. He got of very lightly. Even if the damage to co-authors and PhD students is ignored, he did receive 2.2 million euros in research money ($2,9 million!) — money that did not get to his competitors. And was convicted to do 120 hours of community service. 120 hours! That’s an hourly wage of 18333€ (about $24k per hour). Few CEOs can claim that amount of money.

ORGANIZING CREATIVITY

How to generate, capture, and collect ideas to realize creative projects.

1 Trackback / Pingback