Data Sharing in Science and Academic Attrition Rates

“The vaccine. How is it progressing?”
“I regret that it failed to stabilize the latest sample you provided.”
“I see.”
“I have a team of Vorta doctors working night and day to find a cure.”
“Have them document their efforts and then eliminate them.”
“Activate their clones and order them to continue their predecessors’ work. Perhaps a fresh perspective will speed matters along.”
“Of course.”
Founder and Weyoun in Star Trek DS9: “Penumbra”

During the Q&A of a presentation about data-sharing in science, the question of having a data archive within a research institute came up. At first, I couldn’t place the question, given that the talk was about

  1. the long-term availability of data, and (thus also)
  2. making the data usable for other researchers not directly involved in the research (yet).

An in-house solution does not make sense for long-term availability, as the institute can fold if it is not funded anymore. Not to mention the technical overhead, like the need for backups outside the institute itself and the security requirements this entails. What would be the advantage of an in-house data archive? A devil-may-care attitude? As long as we exist, we keep control over our data, and if we fold, we’ll just have lost our jobs and we couldn’t care less? And even more strange, why make the data usable for other researchers not directly involved in the research yet? Shouldn’t the fellow-researchers in that institute, working with the same data, understand it in any case?

Then I found one interpretation … and not a nice one.

An in-house data archive, which allows non-involved researchers to easily work with the data, makes sense — terrible sense — in the context of the high attrition rates of PhDs and post-docs in Academia.

Unfortunately, much of Academia is like a pyramid scheme(*). There are way too many PhDs and post-docs for the few tenured positions that are available. Many PhDs and post-docs will have to leave academia — often without acquiring useful qualifications for this kind of work during their years at universities/research institutes. Or even if they did, it’s hard to proof. PhDs and post-docs are often used as cheap and ubiquitously available work-force — to do the actual research, (ghost)write papers or gift authorships to their supervisors, deal with the teaching load, and much more. Highly qualified, yet low paid disposable drones.

But usually not immediately disposable drones.

The downside of this treatment is that many PhDs and post-docs leave before their contract ends — frequently they leave quite suddenly or even “surprisingly” for many supervisors. Given that the legal period of notice is much shorter than the time usually needed to publish a paper, esp. when revisions are required, this is a problem for a ‘publish or perish‘ culture. Especially if the leaving PhDs/post-docs are the only real authors and they first tell their supervisors that they leave when they actually quit. This happens and — depending on the supervisor — is actually a very smart move.

If the actual author leaves, esp. when leaving academia, and the supervisor has never seen the data of the paper, this can effectively kill the paper. This situation is not uncommon depending on the project and the supervisor relationship — and it does real damage to department heads and other supervisors. Even tenured supervisors are subject to ‘publish or perish‘, given that most department heads need research grants to maintain their status and do ‘their’ research. And grant proposals require published papers with them as authors.

In this situation, an in-house data archive makes terrible sense. An in-house data archive — where PhDs or post-docs would have to enter their research data in a way that someone else (= the next drone) could easily analyze the data — makes sense here. The data is then easily available and understandable without requiring any investment of the resources of the supervisor. It makes PhDs/post-docs easily replaceable even when they suddenly quit. Thus it strongly reduces the costs of high attrition in Academia. The PhD/post-doc leaves? Who cares, just hire the next one, the data is documented in a way it explains itself.

Of course, this interpretation for the need of an in-house data archive is pure speculation — but from my view on academia, based on what I have seen, heard, and read in the last few years — it can make sense for some departments and institutes. Unfortunately.

So, personally, I hope that in-house data archives are not used. They do not solve the long-term availability problem and have a terrible potential to be misused to deal with the symptoms of high academic attrition rates. Instead of addressing only the symptoms, as lessening the attrition damage would do, academia needs to address the causes. And this requires a major overhaul. Currently, much of Academia is parasitic to society. Even ignoring the ‘research waste‘ it produces, it (mis-)uses large groups of highly qualified people to do the actual research for a few years without offering them a future when their time is up. It needs to allow for a career path of every new PhD student — which in many cases means strongly reducing the number of PhD positions — and/or allow for the development of non-academic qualifications for those who want to leave Academia.

But this is much more difficult to achieve than creating an in-house data archive.


(*) Academia has also been compared with a drug gang. Actually, there are good arguments for this comparison.

Categories: Community Aspects, Doing Science, Improving your Creativity, Infrastructure, Learning to do Science, Science, Something to Think About

2 Trackbacks & Pingbacks

  1. What Science Should be About (but Academia Often Isn’t): The Joy of Discovery | ORGANIZING CREATIVITY
  2. Why not to trust in-department data storage | ORGANIZING CREATIVITY

Leave a comment

Your email address will not be published.