Use-by data


It’s not just food that goes off.

The decision for when to publish is an agonising one. Is there a complete story? Is the data the best quality it can possibly be? Will it be enough to convince a dispassionate reader? It’s a lot like asking someone out on a date – the fear of rejection can hold you back (and just like dating, publishing scientific work usually involves a lot of rejection and emotional bruising). 

In fact, the temptation to dither can be irresistible. Maybe you should just try one more new assay to shore up a conclusion. Maybe you should repeat things a couple more times to see if you can get that perfect image. Maybe you should start a new set of experiments to bulk out the story a bit more and stop it looking lightweight.

Sometimes it’s not you either. Group dynamics can play a part. Perhaps there’s an over-cautious senior author who won’t publish anything until they feel it’s “ready”. Perhaps the first author has already left the group and is finding it hard to commit time. Perhaps there’s disagreement over what the story actually is, or different standards of proof are being applied by different members of group.

Basically, there are lots and lots of reasons to postpone. The early bird gets the worm, goes the proverb, but the second mouse gets the cheese. And even when you overcome the urge to be over-cautious, assembling a good story – let alone navigating the choppy waters of peer review – can be a draining experience that leaves little appetite for another bout at its conclusion.

The problem though is that data doesn’t stay fresh forever. It may not come with a use-by date, but it still has one. Eventually the techniques you used to get it, or the resolution you achieved, will no longer be acceptable at the top tier. Things get stale when they’re left for too long. You need to publish work within a certain time period of it being generated, because it’s not going to be right forever and it may not be of acceptable quality forever.

In fact, delay publishing too long and you usually end up repeating and reworking your own results, or get bogged down in mission creep. These are surely two of the commonest and most wasteful errors one can make as a researcher – having to throw out data and show what you already know all over again simply in order to make it palatable, or continuing to develop the story long past its natural end point and losing control of the narrative. You could call this latter point the George R R Martin principle – the “Game of Thrones” books serve as a warning for what happens when you can’t bring yourself to end things when you should.

It is natural for older work to be revised and enhanced by new approaches. It is good when a lab can publish work that brings conclusions from its previous publications to a new and higher level of resolution. It is disastrous when a lab ends up publishing work that supersedes its own unpublished data – this is nothing less than putting out one paper for the price of two.

To be sure, TIR is not advocating doing rushed work or sloppy work. It’s important to generate the highest-possible grade of data and only publish it when it’s reached a natural conclusion. But If you’re the type of person who is agonising about whether your data is good enough to publish, then it is almost certainly already good enough. You wouldn’t sit on food, and you shouldn’t sit on data.

2 thoughts on “Use-by data

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s