LikeLike

]]>LikeLike

]]>LikeLike

]]>LikeLike

]]>LikeLike

]]>LikeLike

]]>Agreed – he’s one of several obvious absentees; Francisco Mojica would be another, for doing some of the real pioneer work. Quite why the Siksnys paper was delayed so long might well be due to snobbery/Matthew Effect issues related to address lines, but I wouldn’t want to speculate further than that.

It’s a personal opinion, but for me, the fusion of the crRNA and the trRNA in the Doudna/Charpentier paper is the masterstroke, and that’s something that only appears in that publication (neither Siksnys nor Zhang made the same jump). Demonstrating that Cas9 was programmable was a logical extension once its activity had been characterised, but realising that the system could be simplified even further requires both a profound knowledge of the relevant RNA biology and structural biology, and some outside-the-box thinking.

Anyhows, I’m sure this will be a fun topic for pub discussions in years to come..! 🙂

LikeLike

]]>Hi Anupam,

Sorry for the slow reply.

Soooo, much of the below is paraphrased from “Naked Statistics” by Charles Wheelan, which I can highly recommend if you’re not already familiar with it.

– Any given experiment (with multiple technical and biological replicates) will give us a sample mean, which is an approximation of the true population mean.

– If we do multiple experiments, each with multiple technical and biological replicates, we will end up with a set of sample means.

– The sample means will themselves form a normal distribution around the true population mean.

– Crucially, the underlying data does not have to have a normal distribution for this to apply.

– The bigger the number of samples, the closer to a normal distribution we’ll be, and the larger the size of the dataset in each sample, the tighter the distribution will be (because each sample mean has been measured more accurately).

– The power of a normal distribution is that we know what proportion of the observations will lie within 1 SD above or below the mean (68%), what proportion will lie within 2 SD of the mean (95%), and so on.

– The standard error is the term that describes the dispersal of these sample means.

– I.e. standard deviation measures the dispersion in the sample; the standard error measures the dispersion of the sample means. The standard error is the standard deviation of the sample means.

– It’s therefore easy to see why the standard error is a richer readout on your data, because it provides an estimate of the variation in the sample means, and therefore an indication of how close you might be to the true population mean.

– The thing to ask though is this: how many people use the standard error because it provides a measure of the dispersal of the sample means, and how many people use it just because it gives a lower value than the standard deviation? In biological sciences, I’d wager that the latter is often more likely. In fact, I’d bet that a lot of times that the standard error is used, it’s (erroneously) not even being used on means.

– In biological sciences, people with only a basic knowledge grasp of statistics (like myself) want to show where the centre of their data – obtained using multiple technical replicates, multiple biological replicates, and across multiple independent experiments – is, and how much variation around that point there is. Median and SD describe those things.

Does that sound reasonable?

]]>