In all fairness, they put it very, very delicately and we have simply read between the lines. We are the ones who have been throwing around the words "liars/fakers/cheaters" but the exact terminology used by Kevin Franck was very considerate and more suggestive towards an "unconscious bias" i.e. without intention. We are the ones who have suggested something more nefarious may have happened. Either way, the result is the same (if true).
In a way, that's almost worse, IMO. It's like they think they will easily pull the veil over everyone.
Here's the problem. They could have designed the trial in such a way that there would be incentive to be honest. Just as an example, and I'm simplifying the numbers. In actuality, it would depend on the number of applicants, etc.
But let's just say they said to themselves the following. We are using Thornton Raffin confidence intervals to measure changes in repeated word tests. Let's do the same thing to compare their >=6 month test to the screening test.
It's
really important to us to get good data here, knowing that people are desperate and that there's social media buzz on top of our own messaging, that we want low word scores. Let's create a rigorous standard.
Let's actually bother to read the Thornton Raffin paper (instead of just grabbing numbers from the 95% table). Let's construct rigorous 80% confidence intervals (smaller error margins). You have to score (incentive!) within the 80% confidence interval of the score that you took
before you even knew about the trials (so basically 0% chance of the >=6 month WR score being faked).
You may try your hardest and your second score may land outside of the 80% confidence interval. It's not fair, you're not a bad person, but we demand people who are good at producing good repeat data.
Notice that my strategy is bad-incentive proof. I could have all sorts of nefarious, desperate intentions. I am encouraged to score my best.
To make this even better, you could have this standard and not reveal it publicly. This way, there's a higher probability that the >=6 month score was achieved from trying their hardest.
Between this test being stable and the audiogram being stable, there would be essentially no worry about frauds.
The only thing I can think of that wouldn't be Frequency Therapeutics fault is if the patients didn't lie at screen, but were paranoid that if their baseline scores were too good, they would be kicked out. This is unlikely to be the case, and is easily fixed by just insisting to them that the motivation is to do your best throughout the trial.
I really hope they did do something like this, but just got burned. If I can think of it, they definitely can.