tl;dr: The company realized that saving the science was more important than saving the egos of management so their whole approach is to show investors that they fucked up trial design in Phase 2a by encouraging people to have low screening scores (which were then used for baselines. wtf!), but the science and reliability for future single-dose studies is still something to be optimistic about. Whether you agree with that or not is up to you.
------------------------------------------------------------------------------------
Detailed explanation:
I will try to help you out with the last paragraph. As
@Diesel alluded to, in open-label trial 111, there is no available information (that I'm aware of as well) on whether or not the patients knew which ear was injected. My guess is that they did and that it was the worse ear, but that's just a guess.
Anyways, what do they mean by this picture (and the remark about trial 111)?
View attachment 45393
When people do the WR test, they have a baseline and then a later date data point (in this case day 90). Call the baseline score X and the day 90 score Y. In theory, if someone received a placebo (ignoring many other factors and assuming the person is essentially a robot taking the same test twice), then X and Y should be close. With X given, a 95% confidence interval is a range of numbers which should encapsulate Y about 95% of the time. Though an abuse of mathematical language,
roughly speaking, 95% of the time, Y should land in this interval. Anyways, there's a process to calculating these 95% confidence intervals for Y, given each possible baseline score X.
Okay, so what are they pointing out here? A 95% confidence interval is constructed so that if L is the minimum and U is the maximum of the interval, then approximately 2.5% of the time, by pure chance, Y should exceed U and 2.5% of the time, Y should be less than L.
What ended up happening in the Phase 2a and why they are calling red flags on themselves (total mismanagement) is that we "should" (I put that in quotes because the placebo sample size was only n=21, so it's only a rough approximation -- requires math to understand) see about 2.5% of the placebo patients exceed the upper bound U of their respective 95% confidence interval (the confidence intervals are different for different baseline scores X across the placebo patients). They saw 16% of the patients exceed U. In other words, it's unbelievable that,
by chance, placebos just happened to improve that much. This is their whole point -- that study design was very poor.
In essence, they are trying to say "the science is/might be okay; we just really fucked up the trial." This is a smart move, given the braindead decision-making to not have a lead-in from the start.
Finally, how are they comparing to trial 111, where there were no true placebos, but the untreated ear was being treated as a placebo data point? They are saying, basically, for reference (strengthening their argument that the trial was fucked), 0% of the untreated (placebo-like) ears in trial 111 saw improvements that exceeded the upper bound U of the 95% confidence interval.
If you look at the
May 13 press release (see below), they point out that in the trials where they correctly utilized a lead-in (i.e. they had to take a screening test to verify that their score at screen was similar to another one >6 months ago to verify that the person really had stable word scores. The baseline was then assessed later when the incentive was gone.), shit did not hit the fan.
View attachment 45395