I'm going through it now, but suffice to say I think there's some stuff here we haven't seen before. Already learned something new:
"Furthermore, 2 of the 6 FX-322 -treated responders demonstrated a >5 dB at both 6 and 8 kHz on day 90 post injection (Figure 9B)".
I had no idea they had 2 patients responding at 6 kHz on the audiogram. Was anyone else aware of this? I knew the pharmacodynamics showed it gets that far but didn't realise it was in any therapeutic quantity. Probably wasn't statistically significant but I still think that could be really promising. I find this quite exciting because the biggest dip on my audiogram is at 6 kHz!
Analysis, Part I
So I ran a
Fisher's Exact Test for myself on this test and computed the p-value. I was lazy so I just used
Graphpad to do the calculations for me.
Assumptions:
- Two categorical variables: Variable 1 is Treatment (FX-322 or Placebo) and Variable 2 is Responder (Yes or No, where Yes means >5 dB at 6 and 8 kHz on day 90). This creates a 2x2 grid where the n=23 total sample size will satisfy that each patient lands in exactly 1 of the 4 cells.
- One-tailed test. In other words, the p-value looks for the "improvement" from FX-322 being at least as extreme. This means that I am looking for the probability that with 15 treated ears, 8 placebo ears, what is the probability that at least 2/15 ears will be a responder? Obviously, the p-value should be large, which is considered less evidence. A small p-value means that the probability of improvement at that level of extremeness if the null hypothesis (no improvement) was true, is small. Therefore, the closer p is to 0, the better and the closer p is to 1 (100%), the worse. Small p-values mean that the null hypothesis is harder to believe, essentially. Opposite for large p-values.
Notes:
- They performed this same test on the 6/15 responders to look for statistical significance. Therefore, it's a very reasonable test. They also performed a one-tailed test.
Results:
After a bunch of math,
p=0.415. For some interpretation, we need p < .05 to be considered statistically significant. Also, note that p=.05 for the same test, but with "Responder" only requiring >5 dB improvement at 8 kHz. As we can see, big difference in the p-value.
This small sample sized, crude test tells me something pretty big. That is, 8 kHz is right around the boundary where drug action reaches. It is nice that 2/15 saw improvement at 6 kHz, but I think our expectations should be low. Moreover, to
@Aaron91's point about the impact of dosing
volume being similar because of Pick's law, that makes me think that the best we'll see from multiple dosing cohorts is added improvement at 8 kHz. I expect that 6/15 ratio to improve, but the 2/15 ratio at 6 kHz to barely improve in the Phase 2a study.
tl;dr: they need a new formulation or a fundamentally different drug delivery technique to reach lower. Still encouraging.
Also, check this out:
"In addition, three subjects reported an improvement in tinnitus."
That's at least 50% of patients who didn't have a WR ceiling effect seeing improvement in their tinnitus.
Analysis, Part II
I ran the same test as above, but with "Responder" changing to "tinnitus improvement" (Yes or No). My conservative assumption is that of the n=23 people in the study, there were only 3 tinnitus improvers in total. The only 3 tinnitus improvers came from the 15 treated ears.
The same one-tailed test procedure reveals a p-value of
p=0.2569. For the two-tailed test (more conservative p-value, which considers the possibility that the drug
worsened tinnitus), we have
p=0.5257.
Caveats: This assumes that tinnitus was perfectly assessed and that only 3 people in the whole study improved. I have no idea is is true or not. Either way, it seems like if the 3 tinnitus responders came from the 6 responders, that's not bad for one dose.