Thanks to Ken and Bill for explaining some of the statistical issues but they only discuss the lower and upper prediction percentiles (typically 5%ile and 95%ile are used). Mike has also mentioned the central tendency

An arguably more important percentile for model evaluation in a VPC is the 50%ile (the median). This gives you the clearest idea of how well the model predicts the central tendency of the observations and can give you direct insight into model mis-specification and how this might be addressed. See the tutorial by Nguyen et al (2017) for examples.

VPCs are most easily evaluated by comparing the observation percentile with its corresponding prediction percentile. Unfortunately some commonly used VPC tools do not include the prediction percentile by default so users are left having to guess how well the observed and predicted percentiles agree. Hint to VPC tool developers – please help users by including the prediction percentiles by default.

Nguyen TH, Mouksassi MS, Holford N, Al-Huniti N, Freedman I, Hooker AC, et al. Model Evaluation of Continuous Data Pharmacometric Models: Metrics and Graphics. CPT: pharmacometrics & systems pharmacology. 2017;6(2):87-109.

This is a great example of the kind of terminology debates that the ASA / ISOP Statistics and Pharmacometrics special interest group (SxP) is trying to tackle.

As Mats and Bill point out, the common usage within our community is to say that the percentiles (5th, 95th) are “prediction intervals” and the interval estimates / uncertainty around these percentiles are “confidence intervals”.

But as Ken points out, these terms do not strictly correspond to the statistical definition of each if you take into account what the VPC procedure is actually doing.

The VPC is a model diagnostic procedure for the observed data and provides a visual check of whether the model is capturing central tendencies and dispersion in our data. (BTW, I *know* there are debates about the usefulness or otherwise of VPC plots. I’m not going to address that here and I suggest we don’t disappear down *that* rabbit hole.) We are NOT trying to make probabilistic statements about the likelihood of observed percentiles being within the intervals around these. So if the question arises from some reviewer based on our use of statistically woolly terms like “prediction interval” or “confidence interval” we should be ready to put up our hands and admit that the terms we are using do not imply those statistical properties.

We could advocate changing the terminology used, but that may not have traction in the community after this length of time. But we *should* be cognizant about what these things are, what they’re for, what the formal, statistical terminology implies and what our use (or maybe misuse) is or isn’t implying.

The ASA / ISOP SxP group has had a session accepted at this year’s ACOP meeting where we hope to surface a few of these thorny issues and debate between our use of terminology in pharmacometrics, the statistical interpretation of that terminology and whether it *really* matters. If you’re interested, please come along and be prepared to engage in the discussion!

