A Brief Review of LQAS and B-LQAS
LQAS
Classical LQAS is primarily a classification procedure [16]. In its simplest form, the goal is to classify the unknown prevalence of a binary indicator as greater than or equal to some critical threshold,p*, or less than this threshold. To do so, the number of cases,Y, in a simple random sample of sizen are compared to a predefineddecision rule, d. If fewer thand cases are observed, then the prevalence is classified aslow (p < p*). Otherwise, it is classified ashigh (p ≥ p*). The sample size and decision rule are chosen to achieve error probabilities ofα andβ, where the former is the maximum acceptable probability of a false negative (or a falselow) over some range of prevalences and the latter is the maximum acceptable probability of a false positive (or falsehigh) over some other range of prevalences.
To define these ranges, LQAS uses upper and lower thresholds,p
U
andp
L
. The sample size and decision rule are chosen so that the probability of a false negative when the true prevalence is greater than or equal top
U
is less than or equal toα. Likewise, the probability of a false positive when the true prevalence is less than or equal top
L
is less than or equal toβ. In the industrial literature, these errors are referred to as the consumer and producer risks. In health applications, generallyp
U
is chosen to be equal to the critical thresholdp* andp
L
is chosen to reflect the desired detectable deviation from that threshold [16]. However, some have suggested usingp* =p
L
[17], which might be appropriate depending on the application. When deciding on how to implement the procedure, it is important that the investigator keep in mind whatp represents, particularly ifp is the prevalence of an undesirable outcome.
The Operating Characteristic (OC) Curve completely summarizes any LQAS design. For a given value ofp, this is defined as
(1)
whereY is assumed to be binomially distributed with parametersn andp. Plotting this quantity for the entire range ofp yields the desired curve. A satisfactory LQAS design will have the following properties:
For example, in Figure1, we see anOC curve withn = 200 andd = 14, which corresponds to the LQAS design used by Deitchleret al[2,3] to classify the prevalence of malnutrition with a critical thresholdp* = 0.10 and withp
U
= 0.10 andp
L
= 0.05. In that application,α was set to 0.10 andβ to 0.20. We see that theOC curve is less thanα = 0.10 at the upper threshold and greater than 1 -β = 0.80 at the lower threshold, and thus meets the design requirements.
B-LQAS
B-LQAS is similar to classical LQAS in that the final goal is decide whether to classify the prevalence as greater than or equal to some thresholdp* , or less than this threshold, so that appropriate action be taken. However, in the Bayesian context, we allow a prior distribution of the parameter,p, to be part of the analysis. As with classical LQAS, we choose the sample size,n, and the decision rule,d, to achieve certain criteria when performing the classificationbefore observing the data. In contrast to classical LQAS, these criteria are based on posterior properties of our decision, or what we believeafter observing the data. For example, it might be important to know what is the probability we have made the correct decision, or classification, and we have the choice of two probabilities, depending on which decision we make.
To get a better understanding of the intuition behind this approach, consider Figure2, where we have plotted a hypothetical prior distribution of the prevalence of malnutrition. This distribution has mean 8.5% with 77% of its mass between 5% and 10%. Overlaid on this plot are twoOC curves. The solid curve corresponds to a classicalOC curve withd = 14, or the decision rule that we would choose using classical considerations. When we consider the prior distribution, we might argue that it makes less sense to use this decision rule, which prioritizes error above 10% and below 5%, since we seldom expect to see a prevalence as high or as low. The dashed line corresponds to a classicalOC curve withd = 28. With the chosen prior, this appears to be a better design as it prioritizes the region of largest prior mass when choosing a decision rule. That is, the design prioritizes correct classification of prevalences which aremost likely given our prior beliefs. Hence, prior beliefs about the parameter of interest should play a vital role in determining an appropriate design, and in explaining its properties.
For the sake of illustration, in this paper we assume the conjugate Beta prior on the prevalence,p ~Beta(a,b), to demonstrate the effect on theOC curves. That is, we let the prior distribution,π(p), take the structural form
wherea,b > 0 andB(a,b) is the beta function [18]. The Beta provides a rich family of distributions, allowing for a range of flexible prior shapes. Further, there is some precedence for its use [10]. The parametersa andb control the shape of the prior distribution. To aid interpretation, we might think abouta andb as the prior number of successes and failures, respectively. Therefore, a large value ofb relative toa yields a distribution skewed to the left.
For pedagogical reasons, we choose these parameters to reflect a variety of potential prior beliefs (see Figure3A). For example, whena = 1 andb = 1, the prior is completely flat, which might correspond to a lack of prior knowledge or possibly prior indifference. Whena = 2 andb = 10, this corresponds to a prior density with most of its mass below 30%, which is a realistic assumption as malnutrition prevalence is rarely as high as 30%. For example, in the CE-DAT global database of over 1400 malnutrition surveys conducted in emergency situations, only 59 reported a prevalence as high as 30% in children 6-59 months [19]. However, we also look at the case wherea = 4 andb = 2 and wherea =b = 5, reflecting a prior belief that the prevalence is in fact quite high, even though this condition is probably quite unlikely in the present context. The properties of a B-LQAS design can once again be formalized in theOC curves, although we now focus our attention on the BayesOC curves. A key difference between classical LQAS and B-LQAS is the reliance on not one, but two curves to determine appropriate designs, since we need to condition on either a high or low classification. In this paper, we define the following BayesOC curves
(2)
(3)
where the eventPass = {Y ≥d} andFail = {Y <d},whereY is the number of "successes" in a sample of sizen. Plotting (2) and (3) as a function ofx yields the desired curves. We can write
whereπ(p) is the prior distribution ofp andf (y|p) is the sampling distribution ofY given the parameter,p. As a result, the BayesOC curves are less straightforward to calculate than the classicalOC curves, as some integration over the unknown parameter is required. In some cases, these integrals can be analytically intractable, in which case one would have to appeal to numerical methods to evaluate the expressions. In any single application we ultimately take only a single action, but we need to consider both BayesOC curves. Note that in the case of malnutrition, ifY ≥d, this indicates a high burden of malnutrition. Therefore, the use of the wordPass is not instinctual. However, we might think ofPass as "qualifying for humanitarian aid" to facilitate the interpretation. We continue with this notation to provide a unified framework.
The interpretation of each of these curves allows us to make probabilistic statements about the parameter of interest, given the results of our diagnostic procedure; such as statements like those made by Bilukha and Blanton [5], which in their context are incorrect. Intuitively, it would be desireable to control for the probability that the prevalence is low when we say it is high (or declare aPass), for example, and vice-versa. Using the BayesOC curves, we can choosen andd so that the Bayes classification errors are controlled. For an analogue to classical LQAS, we can enforce the following:
(4)
(5)
wherep
U
=p* andp
L
is some lower critical threshold. However, it is also possible to choosep
L
=p
U
=p*, which might be more appealing to some practitioners. This latter case is discussed at length in the context of Phase II clinical trials by Wanget al[20]. Ultimately, the choice depend on the application and the priorities of the investigators. We discuss this issue further in the next section.
In Figure3B, we see the BayesOC curve plotted as a function of the prevalence threshold(x in equations (2) and (3)) withn = 200 andd = 14. Whenα = 0.10 andβ = 0.20, we see that the constraints posed in (4) and (5) forp
U
= 0.10 andp
L
= 0.05 are met for all considered priors. That is,
Hence, when we choosed = 14, which corresponds to the classical solution, we achieve reasonable Bayesian properties as well.
Note that when we letp
L
=p
U
= 0.10, the error at the upper and lower thresholds increases slightly as compared to the case whenp
U
= 0.10 andp
L
= 0.05 for the considered priors. That is, 1 -BOC
P
< (p = 0.10 |n,d) decreases from essentially one, to just over 0.80, in the case whena = 2 andb = 10, which is still within the design constraints. This is important, however, as it highlights the tradeoff between classification precision and accuracy in these applications. Namely, by classifying as abovep
U
=p* or belowp
L
, we are asking for slightly imprecise results, but doing so with high accuracy. However, if we classify as above or belowp
U
=p*, we are asking for highly precise results, but doing so with less accuracy. We revisit this notion in more depth below.
Maximizing the Figure of Merit
In general, the above outlined approach is feasible yet time consuming, as it might require an investigator to look at a range of sample sizes and decision rules to arrive at a given design. A more automated design selection is achieved by using the followingFigure of Merit (FOM),
(6)
and for a given sample size, one might choose the decision rule to maximize this quantity. In the decision theoretic literature, this quantity is known as the Bayes Risk of a zero-one utility (negative loss) function [1]. Whenp
U
=p
L
, theFOM of a given design is the average probability of correct classification given a priorπ(p). Whenp
L
<p
U
, dividing the above quantity by a factor of 1 -Pr(p
L
<p <p
U
) yields the average probability of correct classification of priority locales. This scaling of theFOM does not affect the maximization, but can help with interpreting the result. Interestingly, (6) can be rewritten as
which shows that an optimal design is one that maximizes a weighted average of the BayesOC curves, where the weights are the marginal probabilities of passing and failing the procedure. The marginal distribution ofY is also referred to as theprior predictive distribution. That is, the predictive distribution ofY given only our prior assumptions. Hence, we weight more heavily the BayesOC curve which has the greater prior predictive probability of occurring.
Continuing with our example, Figure4A shows the plot of theFOM as a function ofd for the situations whenn = 200,p
U
= 0.10,p
L
= 0.05 and 0.10, and four prior distributions. Whenp
L
= 0.05, the optimal decision rule hovers aroundd = 14, or the same as the classical LQAS solution. Yet, it is important to note that both whena = 5,b = 5 anda = 4,b = 2, the optimal decision rules are less than 14. Further, even though the curves are very nearly flat in the displayed range of rules, these are true maxima due to the fact that the prior mass belowp
L
is non-zero. Scaling the maximumFOM appropriately reveals that the average probability of correct classification of priority locales is close to 100%, indicating the appropriateness of the design for detecting extremes (i.e. areas wherep ≤p
L
orp ≥p
U
).
Whenp
L
=p
U
= 0.10, the maximumFOM decreases, albeit slightly, and the optimal decision rule increases tod = 17 ord = 20, depending on the prior. Therefore, if our prior belief is that the malnutrition prevalence is low, we require agreater number of malnourished children in our sample to be convinced otherwise. But if we believe that the malnutrition prevalence is high, we will need fewer malnourished children in our sample to be convinced that the prevalence is indeed low, and thus possibly triggering an earlier intervention. This is a consequence of incorporating prior information into our analysis. In either case, it is important to realize that the optimal design does a good job of classifying areas. That is, in both cases, the maximumFOM is greater than or equal to 90%. Therefore, with a sample of sizen = 200, the probability that we correctly classify an area is greater than or equal to at least 0.90.
It is interesting to note that whena = 1 andb = 1, the in difference prior, the optimal decision rule is equal to 20, ornp*. Both Bilukha and Blanton [5] and Rhodaet al[17] have suggested usingd ≈np* for the decision rule in the classical setting. The use of a flat prior withp
L
=p
U
=p* gives a Bayesian justification for such a choice, although the use of a flat prior does not ordinarily make sense for this application, as it is uncommon for the prevalence of acute malnutrition to reach as high as 30%, much less 80% or 90% [19]. We discuss this in more depth in the following sections.
Balancing Accuracy and Precision
TheFOM measures the overallaccuracy of the B-LQAS procedure, and it is attractive to constrain our procedure to achieve at least a minimumFOM. Namely, we constrain theFOM so that
(7)
where the parameterδ controls the overall level of accuracy in the procedure. This might be considered a more appealing design metric thanα andβ. Of course, whenp
L
=p
U
, constraints (4) and (5) imply
Hence, in this special case, the optimal design which meets (7) might be chosen as that design for which a weighted average of the producer and consumer risks, weighted according to the prior belief of passing or failing, is greater than or equal to 1 -δ. Ifα =β, then we have thatα =β =δ, further simplifying the parameterization.
The precision demanded of the procedure impacts the accuracy. That is, the choice ofp
U
andp
L
affects the properties of the design. Formally, define theprecision as 1-|p
U
-p
L
|.Whenp
U
=p
L
, the precision is equal to one. But asp
L
deviates fromp
U
, the precision decreases. Indeed, when at their maximal difference, the precision is zero. In our example,p
U
= 0.10 andp
L
ranges from 0.05 to 0.10, so that the precision ranges from 0.95 to 1.00.
In Figure4B, we plot the maximum average probability of correct classification of priority locales (or the appropriately scaledFOM) as a function of the precision, fixingp
U
= 0.10 and allowingp
L
to vary from 0.05 to 0.10. Therefore, when the precision is equal to 0.95, this corresponds top
L
= 0.05 andp
U
= 0.10. When the precision is equal to one, thenp
L
=p
U
= 0.10. Assume that we want a design that achieves an overall accuracy of 0.95 (1 -δ = 0.95). We see that for three of the four considered priors, the maximumFOM is well above 0.95 for all considered precisions, and therefore we should on average correctly classify over 95% of locales with these procedures. However, for the situation whena = 2 andb = 10, which is likely the more realistic prior for this application, the maximumFOM drops below 0.95 asp
L
approachesp
U
, or the precision approaches one. Hence, it is not always possible to achieve the desired level of accuracy for all precisions, short of increasing the sample size; illustrating the trade of between the two.