Bayes-LQAS: classifying the prevalence of global acute malnutrition
© Olives and Pagano; licensee BioMed Central Ltd. 2010
Received: 20 July 2009
Accepted: 9 June 2010
Published: 9 June 2010
Lot Quality Assurance Sampling (LQAS) applications in health have generally relied on frequentist interpretations for statistical validity. Yet health professionals often seek statements about the probability distribution of unknown parameters to answer questions of interest. The frequentist paradigm does not pretend to yield such information, although a Bayesian formulation might. This is the source of an error made in a recent paper published in this journal. Many applications lend themselves to a Bayesian treatment, and would benefit from such considerations in their design. We discuss Bayes-LQAS (B-LQAS), which allows for incorporation of prior information into the LQAS classification procedure, and thus shows how to correct the aforementioned error. Further, we pay special attention to the formulation of Bayes Operating Characteristic Curves and the use of prior information to improve survey designs. As a motivating example, we discuss the classification of Global Acute Malnutrition prevalence and draw parallels between the Bayes and classical classifications schemes. We also illustrate the impact of informative and non-informative priors on the survey design. Results indicate that using a Bayesian approach allows the incorporation of expert information and/or historical data and is thus potentially a valuable tool for making accurate and precise classifications.
The frequentist approach to statistical inference assumes that a parameter of interest is a fixed and unobservable quantity. The goal is to make inference about this fixed value, given an assumed sampling distribution of the data. For example, one might estimate the prevalence of disease in a population and calculate a confidence interval about the estimate to reflect the statistical uncertainty associated with the estimation; or test a hypothesis about the value of the prevalence and report a p-value to determine significance. The attributes of these methods are judged a priori, or before observing any data. For example, a 95% confidence interval will capture the true parameter value on average 95% of the time. Similarly, a hypothesis test is designed to a certain power function, which determines the potential errors. Yet, once the data have been observed, a posteriori the probability that the true parameter lies within that interval is zero or one and the result of the hypothesis test is correct or not- and, unfortunately, we do not know which.
The relevant pieces of this expression are the likelihood,Pr(X|ϕ) and the prior distribution,Pr(ϕ), which, together withPr(X), yield the posterior distributionPr(ϕ|X). Both ingredients must be specified for valid conclusions in this context. The posterior distribution probabilistically describes the behavior of the unknown parameter given the prior and observed data, and serves as the basis for Bayesian analysis. Certain applications naturally lend themselves to a Bayesian approach. Consider monitoring the prevalence of acute malnutrition amongst children 6-59 months of age and within a particular area. At any given time, there may be a true value of the prevalence of acute malnutrition (i.e. the number of children acutely malnourished in the area divided by the total number of children in the area). However, if one were to consider the prevalence over a six month period, this value would fluctuate as children age, thereby entering or exiting the cohort, or their nutritional statuses change. Thus, it may be more realistic to model the prevalence of malnutrition as a random quantity over time rather than a fixed quantity.
Deitchleret al found Lot Quality Assurance Sampling (LQAS) to be a useful tool to monitor the prevalence of Global Acute Malnutrition, defined as Weight-for-Height-Z-score < -2 standard deviations, in emergency situations [2,3]. With anyLQAS application, the goal is to classify the population prevalence as above or below predefined thresholds by comparing the number of failures in a random sample to a specific decision rule. For example, one might be interested in determining with a high degree of confidence whether the prevalence of acute malnutrition is greater than 10% in a given population of children less than 5 years of age. To accomplish this goal, randomly sample 200 children from within the population. If more than thirteen children exhibit signs of acute malnourishment, then classify the prevalence of acute malnutrition as greater than 10% in that population. The choice of sample size and decision rule determine the degree to which one can rely on this classification. TheLQAS procedure is discussed in depth in the next section. Recently, Bilukha  and Bilukha and Blanton  substantially criticized these designs in this journal. The main problem with their criticism is exemplified in Bilukha and Blanton's suggestion to use as an alternative measure of risk, "the statistical probability of the true population value's exceeding the threshold," conditional on the number of malnourished children in a sample . When the prevalence is treated as a constant, as it is in the model in their paper, this is a measure with little meaning. The authors fall short of specifying the necessary assumptions to make what is clearly a Bayesian statement; namely, no mention is made of the prior distribution. The reader is left to assume that the authors did not consider this aspect in their calculations.
It is attractive to have the ability to make a probability statement about the prevalence of malnutrition, even though it does require that more structure be imposed on the model. The Deitchleret alLQAS designs serve as a natural place to begin this investigation. ClassicalLQAS has generally relied on frequentist statistical principles, particularly in its application in health (see  for over 800 examples, all of which take a frequentist approach). Further, nowhere in the health literature have Bayesian considerations been incorporated into theLQAS procedure, to the best of our knowledge. Yet, Bayes-LQAS (B-LQAS) is well-established in the industrial literature, where it is known as Bayesian Acceptance Sampling (see , and references therein). As early as the 1960's, Brush examined classical and Bayesian risks for a variety of sampling plans . Brush, and Sharma and Bhutani , emphasize the importance of examining both the classical and Bayes risks when deciding upon a sampling design. Fan  and Sheng and Fan  consider B-LQAS for binomial testing and outline an approach to choosing a prior based on historical data using an empirical Bayes approach.
More recently, Moskowitz considers B-LQAS under quadratic and step-loss functions . Fitzgerald looks at B-LQAS plans under an assumed mixture prior , and he bases B-LQAS designs on BayesOC curves and averageOC curves. These curves were first introduced by Easterling . Now, B-LQAS has made a transition into the economics and operations research literature. For example, in 1996 Lattimoreet al used B-LQAS to monitor drug use in Illinois. In that application, the sampling plan was determined by minimizing expected cost, as advocated by Moskowitzet al[14,15].
In this paper, we discuss the potential benefits of using B-LQAS in health applications. As a running example, we discuss an application to acute malnutrition, motivated by LQAS designs proposed by Deitchleret al to classify the prevalence of malnutrition [2,3]. We show how to approach the classification problem from a Bayesian perspective, show some of the advantages of this approach, and discuss the parallels between the classical and Bayesian approach.
A Brief Review of LQAS and B-LQAS
Classical LQAS is primarily a classification procedure . In its simplest form, the goal is to classify the unknown prevalence of a binary indicator as greater than or equal to some critical threshold,p*, or less than this threshold. To do so, the number of cases,Y, in a simple random sample of sizen are compared to a predefineddecision rule, d. If fewer thand cases are observed, then the prevalence is classified aslow (p < p*). Otherwise, it is classified ashigh (p ≥ p*). The sample size and decision rule are chosen to achieve error probabilities ofα andβ, where the former is the maximum acceptable probability of a false negative (or a falselow) over some range of prevalences and the latter is the maximum acceptable probability of a false positive (or falsehigh) over some other range of prevalences.
To define these ranges, LQAS uses upper and lower thresholds,p U andp L . The sample size and decision rule are chosen so that the probability of a false negative when the true prevalence is greater than or equal top U is less than or equal toα. Likewise, the probability of a false positive when the true prevalence is less than or equal top L is less than or equal toβ. In the industrial literature, these errors are referred to as the consumer and producer risks. In health applications, generallyp U is chosen to be equal to the critical thresholdp* andp L is chosen to reflect the desired detectable deviation from that threshold . However, some have suggested usingp* =p L , which might be appropriate depending on the application. When deciding on how to implement the procedure, it is important that the investigator keep in mind whatp represents, particularly ifp is the prevalence of an undesirable outcome.
B-LQAS is similar to classical LQAS in that the final goal is decide whether to classify the prevalence as greater than or equal to some thresholdp* , or less than this threshold, so that appropriate action be taken. However, in the Bayesian context, we allow a prior distribution of the parameter,p, to be part of the analysis. As with classical LQAS, we choose the sample size,n, and the decision rule,d, to achieve certain criteria when performing the classificationbefore observing the data. In contrast to classical LQAS, these criteria are based on posterior properties of our decision, or what we believeafter observing the data. For example, it might be important to know what is the probability we have made the correct decision, or classification, and we have the choice of two probabilities, depending on which decision we make.
wherea,b > 0 andB(a,b) is the beta function . The Beta provides a rich family of distributions, allowing for a range of flexible prior shapes. Further, there is some precedence for its use . The parametersa andb control the shape of the prior distribution. To aid interpretation, we might think abouta andb as the prior number of successes and failures, respectively. Therefore, a large value ofb relative toa yields a distribution skewed to the left.
whereπ(p) is the prior distribution ofp andf (y|p) is the sampling distribution ofY given the parameter,p. As a result, the BayesOC curves are less straightforward to calculate than the classicalOC curves, as some integration over the unknown parameter is required. In some cases, these integrals can be analytically intractable, in which case one would have to appeal to numerical methods to evaluate the expressions. In any single application we ultimately take only a single action, but we need to consider both BayesOC curves. Note that in the case of malnutrition, ifY ≥d, this indicates a high burden of malnutrition. Therefore, the use of the wordPass is not instinctual. However, we might think ofPass as "qualifying for humanitarian aid" to facilitate the interpretation. We continue with this notation to provide a unified framework.
wherep U =p* andp L is some lower critical threshold. However, it is also possible to choosep L =p U =p*, which might be more appealing to some practitioners. This latter case is discussed at length in the context of Phase II clinical trials by Wanget al. Ultimately, the choice depend on the application and the priorities of the investigators. We discuss this issue further in the next section.
Hence, when we choosed = 14, which corresponds to the classical solution, we achieve reasonable Bayesian properties as well.
Note that when we letp L =p U = 0.10, the error at the upper and lower thresholds increases slightly as compared to the case whenp U = 0.10 andp L = 0.05 for the considered priors. That is, 1 -BOC P < (p = 0.10 |n,d) decreases from essentially one, to just over 0.80, in the case whena = 2 andb = 10, which is still within the design constraints. This is important, however, as it highlights the tradeoff between classification precision and accuracy in these applications. Namely, by classifying as abovep U =p* or belowp L , we are asking for slightly imprecise results, but doing so with high accuracy. However, if we classify as above or belowp U =p*, we are asking for highly precise results, but doing so with less accuracy. We revisit this notion in more depth below.
Maximizing the Figure of Merit
which shows that an optimal design is one that maximizes a weighted average of the BayesOC curves, where the weights are the marginal probabilities of passing and failing the procedure. The marginal distribution ofY is also referred to as theprior predictive distribution. That is, the predictive distribution ofY given only our prior assumptions. Hence, we weight more heavily the BayesOC curve which has the greater prior predictive probability of occurring.
Whenp L =p U = 0.10, the maximumFOM decreases, albeit slightly, and the optimal decision rule increases tod = 17 ord = 20, depending on the prior. Therefore, if our prior belief is that the malnutrition prevalence is low, we require agreater number of malnourished children in our sample to be convinced otherwise. But if we believe that the malnutrition prevalence is high, we will need fewer malnourished children in our sample to be convinced that the prevalence is indeed low, and thus possibly triggering an earlier intervention. This is a consequence of incorporating prior information into our analysis. In either case, it is important to realize that the optimal design does a good job of classifying areas. That is, in both cases, the maximumFOM is greater than or equal to 90%. Therefore, with a sample of sizen = 200, the probability that we correctly classify an area is greater than or equal to at least 0.90.
It is interesting to note that whena = 1 andb = 1, the in difference prior, the optimal decision rule is equal to 20, ornp*. Both Bilukha and Blanton  and Rhodaet al have suggested usingd ≈np* for the decision rule in the classical setting. The use of a flat prior withp L =p U =p* gives a Bayesian justification for such a choice, although the use of a flat prior does not ordinarily make sense for this application, as it is uncommon for the prevalence of acute malnutrition to reach as high as 30%, much less 80% or 90% . We discuss this in more depth in the following sections.
Balancing Accuracy and Precision
Hence, in this special case, the optimal design which meets (7) might be chosen as that design for which a weighted average of the producer and consumer risks, weighted according to the prior belief of passing or failing, is greater than or equal to 1 -δ. Ifα =β, then we have thatα =β =δ, further simplifying the parameterization.
The precision demanded of the procedure impacts the accuracy. That is, the choice ofp U andp L affects the properties of the design. Formally, define theprecision as 1-|p U -p L |.Whenp U =p L , the precision is equal to one. But asp L deviates fromp U , the precision decreases. Indeed, when at their maximal difference, the precision is zero. In our example,p U = 0.10 andp L ranges from 0.05 to 0.10, so that the precision ranges from 0.95 to 1.00.
In Figure4B, we plot the maximum average probability of correct classification of priority locales (or the appropriately scaledFOM) as a function of the precision, fixingp U = 0.10 and allowingp L to vary from 0.05 to 0.10. Therefore, when the precision is equal to 0.95, this corresponds top L = 0.05 andp U = 0.10. When the precision is equal to one, thenp L =p U = 0.10. Assume that we want a design that achieves an overall accuracy of 0.95 (1 -δ = 0.95). We see that for three of the four considered priors, the maximumFOM is well above 0.95 for all considered precisions, and therefore we should on average correctly classify over 95% of locales with these procedures. However, for the situation whena = 2 andb = 10, which is likely the more realistic prior for this application, the maximumFOM drops below 0.95 asp L approachesp U , or the precision approaches one. Hence, it is not always possible to achieve the desired level of accuracy for all precisions, short of increasing the sample size; illustrating the trade of between the two.
In this paper, we describe the basic framework for performing Bayes-LQAS, using as an example an application to acute malnutrition. The benefits of using such a method include the ability to incorporate mild or strong prior beliefs about the underlying distribution, based either on historical data or even expert opinion, and the provision of a principled framework for accumulating data, which can be used in subsequent surveys to inform decision making.
Further, B-LQAS allows for the investigator to make probabilistic statements about the prevalence itself, given the outcome of the classification procedure, which classicalLQAS does not. Using theFOM allows for the selection of a design with optimala priori probabilities of correct classification.
We also see the inherent tradeoff between accuracy and precision. This tradeoff is not unique to the Bayesian framework, of course. Indeed, it is this very tradeoff that motivates the use of upper and lower thresholds to evaluate error in the classicalLQAS framework. This is due to the fact that it is impossible to make completely accurate classifications for all values of p, barring an infinite sample or a complete census. An important aspect of this tool which we have not discussed is its potential as a routine tool for monitoring population health. Indeed, the Figure of Merit approach can be easily adapted to incorporate historical or routine data. The above formulation is simple by construction, as we wish only to illustrate the potential of B-LQAS. More complex modeling is required to exploit the full utility of this method for monitoring health programs over time. For use with panel data, or repeated cross-sectional surveys over regular intervals, the extension of the above method needs investigating.
Clearly, the choice of prior distribution is an important element of B-LQAS. One alternative to complete specification of the prior is to let the data influence its shape via empirical Bayes procedures (see , pg. 122-126 for further discussion). Regardless, the prior can have minor or major influence on the chosen design, depending on the situation. In the example we present, the sample size for the survey is relatively large. However, it is not uncommon to use much smaller sample sizes when performingLQAS(n = 19, e.g.) . In this case, the prior distribution will impact the choice of design more heavily. Most importantly, the prior should accurately reflect prior beliefs and should not be chosen to subvert the classification procedure.
The authors thank for their contributions Joseph Valadez, Megan Deitchler and Bethany Hedt, who commented on an early version of this manuscript. The project described was supported by Award Number R56EB006195 from the National Institute of Biomedical Imaging and Bioengineering and the Office of The Director, National Institutes of Health (OD). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of Biomedical Imaging And Bioengineering or the National Institutes of Health.
- Berger J: Statistical Decision Theory and Bayesian Analysis. Springer, second; 1985.View ArticleGoogle Scholar
- Deitchler M, Deconinck H, Bergeron G: Precision, time, and cost: a comparison of three sampling designs in an emergency setting. Emerg Themes Epidemiol. 2008, 5: 6. 10.1186/1742-7622-5-6PubMed CentralView ArticlePubMedGoogle Scholar
- Deitchler M, Valadez J, Egge K, Fernandez S, Hennigan M: A field test of three LQAS designs to assess the prevalence of acute malnutrition. International Journal of Epidemiology. 2007, 36 (4): 858. 10.1093/ije/dym092View ArticlePubMedGoogle Scholar
- Bilukha O: "Old" and "new" cluster designs in emergency field surveys: in search of a one-fits-all solution. Emerging Themes in Epidemiology. 2008, 5: 7. 10.1186/1742-7622-5-7PubMed CentralView ArticlePubMedGoogle Scholar
- Bilukha O, Blanton C: Interpreting results of cluster surveys in emergency settings: is the LQAS test the best option?. Emerging Themes in Epidemiology. 2008, 5: 25. 10.1186/1742-7622-5-25PubMed CentralView ArticlePubMedGoogle Scholar
- Robertson SE, Valadez JJ: Global review of health care surveys using lot quality assurance sampling (LQAS), 1984-2004. Social Science & Medicine. 2006, 63 (6): 1648-1660.View ArticleGoogle Scholar
- Brush G: A Comparison of Classical and Bayes producer's risk. Technometrics. 1986, 28: 37-46. 10.2307/1269605View ArticleGoogle Scholar
- Sharma K, Bhutani R: A comparison of classical and Bayes risks when the quality varies randomly. Microelectronics and Reliability. 1992, 32: 493-495. 10.1016/0026-2714(92)90479-5View ArticleGoogle Scholar
- Fan D: Bayesian acceptance sampling scheme for pass-fail components. Communications in Statistics-Theory and Methods. 1991, 20: 2351-2355. 10.1080/03610929108830637View ArticleGoogle Scholar
- Sheng Z, Fan D: Bayes attribute acceptance-sampling plan. IEEE Trans Reliab. 1992, 41: 307-309. 10.1109/24.257799View ArticleGoogle Scholar
- Moskowitz H, Tang K: Bayesian variable acceptance-sampling plans: quadratic loss function and step-loss function. Technometrics. 1992, 34: 340-347. 10.2307/1270040View ArticleGoogle Scholar
- Fitzgerald M, Martz H, Parker R: Bayesian Single-level binomial and exponential reliability demonstration test plans. International Journal of Reliabilty, Quality and Safety Engineering. 1999, 6: 123-137. 10.1142/S0218539399000139View ArticleGoogle Scholar
- Easterling R: On the use of prior distributions in acceptance sampling. Annals of Reliability and Maintainability. 1970, 9: 31-35.Google Scholar
- Lattimore P, Baker J, Matheson L: Monitoring Drug use using bayesian acceptance sampling: the illinois experiment. Operations Research. 1996, 44: 274-285. 10.1287/opre.44.2.274View ArticleGoogle Scholar
- Moskowitz H, Plante R, Tang K: Multistage Multiattribute Acceptance sampling in serial production systems. IEEE Trans. 1986, 16: 130-137.View ArticleGoogle Scholar
- Valadez JJ: Assessing Child Survival Programs in Developing Countries: Testing Lot Quality Assurance Sampling. Harvard University Press, 1991.Google Scholar
- Rhoda D, Fernandez SA, Fitch DJ, Lemeshow S: Lqas: User Beware. International Journal of Epidemiology. 20010, 39: 60-68. 10.1093/ije/dyn366View ArticleGoogle Scholar
- Casella G, Berger R: Statisical Inference. 2nd edition. Duxbury/Thomson Learning; 2002.Google Scholar
- Centre for Research on the Epidemiology of Disasters (CRED), Universite catholique de Louvain: CE-DAT: The Complex Emercency Database. 2009, Brussels, Belgium,http://www.cedat.orgGoogle Scholar
- Wang YG, Leung DHY, Li M, Tan SB: Bayesian designs with frequentist and Bayesian error rate considerations. Stat Methods Med Res. 2005, 14 (5): 445-456. 10.1191/0962280205sm410oaView ArticlePubMedGoogle Scholar
- Colosimo BM, del Castillo E, : Bayesian Process Monitoring, Control and Optimization. Chapman & HALL/CRC; 2007.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.