Analytic perspective  Open  Published:
Early efforts in modeling the incubation period of infectious diseases with an acute course of illness
Emerging Themes in Epidemiologyvolume 4, Article number: 2 (2007)
Abstract
The incubation period of infectious diseases, the time from infection with a microorganism to onset of disease, is directly relevant to prevention and control. Since explicit models of the incubation period enhance our understanding of the spread of disease, previous classic studies were revisited, focusing on the modeling methods employed and paying particular attention to relatively unknown historical efforts. The earliest study on the incubation period of pandemic influenza was published in 1919, providing estimates of the incubation period of Spanish flu using the daily incidence on ships departing from several ports in Australia. Although the study explicitly dealt with an unknown time of exposure, the assumed periods of exposure, which had an equal probability of infection, were too long, and thus, likely resulted in slight underestimates of the incubation period.
After the suggestion that the incubation period follows lognormal distribution, Japanese epidemiologists extended this assumption to estimates of the time of exposure during a point source outbreak. Although the reason why the incubation period of acute infectious diseases tends to reveal a rightskewed distribution has been explored several times, the validity of the lognormal assumption is yet to be fully clarified. At present, various different distributions are assumed, and the lack of validity in assuming lognormal distribution is particularly apparent in the case of slowly progressing diseases. The present paper indicates that (1) analysis using welldefined short periods of exposure with appropriate statistical methods is critical when the exact time of exposure is unknown, and (2) when assuming a specific distribution for the incubation period, comparisons using different distributions are needed in addition to estimations using different datasets, analyses of the determinants of incubation period, and an understanding of the underlying disease mechanisms.
Background
The incubation period is defined as the time from exposure to onset of disease [1], and when limited to infectious diseases, corresponds to the time from infection with a microorganism to symptom development. According to a rigorous descriptive review [2], historical descriptions of the incubation period can be traced back to the mid16th century when Girolamo Fracastoro (Fracastorius) (1478–1553), an Italian physician, documented for the first time the incubation period of rabies in 1546 [3]. The incubation period of infectious diseases ranges from the order of a few hours, which is common for toxic food poisoning, to a few decades as seen in the case of tuberculosis, AIDS and variant CreutzfeldtJakob disease (vCJD). Since symptom onset reflects pathogen growth and invasion, excretion of toxins and initiation of hostdefense mechanisms, the length of the incubation period varies largely according to the replication rate of the pathogen, the mechanism of disease development, the route of infection and other underlying factors.
During the incubation period of acute infectious diseases, which is subsequently followed by a symptomatic period, it should be noted that the infected host can be infectious. Whereas the incubation and symptomatic periods are distinguished by symptom onset, other epidemiologic terms are distinguished by acquisition of infectiousness. That is, the time from infection to acquisition of infectiousness is referred to as the latent period, which is subsequently followed by the infectious period [4]. These two concepts are clearly separated by definition and are not directly related. The incubation period of infectious diseases offers various insights into clinical and public health practices, as well as being important for epidemiologic and ecological studies.
To enhance our understanding of the incubation period distributions of infectious diseases, it is useful to revisit previous efforts and reassess explicit models. In particular, it is of practical importance to reanalyze historical works to clarify the present day implications. This paper discusses relatively unknown historical efforts, paying particular attention to diseases with an acute course of illness. Previous classic works on models of incubation period are discussed, including the earliest method to estimate the incubation period using incomplete data, the earliest attempt to model the distribution, and estimations of the time of exposure during an outbreak with a common harmful influence and a very brief time of exposure (i.e., a point source outbreak).
Analysis
The usefulness of understanding the incubation period
Before entering into details of historical works on the incubation period, the various uses of the incubation period distribution are briefly discussed. Table 1 summarizes a number of common examples, presenting historical as well as recent major uses [1, 5–27]; however, it is worth noting that this list does not cover all utilities in full.
In clinical practice, the incubation period is useful not only for making rough guesses as to the causes and sources of infection of individual cases [5], but also for developing treatment strategies to extend the incubation period (e.g., antiretroviral therapy for HIV infection [1]) and for performing early projection of disease prognosis when the incubation period is clearly associated with clinical severity due to doseresponse mechanisms (e.g., diseases caused by exotoxin) [6, 7]. Moreover, during an outbreak of a newly emerged directly transmitted disease, the incubation period distribution permits determination of the length of quarantine required for a potentially exposed individual (i.e., by restricting movement of an exposed individual for a duration sufficiently longer than the incubation period) [10]. Further, if the time lag between acquiring infectiousness and symptom onset appears long (i.e., if the incubation period is relatively long compared to the latent period), it implies that isolation measures (e.g, restriction of movement until the infectious individual loses infectiousness) are likely to be ineffective, complicating disease control [1, 11].
Understanding the incubation period distribution also enables statistical estimation of the time of exposure during a point source outbreak [12] as well as a hypothesistesting to determine whether the outbreak has ended [13]; the former is discussed below. The distribution is also useful in statistical approaches of epidemic curve reconstruction and shortterm predictions of slowly progressing diseases; the backcalculation method uses the incubation period to estimate HIV prevalence and project the future incidence of AIDS [14, 15]. During the last decade, this method has also been extended to prion diseases such as Bovine Spongiform Encephalopathy (BSE) [16, 17], vCJD [18–22] and Kuru [23]. Although backcalculation is not discussed in this paper, several rigorous reviews have been published with regard to diseases with a long incubation period [15, 17, 22, 28]. This approach has also recently diverged to quantification of the transmission potential of diseases with an acute course of illness [24] and infectiousness relative to diseaseage [25]. Moreover, in cases such as the short and long incubation periods of Plasmodium vivax malaria in temperate zones, the incubation period also enhances ecological understanding of adaptation strategies; in temperate zones, clearly separate bimodal peaks with approximate lengths of 2 and 50 weeks are observed [26, 27], helping malaria transmissions continue over the winter season when transmission is usually greatly reduced due to seasonal entomologic characteristics.
The earliest model developed using incomplete data
Whereas the incubation period is conveniently extracted from specific data indicating the time of exposure, i.e., experimental inoculation data and case travel histories [2], most infection events are not directly observable for diseases transmitted by nonsexual direct contact. Thus, it is often difficult to determine the incubation period without explicit information of the time of exposure. The majority of epidemiologic data informs us that exposure (i.e., infection) occurred in a defined period, data of which is referred to as interval censored [29]. This is a common concern for acute infectious diseases transmitted by droplets and droplet nuclei and, most noteworthily, was discussed in detail during the outbreak of severe acute respiratory syndrome (SARS) [30, 31]. Previous studies on the population dynamics of influenza tend to make assumptions with regard to the incubation period distribution without employing observed data [32, 33], perhaps mainly due to difficulties in identifying the time of exposure. The incubation period distribution of influenza remained almost unknown until a recent study reanalyzed the data of influenza transmission on an aircraft with a short duration of flight [34, 35]. Assuming Weibull distribution, this study estimated the mean (and standard deviation (SD)) incubation period as 1.48 (0.47) days [35]. Not only was the sample size of the estimate limited (i.e., 37 secondary cases), but since no other estimates are currently available, the present paper revisits a historical study on this topic.
The earliest study concerned with estimating the incubation period of influenza was published by Anderson Gray McKendrick (1876–1943) and J. Morison in the Indian Journal of Medical Sciences in 1919 [36]. Dr. McKendrick, a physician and epidemiologist, applied various mathematical methods to the field of medicine and is a known pioneer in the biomathematics of infectious disease epidemiology [37–39]. Whereas Dr. McKendrick, in collaboration with William Ogilvy Kermack (1898–1970) [40, 41], is relatively well known as the first to suggest the deterministic epidemic model given by differential equations, his analysis of the incubation period of Spanish flu preceded this, and remains relatively unknown even among specialists (see Online Additional File 1 for the original). Except for this work, no other historical study on influenza has explicitly accounted for the unknown time of exposure or identified the time of exposure in a specific setting (as in the above mentioned study documenting transmission on an aircraft [34, 35]).
In Dr. McKendrick's study, an attempt was made to infer the incubation period of pandemic influenza using the daily incidence of cases on ships departing, with incubating individuals, from several ports in Australia. The incidence was recorded according to the date of voyage after departure. The original epidemiologic data was based on observations of 92 departing voyages, summarized by Dr. John Howard Lidgett Cumpston (1880–1954), Director of Quarantine of the Commonwealth of Australia [42] (the original material is available online [43]). In this dataset, onset of 64, 17, 5 and 2 cases was observed on the 1st, 2nd, 3rd and 4th day of voyage, i.e., after departure, on the documented ships (Figure 1). No influenza case developed symptoms on or after the 5th day of voyage and the observed cases were thought to have been exposed to influenza before departure. Since the data for each voyage mainly included only a few initial cases that developed influenza on board, it is assumed that potential secondary transmission on board was negligible, and potential asymptomatic transmission was also ignored (detailed information on the observed and excluded secondary transmissions are documented in the original [43] and Dr. McKendrick also addressed the issue of secondary transmission by limiting the number of cases per ship). Further technical details are given in the Additional File 2.
Using the data in Figure 1 (with a total of N cases), the number of cases, G(t), t days after departure was modeled as:
where f(t) and F(t) are the probability density and cumulative distribution functions of an incubation period of length t (see Additional File 2). From this, Dr. McKendrick suggested that the mean incubation period was 32.71 hours, which is consistent with recent estimate [35]. However, this value was likely slightly underestimated, because the model implicitly assumed the possible time of exposure as being from time 0 to infinite before departure; it has been extensively discussed that data assuming long possible periods of exposure is likely to be uninformative. Moreover, in a recent work on SARS concerned with analysis of data with short periods of exposure [44], the equal probability of exposure for each possible date was likely to have overestimated the variance of the incubation period distribution [31]. Thus, to obtain a precise estimate of the incubation period, appropriate censoring methods with welldefined short periods of exposure are needed in addition to a large sample size [30, 45]. However, despite these technical concerns, it is remarkable that Dr. McKendrick was able to estimate the incubation period of pandemic influenza considering the unknown time of exposure in the given data.
Classic rightskewed distribution
After Dr. McKendrick's initial work and his use of implicit assumptions to determine the incubation period distribution, John R. Miner (1892unknown), a biologist and epidemiologist at Johns Hopkins University, is believed to have documented the first explicit model of the incubation period distribution [46]. Dr. Miner collected epidemic records of several outbreaks of typhoid fever, claiming that the length of the incubation period clearly differs by source of infection (i.e., comparing water and foodborne outbreaks, he found that the foodborne outbreaks had a much shorter incubation period, most likely reflecting doseresponse phenomena). During his analysis, Dr. Miner paid close attention to variance in the incubation period, describing a distribution that always skewed to the right. In calculating "moments" of the incubation period in a waterborne outbreak at the Old Salem Chautauqua, 1916, he used the following equation to explain the epidemic curve:
where y and x are the expected number of cases and time after exposure, respectively (Figure 2). The general form of eqn. (2) is referred to as Pearson's type I distribution, which is given by [47]:
where m_{1}/a_{1} = m_{2}/a_{2}. During the early 20th century, it was deemed useful to apply Karl Pearson's (1857–1936) "system of frequency curves" to observed data, because the parameters could be arithmetically obtained from moments determined by the descriptive statistics; a "moment" refers to the expected value of a positive integral power of a random variable (i.e., the n th moment of a distribution is the expected value of the n th power of the deviation from a fixed value). Among Pearson's curves, type I distribution is the most standard and relatively flexible, and can realize rightskewed distribution [47]. Although no other works concerned with models of the incubation period have been identified, Major Greenwood (1880–1949) applied Pearson's type III distribution to the distribution of the serial interval (i.e., the time from symptom onset in a primary case to symptom onset in a secondary case [48]) of measles within a number of households [49].
Lognormal distribution proposed by Philip Sartwell
The epidemiologist Philip E. Sartwell (1908–1999), who previously acted as chairman of the Department of Epidemiology, Johns Hopkins School of Hygiene and Public Health, contributed most to the foundation of incubation period distribution modeling [50]. Dr. Sartwell initially found that the incubation period of acute infectious diseases tends to follow lognormal distribution [12], and applied the distribution to various diseases [51, 52]. Observing that the distributions often skewed to the right, Dr. Sartwell suggested the use of two parameters (i.e., an estimated "median", which is also the geometric mean due to the characteristics of lognormal distribution, and a "dispersion factor" as a measure of variability) rather than the sample mean and standard deviation. Lognormal distribution has a probability density function (pdf) of:
for x > 0, where μ and σ are the mean and standard deviation of the variable's logarithm [53]. The coefficient of variation (CV), a dimensionless number, is a measure of dispersion of the distribution given by:
Figure 3 shows the frequency distributions of the incubation periods of measles and poliomyelitis based on careful observations of the times of exposure and onset [54, 55] (the maximum likelihood method was used to obtain Fig. 3 and will be discussed later). Both incubation periods were reasonably generalized using lognormal distributions, yielding maximum likelihood estimates of μ and CV of 2.47 log(days) and 28.0% and 2.37 log(days) and 47.4%, respectively. The goodnessoffit to lognormal distribution was then visually assessed by drawing lognormal quantile plots (Figs. 3B and 3D).
Even at present, it is frequently assumed that the incubation period of acute infectious diseases follows lognormal distribution [25, 56, 57]. Using the lognormal assumption for incubation period, Dr. Sartwell further developed a method to estimate the time of exposure during a point source outbreak [52]. Since the contribution of Dr. Sartwell has been revisited several times elsewhere [2, 58] and is relatively well known among experts in this field, similar and directly relevant models proposed by Japanese epidemiologists are discussed in the following.
Lognormal models proposed by Japanese epidemiologists
Dr. Sartwell's suggestion on the tendency for the incubation period to follow lognormal distribution largely influenced early theoretical epidemiologic studies in Japan, especially those related to estimations of the time of exposure during a point source outbreak. The earliest Japanese work appeared immediately after Dr. Sartwell's first publication and was conducted by Takeshi Hirayama (1923–1995), an epidemiologist who, later in life, worked mainly on the epidemiology of various cancers [59, 60]. The theoretical basis of Dr. Hirayama's method is illustrated in Figure 4, the logic of which is explained in the following.
Since all cases in a point source outbreak share the same time of exposure, the epidemic curve, which is drawn according to the time of onset (i.e., incidence), is equivalent to the incubation period distribution (Figure 4). Suppose that the median point of the case frequency was observed x days after exposure and, further, that there are α percentile points on both sides of the observed distribution (upper and lower percentiles α) with the distances from the median to both percentiles points being a and b days, respectively, the following relationship is given (because the logarithm follows normal distribution):
This is rearranged as:
Consequently, the time of exposure can be inferred using the distance from the time of exposure to the median, x, by taking the distances to any equal percentiles on both sides:
This estimator is theoretically the same as that suggested by Dr. Sartwell in his later work [52]. Although this model can theoretically assume any α (for 0 <α < 0.5), Dr. Hirayama implicitly suggested the use of α = 0.16 to obtain a precise estimate of x and small SD, but this suggestion was made based on observational experience alone and analytical expression for the SD was unfortunately lacking. Since recall bias is unavoidable in retrospective epidemiologic studies of food poisoning requiring huge efforts of food traceback [61], this method appears to be very useful in determining the most plausible time of exposure and narrowing down the amount of information to be traced. A similar method has been applied to the epidemiology of cancer and other chronic diseases [62, 63].
Another lognormal assumption was made by a research group on Theoretical Epidemiology at Osaka City University Medical School, mainly and initially led by Kazuya Horiuchi and Hiroshi Sugiyama [64, 65]. The methodology has been frequently applied to field data in Japan [66, 67] and is relatively well known among Japanese public health workers [68–70]. Dr. Horiuchi examined the validity and precision of Dr. Hirayama's method using Monte Carlo simulations, claiming that the method could be improved further [71] and suggesting that the incubation period should be assumed to follow "noncentral" lognormal distribution when an epidemic curve is used [64]. That is, although Dr. Hirayama used the distance from the time of exposure to the median (x in eqn. (6)), this is unknown information in field observations, and thus, Dr. Horiuchi and his colleagues suggested the use of xC, where C is the time of exposure. This permitted the more convenient use of calendar time. For example, let X be a random variable following noncentral lognormal distribution, ln(XC) should follow a normal distribution, N(μ, σ ^{2}), and consequently, the following t becomes a linear function of ln(XC):
When we assume that the random variable X is a function of t, eqn.(9) can be rearranged as:
Further, considering different values of t, e.g., t+h, yields:
Using eqns.(10) and (11), an estimate of C was obtained by graphically plotting these two functions on vertical and horizontal axes, respectively, and then finding the intersect. Estimation of the time of exposure using similar assumptions was extensively discussed in Japan during the 1950s and 60s. These discussions included the following: (i) the definition of the incubation period (e.g., which to use as the time of onset during a foodborne outbreak, the onset of diarrhea or fever? [72, 73]), (ii) extension of the estimation method when the data is truncated [68], (iii) the influence of host and pathogenrelated factors and routes of infection on the incubation period [74], and (iv) outbreaks that include cases resulting from humantohuman secondary transmissions (e.g., shigellosis [75]).
More modern studies employing lognormal distribution
Although the studies described above have offered useful and practical methods based on an understanding of the characteristics of lognormal distribution, the classic methods likely included sampling errors and did not achieve acceptable precision. Indeed, it has been pointed out that the estimates obtained using the methods of Drs. Sartwell and Hirayama largely depend on optional percentile points, α [76], while that proposed by Dr. Horiuchi and his colleagues is also thought to be highly sensitive to an optional value, h [77]. Thus, estimates of the time of exposure should be addressed statistically by precise solution of the threeparameter lognormal distribution [78, 79]. Accordingly, in line with this, the maximum likelihood method was proposed [77, 80, 81]. Although Dr. Hill was the first to document the application of the maximum likelihood method [80], it unfortunately remained relatively unknown, especially among Japanese epidemiologists, until Toshiro Tango, a statistician at the National Institute of Public Health, Japan, attempted to propagate the method and propose reasonable estimators during the 1990s [77, 81]. Let γ be the time of exposure, the pdf of the threeparameter lognormal distribution is given by:
for x > γ . Other parameters are as in eqn.(4). The likelihood function is given by the pdf:
where n is the total number of cases observed in an outbreak. Although maximum likelihood estimates of γ, μ and σ are obtained by minimizing the negative logarithm of eqn.(13), it is often the case that the iteration does not converge [82], and thus, Dr. Tango proposed his estimators [77, 81]. Assuming that γ is known, maximum likelihood estimators of μ and σ are given by:
Using these, the maximum log likelihood is given as a function of γ :
which is the profile likelihood of γ ; the estimate of γ maximizes eqn.(16). A Bayesian method was also proposed by Dr. Hill, in addition to the maximum likelihood method [80].
The validity of a lognormal assumption
Despite rigorous studies, it should be noted that we have limited explicit explanations for the biological validity of assuming lognormal distribution for the incubation period. The fundamental biological reason to assume lognormal distribution is related to an inoculation study of ectromelia virus (mouse pox) [83], which suggested exponential growth of pathogens within the host during the initial phase. Another similar study suggested that a fixed threshold likely exists when the host response is observed [84]. Based on these findings, pathogen growth in inoculation experiments was modeled using the birthdeath process, supporting right skewed distribution of the incubation period and its long tail [85–87]. Also, given similar results from further birthdeath process models [76, 88] and another previous model [89], what we have learnt to date can be described as follows: if the growth rate of a microorganism is implicitly assumed to follow normal distribution, and if there is a fixed threshold of pathogen load at which symptoms are revealed due to the host response, exponential growth of microorganisms should result in an incubation period sufficiently approximated by lognormal distribution.
Given the above reasonable explanations, a previous Japanese study examined 86 outbreak records for which the date of infection was known and the population exposed was homogeneous [90]. By assessing the goodnessoffit, 61 out of the 86 examples (70.9%) were accepted as lognormal at a 5% level of significance or better, from which it was concluded that, in general, lognormal distribution represents the incubation period of acute infectious diseases [90, 91]. Through such efforts, the validity of the lognormal assumption has been supported by the accumulated experience of Dr. Sartwell and the above mentioned Japanese epidemiologists. It may also be true that the lognormal distribution was preferred because of its statistical usefulness (as described in the above Japanese study). However, the hostdefense mechanism, which is almost entirely responsible for symptom onset, was later shown to be far more complex than previously expected. For example, fever is induced by very complex reactions and by several factors including circulating cytokines such as interluekin2 [92]. Thus, whereas lognormal distribution may be applied to the incubation periods of many acute infectious diseases, it is necessary to bear in mind that the assumption is supported only by previous experience.
A further critique of the lognormal assumption
Until recently, the validity of assuming lognormal distribution has not been explicitly compared with that of other distributions. As discussed above, Weibull distribution with a threshold (i.e., three parameter Weibull distribution) was assumed for the incubation period of influenza [35]. Such study indicates that a simple lognormal assumption does not always hold in practice. Other studies have assumed gamma distribution for the incubation periods of SARS and smallpox [24, 30, 93–95], and regarding the latter, lognormal distribution has also been assumed [25, 96]. Figure 5 compares the quantile plots of lognormal and gamma distribution for the incubation period of smallpox, showing that both fit almost equally well with the observed data. For both distributions, the χ^{2} goodnessoffit test revealed no significant deviation between the observed and predicted values (χ^{2}_{12} = 11.6, p = 0.48 and χ^{2}_{12} = 16.8, p = 0.16 for lognormal and gamma distributions, respectively). However, twoparameter Weibull distribution did not represent well the probability density functions of the incubation period (p < 0.001). These discussions imply that comparisons using different distributions are needed; it is important to at least compare the goodnessoffit of different and arbitrarily chosen distributions for acute infectious diseases.
The validity of lognormal assumption is particularly lacking for slowly progressing diseases. One important reason for this is that the mechanisms of disease development for AIDS and prion diseases, for example, are far more complicated than those of acute infectious diseases. In the case of AIDS, where Weibull distribution is frequently assumed for the incubation period [15], symptom onset is induced by immunodeficiency resulting from HIV infection followed by various opportunistic infections. For BSE and vCJD, various distributions have been assumed for the backcalculation, permitting some uncertainty analyses [19, 20, 22, 97]. Although the disease mechanisms of vCJD are yet to be clarified, considering withinhost dynamics it is evident that the incubation period cannot be explained by the above simple explanation [98, 99]. That is, for these diseases, the above mentioned explanation for the lognormal assumption is not justified, and thus, the choice of distribution for the incubation period needs to be carefully assessed using sensitivity and uncertainty analyses. Indeed, various right skewed distributions are often used in sensitivity analysis, revealing whether or not the final results depend on the arbitrarily chosen standard distribution for the incubation period [19, 20, 97].
Two conclusions can be drawn from the above discussions. First, the lognormal assumption does not always hold. Thus, as far as we continue to rely on observed frequencies and arbitrarily chosen specific distributions, it is essential that comparisons using different distributions are made; any assumptions should be explicitly evaluated by means of significance tests and visual assessments. Second, the biological validity of assuming specific distributions for the incubation period remains an open question [100, 101], and thus, further information is needed. For example, withinhost dynamics would help clarify disease onset mechanisms in the most explicit way [102]. Moreover, if information associated with withinhost dynamics is not available, an accumulation of distributions obtained using different datasets would be of interest, as would examination of various characteristic factors (e.g., doseresponse mechanisms [6, 7, 9], and variable susceptibility due to age, race and genetic factors (for example, see [45, 94])).
Conclusion
The present study revisited previous works concerned with models of the incubation period of acute infectious diseases. In particular, the following were highlighted: (i) the earliest modeling effort conducted using incomplete data of a pandemic influenza, (ii) the explicit distribution of the incubation period, (iii) the application of a lognormal assumption to estimations of the time of exposure during a point source outbreak, and (iv) the validity of assuming lognormal distribution for the incubation period. Although it was not highlighted in the present paper, Norman T. J. Bailey also formed a framework using a chain binomial model, which is useful for household transmission data [103, 104]. This method estimates the incubation period as the sum of the mean latent period, which follows normal distribution, and a further fixed infectious period; however, the estimated period does not precisely imply the incubation period, but rather is closer to what is presently referred to as the serial interval [48, 105]. That is, the incubation period that can be extracted from household transmission data remains to be clarified.
The lessons that can be learnt from the presented discussion are as follows: (I) although it is historically remarkable that the incubation period of pandemic influenza was assessed based on an explicit understanding of an unknown time of exposure, the assumed periods of exposure were too long and equal probability of exposure was assumed for each possible date. Welldefined short periods of exposure are needed to decipher the incubation period distribution using appropriate statistical methods. Taking this point into account will be critically important in estimating the incubation period of newly emerging diseases in the future. (II) The epidemiologic usefulness of the lognormal assumption was highlighted with respect to the basic characteristics of lognormal distribution, but this assumption is likely to remain unwarrantable until details of disease mechanisms are fully clarified; thus, this assumption may be merely an approximation of the rightskewed distribution. For example, considering the mechanisms of disease development, the lognormal assumption does not hold for HIV/AIDS and prion diseases. However, this limitation of the lognormal assumption does not imply that such approximation of the incubation period distribution is meaningless. Rather, it suggests that when parametric models are assumed, it is at least necessary to compare the goodnessoffit for several distributions in order to overcome some of the uncertainty. Various datasets on the same disease would also help assess the uncertainty. Further, it would be informative if the determinants could be clarified even by simple stratifications (e.g., with respect to sex, age and genetic factors). Ideally, assumptions in the future should be supported by a detailed understanding of the underlying disease mechanisms provided by observations of withinhost dynamics. Since the incubation period of infectious diseases is directly relevant to prevention and control, and because such knowledge can enhance our theoretical understanding of the spread of disease, further clarifications of the above points are deemed necessary.
Abbreviations
 AIDS:

Acquired Immunodeficiency Syndrome
 CV:

Coefficient of variation
 HIV:

Human Immunodeficiency Virus
 pdf:

Probability density function
 SARS:

Severe Acute Respiratory Syndrome
 vCJD:

variant CreutzfeldtJakob disease
References
 1.
Brookmeyer R: Incubation period of infectious diseases. In Encyclopedia of Biostatistics Edited by: Armitage P, Colton T. New York: Wiley; 1998:20112016.
 2.
Armenian HK, Lilienfeld AM: Incubation period of disease. Epidemiol Rev. 1983, 5: 115.
 3.
Fracastorii H: De sympathia et antipathia rervm liber vnvs. De contagione et contagiosis morbis et eorum curatione, Libri III. Venetiis: apud heredes Lucaeantonij Iuntae Florentini; 1546 (in Latin. Translation and notes by Wright WC: On Contagion, Contagious Diseases and Their Cure) New York: GP Putnam and Sons; 1930.
 4.
Anderson RM, May RM: Infectious Diseases of Humans: Dynamics and Control Oxford: Oxford University Press;; 1991.
 5.
Mandell GL, Bennett JE, Dolin R: Principles and Practice of Infectious Diseases 6th edition. Philadelphia, PA: Elsevier Churchill Livingstone; 2004.
 6.
Tateno I: Incubation period and the initial symptoms of tetanus: A clinical assessment of the problem of the passage of tetanus toxin to the central nervous system. Jpn J Exp Med. 1963, 33: 149158.
 7.
Nishiura H: Incubation period as a clinical predictor of botulism: analysis of previous izushiborne outbreaks in Hokkaido, Japan, from 1951 to 1965. Epidemiol Infect. 2007, 135: 126130. 10.1017/S0950268806006169
 8.
Glynn JR, Bradley DJ: The relationship between infecting dose and severity of disease in reported outbreaks of Salmonella infections. Epidemiol Infect. 1992, 109: 371388.
 9.
Nishiura H, Halstead SB: Natural history of dengue virus (DENV)1 and DENV4 infections: reanalysis of classic studies. J Infect Dis. 2007, 195: 10071013. 10.1086/511825
 10.
Farewell VT, Herzberg AM, James KW, Ho LM, Leung GM: SARS incubation and quarantine times: when is an exposed individual known to be disease free?. Stat Med. 2005, 24: 34313445. 10.1002/sim.2206
 11.
Fraser C, Riley S, Anderson RM, Ferguson NM: Factors that make an infectious disease outbreak controllable. Proc Natl Acad Sci USA. 2004, 101: 61466151. 10.1073/pnas.0307506101
 12.
Sartwell PE: The distribution of incubation periods of infectious diseases. Am J Hyg 1950, 51:310318. (reprinted in Am J Epidemiol 1995, 141: 386–394)
 13.
Brookmeyer R, You X: A hypothesis test for the end of a common source outbreak. Biometrics. 2006, 62: 6165. 10.1111/j.15410420.2005.00421.x
 14.
Brookmeyer R, Gail MH: A method for obtaining shortterm projections and lower bounds on the size of the AIDS epidemic. J Am Stat Assoc. 1988, 83: 301308. 10.2307/2288844. 10.2307/2288844
 15.
Brookmeyer R, Gail MH: AIDS Epidemiology: A Quantitative Approach (Monographs in Epidemiology and Biostatistics) New York: Oxford University Press; 1994.
 16.
Anderson RM, Donnelly CA, Ferguson NM, Woolhouse ME, Watt CJ, Udy HJ, MaWhinney S, Dunstan SP, Southwood TR, Wilesmith JW, Ryan JB, Hoinville LJ, Hillerton JE, Austin AR, Wells GA: Transmission dynamics and epidemiology of BSE in British cattle. Nature. 1996, 382: 779788. 10.1038/382779a0
 17.
Donnelly CA, Ferguson NM, Ghani AC, Anderson RM: Extending backcalculation to analyse BSE data. Stat Methods Med Res. 2003, 12: 177190. 10.1191/0962280203sm337ra
 18.
Cousens SN, Vynnycky E, Zeidler M, Will RG, Smith PG: Predicting the CJD epidemic in humans. Nature. 1997, 385: 197198. 10.1038/385197a0
 19.
Valleron AJ, Boelle PY, Will R, Cesbron JY: Estimation of epidemic size and incubation time based on age characteristics of vCJD in the United Kingdom. Science. 2001, 294: 17261728. 10.1126/science.1066838
 20.
Huillard d'Aignaux JN, Cousens SN, Smith PG: The predictability of the epidemic of variant CreutzfeldtJakob disease by backcalculation methods. Stat Methods Med Res. 2003, 12: 203220. 10.1191/0962280203sm328ra
 21.
Ghani AC, Ferguson NM, Donnelly CA, Anderson RM: Predicted vCJD mortality in Great Britain. Nature. 2000, 406: 583584. 10.1038/35020688
 22.
Ghani AC, Ferguson NM, Donnelly CA, Anderson RM: Shortterm projections for variant CreutzfeldtJakob disease onsets. Stat Methods Med Res. 2003, 12: 191201. 10.1191/0962280203sm327ra
 23.
Huillard d'Aignaux JN, Cousens SN, Maccario J, Costagliola D, Alpers MP, Smith PG, Alperovitch A: The incubation period of kuru. Epidemiology. 2002, 13: 402408. 10.1097/0000164820020700000007
 24.
Eichner M, Dietz K: Transmission potential of smallpox: estimates based on detailed data from an outbreak. Am J Epidemiol. 2003, 158: 110117. 10.1093/aje/kwg103
 25.
Nishiura H, Eichner M: Infectiousness of smallpox relative to disease age: Estimates based on transmission network and incubation period. Epidemiol Infect 2007 in press. (doi: 10.1017/ S0950268806007618)
 26.
Nishiura H, Lee HW, Cho SH, Lee WG, In TS, Moon SU, Chung GT, Kim TS: Estimates of short and long incubation periods of Plasmodium vivax malaria in the Republic of Korea. Trans R Soc Trop Med Hyg. 2007, 101: 338343. 10.1016/j.trstmh.2006.11.002
 27.
Tiburskaja NA, Sergiev PG, Vrublevskaja OS: Dates of onset of relapses and the duration of infection in induced tertian malaria with short and long incubation periods. Bull World Health Organ. 1968, 38: 447457.
 28.
Brookmeyer R: AIDS, epidemics, and statistics. Biometrics. 1996, 52: 781796. 10.2307/2533042
 29.
Little RJA, Rubin DB: Statistical analysis with missing data 2nd edition. New York: John Wiley and Sons; 2002.
 30.
Donnelly CA, Ghani AC, Leung GM, Hedley AJ, Fraser C, Riley S, AbuRaddad LJ, Ho LM, Thach TQ, Chau P, Chan KP, Lam TH, Tse LY, Tsang T, Liu SH, Kong JH, Lau EM, Ferguson NM, Anderson RM: Epidemiological determinants of spread of causal agent of severe acute respiratory syndrome in Hong Kong. Lancet. 2003, 361: 17611766. 10.1016/S01406736(03)134101
 31.
Donnelly CA, Fisher MC, Fraser C, Ghani AC, Riley S, Ferguson NM, Anderson RM: Epidemiological and genetic analysis of severe acute respiratory syndrome. Lancet Infect Dis. 2004, 4: 672683. 10.1016/S14733099(04)011739
 32.
Elveback LR, Fox JP, Ackerman E, Langworthy A, Boyd M, Gatewood L: An influenza simulation model for immunization studies. Am J Epidemiol. 1976, 103: 152165.
 33.
Rvachev LA, Longini IM: A mathematical model for the global spread of influenza. Math Biosci. 1985, 75: 322. 10.1016/00255564(85)900641. 10.1016/00255564(85)900641
 34.
Moser MR, Bender TR, Margolis HS, Noble GR, Kendal AP, Ritter DG: An outbreak of influenza aboard a commercial airliner. Am J Epidemiol. 1979, 110: 16.
 35.
Ferguson NM, Cummings DA, Cauchemez S, Fraser C, Riley S, Meeyai A, Iamsirithaworn S, Burke DS: Strategies for containing an emerging influenza pandemic in Southeast Asia. Nature. 2005, 437: 209214. 10.1038/nature04017
 36.
McKendrick AG, Morison J: The determination of incubation periods from maritime statistics, with particular reference to the incubation period of influenza. Indian J Med Res 1919, 7:364371.
 37.
McKendrick AG, Morison J: The determination of incubation periods from maritime statistics, with particular reference to the incubation period of influenza. Indian J Med Res 1919, 7:364371.
 38.
Dietz K: Introduction to McKendrick (1926). Application of mathematics to medical problems. In Breakthroughs in Statistics: Volume III Edited by: Kotz S, Johnson NL. New York: Springer; 1997:1726.
 39.
Gani J: Anderson Gray McKendrick. In Statisticians of the Centuries Edited by: Heyde CC, Seneta E. New York: Springer; 2001:323327.
 40.
Kermack WO, McKendrick AG: A contribution to the mathematical theory of epidemics. Proc R Soc Lond Ser A 1927, 115:700721. (Reprinted in Bull Math Biol 1991, 53: 33–55)
 41.
Anderson RM: Discussion: the KermackMcKendrick epidemic threshold theorem. Bull Math Biol. 1991, 53: 332. 10.1016/S00928240(05)800394
 42.
Cumpston JHL: Service Publication No. 18. Influenza and maritime quarantine in Australia Melbourne: Commonwealth of Australia, Quarantine Service; 1919.
 43.
FluWeb Historical Influenza Database http://influenza.sph.unimelb.edu.au(accessed on 1 May, 2007)
 44.
Meltzer MI: Multiple contact dates and SARS incubation periods. Emerg Infect Dis. 2004, 10: 207209.
 45.
Cai QC, Xu QF, Xu JM, Guo Q, Cheng X, Zhao GM, Sun QW, Lu J, Jiang QW: Refined estimate of the incubation period of severe acute respiratory syndrome and related influencing factors. Am J Epidemiol. 2006, 163: 211216. 10.1093/aje/kwj034
 46.
Miner JR: The incubation period of typhoid fever. J Infect Dis. 1922, 31: 296301.
 47.
Elderton WP: Frequency Curves and Correlation 4th edition. Washington DC: Harren Press; 1953.
 48.
Fine PE: The interval between successive cases of an infectious disease. Am J Epidemiol. 2003, 158: 10391047. 10.1093/aje/kwg251
 49.
Greenwood M: The infectiousness of measles. Biometrika. 1949, 36: 18.
 50.
The Editors: In Memoriam: Philip E. Sartwell (1908–1999). Am J Epidemiol. 2000, 151: 439
 51.
Sartwell PE: The incubation period of poliomyelitis. Am J Public Health. 1952, 42: 14031408.
 52.
Sartwell PE: The incubation period and the dynamics of infectious diseases. Am J Epidemiol. 1966, 83: 204216.
 53.
Johnson NL, Kotz S: Continuous univariate distributions Volume 1. Boston: Houghton Mifflin Co.;; 1970.
 54.
Goodall EW: Incubation period of measles. Brit Med J. 1931, 1: 7374.
 55.
Casey AE: The incubation period in epidemic poliomyelitis. J Am Med Assoc. 1942, 120: 805807.
 56.
Brookmeyer R, Blades N, HughJones M, Henderson DA: The statistical analysis of truncated data: application to the Sverdlovsk anthrax outbreak. Biostatistics. 2001, 2: 233247. 10.1093/biostatistics/2.2.233
 57.
Brookmeyer R, Johnson E, Barry S: Modelling the incubation period of anthrax. Stat Med. 2005, 24: 531542. 10.1002/sim.2033
 58.
Armenian HK: Invited commentary on "The distribution of incubation periods of infectious disease". Am J Epidemiol. 1995, 141: 385
 59.
Hirayama T: Epidemiology (Ekigaku) Tokyo: Sekibundo Press; 1958. (in Japanese)
 60.
Kanamitsu M, Okada H, Kohno R, Shigematsu I, Hirayama T: Epidemiology and its application (Ekigaku To Sonoouyou) Tokyo: Nanzando; 1966. (in Japanese)
 61.
Majkowski J: Strategies for rapid response to emerging foodborne microbial hazards. Emerg Infect Dis. 1997, 3: 551554.
 62.
Armenian HK, Lilienfeld AM: The distribution of incubation periods of neoplastic diseases. Am J Epidemiol. 1974, 99: 92100.
 63.
Horner RD, Samsa G: Criteria for the use of Sartwell's incubation period model to study chronic diseases with uncertain etiology. J Clin Epidemiol. 1992, 45: 10711080. 10.1016/08954356(92)90147F
 64.
Horiuchi K, Nakai S, Ueshima S, Sugiyama H: Theoretical epidemiologic study on the estimation of the time of exposure (Part 1). Jpn J Public Health (Nippon Koshu Eisei Zasshi) 1956, 3:184186. (in Japanese)
 65.
Horiuchi K, Oki Y, Sugiyama H: Statistical study of the incubation period of acute communicable diseases. Theoretical Epidemiology (Riron Ekigaku). 1959, 6: 517.
 66.
Yamamoto K, Oki Y, Ueshima I, Ueki T: A theoretical, epidemiological estimation of the time of exposure to the source of infection: Application of an estimation method to mass incidence caused by food poisoning. J Osaka City Med Center (Osaka Shiritsu Daigaku Igaku Zasshi) 1958, 7:10401043. (in Japanese with English abstract)
 67.
Yoshida K, Matsui Y, Kimura K: Some questions concerning the practical application to the graphical method of estimating the time of exposure to the source of infection. Jpn J Public Health (Nippon Koshu Eisei Zasshi) 1960, 7:967973. (in Japanese with English abstract)
 68.
Oki Y: Studies on the incubation period of acute infectious diseases from the view point of theoretical epidemiology. J Osaka City Med Center (Osaka Shiritsu Daigaku Igaku Zasshi) 1960, 9:23412368. (in Japanese with English abstract)
 69.
Kondo K: Exposure analysis Estimating the time of exposure from case onset information. J Clin Exp Med (Igaku No Ayumi) 1972, 82:576. (in Japanese)
 70.
Sakamoto K: Incubation period distribution of communicable disease. In Epidemiology and epidemiological model (Ekigaku To Ekigaku Model) Kyoto: Kinpodo; 1985:290298. (in Japanese)
 71.
Horiuchi K: Theoretical epidemiologic study on the estimation of the time of exposure (Part 2). Theoretical Epidemiology (Riron Ekigaku) 1956, 3:3542. (in Japanese)
 72.
Horiuchi K: Speculation on the incubation period. Theoretical Epidemiology (Riron Ekigaku) 1957, 4:916. (in Japanese)
 73.
Mizuno H, Ito A, Hotta I, Morita J: An experimental study on applicable limits of the estimatingmethod for the exposed point of infection. Jpn J Public Health (Nippon Koshu Eisei Zasshi) 1960, 7:391394. (in Japanese with English abstract)
 74.
Kato H: Epidemiological studies on the incubation periods of infectious diseases. II. Influence of the biological factors upon the incubation periods. Sapporo Med J (Sapporo Igaku Zasshi) 1955, 7:260266. (in Japanese with English abstract)
 75.
Kimura K: On the variability in the distribution of incubation periods in the case of mass occurrence of dysentery. Mie Med J. 1960, 10: 87102.
 76.
Meynell GG: Interpretation of distributions of individual response times in microbial infections. Nature. 1963, 198: 970973. 10.1038/198970b0
 77.
Tango T: Maximum likelihood estimation of date of infection in an outbreak of diarrhea due to contaminated foods assuming lognormal distribution for the incubation period. Nippon Koshu Eisei Zasshi 1998, 45:129141. (in Japanese with English abstract)
 78.
Giesbrecht F, Kempthorne O: Maximum likelihood estimation in threeparameter lognormal distribution. J R Stat Soc Ser B. 1976, 38: 257264.
 79.
Cohen AC, Whitten BT: Estimation in the threeparameter lognormal distribution. J Am Stat Assoc. 1980, 75: 399404. 10.2307/2287466. 10.2307/2287466
 80.
Hill BM: The threeparameter lognormal distribution and Bayesian analysis of a pointsource epidemic. J Am Stat Assoc. 1963, 58: 7284. 10.2307/2282955. 10.2307/2282955
 81.
Tango T: Topics II: Foodborne outbreak caused by Esherichia coli O157. In Introduction to Statistical Model (Tokei Model) Tokyo: Asakura Publishing Co; 2000:617. (in Japanese)
 82.
Cohen AC: Threeparameter estimation. In Lognormal Distribution Theory and Applications Edited by: Crow EL, Shimizu K. New York: Marcel Dekker; 1988:113137.
 83.
Fenner F: The pathogenesis of the acute exanthems. An interpretation based upon experimental investigation with mousepox (infectious ectromelia of mice). Lancet. 1948, ii: 915920. 10.1016/S01406736(48)915992. 10.1016/S01406736(48)915992
 84.
Meynell GG, Meynell EW: The growth of microorganisms in vivo with particular reference to the relation between dose and latent period. J Hyg (Lond). 1958, 56: 323346.
 85.
Gart JJ: Some stochastic models relating time and dosage in response curves. Biometrics. 1965, 21: 583599. 10.2307/2528543
 86.
Williams T: The distribution of response times in a birthdeath process. Biometrika. 1965, 52: 581585.
 87.
Williams T: The basic birthdeath model for microbial infection. J R Stat Soc Ser B. 1965, 27: 338360.
 88.
Armitage P, Meynell GG, Williams T: Birthdeath and other models for microbial infection. Nature. 1965, 207: 570572. 10.1038/207570a0
 89.
Puri PS: A class of stochastic models of response after infection in the absence of defense mechanism. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability (June 21July 18, 1965 and December 27, 1965January 7, 1966) Edited by: le Cam LM, Neyman J. Los Angeles: University of California Press; 1967:511535.
 90.
Kondo K: The lognormal distribution of the incubation time of exogenous diseases. Genetic interpretations and a computer simulation. Jinrui Idengaku Zasshi. 1977, 21: 217237.
 91.
Lawrence RJ: The lognormal as eventtime distribution. In Lognormal Distribution Theory and Applications Edited by: Crow EL, Shimizu K. New York: Marcel Dekker; 1988:211228.
 92.
Netea MG, Kullberg BJ, Van der Meer JW: Circulating cytokines as mediators of fever. Clin Infect Dis. 2000, 31: S178S184. 10.1086/317513
 93.
Kuk AY, Ma S: The estimation of SARS incubation distribution from serial interval data using a convolution likelihood. Stat Med. 2005, 24: 25252537. 10.1002/sim.2123
 94.
Cowling BJ, Muller MP, Wong IO, Ho LM, Louie M, McGeer A, Leung GM: Alternative methods of estimating an incubation distribution: Examples from severe acute respiratory syndrome. Epidemiology. 2007, 18: 253259. 10.1097/01.ede.0000254660.07942.fb
 95.
Riley S, Ferguson NM: Smallpox transmission and control: spatial dynamics in Great Britain. Proc Natl Acad Sci USA. 2006, 103: 1263712642. 10.1073/pnas.0510873103
 96.
Nishiura H, Eichner M: Interpreting the epidemiology of postexposure vaccination against smallpox. Int J Hyg Environ Health 2007 in press. (doi:10.1016/j.ijheh.2007.01.029)
 97.
Ferguson NM, Donnelly CA, Woolhouse ME, Anderson RM: The epidemiology of BSE in cattle herds in Great Britain. II. Model construction and analysis of transmission dynamics. Philos Trans R Soc Lond B Biol Sci. 1997, 352: 803808. 10.1098/rstb.1997.0063
 98.
Nowak MA, Krakauer DC, Klug A, May RM: Prion infection dynamics. Integr Biol. 1998, 1: 315. 10.1002/(SICI)15206602(1998)1:1<3::AIDINBI2>3.0.CO;29. 10.1002/(SICI)15206602(1998)1:1<3::AIDINBI2>3.0.CO;29
 99.
Masel J, Jansen VA, Nowak MA: Quantifying the kinetic parameters of prion replication. Biophys Chem. 1999, 77: 139152. 10.1016/S03014622(99)000162
 100.
Dietz K: Epidemics and rumors: A survey. J R Stat Soc Ser A. 1967, 130: 505528. 10.2307/2982521. 10.2307/2982521
 101.
Morgan BJT, Watts SA: On modelling microbial infections. Biometrics. 1980, 36: 317321. 10.2307/2529985
 102.
Nowak MA, May RM: Virus Dynamics Mathematical Principles of Immunology and Virology New York: Oxford University Press; 2000
 103.
Bailey NTJ: A statistical method of estimating the periods of incubation and infection of an infectious diseases. Nature. 1954, 174: 139140. 10.1038/174139a0
 104.
Bailey NTJ: The mathematical theory of infectious diseases and its applications 2nd edition. London: Griffin; 1975.
 105.
Nishiura H: Epidemiology of a primary pneumonic plague in Kantoshu, Manchuria, from 1910 to 1911: statistical analysis of individual records collected by the Japanese Empire. Int J Epidemiol. 2006, 35: 10591065. 10.1093/ije/dyl091
Acknowledgements
The author thanks Klaus Dietz for useful discussions and Dr. Lance Sanders for collating ref.[42] during the preparation of this study. He received funding support from the Banyu Life Science Foundation International for his research in Germany and from the Japanese Ministry of Education, Science, Sports and Culture in the form of a GrantinAid for Young Scientists (#18810024, 2006).
Author information
Additional information
Competing interests
The author(s) declare that they have no competing interests.
Authors' contributions
HN carried out paper reviews, proposed the study, performed mathematical analyses and drafted the manuscript. The author has read and approved the final manuscript.
Electronic supplementary material
Authors’ original submitted files for images
Rights and permissions
About this article
Received
Accepted
Published
DOI
Keywords
 Influenza
 Incubation Period
 Lognormal Distribution
 Severe Acute Respiratory Syndrome
 Prion Disease