The coefficient of cyclic variation: a novel statistic to measure the magnitude of cyclic variation
© Fulford; licensee BioMed Central Ltd. 2014
Received: 28 April 2014
Accepted: 9 September 2014
Published: 2 October 2014
Periodic or cyclic data of known periodicity are frequently encountered in epidemiological and biomedical research: for instance, seasonality provides a useful experiment of nature while diurnal rhythms play an important role in endocrine secretion. There is, however, little consensus on how to analysis these data and less still on how to measure association or effect size for the often complex patterns seen.
A simple statistic, readily derived from Fourier regression models, provides a readily-understood measure cyclic variation in a wide variety of situations.
The coefficient of cyclic variation or similar statistics derived from the variance of a Fourier series could provide a universal means of summarising the magnitude of periodic variation.
where θ i is the angle in radians corresponding to the point in the cycle at which the ith point was measured, α i and γ j are the coefficients to be estimated and P is the number of pairs of terms in the truncate Fourier series. The β T x i term is the linear combination of the other covariates (if any) fitted by the regression. Such models are simple to use, may be implemented with almost any statistical software and have found many and varied application (e.g. a thorough presentation by Fernandez et al; possibly the first implementation was by Bliss), although are possibly still not as widely used as they should be. Their form is naturally cyclic and smooth, avoiding the unrealistic steps introduced when the period is discretised. Also, by varying the point at which the Fourier series is truncated, it is possible to determine the degree of detail fitted and avoid over-parameterising the model. Indeed, higher terms represent higher frequencies, which are often noise: by filtering them out the method automatically provides its own smoothing. By contrast, models based on discretising the cycle almost always become less realistic the more parsimonious the model. If, as is often the case, observations are reasonably uniformly distributed across the cycle, all the Fourier terms will be almost orthogonal to one another and to the intercept thus greatly simplifying model building. Furthermore, the Fourier representation is mathematically versatile allowing us under certain circumstances to deconvolve the underlying cyclic pattern when all we can observe is the cumulative effect of exposure to its influence.
All models of periodicity by their very nature require more than one parameter to describe them: at least one parameter each is needed in order to fit phase and amplitude. Fourier regression is no exception. Consequently the extent of periodicity is not generally represented by a single model parameter; a summary statistic needs to be derived from the fitted model. This article is concerned with the search for a suitable statistic to summarise and compare the amplitude of cyclic patterns.
A common choice of statistic to measure the magnitude of cyclic patterns is the crude difference between the maximum and minimum values of the mean. While this has its uses it also has its drawbacks: it is not always as straightforward as it may seem to locate and measure the extrema accurately or to estimate the standard error of the difference. It is also focuses on one narrow aspect of the periodic function and ignores information over much of the cycle.
Another obvious choice, at least for continuous dependent variables, would be the partial R-squared. Pewsey et al. review a number of exotic correlation coefficients devised for circular data. These, however, essentially restrict their attention to the first pair of Fourier terms and do not take account of covariates. R-squared can be thought of as the variance of the fitted values expressed on the scale of the overall variance of the dependent variable in the sample. While this can be a logical scale to work on, it can also present a number of problems of interpretation. Often a large component of the variance is due to measurement imprecision; a scale based on the arbitrary degree of noise can have little meaning. Furthermore both precision of measurement and the underlying variance within the population will often vary between studies thus invalidating direct comparison of R-squared values. The statistical power of studies of cyclic patterns is often greatly improved by observing each individual at several different points in the cycle. In such multi-level designs there is more than one variance to consider and a difficult choice to be made as to which provides the most relevant scale on which to measure the magnitude of the periodicity.
A better choice of scale is therefore needed. We desire something familiar, universal and stable. Simply expressing the explained variance as its square root (i.e. as the standard deviation) places it on the scale of the original variable. That would be familiar and stable but not universally useful: while it might be useful when comparing the same variable measured in different studies, it is usually useless when comparing different variables even within the same study. Another approach to standardising the scale is to divide the standard deviation by mean. The coefficient of variation, frequently used to assess assay precision, is of this form, although obviously in this case it is error rather than explained variation that is being standardised. This approach is particularly appropriate when, as is often the case, proportional changes in the variable in question are important. Such variables are usually analysed in the logarithm.
The ccv is thus a simple function of the parameters in the Fourier regression model. Its standard error and confidence intervals may be readily calculated either using the delta method or the bootstrap (Stata’s nlcom command, for instance, can be used to estimate the statistic and its confidence interval based on the delta method – StataCorp, College Station, TX).
The seasonal variation in vitamin B6 biomarkers among Gambia women
There are numerous ways to exploit the simple (but apparently little known) formula for the variance of a Fourier series given in equation (3). I suggest that in many, probably the great majority, of cases when seasonal or diurnal patterns are investigated in epidemiological studies, the ccv or related statistic will provide a simple and useful measure of the size of periodic variation. Its simplicity has the potential for this approach to provide the universally recognised standard statistics to be reported in such studies.
This work was funded by the UK Medical Research Council.
- Nelson W, Tong YL, Lee JK, Halberg F: Methods for cosinor-rhythmometry. Chronobiologia. 1979, 6: 305-323.PubMedGoogle Scholar
- Fernández JR, Hermida RC, Mojón A: Chronobiological analysis techniques. Application to blood pressure. Phil Trans R Soc A. 2009, 367: 431-445. 10.1098/rsta.2008.0231View ArticlePubMedGoogle Scholar
- Bliss CI: Periodic regression in biology and climatology. Bull Conn Agric Exp Station New Haven. 1958, 615: 1-55.Google Scholar
- Cox NJ: Speaking stata: in praise of trigonometric predictors. Stata J. 2006, 6 (4): 561-579.Google Scholar
- Fulford AJC, Rayco-Solon P, Prentice AM: Statistical modelling of the seasonality of preterm delivery and intrauterine growth restriction in rural Gambia. Paediatr Perinat Epidemiol. 2006, 20 (3): 251-259. 10.1111/j.1365-3016.2006.00714.xView ArticlePubMedGoogle Scholar
- Pewsey A, Neuhäuser M, Ruxton GD: Correlation and Regression. Circular Statistics in R. Oxford: Oxford University Press; 2013, 149-170.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.