 Analytic perspective
 Open Access
 Published:
Mitigation of biases in estimating hazard ratios under nonsensitive and nonspecific observation of outcomes–applications to influenza vaccine effectiveness
Emerging Themes in Epidemiology volume 18, Article number: 1 (2021)
Abstract
Background
Nonsensitive and nonspecific observation of outcomes in timetoevent data affects event counts as well as the risk sets, thus, biasing the estimation of hazard ratios. We investigate how imperfect observation of incident events affects the estimation of vaccine effectiveness based on hazard ratios.
Methods
Imperfect timetoevent data contain two classes of events: a portion of the true events of interest; and falsepositive events mistakenly recorded as events of interest. We develop an estimation method utilising a weighted partial likelihood and probabilistic deletion of falsepositive events and assuming the sensitivity and the falsepositive rate are known. The performance of the method is evaluated using simulated and Finnish register data.
Results
The novel method enables unbiased semiparametric estimation of hazard ratios from imperfect timetoevent data. Falsepositive rates that are small can be approximated to be zero without inducing bias. The method is robust to misspecification of the sensitivity as long as the ratio of the sensitivity in the vaccinated and the unvaccinated is specified correctly and the cumulative risk of the true event is small.
Conclusions
The weighted partial likelihood can be used to adjust for outcome measurement errors in the estimation of hazard ratios and effectiveness but requires specifying the sensitivity and the falsepositive rate. In absence of exact information about these parameters, the method works as a tool for assessing the potential magnitude of bias given a range of likely parameter values.
Introduction
Outcome measurement errors are common in epidemiological studies and may bias the estimated effects of exposures or interventions on health outcomes. When a binary outcome such as presence/absence of infection is measured with error, the problem is called outcome misclassification [1]. The impact of outcome misclassification on estimation of risk ratios has been studied thoroughly [2,3,4]. Nevertheless, the same lessons cannot be readily adopted when estimating hazard ratios from timetoevent data because imperfectly observed event times do not only affect event counts but may also bias the atrisk times and thus the risk set sizes.
A particular problem arises when estimating vaccine effectiveness as the relative reduction in the infection hazard. If infectioninduced immunity reduces or removes the risk of subsequent infection with the same pathogen, nonsensitive measurement of infection inflates the risk set. For example, influenza is likely to immunise the human host at least temporarily and all infections in a large population are never recorded in practice. Moreover, falsepositive records may occur due to imperfect specificity of diagnostic procedures.
Yang et al. [5] addressed estimation of vaccine effectiveness under nonspecific observation of influenza infection using a subset of acute respiratory infections as a validation set on disease aetiology. An expectation–maximisation algorithm was developed to account for the uncertainty in the aetiology of infections outside the validation set [5]. Although the validation data carried information on the specificity, perfect sensitivity was assumed.
Meier et al. [6] focused on detection of chronic outcomes such as human immunodeficiency virus infection, which if initially missed can still be detected by later testing. A fulllikelihood approach was developed to estimate the hazard ratio under repeated usage of an imperfect laboratory test, based on a proportional hazards (PH) model in discrete time and assuming the test sensitivity and specificity are known [6]. However, this method cannot be applied under imperfect observation of incident events, such as influenza infection, which by standard laboratory tests can only be detected up to one week after symptom onset [7].
The role of nonsensitive and nonspecific observation of incident infection outcomes on the estimation of hazard ratios has thus not been fully covered in previous literature. We here study how outcome measurement errors affect the estimation of vaccine effectiveness based on hazard ratios. We modify the standard partial likelihood under the PH model [8] to adjust for outcome measurement errors in timetoevent data, assuming the sensitivity and the falsepositive rate are known. We explore the magnitude of bias when the measurement errors are not corrected for and evaluate the robustness of effectiveness estimates to misspecification of the sensitivity and the falsepositive rate. We implement the new method in R [9] (see Additional file 1: R script) and use simulated and Finnish register data to show its performance. Our work is motivated by the Finnish policy of estimating influenza vaccine effectiveness each season from register data [10], which do not include influenzanegative test results and thus do not allow for a retrospective design such as the widely used testnegative design [11]. Therefore, we here focus exclusively on cohort studies.
Methods
True and falsepositive events
We consider the sensitivity of outcome measurement as the conditional probability for the true event of interest being recorded in the data. When the sensitivity is less than 1, some true events may not be recorded. Additionally, other events may be mistakenly recorded as events of interest. Such falsepositive records make outcome measurement nonspecific. Here, the true event means influenza infection while falsepositive events are any other (noninfluenza) events incorrectly recorded as influenza.
For any one subject, let the hazard of the true event at time \(t\) be \(\lambda (t)\). We assume that the true event occurs at most once during the study period, and if it occurs, is recorded with sensitivity \(se\). Falsepositive events may occur repeatedly at rate \(\kappa (t)\). Originally, the data may thus comprise more than one event per subject. In the study, however, based on the assumption that the true event is unique, each subject’s followup is set to end at his/her first recorded event or censoring, whichever occurs first. The data under study therefore comprise at most one recorded event per subject (Fig. 1). The event’s status as true or false positive is indistinguishable by observation.
If the true event occurs but is not recorded, the subject’s followup continues beyond the true event time, erroneously lengthening the atrisk time in the study. By contrast, should a falsepositive event occur before the true event, the subject’s followup ends prematurely and the true event time is not part of the data under study (Fig. 1). This also shows that minor violations of the above assumption of uniqueness would not compromise the study.
True and observed survival functions
The true survival function \(S(t)\) is the probability to escape the true event beyond time \(t\). The observed survival function \(\tilde{S}(t)\) is the probability to avoid detection of the true event and the occurrence of any falsepositive events beyond \(t\). Assuming constant sensitivity (\(se\)) and falsepositive rate (\(\kappa\)), the relationship between the survival functions is (cf. Additional file 2: Web Appendix)
The first righthandside term is the complement probability of the true event having occurred and been recorded by \(t\). The second term is the probability of no falsepositive events having occurred by \(t\).
True and observed hazards
Given expression (1), the relation between the hazard \(\tilde{\lambda}(t)\) of the recorded event (observed hazard) and the hazard \(\lambda (t)\) of the true event (true hazard) follows (cf. Additional file 2: Web Appendix):
where the weight \(w(t)\) is defined as
The observed hazard is thus the sum of the hazard of recording the true event and the falsepositive rate. The sensitivity \(se\) accounts for the possibility of not recording the true event. The weight \(w(t)\) equals the ratio of the true survival probability and the survival probability observed in absence of false positives and adjusts for the fact that a true but unrecorded event may have already removed the subject from the study’s risk set before time \(t\). It holds that \(w(t)\le 1\).
Vaccine effectiveness
We compare the true hazards between two groups defined by vaccination as binary exposure. Specifically, \({\lambda }_{0}(t)\) and \({\lambda }_{1}(t)\) denote the true hazards for unvaccinated and vaccinated subjects, respectively, as functions of time since season onset. The estimand of interest is vaccine effectiveness (\(VE\)) defined as the relative reduction in the infection hazard [12]:
In this paper, the true hazards in unvaccinated and vaccinated subjects are assumed to be proportional over time so that the \(VE\) estimand is constant. If \(\kappa =0\), it follows from (2) that
where
and \(v=0\) (unvaccinated) or \(1\) (vaccinated).
Weighted partial likelihood under imperfect sensitivity in absence of false positives
The data under study comprise \(n\) recorded events in a cohort of unvaccinated and vaccinated subjects. The event times of the \(n\) cases are \({{t}_{1}<t}_{2}<\dots <{t}_{n}\). Let \({\tilde{N}}_{0}({t}_{i})\) and \({\tilde{N}}_{1}({t}_{i})\) denote the numbers of unvaccinated and vaccinated subjects in the risk set at \({t}_{i}\). Of note, a subject’s vaccination status may change over time [10].
Under the PH assumption, the hazard ratio and thus \(VE\) can be estimated from complete and perfectly measured timetoevent data by maximising the standard partial likelihood [8]. Here, we adjust the partial likelihood to allow estimation of \(VE\) under imperfect sensitivity. When \({se}_{0}\) and \({se}_{1}\) are known, the partial likelihood of \(VE\) is.
where \({L}_{i}(VE)\) is the conditional probability for the event occurring to case \(i\) given the risk set at \({t}_{i}\), and \({v}_{i}\) is 0 if case \(i\) is unvaccinated and 1 otherwise. Using (3), \({L}_{i}(VE)\) is obtained as
Unlike the standard partial likelihood, (5) depends on weights \({w}_{0}(t)\) and \({w}_{1}(t)\), which correct for the too large risk set following from imperfect sensitivity. Using the Kaplan–Meier estimate \(\widehat{\tilde{S}}(t)\) for \(\tilde{S}(t;se,\kappa =0)\) leads to plugin weights \({\widehat{w}}_{0}(t)\) and \({\widehat{w}}_{1}(t)\) (cf. Additional file 2: Web Appendix):
\(VE\) is estimated by maximising (4) and its standard error (\(SE\)) can be obtained using the Fisher information. If \({se}_{0}={se}_{1}=1\), (4) simplifies to the standard partial likelihood.
Probabilistic deletion of falsepositive events
If falsepositive events occur, i.e. if \(\kappa >0\), semiparametric estimation using the weighted partial likelihood is not directly applicable as the true hazard \({\lambda }_{0}(t)\) does not cancel out affecting expression (5). We propose an approach that retains only a portion of the \(n\) recorded events by approximating the timevarying probability of the recorded event being a true event.
The probability that an event observed at time \({t}_{i}\) is a true event is given by the ratio of the hazard of recording the true event to the observed hazard. This probability is (cf. Additional file 2: Web Appendix)
We suggest approximating \({\tilde{\lambda }}_{v}\left({t}_{i}\right)\) over a short time window (\(\Delta {t}_{i}\)) centred around \({t}_{i}\) as the number of events observed (\({\tilde{D}}_{v,i}\)) per persontime (\(\tilde N_v(t_i)\cdot {\Delta t}_{i}\)) and, hence, an approximation to \({p}_{v}({t}_{i})\) is given by
Subsequently, any event observed at \({t}_{i}\) is retained in the data with probability \({p}_{v}({t}_{i})\), corresponding to censoring events at each \({t}_{i}\) with probability \(1{p}_{v}({t}_{i})\). In analogy to multiple imputation, the above procedure is repeated a number of times to produce replicate data sets. Each resulting set of timetoevent data is analysed as in absence of false positives using the weighted partial likelihood. At the end, the \(VE\) estimates are pooled taking into account the within and the betweenimputation variability [13].
Simulation study
Setup
We conducted a simulation study to assess the performance of the proposed methods. Briefly, true event times were simulated according to hazards \({\lambda }_{0}\left(t\right)\) (unvaccinated) and \({\left(1VE\right)\cdot \lambda }_{0}(t)\) (vaccinated), where \({\lambda }_{0}\left(t\right)\) mimicked the force of infection in a SusceptibleInfectedRemoved epidemic [14] with cumulative risk of 0.25 (alternatively 0.81) over a 196day influenza season (cf. Additional file 2: Web Appendix). Two separate cohorts were considered, comprising 50,000 (30% vaccinated at season onset) and 1,000,000 (50% vaccinated) individuals, corresponding to the cohort sizes of Finnish children and elderly, respectively [10, 15, 16]. \(VE\) was 10%, 30%, 50%, 70% or 90%.
For each individual, observed true events were realised by retaining simulated true events with sensitivities \({se}_{0}\) (unvaccinated) and \({se}_{1}\) (vaccinated). Values \({se}_{0}={se}_{1}=0.04\) were based on a Finnish study of the 2009/10 influenza season [17]. Alternatively, values \({se}_{0}=0.05\) and \({se}_{1}=0.03\) were employed to investigate differential sensitivity. A falsepositive event time was sampled from the exponential distribution with rates corresponding to 2% or 16% of all recorded events in the unvaccinated being falsepositive. The smaller of the observed true and falsepositive event times was used as the recorded event time for the individual.
For each setting (cohort size, \(VE\), \({se}_{0}\), \({se}_{1}\) and \(\kappa\)), 10^{4} repeated datasets were simulated. For each dataset, ten random subsets were created by retaining events with probability \(p(t)\) as in (8). Adjusted \(VE\) estimates were computed with the same values of \({se}_{0}\), \({se}_{1}\) and \(\kappa\) as used in simulation. In addition, naïve \(VE\) estimates were obtained by incorrectly assuming perfect sensitivity (\({se}_{0}={se}_{1}=1\)) and/or absence of false positives (\(\kappa =0\)). Finally, the ten datasetspecific estimates were pooled resulting in 10^{4} estimates of \(VE\) and \(SE\) per setting.
We report the bias as the difference between the mean of \(VE\) estimates (\(\widehat{VE}\)) and the true \(VE\). We compare the mean of the \(SE\) estimates (\(\widehat{SE}\)) with the empirical standard error of the \(VE\) estimates (\({SE}_{\widehat{VE}}\)). The estimation error (\({\sqrt{MSE}}_{\widehat{VE}}\)) is assessed as the rootmeansquared error between the \(VE\) estimates and the true \(VE\). The empirical coverage probability of the 95% confidence interval (CI) was estimated as the percentage of 10^{4} CIs that included the true \(VE\).
Estimation of vaccine effectiveness under imperfect sensitivity in absence of falsepositive events
Tables 1 and 2 show the adjusted and naïve \(VE\) estimates under nondifferential sensitivity (\({se}_{0}={se}_{1}=0.04\)) and differential sensitivity (\({se}_{0}=0.05\), \({se}_{1}=0.03\)), respectively, with \(\kappa =0\) and cumulative risk of 0.25 in the unvaccinated. Table 3 and Additional file 2: Table S1 (see Additional file 2: Web Appendix) show the corresponding estimates under cumulative risk of 0.81.
The adjusted \(\widehat{VE}\)’s are unbiased. Therefore, the estimation error is equal to the standard error (\({\sqrt{MSE}}_{\widehat{VE}}={SE}_{\widehat{VE}}\)). Because the uncertainty in the plugin weights is not taken into account, the standard errors are slightly underestimated (\(\widehat{SE}<{SE}_{\widehat{VE}}\)), leading to smaller than nominal CI coverage probabilities.
Under nondifferential sensitivity, the naïve \(\widehat{VE}\)’s underestimate the true \(VE\). The bias is stronger when the cumulative risk is high (0.81). When the cumulative risk is small (0.25), the estimation errors in the naïve and adjusted estimates are similar. However, as standard errors may be small, even slight biases can lead to poor CI coverage. Under differential sensitivity (\({se}_{0}>{se}_{1}\)) and small cumulative risk, the naïve \(\widehat{VE}\)’s overestimate the true \(VE\). The estimation error in the naïve estimates exceeds the one in the adjusted estimates indicating that the estimation is not robust to gross misspecification of \({se}_{0}\) and \({se}_{1}\). The error attenuates when the cumulative risk is high.
Estimation of vaccine effectiveness under imperfect sensitivity and falsepositive events
Table 4 and Additional file 2: Table S2 (see Additional file 2: Web Appendix) show the adjusted and naïve \(VE\) estimates under nondifferential sensitivity (\({se}_{0}={se}_{1}=0.04\)) and differential sensitivity (\({se}_{0}=0.05\), \({se}_{1}=0.03\)), respectively, with cumulative risk of 0.25 and falsepositive proportion of 2% among unvaccinated. In general, the results correspond to the situation without false positives. The adjusted \(\widehat{VE}\)’s are essentially unbiased. As the naïve \(\widehat{VE}\)’s do not differ much between settings with and without false positives, a falsepositive rate corresponding to a falsepositive proportion of 2% among unvaccinated does not essentially affect the estimation. However, under the higher falsepositive proportion (16%) naïve estimates are biased but adjusted estimation performs well (Table 5).
Influenza vaccine effectiveness in the Finnish elderly
This section presents estimates of influenza vaccine effectiveness in Finnish elderly in 2016/17, a season dominated by influenza subtype A/H3N2. Briefly, a nationwide cohort of individuals aged 65 years and above was monitored through the season (196 days), using data collected as part of healthcare routines. The outcome was laboratoryconfirmed influenza, which is a nonsensitive measurement of influenza infection as not everyone seeks healthcare or is swabbed. The registerbased cohort study design is described in detail elsewhere [10]. For simplicity, we here focus on outcome measurement errors assuming absence of other sources of bias such as exposure measurement errors or confounding.
The cohort totalled 1,160,986 individuals of which 47% were vaccinated during the season. There were 8389 recorded events of which 3346 occurred in vaccinated individuals. Unlike in the simulation study, \(VE\), \({se}_{0}\), \({se}_{1}\) and \(\kappa\) were unknown. The sensitivities (\({se}_{0}\), \({se}_{1}\)) were set at 0.04 (cf. Shubin et al. [17]). The falsepositive rate (\(\kappa\)) was deemed very small and thus approximated as 0.
The estimated cumulative risks over the season were 0.20 (unvaccinated) and 0.16 (vaccinated; Fig. 2a). The linear relation between the log–log transformed survival functions supports the PH assumption (Fig. 2b). The adjusted \(VE\) estimate at \(t=196\) (days) was 23% (95% CI 20–26%; Fig. 2c) similar to the naïve (\({se}_{0}={se}_{1}=1\)) estimate 21% (95% CI 17–24%; Fig. 2d).
The estimates were mainly affected by the ratio \({se}_{1}/{se}_{0}\) unless \({se}_{0}\) was chosen very small (Fig. 2d). Assuming \({se}_{0}={se}_{1}=0.01\), for instance, implied rather unrealistic cumulative risks of 0.81 (unvaccinated) and 0.64 (vaccinated). The corresponding \(VE\) estimate was 37% (95% CI 34%–39%). If vaccinated cases were assumed to be less likely detected (\({se}_{1}/{se}_{0}<1\)), the \(VE\) estimates were smaller than 23%, and vice versa. For example, if \({se}_{1}=0.04\) and \({se}_{0}=0.05\), \(VE\) was 2% with 95% CI from –3% to 6%, indicating that vaccination may not have been effective.
Discussion
Motivated by the Finnish policy of evaluating influenza vaccine effectiveness from register data, we developed a weighted partial likelihood approach with probabilistic deletion of false positives to adjust for outcome measurement errors. A simulation study demonstrated that the new method enables unbiased estimation of hazard ratios from timetoevent data when the underlying sensitivity of outcome measurement and the falsepositive rate are known. In practise, falsepositive rates that are small in relation to the true hazard can be approximated to be zero without inducing bias. Moreover, the analysis of empirical data showed that the method is robust to misspecification of the sensitivity parameters as long as their ratio (\({se}_{1}/{se}_{0}\)) is set correctly and the cumulative risk of the true event is small.
We assumed the influenza vaccine’s mode of action is “leaky”, i.e. that vaccination provides only partial protection [12, 18, 19]. The appropriate effect measure of effectiveness is therefore the relative reduction in the infection hazard, assumed here to be constant over one influenza season. When estimating effectiveness based on the risk ratio, it has previously been shown that bias is determined by the ratio of the two sensitivity parameters [4]. We demonstrated that the same applies when effectiveness is based on the hazard ratio but only if the cumulative risk of the outcome is small, i.e. if the outcome is rare so that the risk set is largely unaffected by the occurrence of events.
Unlike in studies that exclusively refer to sensitivity as performance of a utilised laboratory test (e.g. [2]), registerbased studies (e.g. [20]) use sensitivity in a broader sense as resulting from recording accuracy, healthcare seeking behaviour, swabbing policy and the sensitivity of diagnostic procedures. Based on a priori knowledge on surveillance practice in Finland and register data on laboratoryconfirmed influenza, Shubin et al. [17] estimated that only 1 in 25 infections were ascertained and that the sensitivity varied across age, region and season. We expect that there are also differences by vaccination status but wellfounded values for the sensitivity parameters \({se}_{0}\) and \({se}_{1}\) are not yet available. A study that closely follows a representative sample of the population through an influenza season and continuously validates the individuals’ infection status would be needed.
Falsepositive events result from diagnostic procedures with imperfect specificity. In registerbased studies, the falsepositive rate is additionally influenced by the accuracy of recording, sampling strategy, the rate of noninfluenza but influenzalike illness, and healthcare seeking behaviour. Although the new method allows accounting for timevarying and differential occurrence of false positives (cf. Additional file 2: Web Appendix), the presented simulation study used constant falsepositive rates corresponding to 2% or 16% of all recorded events in the unvaccinated being falsepositive.
The simulation study results show that the impact of relatively small falsepositive rates is negligible. Yang et al. [5] developed an expectation–maximisation algorithm to estimate \(VE\) under nonspecific observation of incident events when all true events are observed and validation data are available. Similarly to our method, their approach employs empirical hazards to address the problem of false positives. While Yang et al. allow repeated events, in our application the atrisk time is censored at the first recorded event. The data might then run short of true events if the falsepositive rate is excessively high.
Conclusion
The presented semiparametric method can be used to adjust for outcome measurement errors in the estimation of hazard ratios and effectiveness but requires specifying the sensitivity and the falsepositive rate. In absence of exact information about these parameters, we consider our method as a tool for assessing the potential magnitude of bias given a range of parameter values, possibly stratified by appropriate covariates. The method would allow adjustment for confounders as in the PH model. Finally, although we considered an infectious disease epidemic, the applicability of the method is wider as long as the PH assumption holds.
Availability of data and materials
The simulated datasets are available from the corresponding author on reasonable request.
Abbreviations
 CI:

Confidence interval
 SE:

Standard error
 VE:

Vaccine effectiveness
References
 1.
Hill HA, Kleinbaum DG. Bias in Observational Studies. In: Armitage P, Colton T, editors. Encyclopedia of Biostatistics. 2nd ed. Hoboken: Wiley; 2005.
 2.
Orenstein EW, De Serres G, Haber MJ, Shay DK, Bridges CB, Gargiullo P, et al. Methodologic issues regarding the use of three observational study designs to assess influenza vaccine effectiveness. Int J Epidemiol. 2007;36(3):623–31.
 3.
Nauta JJ, Beyer WE, Kimp EP. Toward a better understanding of the relationship between influenza vaccine effectiveness against specific and nonspecific endpoints and vaccine effectiveness against influenza infection. Epidemiol Biostatistics Public Health. 2017;14(4):637–46.
 4.
De Smedt T, Merrall E, Macina D, PerezVilar S, Andrews N, Bollaerts K. Bias due to differential and nondifferential disease and exposure misclassification in studies of vaccine effectiveness. PLoS ONE. 2018;13(6):e0199180.
 5.
Yang Y, Halloran ME, Chen Y, Kenah E. A pathway EMalgorithm for estimating vaccine efficacy with a nonmonotone validation set. Biometrics. 2014;70(3):568–78.
 6.
Meier AS, Richardson BA, Hughes JP. Discrete proportional hazards models for mismeasured outcomes. Biometrics. 2003;59(4):947–54.
 7.
Carrat F, Vergu E, Ferguson NM, Lemaitre M, Cauchemez S, Leach S, et al. Time lines of infection and disease in human influenza: a review of volunteer challenge studies. Am J Epidemiol. 2008;167(7):775–85.
 8.
Cox DR. Regression Models and LifeTables. J Roy Stat Soc B. 1972;34(2):187–220.
 9.
R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria; 2019.
 10.
Baum U, Auranen K, Kulathinal S, Syrjänen R, Nohynek H, Jokinen J. Cohort study design for estimating the effectiveness of seasonal influenza vaccines in real time based on register data: the finnish example. Scand J Public Health. 2020;48(3):316–22.
 11.
Sullivan SG, Feng S, Cowling BJ. Potential of the testnegative design for measuring influenza vaccine effectiveness: a systematic review. Expert Rev Vaccines. 2014;13(12):1571–91.
 12.
Halloran ME, Longini IM, Struchiner CJ. Estimability and interpretation of vaccine efficacy using frailty mixing models. Am J Epidemiol. 1996;144(1):83–97.
 13.
Rubin DB. Multiple imputation After 18+ Years. J Am Statist Associat. 1996;91(434):473–89.
 14.
Diekmann O, Heesterbeek JAP. Mathematical epidemiology of infectious diseases: model building, analysis, and interpretation. New York: Wiley; 2000.
 15.
Baum U, Kulathinal S, Auranen K, Nohynek H. Effectiveness of 2 influenza vaccines in nationwide cohorts of Finnish 2yearold children in the seasons 2015–2016 through 2017–2018. Clin Infect Dis. 2020;71(8):e255–61.
 16.
Hergens MP, Baum U, Brytting M, Ikonen N, Haveri A, Wiman A, et al. Midseason realtime estimates of seasonal influenza vaccine effectiveness in persons 65 years and older in registerbased surveillance, Stockholm County, Sweden, and Finland, January 2017. Euro Surveill. 2017;22(8):pii=30469.
 17.
Shubin M, Virtanen M, Toikkanen S, Lyytikäinen O, Auranen K. Estimating the burden of A(H1N1)pdm09 influenza in Finland during two seasons. Epidemiol Infect. 2014;142(5):964–74.
 18.
Smith PG, Rodrigues LC, Fine PE. Assessment of the protective efficacy of vaccines against common diseases using casecontrol and cohort studies. Int J Epidemiol. 1984;13(1):87–93.
 19.
Lipsitch M. Challenges of vaccine effectiveness and waning studies. Clin Infect Dis. 2019;68(10):1631–3.
 20.
Kwong JC, Buchan SA, Chung H, Campitelli MA, Schwartz KL, Crowcroft NS, et al. Can routinely collected laboratory and health administrative data be used to assess influenza vaccine effectiveness? Assessing the validity of the flu and other respiratory viruses research (FOREVER) Cohort. Vaccine. 2019;37(31):4392–400.
Acknowledgements
We thank Elizabeth Halloran of the Fred Hutchinson Cancer Research Center and the University of Washington for her insightful comments that helped to improve the manuscript.
Funding
This work was supported by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 634446.
Author information
Affiliations
Contributions
UB formulated the original research question. The concepts and methods were developed by all authors. UB implemented the methods in R and drafted the manuscript, with revisions suggested by SK and KA. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
The presented analysis of Finnish register data is covered by the research permit THL/607/6.02.00/2016, which was approved by the Institutional Review Board of the Finnish Institute for Health and Welfare in 2016.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Additional file 1.
R script. Description of data: The file provides four R functions and a working example (at the end of the script) allowing the realisation of the weighted partial likelihood approach and probabilistic deletion of falsepositive events. sirdata() returns a piecewise constant hazard mimicking the force of infection in a SusceptibleInfectedRemoved epidemic. simdata() simulates timetoevent data and estdata() estimates the vaccine effectiveness and its standard error as described in the main text. outdata() combines the data simulation and estimation steps and returns the performance measures presented in the manuscript.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Baum, U., Kulathinal, S. & Auranen, K. Mitigation of biases in estimating hazard ratios under nonsensitive and nonspecific observation of outcomes–applications to influenza vaccine effectiveness. Emerg Themes Epidemiol 18, 1 (2021). https://doi.org/10.1186/s1298202000091z
Received:
Accepted:
Published:
Keywords
 Influenza
 Outcome measurement error
 Proportional hazards model
 Survival analysis
 Vaccine effectiveness