Emerging Themes in Epidemiology Open Access Methodology Testing Bias in Clinical Databases: Methodological Considerations

Background: Laboratory testing in clinical practice is never a random process. In this study we evaluated testing bias for neutrophil counts in clinical practice by using results from requested and non-requested hematological blood tests. Methods: This study was conducted using data from the Utrecht Patient Oriented Database. This clinical database is unique, as it contains physician requested data, but also data that are not requested by the physician, but measured as result of requesting other hematological parameters. We identified adult patients, hospitalized in 2005 with at least two blood tests during admission, where requests for general blood profiles and specifically for neutrophil counts were contrasted in scenario analyses. Possible effect modifiers were diagnosis and glucocorticoid use. Results: A total of 567 patients with requested neutrophil counts and 1,439 patients with non-requested neutrophil counts were analyzed. The absolute neutrophil count at admission differed with a mean of 7.4 × 10 9 /l for requested counts and 8.3 × 10 9 /l for non-requested counts (p-value < 0.001). This difference could be explained for 83.2% by the occurrence of cardiovascular disease as underlying disease and for 4.5% by glucocorticoid use. Conclusion: Requests for neutrophil counts in clinical databases are associated with underlying disease and with cardiovascular disease in particular. The results from our study show the importance of evaluating testing bias in epidemiological studies obtaining data from clinical databases. Background In recent years, large health care databases are increasingly used and provide important tools in epidemiological research [1,2]. Advantages are that large amounts of clinical data are available at relatively low cost, and that these databases usually reflect daily practice [3,4]. However , in contrast to randomized clinical trials, where data collection is well-controlled, bias should always be considered when using routinely collected data in automated databases and methodological issues should be taken into account [3,5-7]. Laboratory testing in clinical practice is never a random process, as the physician has reasons to perform a test. Physicians selectively request tests for patients with a high probability of abnormalities and less frequently for patients with a low probability, because of patient burden and costs [8]. Such selective processes might induce test


Background
In recent years, large health care databases are increasingly used and provide important tools in epidemiological research [1,2]. Advantages are that large amounts of clinical data are available at relatively low cost, and that these databases usually reflect daily practice [3,4]. However, in contrast to randomized clinical trials, where data collection is well-controlled, bias should always be considered when using routinely collected data in automated databases and methodological issues should be taken into account [3,[5][6][7].
Laboratory testing in clinical practice is never a random process, as the physician has reasons to perform a test. Physicians selectively request tests for patients with a high probability of abnormalities and less frequently for patients with a low probability, because of patient burden and costs [8]. Such selective processes might induce test-ing bias in clinical database studies. There are several strategies to minimize testing bias, including selection of proper patient populations, measuring outcomes for all study participants, blind testing, or using imputation techniques to deal with missing data [8][9][10], but these techniques do not provide insight into size and direction of testing bias.
One example where testing bias might occur is in physicians' requests of blood tests. Neutrophil counts in peripheral blood are considered a useful biomarker for disease severity in many conditions [11][12][13][14]. However, testing bias might occur because of underlying disease or medication use, as neutrophil counts differ in several diseases and clinical observations have shown that patients using glucocorticoids often have higher neutrophil counts. Requesting neutrophil counts specifically for certain diseases or for glucocorticoid users might cause testing bias in clinical databases. The aim of this study was to evaluate testing bias for neutrophil counts in clinical practice by using results from requested and nonrequested hematological blood tests.

Setting
This study was conducted using data from the Utrecht Patient Oriented Database (UPOD). UPOD is an infrastructure of relational databases comprising administrative data on patient characteristics, laboratory test results, medication orders, discharge diagnoses and medical procedures for all patients treated at the University Medical Center (UMC) Utrecht, a 1,042-bed tertiary teaching hospital in the center of the Netherlands. Each year, approximately 165,000 patients are treated during more than 28,000 hospitalizations, 15,000 day-care treatments, and 334,000 outpatient visits. UPOD data acquisition and data management is in accordance with current Dutch privacy and ethical regulations. A more complete description of UPOD has been published elsewhere [15].
UPOD is a unique clinical database as it contains results of hematological blood tests measured with Cell-Dyn Sapphire automated blood cell analyzers (Abbott Diagnostics, St. Clara, California, USA) [15]. A feature of this analyzer is that it measures all hematological parameters irrespective of whether these are requested or not [15]. The non-requested parameters are measured because one hematological test is technically linked to the other hematological tests and conducted automatically when one of these tests is requested. In other words, UPOD contains requested ( Figure 1) and non-requested test results ( Figure 2). Although non-requested neutrophil counts are not reported to the clinician, these neutrophil counts are collected in UPOD.

Scenarios
By comparing the measured hematological parameters with the routine hospital laboratory reporting system, which reports laboratory results to physicians, neutrophil counts can be categorized as requested or non-requested. Neutrophil counts appearing in the laboratory reporting system were categorized as requested; other neutrophil counts were categorized as non-requested. Using these data, we conducted two scenario analyses. Scenario 1 reflects the situation as in a typical clinical database, where all blood tests were requested. With scenario 2 we were able to study testing bias by including nonrequested blood tests in our analysis.

Study population
The source population comprised 3,467 adult (18 years or older) users and non-users of glucocorticoids who were hospitalized in the UMC Utrecht in 2005 and had at least two hematological blood tests, where these tests should cover at least a one-day period. For each glucocorticoid user, the first blood test during admission and the last blood measurement during in-hospital glucocorticoid use were selected for analysis. Up to four unexposed patients were sampled to each glucocorticoid user according to calendar time (with a maximum of 15 days before or after the test date of the user), neutrophil count at time of admission (max 2 × 10 9 neutrophils/l around the neutrophil count of the user) and days between the two blood samples (max two days around the number of days for the glucocorticoid user). According to our laboratory normal reference range for neutrophils (1.6-8.3 × 10 9 /l), there is large inter-individual variation in the absolute neutrophil count. Using two blood tests, we were able to study testing bias in both blood tests separately, but also in the change in neutrophil count during hospitalization for each patient. Within the source population, we contrasted patients with both blood tests requested and with both blood tests non-requested, one at time of admission and one at the end of hospitalization. For all participants the discharge diagnosis was defined according to the ICD-9-CM code [16].

Data analysis
Student t-tests, Mann-Whitney tests, and chi-square tests were used to test for differences between groups, as appropriate. Confounding was studied using logistic regression. The absolute neutrophil count was categorized into tertiles to obtain three equally-sized groups. These three groups were defined as an increase, decrease or no change in neutrophil count where no change was the comparator group. Potential confounding factors were included in a multivariable logistic model in sequence with decreasing confounding strength. Potential confounders that were studied were age, gender, number of days between blood samples, length of hospitalization, death during hospitalization, and diagnosis. All variables that changed the regression coefficient for glucocorticoid use by less than ten percent were excluded from the model. Of these potential confounders, only diagnosis had a substantial effect on the comparison between scenarios. Glucocorticoid use was studied because of the ongoing discussion about the effect of glucocorticoids on the neutrophil count [12,17,18]. Subsequently, linear regression analysis was used to estimate the proportion of bias associated with diagnostic subgroups and glucocorticoid use. The beta-coefficient for the contrasted scenarios was calculated for all patients in the study population as well as for only patients exposed to one factor (for example a diagnostic subgroup or glucocorticoid use). The proportion of bias explained by one factor was calculated as the weighted fraction of betacoefficients. All analyses were conducted using SPSS for Windows, version 14.0 (SPSS Inc., Chicago, Illinois, USA).

Results
A total of 567 patients with requests for the absolute neutrophil count (scenario 1) and 1,439 patients with nonrequested neutrophil counts (scenario 2) were identified. It appeared that the absolute neutrophil count was most frequently requested in the context of a leukocyte differential request, which includes the absolute counts of neutrophils, eosinophils, lymphocytes, monocytes, and basophils (99.8% of all neutrophil count requests). Of patients with requested neutrophil counts, there was also a hemoglobin request for 97.2% of patients. For patients with non-requested neutrophil counts, 96.1% of the requests were for hemoglobin. Hemoglobin values were lower when requested compared with non-requested hemoglobin values (Table 1).
For the first blood test, lower neutrophil counts were found for patients with requested neutrophil counts compared with non-requested neutrophil counts (Table 1). Comparable neutrophil counts were found in the second blood test. For both blood tests, there were more patients with neutropenia and fewer patients with neutrophilia for requested neutrophil counts. Studying the change in the absolute neutrophil count during hospitalization for each patient, patients with non-requested neutrophil counts had a mean decrease of 0.50 × 10 9 neutrophils/l compared with a slight increase of 0.14 × 10 9 neutrophils/l for patients with requested neutrophil counts (p-value 0.008, Table 1, Figure 3).
Overall, the main diagnostic subgroups were cardiovascular disease (28.9%), neoplasms (14.5%), and respiratory disease (4.5%). Requests for neutrophil counts were more often conducted for patients suffering from cardiovascular or respiratory diseases, whereas diagnoses for nonrequested neutrophil counts were much more diffuse with multiple diagnoses (Table 1). There were no differences in absolute neutrophil count or change in neutrophil counts among patients with neoplasms and respiratory disease (Table 2). However, among patients with cardiovascular disease there was a lower absolute neutrophil count in the first blood test for requested counts compared with non-requested neutrophil counts. Excluding cardiovascular patients from analysis, the  absolute neutrophil counts in the first blood test were equal with 8.1 × 10 9 /l in both scenarios ( Figure 4). The difference in absolute neutrophil count between scenarios in the first blood test could be explained for 83.2% by cardiovascular disease (p-value for effect modification 0.002). Incorporating glucocorticoid use in the linear regression model showed that diagnosis was far more important than glucocorticoid use (p-value for effect modification 0.240). Taking diagnosis into account, 4.5% of the difference in absolute neutrophil count between scenarios in the first blood test could be explained by glucocorticoid use.
With respect to the absolute neutrophil count in the second blood test, there were no differences between the scenarios, either in the overall analysis or in the main diagnostic subgroups. An increase in neutrophil count of 2.1 × 109/l was shown for requested neutrophil counts in cardiovascular patients, whereas non-requested counts revealed a decrease of 0.4 × 109 neutrophils/l ( Table 2). Excluding cardiovascular disease from analysis, the change in neutrophil count was comparable in both scenarios with a mean decrease of 0.9 × 109/l for each patient with requested neutrophil counts and a mean decrease of 0.5 × 109/l for each patient including only non-requested neutrophil counts (p-value = 0.211).

Discussion
In this study, we used UPOD to study bias in neutrophil testing, as this database contains both requested and non-requested neutrophil counts. Among tests for which neutrophil counts were requested, hemoglobin was also requested in 97%. For non-requested neutrophil test results, 96% were generated by hemoglobin requests. Therefore, hemoglobin requests approximate random testing and can be used as comparator group. Absolute neutrophil counts differed for requested tests (scenario 1) compared with non-requested tests (scenario 2), which leads to the conclusion that testing bias was found in this study.
The bias in absolute neutrophil count in the first blood test could be explained for 83.2% by cardiovascular disease. This finding could reflect the role of neutrophils in cardiovascular disease [13,19,20]. After excluding cardiovascular disease from analysis, there were no differences in absolute neutrophil count or change in neutrophil count for each patient. This could be explained by the fact Figure 2 Is unique for UPOD as this includes non-requested neutrophil counts. These non-requested neutrophil counts are measured because this test is conducted automatically when one hematological test, for example hemoglobin, is requested.  that the absolute neutrophil count was mainly requested in the context of a leukocyte differential count. Similar findings were observed for the change in neutrophil count. For the second neutrophil test, no differences were found between the scenarios. The second neutrophil test was at the end of hospitalization. The first neutrophil test, at time of admission, is more informative because the patients are likely to be severely ill at that point. At time of the second blood test the difference in absolute neutrophil count has evened out, as patients are healthier at the end of hospitalization.
The results of this study are in accordance with other studies finding that data are not missing at random [21,22]. Further research is needed to study the clinical relevance of the bias found in this study. Distributions of diagnostic subgroups and testing guidelines might vary between health care institutions. As a consequence, generalizability of clinical implications, like the association with cardiovascular disease as an example in this study, might be limited. However, testing bias is an issue in all centers and should be evaluated to be able to adjust for this bias. Using laboratory tests, a random tested parameter, like hemoglobin testing in this study, could serve as comparator group to study testing bias. With knowledge about the size and direction of testing bias, strategies such as imputation techniques [8,21] could adjust for this bias in order to obtain an unbiased risk estimate in epidemiological studies.
With development of automated machines for routine analysis, more parameters are measured than requested. When these non-requested parameters are collected, testing randomness is introduced. UPOD contains requested neutrophil counts and non-requested neutrophil counts, as well as other non-requested hematological blood tests. Therefore, the database is especially suitable to study and adjust for testing bias in clinical research questions. Conducting studies with laboratory markers in UPOD, correction factors for requested testing can be  added to the statistical model to minimize testing bias in order to obtain an unbiased risk estimate.
A classic example of testing bias is the association between thrombosis and use of oral contraceptives. Many studies state traditionally that the size of this association is overestimated because of diagnostic suspicion bias and referral bias, both types of testing bias [23,24]. However, a case-control study with the same referral and diagnostic strategies for cases and controls, showed that neither type of bias played a major role in previous studies, and that the risk of thrombosis while using oral contraceptives is not solely due to bias [9]. This example and the results from our study show the importance of evaluating testing bias in epidemiological studies obtaining data from clinical databases.  The effect of cardiovascular disease on the absolute neutrophil count of the first blood test. The higher neutrophil count in scenario 2 is explained by a high neutrophil count among cardiovascular patients in scenario 2. Excluding cardiovascular disease from analysis, there was no difference between the scenarios. Scenario 1 included requested neutrophil counts, scenario 2 included non-requested neutrophil counts.