Skip to main content

Development of a quality assessment tool for systematic reviews of observational studies (QATSO) of HIV prevalence in men having sex with men and associated risk behaviours



Systematic reviews based on the critical appraisal of observational and analytic studies on HIV prevalence and risk factors for HIV transmission among men having sex with men are very useful for health care decisions and planning. Such appraisal is particularly difficult, however, as the quality assessment tools available for use with observational and analytic studies are poorly established.


We reviewed the existing quality assessment tools for systematic reviews of observational studies and developed a concise quality assessment checklist to help standardise decisions regarding the quality of studies, with careful consideration of issues such as external and internal validity.


A pilot version of the checklist was developed based on epidemiological principles, reviews of study designs, and existing checklists for the assessment of observational studies. The Quality Assessment Tool for Systematic Reviews of Observational Studies (QATSO) Score consists of five items: External validity (1 item), reporting (2 items), bias (1 item) and confounding factors (1 item). Expert opinions were sought and it was tested on manuscripts that fulfil the inclusion criteria of a systematic review. Like all assessment scales, QATSO may oversimplify and generalise information yet it is inclusive, simple and practical to use, and allows comparability between papers.


A specific tool that allows researchers to appraise and guide study quality of observational studies is developed and can be modified for similar studies in the future.


Epidemiological evidence-based research is becoming an increasingly important basis for health care decisions and planning. There is a dearth of reviews of observational and analytic studies on HIV prevalence and risk factors for HIV transmission among men having sex with men (MSM), and this is particularly the case in mainland China [13]. We sought to conduct a rigorous systematic review summarising HIV prevalence data in MSM and to measure their associated high risk behaviours in China, with the aim of providing systematic and comprehensive data for policymakers to devise appropriate plans for health promotion and interventions to control the spread of HIV in the target population.

A number of consensus statements have previously been prepared to encourage higher quality of reporting, including recommendations for reporting systematic reviews (QUOROM)[4], randomized trials (CONSORT)[5], studies of diagnostic tests (STARD)[6], meta-analyses of observational studies (MOOSE)[7] and observational epidemiological studies (STROBE)[8, 9]. However all these were aimed at authors of reports, not at those seeking to assess the validity of what they read [10]. Of particular relevance here are the MOOSE and STROBE statements, both of which were developed as checklists designed to assist authors when writing up analytical observational studies, to support editors and reviewers when considering such articles for publication, and to help readers when critically appraising published articles [8]. However, there remains a clear disparity between the quality of tools available to aid the critical appraisal of observational studies when compared with those available for controlled trials, making the systematic review of the former particularly difficult. We believe a quality assessment tool is the key to any systematic review as it allows original research to be objectively appraised and evaluated, in order to inform subsequent decisions regarding inclusion by evaluating, ranking, or scoring the relevant studies [1113].

A study conducted by Mallen and his coworkers [14] in 2006 revealed that quality assessment tools were grossly under-utilised in the evaluation of observational studies, in that only 13 out of 40 articles in 2003–2004 using published checklists/quality assessment tools such as NHS CRD [15, 16], MOOSE [7], Downs and Black checklist [17] and Ottawa-Newcastle tool [18]. Of such tools, the Newcastle-Ottawa Scale (NOS) is one of the more comprehensive instruments for assessing the quality of non-randomised studies in meta-analyses: the 8-item instrument consists of three subscales, namely, selection of subjects (4-item), comparability of subjects (1-item), assessment of outcome/exposure (3-item). Despite having been recommended by the Cochrane Non-Randomized Studies Methods Working Group [19], it is only partly validated and primarily used to appraise cohort studies and case-control studies [18]. In short, our major challenge is that each study is to some extent unique, and that a quality checklist may consequently not include items that may be considered relevant for the purposes of the intended meta-analysis. We therefore set out to develop a concise quality assessment checklist to help standardise decisions regarding the quality of studies, with careful consideration of issues such as external and internal validity.

Results and discussion


Often both internal and external validity are assessed together during methodological quality assessment as interpretation of the findings of a study depends on design, conduct and analyses (internal validity), as well as on populations, interventions and outcome measures (external validity). The information gained from quality assessment is crucial in determining the strength of inferences and in assigning grades to recommendations generated within a review.

Our team proposed to identify case-control studies, cross-sectional studies with case-control design in the questions, and those intervention studies that address prevalence rates. A pilot version of the checklist was developed based on epidemiological principles, reviews of study designs, and existing checklists for the assessment of observational studies. It was later modified in light of preliminary and pilot application. The final tool, abbreviated as QATSO Score, covers the following aspects (Additional file 1):

1) External validity (1 item) – addresses the extent to which the findings from the study can be generalised to the population from which the study subjects are derived.

2) Reporting (2 items) – assesses whether the information provided in the paper is sufficient to allow a reader to make an unbiased assessment of the findings of the study. One of the items is specific for prevalence studies.

3) Bias (1 item) – addresses bias in the measurement of the outcomes in a study.

4) Confounding (1 item) – addresses whether studies have applied adjustment for confounding in the analysis. This item is specific to studies concerning association of risk factors.

Although the QATSO Score consists of five items, users may select 4–5 items depending on the type of studies being evaluated. Studies achieving 67% or more in the score will be regarded as "good" quality; 34–66% "fair"; and, below 33% "poor".


Experts from the Hong Kong Branch of the Chinese Cochrane Centre and local HIV researchers (see acknowledgement) were invited to provide comments on the content validity of the assessment tool. This assessment tool was then pilot-tested with two independent reviewers to test the consistency of study quality. The two reviewers were asked to assess 10 observational studies selected at random from a group of 30 identified during a systematic review of HIV prevalence in MSM and associated risk factors. The reviewers were given guidance with regard to the interpretation of the items included in the checklist before reviewing the papers. Inter-rater reliability was shown to be good (Pearson coefficient = 0.86).

In order to evaluate the practicality of the tool, the time used to assess each paper was recorded. Both reviewers reported that they took an average of 10.4 ± 4.6 minutes to assess one paper with QATSO Score as compared to 23.0 ± 4.5 (p < 0.001) spent applying a validated lengthy checklist (comprising of 27-items) reported elsewhere [11].


We searched articles published in English and Chinese languages in the following electronic databases: MEDLINE (1966 to December 2006), EMBASE (1980 to December 2006), ProQuest Social Science Journal (1989 to December 2006), Anthropology (1984 to 1996), China Journal Net (1994 to December 2006) and Wan Fang Data (1998 to December 2006). To retrieve publications reporting HIV prevalence and risk behaviours among MSM in Mainland China, we performed a combined search strategy that included the following terms as both medical subject heading (MeSH) terms and text words: "prevalence", "epidemiology", "HIV infections", "Acquired Immunodeficiency Syndrome", "AIDS", "MSM", "male having sex with male", "men having sex with men", "men who have sex with men", "homosexuality, male", "gay", "homosexual", "bisexual", "queer", "male sex worker", "male sexual worker", sexual risk behaviour", sexual behaviour", "risk taking", "risk factors", "protective factors", "China" and "Tibet". We manually searched for review articles and abstracts from the reference list of identified articles. Additional reports from known experts in field through our contacts and professionals were included for review.

Data were independently abstracted onto a standardized form by two independent reviewers. Data abstracted included study design, time period of study, place of origin, study setting, HIV prevalence, information source for exposure measurement, total number of persons in each group, odds ratio or risk ratios, with and without adjustment for potential. Conflicts in data abstraction were resolved by consensus. Data reporting conforms to the Meta-analysis of Observational Studies in Epidemiology (MOOSE) study group guidelines [7]. QATSO is then applied to assess the standard of each paper that fulfils the inclusion criteria.

During this process, we found that QATSO may over-simplify and generalise information one could extract from a published manuscript, an issue inherent in all quality assessment tools. For example, the relative importance of individual items will be lost through a summation of items represented by a total score. A careful balance has to be struck so that the final scale is inclusive and allows comparability between papers, yet is simple and practical to use. Secondly, any attempts at summarising quality on, for example, the inclusion or exclusion of a particular item, will invariably lose the significance of that item's magnitude. For example, a reported response rate per se does not necessarily mean that the response rate is satisfactory (item three in the scale); we therefore selected an arbitrary 60% response rate as a cut-off for acceptable quality. However, it is important to emphasise that the objective of this tool is to appraise and guide study quality; actual analyses are conducted in the next phase of systematic review or meta-analysis which will be reported elsewhere.


Few quality assessment tools for the systematic review of observational studies are available and relevant for HIV prevalence in MSM and associated risk behaviours. We have developed a specific tool that researchers who wish to conduct similar systematic reviews can adopt to ensure that studies reach a level of quality that permit their inclusion on meta-analyses.


  1. Zhang BC, Chu QS: HIV and HIV/AIDS in China. Cell Res. 2005, 15: 858-64. 10.1038/

    Article  PubMed Central  PubMed  Google Scholar 

  2. Frankis J, Flowers P: Men who have sex with men (MSM) in public sex environments (PSES): a systematic review of quantitative literature. AIDS Care. 2005, 17: 273-88. 10.1080/09540120412331299799

    Article  CAS  PubMed  Google Scholar 

  3. Colby D, Cao NH, Doussantousse S: Men who have sex with men and HIV in Vietnam: a review. AIDS Educ Prev. 2004, 16: 45-54. 10.1521/aeap.

    Article  PubMed  Google Scholar 

  4. Moher D, Cook DJ, Eastwood S, Olkin I, Rennie D, Stroup DF: Improving the quality of reports of meta-analyses of randomised controlled trials: the QUOROM statement. Quality of reporting of meta-analyses. Lancet. 1999, 354: 1896-900. 10.1016/S0140-6736(99)04149-5

    Article  CAS  PubMed  Google Scholar 

  5. Deeks JJ, Dinnes J, D'Amico R, Sowden AJ, Sakarovitch C, Song F, Petticrew M, Altman DG: Evaluating non-randomised intervention studies. Health Technol Assess. 2003, 7: iii-173.

    Article  CAS  PubMed  Google Scholar 

  6. West S, King V, Carey TS, Lohr KN, McKoy N, Sutton SF, Lux L: Systems to Rate the Strength of Evidence. Evidence Report/Technology Assessment No. 47. 2002. Agency for Healthcare Research and Quality, Rockville, MD. AHRQ Publication No. 02-E016.

  7. Stroup DF, Berlin JA, Morton SC, Olkin I, Williamson GD, Rennie D, Moher D, Becker BJ, Sipe TA, Thacker SB: Meta-analysis of observational studies in epidemiology: a proposal for reporting. Meta-analysis Of Observational Studies in Epidemiology (MOOSE) group. JAMA. 2000, 283: 2008-12. 10.1001/jama.283.15.2008

    Article  CAS  PubMed  Google Scholar 

  8. von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP: The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement: Guidelines for Reporting Observational Studies. PLoS Medicine. 2007, 4 (10): e296. 10.1371/journal.pmed.0040296

    Article  Google Scholar 

  9. Altman D, Egger M, Pocock S, Vandenbrouke JP, von Elm E: Strengthening the reporting of observational epidemiological studies. STROBE Statement: Checklist of Essential Items Version 3. 2005 []

    Google Scholar 

  10. Sanderson S, Tatt ID, Higgins JP: Tools for assessing quality and susceptibility to bias in observational studies in epidemiology: a systematic review and annotated bibliography. Int J Epidemiol. 2007, 36: 666-676. 10.1093/ije/dym018

    Article  PubMed  Google Scholar 

  11. Systematic reviews in healthcare: meta-analysis in context. Edited by: Egger M, Davey Smith G, Altman DG. 2000, BMJ Publishing Group, London; 2.

  12. Altman DG: Systematic Review on Prognostic Study. BMJ. 2001, 323: 224-8. 10.1136/bmj.323.7306.224

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  13. West S, King V, Carey TS, Lohr KN, McKoy N, Sutton SF, Lux L: Systems to rate the strength of scientific evidence. In Evidence report/technology assessment No. 47 (prepared by the Research Triangle Institute – University of North Carolina Evidence-based Practice Center under Contract No. 290-97-0011). 2002, AHRQ Publication No. 02-E016, Agency for Healthcare Research and Quality, Rockville, MD; 2002.

    Google Scholar 

  14. Mallen C, Peat G, Croft P: Quality assessment of observational studies is not commonplace in systematic reviews. J Clin Epidemiol. 2006, 59: 765-769. 10.1016/j.jclinepi.2005.12.010

    Article  PubMed  Google Scholar 

  15. Deeks J, Glanville J, Sheldon T: Undertaking Systematic Reviews of Effectiveness: CRD Guidelines for Those Carrying Out or Commissioning Reviews. York, England: NHS Centre for Reviews and Dissemination, University of York; 1996, CRD Report 4

    Google Scholar 

  16. Khan KS, Riet G, Popay J, Nixon J, Kleijnen J: Conducting the review Study quality assessment. []

  17. Downs SH, Black N: The feasibility of creating a checklist for the assessment of the methodological quality both of randomised and non-randomised studies in health care intervention. J Epidemiol Community Health. 1998, 52: 377-84.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  18. Wells GA, Shea B, O'Connell D, Petersen J, Welch V, Losos M, Tugwell P: The Newcastle-Ottawa Scale (NOS) for assessing the quality of nonrandomized studies in meta-analyses. [] Department of Epidemiology and Community Medicine, University of Ottawa, Canada

  19. The Cochrane Collaborative Review Group on HIV Infection and AIDS: Editorial Policy: Inclusion and Appraisal of Experimental and Non-experimental (Observational) Studies. []

Download references


We would like to thank for Tang JL, The Hong Kong Branch of the Chinese Cochrane Centre and Professor SS Lee, Professor of Infectious Diseases, The Chinese University of Hong Kong for their valuable comments for development of the quality assessment tool. A sincere thank you also goes to Miss Amie Bingham who helped to edit and proof-read this manuscript.

Author information

Authors and Affiliations


Corresponding author

Correspondence to William CW Wong.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

WCWW conceived the original idea for the study, was involved in the design and testing of the QAT, and in the drafting and write-up of the final report. CSKC was involved in the testing of the QAT, as well as conducting the literature search and contributing to the write-up of the final report. GH was involved in designing the QAT, as well as in the preparation and writing of the final report. All authors have read and approved the manuscript.

Electronic supplementary material


Additional file 1: Quality assessment checklist for observational studies (QATSO Score) concerning HIV prevalence/risk behaviours among MSM.(DOC 38 KB)

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Wong, W.C., Cheung, C.S. & Hart, G.J. Development of a quality assessment tool for systematic reviews of observational studies (QATSO) of HIV prevalence in men having sex with men and associated risk behaviours. Emerg Themes Epidemiol 5, 23 (2008).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Acquire Immunodeficiency Syndrome
  • Health Care Decision
  • Quality Assessment Tool
  • Methodological Quality Assessment
  • Pilot Version