Introduction
Delirium is a neuropsychiatric syndrome reflecting an acute brain dysfunction that occurs frequently in intensive care units (ICUs) [
1]. It may be induced by a physiological stress related to a systemic pathology or to the critical care interventions and other specific factors (e.g., sleep disturbances, light pollution at night). Characteristics such as age, sex, disease severity, mechanical ventilation are risk factors [
1,
2]. The incidence is about 30%, with significant variations (from 10 to 80%) depending on the admission cause and the definition used to characterize it [
3]. This disorder is challenging to assess reliably because of varied symptomatology, including fluctuating mental status, disturbance in consciousness, attention and judgment disorders, disorientation, circadian disturbances, and psychomotor slowing and/or agitation, which could lead to under-recognition [
4,
5]. The reference diagnosis is based on the diagnostic and statistical manual of mental disorders (DSM) criteria after a psychiatric evaluation. To facilitate the diagnosis and recognition of this trouble by ICU physicians in their everyday practice, scales were developed in the 2000s and included the Confusion Assessment Method for the Intensive Care Unit (CAM-ICU) or the Intensive Care Delirium Screening Checklist (ICDSC). Both scales were validated after comparison with the DSM criteria [
6].
Delirium is associated with increased mortality, prolonged hospital stay, prolonged mechanical ventilation, and increased risk of long-term cognitive impairment [
7‐
10]; therefore, it is a major therapeutic issue. Current research on delirium in ICUs focuses on the evaluation of prevention and treatment strategies including various pharmacological or non-pharmacological interventions that have been evaluated in many randomized controlled trials (RCTs) [
11]. However, few studies have raised the issue of the definition of delirium, which seems to be heterogeneous in the published literature despite the existing tools. The definition of an outcome such as delirium may be an important source of heterogeneity and variation in the intervention effect making comparison between trials on this topic difficult [
12].
In this study, we aimed to evaluate the heterogeneity in the definition of delirium used in RCTs included in meta-analyses evaluating the prevention or treatment of delirium in ICUs and to explore whether the intervention effect varies depending on the definition used.
Methods
Study design
Our study used a meta-epidemiological approach, the reference method for identifying biases in RCTs [
13]. This method consists of assessing whether a given characteristic is associated with the intervention effect in a sample of meta-analyses [
14]. First, a systematic review was conducted to identify meta-analyses assessing prevention or treatment strategies of delirium in ICUs. This systematic review followed the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) (Additional file
1: Table S1). Then, we evaluated the definition of delirium in each included trial report and classified the definition as validated and not validated. Finally, we compared intervention effects between trials reporting a validated definition and those that did not by using meta-epidemiological analyses.
Search strategy
We identified systematic reviews with meta-analyses evaluating prevention or treatment strategies of delirium in ICUs by searching PubMed and the Cochrane Database of Systematic Reviews on March, 4, 2022. The detailed search equation is reported in Additional file
1: Information S1. We also manually searched the “Emergency and Critical care” and “Dementia and Cognitive improvement” review groups of Cochrane.
Two reviewers (LC and CA) independently assessed the eligibility of retrieved references after removing duplicates. Discrepancies were resolved by discussion with a third reviewer (AD) to reach consensus.
Eligibility criteria were meta-analyses including RCTs of adults hospitalized in medical or surgical ICUs that assessed an intervention for preventing or treating delirium and evaluating delirium as a primary or secondary outcome. We included network meta-analyses if direct comparisons were available and focused on these. If a systematic review included several comparisons evaluating different types of interventions, we included each comparison corresponding to our eligibility criteria.
We excluded meta-analyses dedicated to neurobehavioral manifestations in neurological patients or to delirium in alcohol withdrawal, systematic reviews without meta-analysis, protocols and meta-epidemiological studies or overviews. We also excluded meta-analyses including fewer than three RCTs because three is the minimum to conduct meta-epidemiological analyses.
For each selected meta-analysis of delirium, we included only RCTs and excluded RCTs of children and those not conducted in an ICU. We then removed duplicates. We did not consider as duplicates the same RCTs if, across meta-analyses, a different definition of delirium was used or if the RCT was conducted in different populations or if the experimental or control intervention were different. For example, a three-arm RCT could be included twice if a meta-analysis considered the comparison of arms A and B and another considered the comparison of arms A and C. An RCT evaluating delirium with different definitions could also be included twice if a meta-analysis considered the incidence of delirium based on one definition and another considered the incidence of delirium based on the second definition.
Two reviewers (LC and CA/AL) independently extracted data; any disagreements were resolved by discussion with a third reviewer (AD). Two data collection forms were developed and used: one for meta-analyses and one for individual trials.
For each meta-analysis, the following data were extracted:
-
General characteristics: date of publication, journal, funding sources
-
Number of studies included in the meta-analysis of delirium
-
Interventions assessed in the experimental and control groups
-
Delirium outcome evaluated: incidence, duration, delirium- or coma-free days, severity
-
Tool used to assess risk of bias
-
Method for pooling data
-
Results of the meta-analysis of delirium: combined estimate with confidence intervals (CIs) and heterogeneity assessed with the I2 and Cochran Q chi-square test.
-
Whether and how review authors discussed the definition of delirium in included studies
For each trial included in the meta-analysis of delirium, we collected:
-
General characteristics: date of publication, journal, funding sources, reporting of registration in a clinical trial registry (e.g., ClinicalTrials.gov)
-
Population characteristics: sample size, main inclusion and exclusion criteria
-
Details on the experimental and control interventions
-
Primary outcome defined in the trial
-
Assessment of risk of bias using the Cochrane Risk of Bias 1 (RoB1) tool [
15]: random sequence generation, allocation concealment, blinding of participants and personnel, blinding of outcome assessors, incomplete outcome data, selective reporting
-
Definition of delirium used: DSM criteria, scale (CAM-ICU, ICDSC, NEECHAM, other), set of symptoms, physician appreciation, other, not reported. If an RCT did not report delirium as an outcome but the authors of the meta-analysis interpreted an outcome as delirium, we considered the definition as not reported.
-
Data for RCTs were extracted directly from the RCT report. If the full text was not available, we contacted the authors, and in case of no answer, we collected the data from the meta-analysis. The risk of bias was extracted from meta-analyses. Because meta-analyses used different tools, we relied on the Cochrane RoB1 tool because it is a reference tool and was the most frequently used. We re-evaluated the risk of bias from the RCT report by using this tool when another tool was used in the meta-analysis.
Definition of delirium in included RCTs
We evaluated whether the authors used a validated definition or not based on the literature including a list of assessment tools to measure delirium with COSMIN ratings published by the Network for Investigations of Delirium: Unifying Scientists (NIDUS) [
17]. We considered as validated definitions the DSM criteria as this is the gold standard and tools that had been compared and validated against the DSM criteria (CAM-ICU, ICDSC, NEECHAM and Delirium rating scale Revised-98 (DRS-R98)) [
6,
18‐
22]. Non-validated definitions were non-validated scales (RASS or not reported), a set of symptoms, physician appreciation or the definition not reported.
Data synthesis
We estimated the intervention effect for the incidence of delirium with odds ratios (ORs) calculated from the number of patients presenting the outcome and the number of patients analyzed in the experimental and control groups. Outcome events were re-coded so that an OR < 1 indicated a beneficial effect of the experimental intervention. Concerning delirium duration, number of delirium- or coma-free days, we estimated a standardized mean difference by dividing the difference in means between groups by the standard deviation among participants. DerSimonian and Laird random-effects models were used to combine intervention effects across RCTs within each meta-analysis. Heterogeneity across trials was assessed by the I2 and the Cochran Q Chi-square test.
For these analyses, we focused on meta-analyses of the incidence of delirium including trials comparing an intervention to a placebo or usual care. We excluded meta-analyses comparing two active interventions because the direction of bias may be uncertain in that case. We also excluded meta-analyses evaluating the same research question (same intervention and control group) if they had three or more RCTs in common.
Primary analysis
The primary meta-epidemiological analysis followed the two-step method described by Sterne et al. [
23]. We compared intervention effects between RCTs using a validated definition and those using a non-validated one as follows. For each meta-analysis, we first estimated the ratio of ORs (ROR) defined as the OR from trials reporting a non-validated definition to the OR from those reporting a validated definition by using a random-effects meta-regression model to incorporate between-trial heterogeneity. An ROR < 1 indicated larger intervention effect estimates in trials using a non-validated versus a validated definition. Then, we estimated the combined ROR and its 95% CI by using a random-effects meta-analysis model. Heterogeneity across RORs was assessed with the
I2, the Cochran
Q Chi-square test, and the between-meta-analysis variance τ
2.
Subgroup and sensitivity analyses for the primary analysis We conducted subgroup analysis by type of intervention assessed (pharmacological or non-pharmacological) and tested interaction with a random-effects meta-regression model.
To control for potential confounders, we conducted sensitivity analyses by adjusting the meta-regression model for each item of the RoB1 tool, sample size and publication date. The cutoff chosen for publication date was 2010 because the definition of delirium evolved with the development of scales such as CAM-ICU and ICDSC that were more often used after 2010 in ICUs and in the research leading to more screening, prevention and treatment of delirium.
Secondary analyses
The secondary meta-epidemiological analyses were conducted using another method, a one-step multilevel logistic regression model with random effects described by Siersma et al. [
24]. Two comparisons were conducted: first, we compared intervention effects between trials using a validated definition and those using a non-validated one. Then, we compared four definition categories that we considered the most representative and relevant categories. These four categories were the DSM criteria as the reference category, the CAM-ICU, non-validated scales and definition not reported. Further details on these secondary analyses can be found in Additional file
1: Information S2.
Analyses were performed with R 4.1.3 (R Core Team [2022]. R Foundation for Statistical Computing, Vienna, Austria.
https://www.R-project.org) except the multilevel analyses, which were performed with SAS version 9.4 (SAS Institute Inc., Cary, NC, USA).
Discussion
In this systematic review of meta-analyses including RCTs assessing prevention or treatment strategies of delirium in ICUs, the definition of delirium was heterogeneous across trials, and one-fifth did not report how they defined delirium. We attempted to assess the impact of this heterogeneity on intervention effect estimates by using a meta-epidemiological approach and found no significant difference between trials using a validated definition and those using a non-validated one in our primary analysis. However, this analysis included few studies and may lack power. The secondary analysis, based on a multilevel model including more studies, found significantly larger intervention effects in trials using a non-validated definition than those using a validated one, which suggests an association between the definition used and intervention effect.
This is the first study evaluating the heterogeneity in the definition of delirium and its association with the intervention effect by a meta-epidemiological approach, the reference method to identify bias [
13]. Our sample included mostly recent meta-analyses. The definition of delirium was extracted directly from the RCTs because this information was seldom reported in meta-analyses despite its importance. We classified delirium definitions into different categories specified a priori based on a literature review and considered as a validated definition the DSM criteria because it is the gold standard and CAM-ICU, ICDSC, NEECHAM and DRS-R98 because they have been validated in numerous countries and publications [
6,
18‐
22]. Our classification of validated and non-validated definitions is consistent with the NIDUS list assessment tools for delirium screening [
17]. We used two different approaches for the meta-epidemiological analysis and performed sensitivity analyses accounting for important confounding factors.
However, our study has limitations. First, the search strategy might have missed some meta-analyses, but this should not have introduced bias. Second, because meta-analyses were covering the same research area, many RCTs were included in several meta-analyses and had to be removed, which left fewer RCTs available for analysis. Third, it was not reported in included RCTs whether delirium was systematically assessed by research personnel or as part of routine care. Assessment in the context of routine care may lead to under-recognition even when using a validated tool. However, because RCTs are experimental studies with delirium reported as a primary or secondary outcome, we can reasonably assume that it was systematically assessed. Fourth, concerning the meta-epidemiological analysis, only a small number of meta-analyses were included, particularly in the primary analysis, so this analysis may lack power. However, we could not exclude that a difference might exist, especially because the secondary analysis revealed a significant difference. Finally, the development and validation of new tools such as CAM-ICU and ICDSC were concomitant and contributed to an evolution in practices in ICUs with more screening and treatment of delirium. In 2017, the bundle ABCDEF guidelines appeared, representing evidence-based guidelines for physicians to optimize ICU patient care. Management of delirium is a large part of this bundle, which has been increasingly used in the day-to-day care. We tried to collect information on this bundle in trials, but it was seldom reported.
A previous systematic review analyzed the outcomes in RCTs evaluating prevention or treatment strategies of delirium in ICUs [
25]. The authors found heterogeneity and multiplicity in outcomes, the most frequent tools used being the CAM-ICU and the ICDSC, which is consistent with our results. They suggested that delirium should be screened regularly with a reliable tool and that a core outcome set be developed to inform delirium research. Another recent systematic review evaluated the heterogeneity in design and analysis of ICU delirium outcome [
26]. The authors included RCTs with delirium as primary outcome, evaluated by a validated tool, based on the NIDUS assessment tool list and the DSM criteria, in agreement with our classification. The most frequent tool was the CAM-ICU, which is also consistent with our findings. The authors suggested developing specific methods for statistical analyses and reporting in RCTs of delirium that, if used by most researchers, could improve the quality of clinical trials and the comparison between them. However, they did not raise the issue of the heterogeneity among the definitions of delirium used and included only RCTs using a validated definition. We believe that harmonization in the definition of delirium is the first step to improve the quality of the research on this topic, allowing the research community and physicians to talk about the same thing when considering delirium.
Our study showed heterogeneity in the definition of delirium, with nine different definitions reported, even if CAM-ICU was the most frequently used. The gold standard to define delirium is the DSM criteria, but this evaluation should be made by a psychiatrist. Before the development of tools such as CAM-ICU, ICDSC or NEECHAM, physicians and researchers used various symptoms such as agitation or confusion to define this disorder, thus increasing the heterogeneity in definitions used. One-fifth of trials of delirium did not define delirium. Although
bad reporting does not mean bad methods [
27], the lack of definition limits the interpretation of results including the comparison with other trials. This is why we chose to consider a not-reported definition as non-validated.
Concerning the impact of the heterogeneity in delirium definitions on the intervention effect, the primary meta-epidemiological analysis did not show any significant difference, although the difference was in the direction we expected, which suggests that trials using a non-validated definition may overestimate the intervention effect. However, this analysis lacks power given the small number of studies included. The two-step approach is the most used and the reference method for meta-epidemiological analyses [
28], but it is restrictive because it requires including at least one RCT using a validated definition and one using a non-validated definition within each meta-analysis, thus reducing the number of contributing meta-analyses (seven in our study). This constraint does not exist with the multilevel approach, which allows for the inclusion of more meta-analyses. Use of this multilevel model revealed a significant difference in the same direction, which supports the possible existence of larger estimates of intervention effects in trials using a non-validated definition. The secondary analysis with different categories suggests that the difference in intervention effects between trials using a validated and a non-validated definition may be driven by the trials not reporting the definition used. Not reporting the definition of an outcome may reflect a lack of rigor in these trials, which could partly explain the results.
Sensitivity analysis for the primary analysis did not reveal significant differences, but after adjustment for the items of the RoB1, RORs were closer to one. Hence, the non-significantly larger estimates in trials using a non-validated definition may reflect a weaker methodology. In previous meta-epidemiological studies, an inadequate sequence generation, the absence of allocation concealment or lack of blinding was associated with an overestimation of intervention effect [
29‐
32].
Improving how delirium is defined is an important way to limit waste of research [
33‐
35] because it would facilitate the comparison between RCTs, leading to better-quality systematic reviews and meta-analyses on this topic [
36,
37]. It may result in a better understanding of this disorder and a better evaluation of the efficacy of therapeutic or preventive interventions. For physicians, it would also improve the diagnosis of delirium, thus resulting in more efficient patient care in ICUs.
The NIDUS proposal listing tools for defining delirium was an important step. However, it includes 34 assessment tools for delirium screening, diagnosis or severity and 5 brief screening tools. Although this catalogue helps clarify the definition of delirium and facilitates the comparisons between trials, there is still a large panel of tools used, which results in heterogeneity with a possible association with the intervention effect. Physicians and researchers should agree on which validated tool should be used to define delirium. The development of a core outcome set may be useful, as suggested by Rose et al. [
25].
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.