Background
The study of the epidemiology of disturbed behavioural phenotypes in critically ill patients is both novel and challenging [
1]. In part, this is because such phenotypes are multifaceted manifestations of a poorly understood underlying neurocognitive state. Thus, they cannot be measured and can only be described by words. Moreover, the frequently cited reference for their description [
2] is itself an aggregation of words, provides limited guidance for the systematic identification of behavioural phenotypes, and was not intended for use in the intensive care unit (ICU). Furthermore, the widely used screening tools for cognitive and behavioural dysfunction in critically ill patients, the Confusion Assessment Method for the Intensive Care Unit (CAM-ICU) [
3] and the Intensive Care Delirium Screening Checklist (ICDSC) [
4] only include limited guidance for the identification of behavioural phenotypes.
Notwithstanding the above difficulties, using CAM-ICU and ICDSC, several studies have reported that phenotypes of delirium do exist and may be associated with different morbidity and mortality rates [
5‐
7].
Such studies have also suggested that there might be clinical value in further investigating the characteristics, prevalence, trajectory, treatment, and outcomes of disturbed behavioural phenotypes, a similar but not identical concept to delirium [
8‐
14]. In this regard, natural language processing (NLP) of caregivers’ notes has recently emerged as a screening tool for behaviour in critically ill patients [
8‐
14]. NLP has also been recently used to describe the syndrome of NLP-diagnosed behaviour disturbance (NLP-Dx-BD) [
15]. Different from delirium screening tools, the identification of this syndrome requires the application of NLP to screen caregiver notes for the purpose of detecting words or phrases that describe disturbed behaviour (see “
Methods” section). Such NLP-Dx-BD is a condition observed in critically ill patients and has high sensitivity for the identification of patients who later go on to be treated with antipsychotic medications in the ICU [
8].
Moreover, although NLP can be used to read caregiver’s clinical notes for the purpose of identifying words and phrases such as “combative” or “confused”, these words may also be indicative of different phenotypes of NLP-Dx-BD. Thus, “combative” implies an agitated phenotype, while, in the absence of words to indicate psychomotor agitation, “confused” implies a non-agitated phenotype, and, finally, the presence of both phenotypes within a defined reporting period (e.g., 24 h) implies a combined phenotype.
We hypothesised that NLP could be used to detect words and phrases associated with distinct behavioural phenotypes and that such NLP-defined phenotypes would be associated with clinically important differences in other patient characteristics. Accordingly, we used NLP to study the characteristics, prevalence, trajectory, treatment, and outcomes of phenotypes of NLP-Dx-BD in a cohort of critically ill patients.
Methods
Study design
We performed a non-interventional, retrospective study of a cohort of critically ill adult patients (≥ 18 years old) admitted to the three ICUs of a university affiliated hospital in Melbourne, Australia between 1 May 2019 and 31 December 2020. The study was approved by the Austin Hospital Human Research Ethics Committee (LNR/19/Austin/38), which waived the requirement for informed consent. For patients with multiple admissions, only the first was included. No other exclusion criteria were applied. During the study period, all patients received care designed to reduce the risk of developing delirium including family visits, dimmed lights at night, minimal interaction to facilitate night-time sleep, and the use of visual and auditory aids as required.
Data collection and manipulation
We obtained the patient clinical progress notes entered into the electronic health record (EHR) by doctors, nurses, physiotherapists, and other allied health professionals. We analysed these notes using NLP techniques. As previously described [
8,
15], each progress note was converted into sentence vectors, tokenised and searched for the presence of words indicative of agitated or non-agitated behaviour (Natural Language Toolkit; NLTK 3.5) [
16], a process equivalent to the first step of Large Language Model Generative Pre-trained Transformer strategies recently popularised by OpenAI.
The words used in our NLP model were derived from the results of a previously published survey of clinical staff who were asked to identify words, terms or expressions that they would associate with disturbed behaviour and possible delirium [
17].
In this study, we further categorised each of these words as being suggestive of agitated or a non-agitated behavioural disturbance state, while the presence of negation or resolution words was also determined (Table
1). Accordingly, all notes for the same patient were aggregated and categorised into four groups of NLP diagnosed behavioural disturbance (NLP-Ex-BD): (1) agitated, when only agitated words were present in available notes; (2) non-agitated, when only non-agitated words were present in available notes; (3) combined, when both agitated and non-agitated words were present in available notes; and (4) no disturbance, when no agitated or non-agitated words were present in any note available.
Table 1
Words suggestive of behavioral disturbance
Agitated | Agitated | Agitation | Aggression | Aggressive |
| Combative | Endangering | Paranoia | Paranoid |
| Restrained | Restraint | Shackled | Violence |
| Uncooperative | Violent | | |
Non-agitated | Confused | Confusion | Delirium | Delirious |
| Disorientation | Disorganized | Disorganised | Disorientated |
| Distraction | Disturbed | Delusion | Fluctuating |
| Inattention | Incoherent | | |
Negation | No | Not | Nill | Nil |
Resolution | Resolved | Resolving | Cleared | Clearing |
| Ceased | | | |
We use the term NLP-Dx-BD instead of delirium for the sake of accuracy. Importantly, we wish to emphasise that the words in Table
1 are intuitively associated with cognitive deterioration, which is recognised with delirium as well. However, we are also aware that we do not have sufficient data, for example, to exclude the fact that agitated behaviour was not, in fact, due (in a very few individuals) to underlying severe pain and/or dementia.
Baseline and outcome data were obtained from the Australian and New Zealand Intensive Care Society Adult ICU Patient Database run by the Centre for Outcome and Resource Evaluation [
18]. Data detailing the use of antipsychotic medications were obtained from the hospital’s electronic medication management system.
Exposure
The primary exposure of this study for a given patient was the occurrence at any time of having a word/s indicating agitated or non-agitated behaviour, or combined behavioural disturbance, within the notes recorded during their ICU stay.
Outcomes
The primary outcome of this study was the use of antipsychotic medications.
Although such treatment is controversial [
19‐
22], antipsychotic medication use was chosen as the primary outcome measure because we were studying behaviour and considered it likely that such medications would be used differently according to the presence or absence of an agitated behavioural phenotype [
23,
24].
Secondary outcomes included ICU and hospital mortality and 28-day mortality censored at hospital discharge as well as ICU and hospital length of stay and duration of mechanical ventilation.
Statistical analysis
All continuous data are reported as median (quartile 25%–quartile 75%) and categorical data as numbers and percentage. Baseline, clinical characteristics and outcomes of the patients were compared among the groups using the Fisher exact test and Kruskal–Wallis test.
Multivariable logistic regression models were used to assess the impact of the exposure on the use of antipsychotic medications and hospital clinical outcomes, according to each phenotype. The model was adjusted by age, type of admission, and by the Australian and New Zealand (ANZ) Risk of Death (ANZROD) after log transformation [
25]. ANZROD is a local recalibration of the APACHE III score which adjusts for the persistent lower than expected mortality in ANZ and contains variables such as admission diagnosis and pre-existing comorbidities [
25]. As previously demonstrated, ANZROD is an accurate outcome predictor and explains most of the mortality found in ICUs in Australia and New Zealand [
26]. Effect estimates were reported as odds ratio (OR) with its 95% confidence interval (CI). To account for immortal time bias, we conducted a time-dependent Cox proportional hazard model for the primary outcome and hospital mortality that considered all measurements available in each note. For the primary outcome, only exposure variables happening before the first outcome (first time the patient received an antipsychotic) were included in the model to avoid exposures measured after the medication had been given. Due to pairwise comparisons, the significance level was adjusted using a Bonferroni method and a
p < 0.01 was considered statistically significant. All analyses were case complete analyses and were conducted in R v.4.0.2 (R Foundation, Vienna, Austria) [
27]
Discussion
Key findings
We applied Natural Language Processing (NLP) to evaluate the characteristics, prevalence, trajectory, treatment, and outcomes of behavioural disturbance phenotypes in critically ill patients. We identified three major phenotypes: agitated, non-agitated, and combined. We found that each of these three phenotypes was associated with different patient characteristics. Of these, the combined phenotype was the most common, followed by the agitated phenotype. However, the trajectory of patients was such that movement from one phenotype to another in the first 48 h was common (65.66%). Moreover, during this time, it was uncommon for non-agitated patients to develop an agitated phenotype, many agitated patients achieved resolution of their NLP-Dx-BD, and both non-agitated and agitated phenotype patients typically transitioned to a combined phenotype. We also found significant differences in the use of antipsychotic medication. Thus, in time-variant multivariable models and after adjustment, patients with an agitated component to their NLP-Dx-BD were significantly more likely to receive antipsychotic medications overall compared with those patients with non-agitated NLP-Dx-BD. Finally, we found that patients with the combined phenotype had longer unadjusted duration of invasive ventilation, ICU stay, and hospital stay and greater mortality. However, once multivariable models were applied with hospital mortality as the outcome and groups as time-dependent variables, it was patients with the agitated phenotype who were more likely to die than patients with other phenotypes, especially among those patients receiving mechanical ventilation.
Relationship to previous studies
Characterisation
Behavioural disturbance phenotypes based on NLP analysis of caregiver notes have not been previously defined. However in a parallel fashion, three phenotypes of delirium have been previously described: hyperactive, hypoactive and mixed [
28‐
31]. Although there is no gold standard for the identification of these phenotypes, the linguistic constructs and psychomotor descriptors used in previous studies appear broadly consistent with the categorisation of the NLP-Dx-BD search terms used in our study [
32]. Nonetheless, NLP-Dx-BD phenotypes are focussed on unstructured continuous observation of behaviour by care givers and their relationship with structured intermittent screening tool-based assessment is unknown [
3,
33].
Prevalence
The prevalence of the phenotypes of NLP-Dx-BD phenotypes in critically ill patients is unknown and ours is the first study to explore this concept. However, studies using structured intermittent screening tools reported that between 0.3 and 45.9% of patients displayed agitation, 0.5–91% displayed a non-agitated state and 1–69.5% displayed both [
7,
20,
22]. The prevalence of the NLP-Dx-BD phenotypes identified in our study falls within these ranges.
Trajectory
To our knowledge, the trajectory of behavioural phenotypes of critically ill patients has not been reported. Moreover, studies of the epidemiology and treatment of delirium using intermittent structured screening tool-based assessments have not reported on their dynamic nature, thus implying stability [
34,
35]. In contrast, and for the first time, we have demonstrated in detail that behavioural phenotypes, as identified by NLP, are unstable within the first 48 h of critical care admission.
Treatment and outcomes
Antipsychotic medications remain widely used for the treatment of disturbed behaviour in critically ill patients [
36,
37] with 70% of antipsychotic medication used in acute care prescribed for its treatment [
38]. Further, several studies have reported higher rates of administration for agitated patients [
23,
39]. In our study, we also found antipsychotic medication use was significantly more likely in the combined NLP-Dx-BD group followed by the agitated groups (both groups having an agitated component). This implies that it is the presence of agitation that drives antipsychotic drugs prescription.
Using intermittently applied structured-assessment tools, a previous study of non-ventilated patients suggested that mortality did not differ between the agitated, non-agitated and combined states [
40]. However, in our study, after appropriate adjustments and assessing the group as a time-dependent variable (an adjustment not applied to previous studies), ICU mortality was highest for the NLP-Dx-BD agitated group. This suggests that NLP assessment of continuous caregiver observation may identify a cohort of patients that differs from that identified by intermittently applied structured-assessment tools.
Implications of study findings
Our findings imply that NLP-Dx-BD may be used to identify three clinically relevant phenotypes. Further, our findings suggest that the combined phenotype may be dominant. However, they also suggest that the early trajectory of these phenotypes is complex with dynamic changes from one phenotype to the other, indicating that such phenotypes are unstable. Moreover, on multivariable analysis and overall, patients with an agitated component to their NLP-Dx-BD (agitated phenotype or combined phenotype) appear significantly more likely to receive antipsychotic medications. These observations imply that future studies of pharmacologic intervention should primarily focus on patients who present an agitated component to their phenotype, either in isolation or in a combination (combined phenotype).
Strengths and limitations
Our study has several strengths. It is the first to use NLP to study the prevalence of behavioural phenotypes in critically ill patients. It is also the first to describe the early trajectory of these phenotypes in critically ill patients and to demonstrate their dynamic and unstable nature. Moreover, to the best of our knowledge, our study presents the first detailed information of the use of antipsychotic medications according to phenotype, thus providing further evidence of the validity of the NLP-Dx-BD construct. Finally, with current software, as soon as a key word describing an agitated state is entered into the electronic notes, such entry can be used to trigger alerts to clinicians, thus facilitating randomisation into interventional trials.
We acknowledge several limitations. First, our study was undertaken in a large intensive care unit system involving three ICUs within a university affiliated tertiary hospital in a resource-rich country. Therefore, our findings may not apply to other intensive care units in low or middle-income countries or to other ICUs with a different approach to the management of disturbed behaviour. Moreover, patients were not assessed for the presence or absence of delirium by independent adjudication personnel. However, we investigated behavioural disturbances. This is different in focus, concept, and technique from the assessment of delirium [
8]. Thus, we cannot comment of the relationship between our findings and their relevance to intermittently applied delirium screening tools. Future investigations of such relationship may be of interest. The use of medications in the treatment of behavioural disturbance in critically ill patients is controversial. However, antipsychotic medications remain an important tool for moderating behaviour, are a reasonable proxy for disturbed behaviour (the target of our investigation), and are widely used in critically ill patients [
22,
41]. Finally, clinicians may be unfamiliar with NLP techniques, which might generate scepticisms about our findings. However, clinicians are the very generators of the words we used in our study; NLP-based technology is fast becoming accepted in response to the arrival of Large Language Model Generative Pre-trained Transformer strategies, and our study is the first step toward machine learning approaches to the behavioural management of critically ill patients.
Conclusions
For the first time, we demonstrated that Natural Language Processing of electronic caregiver notes enables the identification and characterisation of NLP-Dx-BD phenotypes in critically ill patients. Moreover, we found that the combined phenotype was dominant and that, in the first 48 h, it was common for critically ill patients within the study cohort to transition between behavioural phenotypes of NLP-Dx-BD. Importantly, after adjustment, patients with the agitated phenotype either in isolation or within a combined phenotype were significantly more likely to receive antipsychotic medications. These findings have important implications for our understanding of the epidemiology of phenotypes of disturbed behaviour in critically ill patients and for trial design in this field.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit
http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (
http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.