DTTC treatment will be administered on an individualized basis by SLPs who were trained in DTTC by the research team, as described below. Treatment will take place in clinic rooms or at the child’s home in a quiet space. Treatment addressing functional communication, language goals, and the like will be allowed over this time period. Participants will receive 24 h of treatment, provided at no cost to the family. All sessions will be audio and video recorded for reliability and scoring purposes. We will report the number of participants initially recruited and tested, as well as those who did or did not meet inclusionary criteria in publications and presentations.
DTTC protocol
Participants in both conditions will receive DTTC treatment. In DTTC treatment, the target is accurate speech movement gestures in production of real words or phrases, rather than targeting accurate individual speech sounds as is common in traditional approaches for speech sound disorders [
9]. Words and/or phrases are selected on an individual basis to target specific speech movement patterns and to be functional and motivating for each child (see Stimuli section below).
In DTTC, treatment targets are practiced along a temporal-based production hierarchy to provide varying degrees of support to facilitate speech accuracy [
9]. Levels of the hierarchy include: (1)
Simultaneous Production – the child and clinician produce targets at the same time
, (2)
Direct Imitation – the child produces the target immediately following the clinician’s model
, (3)
Delayed Imitation – the child produces the target following a brief delay after the clinician’s model
, (4)
Spontaneous Production – the child produces the target in response to questions or phrases. At lower levels of the hierarchy (i.e.,
Simultaneous Production,
Direct Imitation), words are initially practiced at a reduced rate of speech to allow more time for the planning/programming of speech movements; practice gradually moves towards a regular rate as the child gains accuracy. Furthermore, prior to advancing along the temporal hierarchy, treatment targets are practiced while varying prosody (i.e., producing a target
happy/sad/mad/loud/soft/in a question) to introduce practice variability and promote greater speech motor learning. Based on the individual needs of the child, the clinician provides multisensory cues (e.g., verbal, visual, tactile, gestural cues) to support production accuracy combined with frequent, specific feedback related to movement accuracy (i.e., knowledge of performance feedback). As the child becomes more accurate, clinician support and feedback are faded to only indicate accuracy of productions (i.e., knowledge of results feedback) and feedback is provided at a lower frequency.
During DTTC treatment sessions, a small set of treatment targets are practiced using a modified block schedule. A total of five words or phrases (i.e., treatment targets) will be practiced within any given treatment session with one treatment target produced individually at a time in either a small block (i.e., 15–24 words per block) or a large block (i.e., 25–40 words per block). The objective of each session is to achieve 2–3 blocks for each treatment target. There is a minimum of 50 productions per session required for each participant although we anticipate that there will be between 100–200 productions per session for each participant. In each session the clinicians will practice 5 treatment targets with 2–3 blocks per targeted word/phrase (total of 10–15 blocks across the session) with small blocks containing 15–25 productions and large blocks containing 25–40 productions. Clinicians will record each time a target was practiced and whether it was practiced in a small or large block.
At the beginning of each block, the clinician elicits a target in direct imitation and then uses clinical decision-making to determine the appropriate level of the temporal hierarchy depending on the degree of support needed for the targeted word/phrase (most support = Simultaneous; least support = Spontaneous). Practice may advance from Simultaneous Production to Direct Imitation once the child has accurately produced a treatment target 5–15 times at a regular rate of speech and the clinician has introduced varied prosody. After achieving 5–15 accurate productions at Direct Imitation with varied prosody, a target will be practiced at the level of Delayed Imitation. Practice will advance from Delayed Imitation to Spontaneous Production once a child has accurately produced a target 10–15 times with varied prosody. Once a word/phrase has been accurately produced at the level of Spontaneous Production 10/10 times across three consecutive sessions, that treatment target will be graduated out of active practice. The clinician will then refer to the Treatment Target bank to determine the next word or phrase item to include in treatment.
Dynamic assessment will be used to evaluate whether a word or phrase is ready for active treatment (e.g., stimulable with clinician support) and an appropriate substitute for graduated treatment targets. At the end of a treatment session during which a target has been graduated out of active practice, the clinician will select one word/phrase from the set of 20 potential treatment targets and briefly practice this target with the child. If the child responds to clinician cueing (e.g., verbal, tactile, visual, temporal cues), this word/phrase will be added to active treatment. If a child repeatedly attempts a word/phrase but productions remain equally inaccurate despite clinician cues, that target will not be introduced to treatment and the process will be repeated with a different word/phrase.
Clinicians
Community clinicians will be recruited to administer the assessment and treatment protocols. We anticipate having up to 35 different community clinicians across North America who practice in community clinics, schools, and private practice, primarily in the continental United States. Clinicians will undergo a rigorous application and training process including completion of 20 h of didactic and applied learning modules focused on assessment, treatment, research ethics, and study-specific protocols. Following the training and prior to collecting data, clinicians will achieve 90% fidelity in administration of the DEMSS and DTTC, achieve 90% accuracy or higher on an examination that assesses knowledge of the RCT’s experimental protocols and procedures, and achieve 90% reliability in making perceptual judgments of segmental and prosodic accuracy.
Stimuli
Stimuli will consist of individualized treatment targets that will be included in treatment sessions and a generalization corpus containing a common set of words that all children will produce. Treatment targets will consist of 20 words or phrases that may be targeted over the course of treatment. They will be selected by evaluating each child’s performance across assessment tasks, including their phonetic inventory, word shape inventory, lexical stress patterns, individual error profile with a specific focus on vowel errors, and responses to clinician cueing within dynamic assessment tasks (e.g., DEMSS). In addition, the child and/or their family will be consulted to ensure that potential treatment items are highly functional and motivating to the greatest extent possible. Targeted areas will include accuracy of movement gestures in a range of syllable and word shapes, vowel accuracy, coarticulatory contexts including a range of consonant and vowel transitions, and varied stress patterns.
The generalization corpus will be used to examine carryover of treatment gains. This set of words/phrases consists of 45 words/phrases that will not be included in treatment for any participant. These words/phrases have been divided into 15 low complexity targets, 15 medium complexity targets, and 15 high complexity targets. Target complexity was determined according to word structure, segmental features, and lexical/phrasal stress. Word structure of targets follows a hierarchy of increasing complexity determined by syllable/word shape (e.g., CV, VC, CVC, CVCV, VCVC, CCVC, CVCVC, etc.), syllable number (e.g., monosyllabic, bisyllabic, multisyllabic words and phrases), and presence of adjacent consonants (e.g., CCVC; CVC CVC). Segmental complexity accounts for age of acquisition (i.e., Early 8, Mid 8, Late 8; [
50]), consonant features (i.e., place, manner, voicing), and vowel characteristics (i.e., vowel height/advancement; monophthong, diphthong). Complexity of movement transitions considers whether targets contain the same or different consonant–vowel sequences (i.e., same consonants/vowels, varied consonants/same vowels, same consonant/varied vowels, varied consonants/varied vowels) and the extent of place/manner/voicing changes across phoneme sequences. Suprasegmental complexity is based on whether lexical and phrasal stress follows either a trochaic or iambic pattern. See Table
1 for details regarding each level of complexity. Children with severe CAS will produce the 30 low and medium complexity targets. Those with moderate CAS will produce the 30 medium and high complexity targets. The child’s complexity level will be determined from their initial speech assessments using a checklist that clinicians will complete. The 15 medium complexity non-treatment words will be elicited from all participants.
Table 1
Target complexity framework
Low Complexity | CV VC CVC Reduplicated CVCV | Consonants Early consonants Vowels Simple vowels, diphtongs | Same consonant/vowel sequences | Trochaic |
Medium Complexity | CVC CVCV VCVC | Consonants Early, mid, and some late consonants; Homorganic consonant clusters Vowels Simple vowels, diphthongs | Same and varied combinations of consonants (differing by place, manner, and/or voicing) and vowels | Mostly trochaic, some iambic |
High Complexity | Multisyllabic words Phrases | Consonants Early, mid, and late consonants Homorganic & heterorganic consonant clusters Vowels Simple vowels, diphthongs | Varied combinations of consonants and vowels | Trochaic & iambic |
Probe data
Treatment outcomes will be measured using probe data, which will be collected before, during, and following the intervention phase. The data collection schedule is presented in Table
2. Probe words consist of 50 items that include 20 individualized potential treatment targets and a 30-item generalization corpus. At each data collection session, one list of probe words will be presented in a randomized order with each word produced once. Probe data will be elicited in direct imitation (e.g., “Say, “apple") with no feedback or cues provided. The procedure for probe data collection is identical in baseline, treatment, and follow-up phases.
Table 2
Data collection schedule
-3 to -2 | | P1 (× 1), ICS, FOCUS 34 | Baseline accuracy, intelligibility and participation | | P1 (× 1), ICS, FOCUS 34 | Baseline accuracy, intelligibility and participation |
-1 | | P2 (× 3) | | P2 (× 3) |
1 | Tx1 Tx2 | P3 (× 1) | Baseline accuracy (pre-Tx1) | Tx1 Tx2 Tx3 Tx4 | P3 (× 1) | |
2 | Tx3 Tx4 | - | | Tx5 Tx6 Tx7 Tx8 | P4 (× 1) | Accuracy after 6 Tx sessions (pre-Tx7) |
3 | Tx5 Tx6 | - | | Tx9 Tx10 Tx11 Tx12 | - | |
4 | Tx7 Tx8 | P4 (× 1) | Accuracy after 6 Tx sessions (pre-Tx7) | Tx13 Tx14 Tx15 Tx16 | P5 (× 1) | Accuracy after 12 Tx sessions (pre-Tx13) |
5 | Tx9 Tx10 | - | | Tx17 Tx18 Tx19 Tx20 | P6 (× 1) | Accuracy after 18 Tx sessions (pre-Tx19) |
6 | Tx11 Tx12 | - | | Tx21 Tx22 Tx23 Tx24 | P7 (× 2), ICS, FOCUS 34 | Accuracy after 24 Tx sessions (1-day post-Tx24), intelligibility and participation |
7 | Tx13 Tx14 | P5 (× 1) | Accuracy after 12 Tx sessions (pre-Tx7) | | P8 (× 3) | 1-week post-Tx accuracy |
8 | Tx15 Tx16 | - | | | | |
9 | Tx17 Tx18 | - | | | | |
10 | Tx19 Tx20 | P6 (× 1) | Accuracy after 18 Tx sessions (pre-Tx19) | | P9 (× 2), ICS, FOCUS 34 | 4-week post-Tx accuracy, intelligibility and participation |
11 | Tx21 Tx22 | - | | | | |
12 | Tx23 Tx24 | P7 (× 2), ICS, FOCUS 34 | Accuracy after 24 Tx sessions (1-day post-Tx24), intelligibility and participation | | | |
13 | | P8 (× 3) | 1-week post-Tx accuracy | | | |
16 | | P9(× 2), ICS, FOCUS 34 | 4-week post-Tx accuracy, intelligibility and participation | | | |
18 | | | | | P10 (× 3) | 12-week post-Tx accuracy |
24 | | P10 (× 3) | 12-week post-Tx accuracy | | | |