Introduction
A non-coding hexanucleotide repeat expansion in the gene
C9orf72 is the most common cause of both amyotrophic lateral sclerosis (ALS) and frontotemporal lobar degeneration (FTLD) [
17,
70]. ALS and FTLD exist on a disease spectrum, characterized by overlap in diagnosis, neuropathology, and genetic risk variants [
11,
21,
37,
40,
61]
. It has been estimated that up to 97% of ALS patients demonstrate TDP-43 pathology, while approximately 50% of FTLD patients have TDP-43 inclusions [
5]. Additionally, the
C9orf72 repeat expansion has been shown to contribute to the mis-localization of TDP-43 from the nucleus to the cytoplasm [
91], a key feature of TDP-43 pathology. Other studies have demonstrated that expansions in
C9orf72 may lead to disease through at least three main mechanisms: reduced expression of
C9orf72 transcripts, presence of RNA foci, and/or inclusions of dipeptide repeat (DPR) proteins [
3,
17,
57]. However, the clinical and pathological variability across the disease spectrum remains largely unexplained.
Recently, the cerebellum has garnered increased attention in the ALS/FTLD research field. Traditionally, the cerebellum, which is clinically unaffected, was thought of as a region primarily spared from TDP-43 pathology and neuronal loss; however, newer studies have shown that TDP-43 protein levels are reduced in this region [
63]. In fact, evidence from neuropathological characterization and imaging studies indicate that the cerebellum is involved in both ALS and FTLD [
66,
73]. Furthermore,
C9orf72 RNA foci and DPRs are widely observed throughout the cerebellum [
26,
27], and features of the
C9orf72 repeat expansion have been associated with clinico-pathological variables of the disease, specifically in this region [
16,
71,
83]. Some studies have previously been completed to evaluate transcriptomic alterations in the cerebellum across the ALS/FTLD disease spectrum. For example, using a gene expression assay, we described a cerebellar upregulation of homeobox (
HOX) genes in patients with a
C9orf72 repeat expansion [
24]. Additional studies have used RNA sequencing (RNAseq) to identify transcriptome-wide differentially expressed genes and alternative splicing events. One such study compared ALS patients with the repeat expansion to sporadic ALS subjects and controls and found pervasive gene expression changes in the cerebellum [
67]. In this region, differentially expressed genes were involved in neuron development, protein localization, and transcription pathways [
67]. They also discovered widespread splicing dysregulation, specifically alternative splicing of cassette exons. The genes that contained these alternative splicing events were primarily involved in cell signaling, cellular trafficking, synaptic function, and RNA processing [
67]. Despite these efforts, studies of the cerebellar transcriptome have been limited by sample size and/or the computational tools to evaluate splicing that were available at the time.
Regardless, transcriptomic and functional studies have consistently demonstrated that alternative splicing and dysregulation of RNA splicing are key drivers of the ALS/FTLD disease spectrum [
2,
13,
35,
44,
65,
79,
88,
92]. In fact, these studies have not only identified dysregulation of cassette exon splicing, but also the presence of cryptic exons in ALS/FTLD patients with TDP-43 pathology [
51,
67]. Cryptic exons can be defined as exons that are present within a normally intronic region, often leading to a loss of expression, possibly by nonsense mediated decay, or the production of a dysfunctional protein. These studies have shown that nuclear, endogenous TDP-43 binds to and represses these cryptic exons [
51]. One example of a cryptic exon that can be present in the brain of FTLD patients with TDP-43 pathology is in the gene
STMN2, where loss of TDP-43 fails to suppress a cryptic exon, leading to a decrease in both RNA and protein levels and an increase in truncated
STMN2 transcripts [
41,
56]. This has been observed in patient tissue, allowing
STMN2 to serve as a marker for this pathology [
41,
56,
68]. Moreover, 2 studies have indicated that inclusion of a cryptic exon in
UNC13A confers disease risk, where the cryptic exon is found in the absence of nuclear, endogenous TDP-43 [
10,
52].
In summary, the cerebellum exhibits ample C9orf72 pathology, but minimal neurodegeneration in autopsy tissue, making it a prime region for post-mortem study of C9orf72 transcriptomics. This gives us the opportunity to uncover findings that may have remained hidden when focusing on primary affected regions with extensive neuronal loss and that could hint at mechanisms underlying the resilience of the cerebellum to neurodegeneration despite evidence of TDP-43 dysfunction. Therefore, to capture genes, transcripts, or pathways that may contribute to the heterogeneity of the ALS/FTLD disease spectrum, here we profiled the cerebellar transcriptome of patients with or without a C9orf72 repeat expansion, as well as controls. We included subjects for whom post-mortem tissue was available through the Mayo Clinic Brain Bank. Using differential gene expression, co-expression network analysis, and differential splicing, we aimed to determine expression and splicing alterations in this disease spectrum.
Discussion
In this study, we describe transcriptomic alterations in the cerebellum of patients with a
C9orf72 repeat expansion. Traditionally, the cerebellum has not been considered a primary affected region in either ALS or FTLD. However, recent studies have suggested that the cerebellum is susceptible to pathological accumulation of RNA foci and DPR proteins that are produced directly from the
C9orf72 repeat expansion [
26]. We and others have shown transcriptomic changes in the cerebellum of individuals with ALS and/or FTLD who harbor the
C9orf72 repeat expansion [
24,
33,
67,
84]. Consistent with previous results, we observed that
C9orf72 expression is significantly decreased (Fig.
1d) in the cerebellum of c9 patients [
84]. Additionally, through differential gene expression analysis and co-expression analysis (DarkRed module), we found that the expression of
HOX genes is upregulated in the cerebellum of patients with a
C9orf72 repeat expansion, confirming what we demonstrated using a gene expression assay [
24]. The
HOX gene family plays an important role in neuronal development [
53,
69]. We hypothesized that the activation of this gene family may be a compensatory mechanism for loss of
C9orf72 expression [
24]. Here, we confirm that this upregulation is specific to c9 patients (Fig.
1a, b). Along these lines, we found several interesting modules of co-expressed genes that were associated with c9 status and contained multiple disease-relevant genes. For example, the Blue module, which was enriched for small molecule metabolic processes, was negatively associated with the c9 group when compared to controls and contained 19 disease-associated genes, including
C9orf72. Moreover, multiple of these c9-associated modules were enriched for RNA processing pathways (Turquoise, Royal Blue). These results provide additional evidence that these molecular pathways have a role in the disease process and suggest that they might be relevant in the cerebellum. In addition to these differential expression and co-expression analyses, our deconvolution analysis revealed a potential reduction of oligodendroglial-lineage cells in individuals with a
C9orf72 repeat expansion. This reduction is of particular interest because studies have previously implicated oligodendrocytes in the ALS/FTLD disease spectrum [
22,
23,
34,
82,
87]. Regardless, we acknowledge that our observations are highly preliminary. We realize, for instance, that the marker used for our immunostaining experiment, OLIG2, can stain other cell types and may have a preference for oligodendrocyte precursor cells [
54,
62,
86,
89]. As such, further studies are required to clarify if specific oligodendrocyte cell lineage populations (e.g., oligodendrocyte precursor cells, pre-myelinating oligodendrocytes, or myelinating oligodendrocytes) are affected and, if so, how this relates to
C9orf72-associated diseases.
Additionally, as RNA processing and splicing are key drivers of the ALS/FTLD disease spectrum, we focused on alternative splicing. We found multiple alternative splicing events in genes that have been previously implicated in ALS and/or FTLD (Fig.
2c). This includes an exon skipping event in
CAMTA1, a gene that contains variants that have been implicated as a clinical modifier of ALS survival time [
25]. We also found an exon skipping event in
DCTN1, which is a gene that harbors variants that confer risk of both ALS and FTLD. Protein levels of DCTN1 have been shown to be reduced in an ALS mouse model and regulate TDP-43 aggregation [
18,
43,
58,
85]. In addition, we identified increased inclusion of an annotated exon in
SS18L1, which has been associated with familial ALS [
80]. Finally, we found a complex splicing event located in intron 1 of
C9orf72 (Fig.
2d)
, where there seems to be an increased presence of a cryptic splicing event in c9 patients (chr9:27,567,164–27,572,766; c9-PSI = 21%, non-c9-PSI = 0.9%). In this cluster, 11 splice junctions were identified. Interestingly, in multiple
C9orf72 transcripts, the repeat expansion is located in intron 1 [
84]. More detailed analyses need to be performed to fully understand this complex splicing event.
Previously, alternative splicing of cassette exons was shown to be present at high levels in c9 patients compared to both sporadic ALS patients and controls [
67]. We specifically looked at cassette exon splicing in our data and revealed further evidence of dysregulated exon skipping in c9 patients, where we detected a high number of these events in c9 patients compared to both non-c9 patients and controls. We were particularly interested in cryptic splicing as this appears to be a crucial pathogenic mechanism across this disease spectrum [
7,
10,
51,
52,
68]. We identified an increased presence of cryptic splicing events in the c9 group compared to both the non-c9 and control groups in our cryptic cassette and annotation-based analyses. Overall, we identified 4 cryptic cassette exons and 77 cryptic splicing events based on annotations in c9 patients (Figs.
3a,
4b). In non-c9 patients, we detected 1 and 4 cryptic events, respectively. Presently, we can only speculate why few cryptic events were found in non-c9 patients. As opposed to our c9 group, it seems plausible that our non-c9 group is more heterogeneous in terms of underlying mechanisms (e.g., due to various genetic and environmental factors), which might hamper our ability to pick relatively rare events up in the cerebellum of non-c9 patients.
Of note,
STK10,
EFR3A, and
MAP4K3 had large effect sizes and/or low inclusion in controls.
STK10, serine-threonine kinase 10, is involved in Rhoc GTPase activity [
42], which in the brain has been shown to be crucial for neuronal development [
29,
77].
EFR3A, Eighty-five Requiring 3A, has been suggested to play an important role in synaptic development in the human fetal brain [
59]. Both of these events, like the
HOX genes, further implicate a role for neuronal development in the cerebellum of patients with the
C9orf72 expansion. Relevant to neurological diseases, rare variants in
EFR3A have been shown to cause autism and knockout of the gene causes spiral ganglion neurons to degenerate in mouse models [
31].
MAP4K3 is also a serine-threonine kinase that plays an important role in the TFEB pathway [
12], in turn functioning as protein kinase, with other suggested roles in cell death and autophagy [
72]. Genes of the
MAPK family have been implicated in ALS generally as they play important roles in cell proliferation and survival. Notably, multiple inhibitors of the MAPK pathway have been tested in various models of ALS, including those targeting MAP4K, which have been shown to increase motor neuron survival time in
SOD1 and mutant TDP-43 induced pluripotent stem cell lines [
90]. Based on our current analyses, these events could be further studied to determine the functional consequences of these splicing events and how they may contribute to c9-related diseases. In addition, it is important to consider that these events may only be present in specific cell types. Assessing cryptic splicing may be a challenge in single-cell or single-nuclei RNAseq data; however, recently one group completed a single-cell analysis of the
STMN2 and
KALRN cryptic exons in the context of
C9orf72 and identified specific vulnerable cell types [
28]. Future studies, similar to our present study, will need to be performed to identify more of these potentially pathogenic cryptic splicing events.
Though we know that the cerebellum is primarily spared from TDP-43 pathology, TDP-43 protein levels may be reduced in FTLD-TDP patients in this region [
63]. The cerebellum does, however, contain other pathological inclusions in c9 patients that are ubiquitin and p62 positive [
1,
64,
81], in addition to RNA foci and DPR proteins produced from the repeat expansion. Some of these c9-specific pathologies colocalize with and sequester RNA-binding proteins in the cerebellum [
14,
26], including hnRNPs, which have been shown to play a role in suppressing cryptic exons, such as hnRNP K [
7,
75]. We hypothesized that this could explain the increase in cryptic exons in c9 patients but not non-c9 patients. Therefore, we completed correlations between the expression of genes that encode RNA-binding proteins and our cryptic splicing events, which demonstrated that at least some of these cryptic events are associated with the expression of various genes that encode RNA-binding proteins. This aligns with the fact that several cryptic splicing events we detected (e.g., in
TPCN1 and
SNHG14) are in genes known to be regulated by RNA-binding proteins like hnRNP K [
7].
Notably, we detected primarily positive correlations between
TARDBP and c9 cryptic exon junctions. When looking at our differential expression data, we did show that
TARDBP was differentially expressed, with increased RNA expression in c9 patients, contrary to what has been shown at the protein level. TDP-43 autoregulation is well established [
4], and therefore, it may not be surprising that we saw increased expression levels in response to previously reported decreased protein levels [
63]. Overall, there appeared to be 2 primary clusters of genes that were either consistently positively or negatively correlated with cryptic exon inclusion. In addition to
TARDBP, this includes genes like
HNRNPA2B1,
FUS,
SRSF1, and
ALYREF (Fig.
6a–c)
. While further study is needed to evaluate the specific effects of each of the proteins that these genes encode for, our data does suggest that these non-TDP-43 RNA-binding proteins may also play an important role in splicing alterations. One should be mindful, however, of the fact that we examined RNA expression-based correlations of select RNA-binding proteins, which, as demonstrated by TDP-43, are not necessarily indicative of protein levels or RNA-binding protein pathology. Additionally, we would like to stress that future large-scale studies that integrate RNA foci burden and/or DPR protein levels should be performed to specifically investigate the direct role of these pathological features on altered splicing.
In the current study, we performed a transcriptome-wide comparison of differentially expressed genes and WGCNA modules to the frontal cortex, a primary affected region. Importantly, many relevant genes (e.g.,
C9orf72 and
TARDBP) as well as pathways (e.g., small molecule metabolic processes and protein modification) were identified in both regions. This demonstrates that we can capture vital biological changes in the cerebellum that are observed in primary affected regions, like the frontal cortex, while also capturing unique transcriptomic alterations (e.g.,
HOX genes). Additional studies have previously characterized the transcriptome of primary affected brain regions in patients with a
C9orf72 repeat expansion [
33,
67]. In line with our findings, these studies detected decreased levels of
C9orf72 in patients with a
C9orf72 expansion, while no differences
HOX genes were reported [
33,
67]. Interestingly, evaluation of alternative splicing specifically in
C9orf72 patient brain tissue suggested that splicing alterations were more profound in the cerebellum compared to the frontal cortex [
67]. More well-designed, large-scale transcriptomic studies of
C9orf72 patients are needed to allow in-depth comparisons across brain regions, which may reveal critical similarities and differences.
Our study raises many additional questions that require more investigation. Newer sequencing technologies, such as long-read sequencing and single-cell/nuclei sequencing should be capable of addressing some of the questions. Specifically, long-read sequencing technologies could aid in identifying and/or interpretating alternative splicing events, especially complex events, like the ones in
EFR3A or
C9orf72. This technology is a powerful way to discover disease-relevant transcript variants and cryptic splicing events that may not be detectable with short-read sequencing. At the same time, single-cell/nuclei studies can assist in increasing our understanding of gene expression, splicing, and cellular proportion differences. For example, while our bulk RNAseq data has allowed us to estimate changes in some cell types, this approach has a limited ability to comprehensively interrogate cell-type-specific gene expression and splicing changes. As others have shown, unique cell types respond differently to pathology [
76]. Future single-cell/nuclei studies in the cerebellum will be essential for revealing cell-type-specific disease mechanisms and vulnerable cell populations, similar to a recent study of the frontal cortex [
47]. This would help to determine whether a given event is mainly present in a specific type of cell and to elucidate whether differences in cellular proportions might, in part, contribute to some of the observed findings (e.g., splicing events). Combining these technologies (long-read + single-cell/nuclei sequencing) could facilitate the prioritization of disease-relevant splicing events and cell-specific alterations.
To conclude, we discovered abundant transcriptomic changes in the cerebellum of patients with a C9orf72 repeat expansion. Analysis of gene expression emphasized a role for genes involved in neuron development and RNA processing. Additionally, we detected an increase in cryptic exons in the cerebellum of c9 patients, which may be due to c9-specific pathologies in the cerebellum, possibly impacting various RNA-binding proteins. Overall, this study further emphasizes a role for the cerebellum in c9-related diseases and that c9-specific pathologies may contribute to cryptic splicing, particularly in neuroanatomical regions without characteristic TDP-43 pathology.