Introduction
In 2020, female breast cancer (BC) was the most common malignant tumor worldwide and the leading cause of cancer-related death in women (Sung et al.
2021). It is estimated that 287,850 new cases of female BC and 43,250 deaths will occur in 2022 in the US alone (Siegel and Miller
2022). In recent years, surgery, radiotherapy, chemotherapy, immunotherapy, endocrine, and targeted therapy have been widely applied for the treatment of this heterogeneous malignancy, leading to significant progress. However, the global burden of BC remains considerable, necessitating the early identification of high-risk patients and precision medicine-inspired solutions for improved prognosis.
Phosphatidylinositol 3-kinases (PI3Ks) phosphorylate the third hydroxyl of the phosphatidylinositol ring, and they can be classified into three types based on structure (I, II, and III) (Mosele et al.
2020). Type I are the most widely studied, acting in various important biological processes, such as cell growth, proliferation, metabolism, and migration through the PI3K/AKT/mTOR pathway (Samuels et al.
2005). The PIK3CA gene encodes the catalytic subunit of class IA PI3Ks (p110α) (Zardavas et al.
2014). Mutation of PIK3CA can lead to abnormal catalytic activity of PI3Ks and, consequently, promote carcinogenesis in various tissues (Herberts et al.
2020; Hu et al.
2021; Jin et al.
2021; Ugai et al.
2021). PIK3CA somatic mutations occur in approximately 30% of BC patients, being more common in hormone receptor (HR)-positive tumors (Lv et al.
2020). Approximately 80% of PIK3CA mutations occur in the helical and kinase domains, with E542K and E545K in exon 9 as well as H1047R and H1047L in exon 20 as the most common variants (Dirican et al.
2016). In the past few years, PIK3CA has emerged as a promising target for BC treatment, with alpelisib and fulvestrant receiving approval for the treatment of PIK3CA-mutated, endocrine-resistant HR+/HER2− locally-advanced or metastatic BC (De Mattos-Arruda
2020). Moreover, several large clinical trials have reported an association of PIK3CA mutations with favorable prognosis and clinicopathological BC features (Kalinsky et al.
2009; Loi et al.
2013). Therefore, further study of the differences in gene expression between PIK3CA-mutated and wild-type BC should provide valuable insight for predicting prognosis and guiding clinical decision-making.
In light of the impact that the tumor immune microenvironment (TIME) has on tumorigenesis and metastasis, there has been increasing interest in immunotherapy for BC. Immune checkpoint inhibitors (ICIs) targeting cytotoxic T lymphocyte-associated protein 4 (CTLA4), programmed cell death-1 (PD-1), and programmed cell death ligand-1 (PD-L1) have shown promising efficacy in certain tumor types, leading to their entry into the clinic (Gaynor et al.
2022). The immunoregulatory effects of CTLA4 antagonists tremelimumab and ipilimumab have been confirmed in small-scale BC cohorts (Zhu et al.
2021). Clinical trials of CTLA4 antagonist monotherapy or in combination with other immunomodulators are ongoing, with the clinical benefits of CTLA4 inhibition in BC expected to be confirmed in the future. Anti-PD-1/PD-L1 immunotherapy confers a survival benefit to some metastatic triple-negative BC patients (Adams et al.
2019; Schmid et al.
2017). The TIME, which is composed of infiltrating immune cells and various other cell types, is strongly correlated with tumor progression and ICI response (Miller et al.
2021; Oliver et al.
2019; Taylor et al.
2017). Tumor-infiltrating lymphocytes (TILs) are major indicators of immune infiltration in the TIME and can inhibit tumor growth (Bagbudar et al.
2022). Enhancing cytotoxic T cell responses may suppress tumor growth and improve patients survival (Rupp et al.
2022). Studies have shown that BC tumors with a higher proportion of TILs are more sensitive to chemotherapy, radiotherapy, as well as immunotherapy, leading to improved prognosis (Adams et al.
2014; Denkert et al.
2018). Some studies have reported the effect of different mutations on ICI response (Chen et al.
2021a; Collins et al.
2022). Currently, the main challenge of immunotherapy remains the exploration of reliable indicators to predict its potential efficacy.
Hence, we explored the mutation characteristics of PIK3CA in BC samples from the Cancer Genome Atlas Database (TCGA,
https://portal.gdc.cancer.gov/) and our institution. Subsequently, a PIK3CA mutation-driven immune signature (PDIS) was developed based on TCGA data, and we validated its capacity for BC patient risk stratification using the Gene Expression Omnibus database (GEO,
https://www.ncbi.nlm.nih.gov/geo/). The signature presented herein may be employed to evaluate the TIME, optimize clinical benefit, and predict patient prognosis.
Materials and methods
Data collection and processing
Somatic mutation profile, RNA-seq data of BC and normal samples (FPKM-normalized format), as well as corresponding clinical features (including age, survival time, survival status, TNM stage, pathological stage, and ER/PR/HER2 receptor status) were collected from TCGA. Patients without survival information were excluded from subsequent analyses. We implemented the following inclusion criteria for cohorts from the GEO database: (1) samples were derived from human BC; (2) RNA-seq expression data was available; (3) clinical and prognostic information of patients was available; (4) the number of samples was greater than 50.
PIK3CA mutation analysis of BC samples from local hospital
To validate the results of mutation analysis in TCGA, we obtained surgically resected female BC samples from the Department of Oncology at Second Affiliated Hospital of Xi’an Jiaotong University obtained from January 2019 to July 2022. Screening criteria included: (1) no preoperative neoadjuvant therapy, (2) postoperative pathologically confirmed BC, and (3) pathological tissues eligible for molecular typing. This study was approved by the Ethics Committee of the Second Affiliated Hospital of Xi’an Jiaotong University. The surgical procedure was performed in accordance with the Declaration of Helsinki. All patients signed an informed consent form. We employed the allele-specific polymerase chain reaction (AS-PCR) method for PIK3CA mutation detection in patient samples. DNA was extracted from formalin-fixed, paraffin-embedded (FFPE) samples using the QIAamp DNA FFPE Tissue Kit (Qiagen, Cat No./ID: 56,404). Five hotspot mutations of PIK3CA (H1047R, H1047L, E542K, E545K, E545D) were then examined as per the manufacturer’s instructions using the Human PI3K Gene Mutation Fluorescence PCR diagnostic kit (Amoy Diagnostics, Xiamen, China) on an SLAN-96S fluorescent quantitative PCR instrument. Based on the obtained Ct values, we divided samples into negative, weakly positive, and strongly positive, with the latter two considered to harbor mutations.
Somatic mutation analysis and clinical validation
BC mutation data, including mutated genes, mutation types, and mutation sites, were downloaded from TCGA “Masked Somatic Mutation” database. The waterfall function in the “maftools” package was applied to obtain the mutation landscape of the TCGA-BRCA cohort. To understand the distribution of PIK3CA mutations in BC and their correlation with clinicopathological factors, PIK3CA mutation samples and related information were extracted from the downloaded mutation data, with clinical samples obtained from our local hospital used for comparison and validation. Correlations between PIK3CA mutations and clinical variables were analyzed using Pearson’s Chi-square test or Fisher’s exact test (Shao et al.
2021).
Identification of differentially expressed genes (DEGs) driven by PIK3CA mutations
RNA-seq data of PIK3CAMUT and PIK3CAWT patient tumor samples were extracted. After normalization, the “edgeR” package was further used to identify DEGs driven by PIK3CA mutations in BC patients. The screening criteria were set as the adjusted P < 0.05, log2|fold change (FC)|> 0.
Consensus clustering
Single-sample gene set enrichment analysis (ssGSEA) was used to analyze the infiltration of 22 immune cell types and the activity of seven immune-related pathways in tumor tissue samples (Hänzelmann et al.
2013). Based on the extent of immune infiltration, we applied the “ConsensusClusterPlus” package for consensus clustering to divide BC patients into two groups of high- and low-immune activity (Seiler et al.
2010).
Weighted gene co-expression network analysis
The “WGCNA” package was used to construct a gene co-expression network to aggregate genes with highly correlated expression and identify gene modules closely related to immune subtypes (Liu et al.
2022). Firstly, a similarity matrix was constructed based on expression data, and the Pearson correlation coefficient was calculated to evaluate the similarity between genes (Langfelder and Horvath
2008). An adjacency matrix was then constructed based on the above matrix. The formula was as follows: aij = power (sij, β) =|sij|
β, where ajj represents the correlation strength between gene i and gene j, sij is the Pearson correlation coefficient between the two genes, and β is the soft threshold (power) (Zhang and Horvath
2005). By constructing a topological overlap matrix (TOM), hierarchical clustering analysis of genes was performed, and genes with similar expression patterns were divided into gene modules using dynamic branch cut methods (Tian et al.
2020). Each module contained at least 200 genes, and modules with correlations greater than or equal to 0.75 were merged. Finally, the Pearson correlation coefficient between the MEs of each module and the clinical traits was calculated, and the module most related to the immune subtypes were screened for subsequent analysis. Module eigengene (ME, The first principal component in each module) described the overall level of gene expression in the Module. Module membership (MM, Correlation of all gene expression profiles with eigengene of this module) and gene significance (GS, The absolute value of the correlation between genes and phenotypic traits) were used to evaluate the correlation between genes in the module and the module itself, along with the correlation with the corresponding traits of the module (Tian, et al.
2020).
Construction and validation of PDIS
A Venn diagram was used to intersect genes driven by the PIK3CA mutation and the module most relevant to immunity. The obtained genes were termed PIK3CA mutation-driven immune genes (PDIGs). Univariate Cox regression analysis was performed on the obtained PDIGs to screen those related to overall survival (OS) of BC patients. Then, using the “glmnet” and “survival” packages, least absolute shrinkage and selection operator (LASSO) was conducted on the above genes. By controlling the penalty coefficient λ, the coefficients of some genes less related to prognosis were compressed to 0, and the coefficients of genes significantly related to prognosis were retained greater than 0. Then a multivariate model was constructed to further confirm the genes independently associated with prognosis. Risk score = h(t, X) = h
0(t) × e
Ʃ (coefi*Expri), where h0(t), coefi, and Expri are the constant, regression coefficient, and gene expression level, respectively (Ren et al.
2022). Patients in TCGA training and GEO validation cohorts were divided into high- and low-risk groups according to the corresponding median risk score, of which the high-risk patients are those with PDIS score above the median and the low-risk patients are those with PDIS score below the median. Kaplan–Meier (KM) survival analysis was performed using the “survival” and “survminer” packages to explore the capacity of PDIS and PIK3CA mutation status or regions to differentiate prognosis between risk groups. Other recently published prognostic models of BC were retrieved from PubMed (
https://pubmed.ncbi.nlm.nih.gov/). The “survivalROC” package was used to plot the receiver operating characteristic curves (ROC), whereafter the area under the ROC curve (AUC) and C-index were used to evaluate and compare the predictive validity of multiple models.
Establishment and verification of predictive nomogram
Combining with risk score and other clinicopathological features, R packages “rms” and “regplot” were used to construct a nomogram to quantitatively predict the survival rate of BC patients. Then, using the calibration function and “survivalROC” package, calibration and ROC curves of 1-, 3-, 5-year survival were drawn to verify the predictive ability of the nomogram. Decision curve analysis (DCA) was performed on different prognostic factors using the “ggDCA” package.
Functional enrichment analysis
DEGs between the high- and low-risk groups were screened with log
2|FC|> 1 and an adjusted
P < 0.05 as the standard. The “clusterProfiler” package was then used to perform Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis to explore the potential function of DEGs (Yu et al.
2012). Next, gene set enrichment analysis (GSEA) was used to explore the signaling pathways associated with different risk subgroups.
Mutation landscape and TMB analysis
The “maftools” package was used to visualize mutations in the high- and low-risk groups as well as to compare the two groups in this regard. The total number of mutations per megabase in each sample was calculated to obtain tumor mutational burden (TMB) (Wan et al.
2020). We compared TMB between the two subgroups and determined the association between TMB and survival.
Evaluation of differences in immune infiltration and checkpoint factor expression
We used the “GSVA” and “GSEABase” packages to conduct ssGSEA, on the basis of which we employed the ESTIMATE algorithm to evaluate immune score, stromal score, estimate score, and tumor purity (Yoshihara et al.
2013). Using the LM22 gene signature matrix downloaded from the CIBERSORT website as a reference (
https://cibersortx.stanford.edu/), we determined the relative proportions of 22 immune cell types in each tumor sample (Ren et al.
2021). Finally, the relationship between risk scores, immune scores, immune cell infiltration abundance, and immune checkpoint expression level was analyzed using the “ggplot2” package. Immune function-related genes were collected from TISIDB (an integrated repository portal for tumor-immune system interactions,
http://cis.hku.hk/TISIDB/), whereafter the correlation between risk scores and the above genes was analyzed and visualized using the “limma”, “reshape2”, and “RColorBrewer” R packages, to explore the potential mechanism of immune cell infiltration.
Drug response analysis
IC50 refers to the half-inhibitory concentration of the detected drug and is negatively correlated with the effect of chemotherapeutic drugs. We used the “pRRophetic” package and its functions “car”, “ridge”, “preprocessCore”, “genefilter”, and “sva” to calculate IC50 (Pang et al.
2021). Tumor Immune Dysfunction and Exclusion (TIDE) is a newly developed computational method that identifies the potential for tumor immune escape based on transcriptomic data (Jiang et al.
2018). FPKM gene expression data were Z-score normalized, and TIDE scores were then calculated for BC patients, with high TIDE scores predicting poor ICI efficacy. Accordingly, we analyzed the differences in drug response between patients in high- and low-risk groups, visualizing them via the “ggplot2” package.
Statistical analysis
All statistical analyses and charts were generated using R software (version 4.2.0), SPSS software (version 18.0), and Excel (Microsoft Corporation, California). The R package in this article is from CRAN (Comprehensive R Archive Network) or Bioconductor (
https://www.bioconductor.org/) for download. KM survival analysis and log-rank tests were used to compare survival between high- and low-risk groups. A two-tailed
P-value less than 0.05 was used as the threshold for statistical significance.
Discussion
BC is a heterogeneous disease. The molecular subtyping of immunohistochemical markers (ER, PR, HER2, Ki67) has greatly improved prognosis prediction and treatment decision-making (Yeo and Guan
2017). With the development of high-throughput sequencing technology, molecular subtype-based models, such as Oncotype DX, MammaPrint, RecurIndex, Endopredict, and PAM50, have been developed and applied in clinical trials as well as in clinical practice for BC diagnosis, individualized treatment, and survival prediction (Barzaman et al.
2020; Nicolini et al.
2018; Sun et al.
2021). However, these widely used models often neglect the effect of genetic differences on TIME heterogeneity. TIME refers to all immune components within the tumor microenvironment (TME), which plays an established central role in cancer development and progression, having predictive value for BC prognosis and immunotherapy response (Baxevanis et al.
2021; Byrne and Savas
2020; Xu et al.
2021). Tumor genetic heterogeneity, including single-nucleotide variants, short indels, and copy number variants, is involved in establishing TIME heterogeneity (Jia et al.
2022). As the most frequently mutated gene in BC, PIK3CA has gradually become the focus of targeted therapy, but the signature of PIK3CA-driven immune activity driven has not yet been studied.
In the present study, we observed that PIK3CA was most frequently mutated gene in the TCGA-BRCA cohort, with 34.49% samples harboring PIK3CA mutations. H1047R, H1047L, E542K, and E545K accounted for 61.95% of the identified mutations, among which H1047R was the most common, which was basically consistent with the conclusions of other studies (A et al.
2021; Chang et al.
2021; Martínez-Sáez et al.
2020). The clinical relevance of PIK3CA mutations has been extensively studied and is thought to be associated with favorable clinicopathological factors, such as smaller tumor size, HR positivity, and lower grade (Loi, et al.
2013; Sabine et al.
2014; Zardavas et al.
2018). However, no significant association between PIK3CA mutations and clinicopathological features was observed in the TCGA cohort. The characteristics of PIK3CA mutations were validated in clinical samples from a local hospital. The mutation frequency of PIK3CA was 40.83%, with H1047R once again accounting for the highest proportion of mutations. Interestingly, although there remained no link between PIK3CA mutation status and clinicopathological features, we found significant differences in ER, PR status, and molecular subtypes between PIK3CA mutation exon 9 and exon 20, which may provide a basis for individualized endocrine and targeted therapy for patients with different subtypes.
Many previous studies have reported that PIK3CA mutation status may be related to the prognosis of BC patients, but its predictive significance has remained controversial (Baselga et al.
2017; Di Leo et al.
2018; Loi, et al.
2013; Mosele, et al.
2020). Hence, we identified DEGs between PIK3CA
MUT and PIK3CA
WILD tumors. Two immune subtypes (Immune-H and Immune-L) with differential immune infiltration were distinguished via consensus clustering. The key modules and genes were screened via WGCNA analysis and overlapped with PIK3CA-mutated DEGs to yield PDIGs. After univariate Cox regression, LASSO, and multivariate Cox regression analyses, 16 PDIGs closely related to prognosis were selected to establish PDIS. Validation in the GEO database and comparison with the ROC and C-index of other immune (Chen et al.
2021b; Peng et al.
2021; Zhang et al.
2021), ferroptosis (Liu et al.
2021), and glycolysis-related (Zhang et al.
2020) prognostic models published in recent years demonstrated the prognostic value of the PDIS. Analysis of various prognosis-associated factors revealed that neither PIK3CA mutation status, nor factors commonly employed in clinical practice could independently predict BC prognosis. Meanwhile, the PDIS was an independent predictor of prognosis. Taken together with previous studies, the impact of PIK3CA mutation status or regions on BC prognosis remains largely uncertain to date. Similarly to our work, a number of studies have reported no meaningful association between the two (Loibl et al.
2014; Papaxoinis et al.
2015; Sabine, et al.
2014; Zardavas, et al.
2018). For example, a large research pooling 19 early-stage BC studies revealed that although PIK3CA mutations were associated with better OS in a univariate analysis, this relationship was no longer significant after correction via multivariate analysis, and prognosis was not significantly different between patients carrying mutations in the helical versus kinase domain (Zardavas, et al.
2018). Paradoxically, PIK3CA mutations were reported to predict both a favorable or unfavorable prognosis, varying based on molecular subtype and tumor stage (Deng et al.
2015; Elfgen et al.
2019; Liu et al.
2014; Mosele, et al.
2020; Pang et al.
2014). Still, certain studies suggest biological differences between tumors harboring mutations in the helical or kinase domain, which may be associated with different degrees of invasiveness, yet the specific prognostic value remains undetermined (Barbareschi et al.
2007; Lai et al.
2008; Lerma et al.
2008; Mangone et al.
2012; Pang et al.
2009). In summary, compared with PIK3CA mutation status alone, PDIS exhibits a higher predictive ability for BC patient survival. Moreover, by combining PDIS with ER, PR, HER2 status, and stage, the nomogram constructed can predict the survival probability of patients more accurately and quantitatively than PDIS alone.
Functional analysis revealed that DEGs in the high- and low-risk groups were associated with immune-related biological processes and pathways, as well as that there were differences in the functional pathways involved in the two risk groups that may be associated with BC development. Therefore, we analyzed the immunological profiles of PDIS-based risk groups. The low-risk group possessed a high proportion of infiltrating immunostimulatory cells, such as activated NK cells, CD8 T cells, and follicular helper T cells, which play a role in anti-tumor immunity. Among them, CD8 T cells and follicular helper T cells are important TILs, whose abundant infiltration is widely recognized as a marker of favorable prognosis in BC (Salemme et al.
2021). Conversely, regulatory T cells exhibited relatively abundant infiltration in the low-risk group, exerting an immunosuppressive effect, which has been previously reported as associated with poor prognosis (Martinez et al.
2019; Shash et al.
2021). Immune cells, represented by M2 macrophages, exhibited more abundant infiltration in the high-risk group. Some data suggest that M2 contributes to poor prognosis by stimulating angiogenesis and inflammation, enhancing tumor growth, invasion, and metastasis, whilst promoting immunosuppression (Choi et al.
2018; Mehta et al.
2021). Taken together, it is reasonable to speculate that low-risk patients exhibit a stronger anti-tumor immune response and may respond better to immunotherapy.
ICIs, which block immune checkpoint-mediated suppression to trigger robust anti-tumor immunity, have emerged as one of the most effective forms of immunotherapy, with proven benefit in a variety of cancers (Bagchi et al.
2021; Himmel et al.
2020). However, the low immunogenicity of BC often compromises ICI efficacy, highlighting the need for identifying patients expected to benefit. Herein, we observed a significant increase in the expression of nine common immune checkpoint factors in low-risk patients, including PD-L1, PD-1, and CTLA4, suggesting that this subgroup might benefit from ICIs. It is worth noting that it is recommended to combine PDIS with other indicators commonly used in ICI sensitivity evaluation during BC treatment (including ER, BR, HER2 receptor status, pathological stage, etc.) to play a more valuable role. For example, ICIs are usually used in the treatment of TNBC patients, but the therapeutic effect is limited. TNBC patients stratified by PDIS showed different immune activity. Among the low-risk TNBC population, they may be more likely to benefit from the treatment of ICIs, adding new evidence to the patient’s treatment plan and increasing the patient’s confidence. TMB is considered as a valuable biomarker for predicting ICI response, yet its predictive value varies significantly across cancers (Chan et al.
2019). Our results showed that TMB was generally low in BC, with only 1.8% (17/ 922) exhibiting a TMB greater than 10 mut/Mb in the high-risk group. Further, there was no significant difference in OS between H-TMB and L-TMB, which is consistent with previous findings in BC (McGrail et al.
2021; Wang et al.
2020). H-TMB is associated with a favorable ICI response and prognosis, presumably because mutations give rise to neoantigens that are immunogenic and thus are more likely to trigger T cell responses (Zheng
2022). A threshold of 13 mut/Mb is commonly used in H-TMB cancers, such as melanoma and non-small cell lung cancer. However, as an immunologically “cold” tumor with a low TMB, BC is extremely sensitive to the selection based on a TMB cutoff. In our study, using the median TMB (0.87 mut/Mb) was thus not appropriate. Large prospective studies are needed to determine appropriate cutoff that will allow the effective use of TMB in predicting the response to ICIs and prognosis in BC patients. The value of our PDIS for predicting response immunotherapy and chemotherapy was further explored. The TIDE was reported as an accurate predictor of the response to anti-PD-1 or -CTLA4 when compared to other indicators (e.g., PD-L1 levels and TMB) (Jiang, et al.
2018). We found that TIDE scores were higher in the high-risk group and were associated with poorer ICIs efficacy. The sensitivity of different risk groups to some common chemotherapeutics and targeted therapy drugs varied. Therefore, we have reason to believe that despite the shortcomings of PDIS, it is undeniable that it provides a new potential perspective for the treatment of BC. PDIS is not only an effective biomarker for predicting immunotherapy response, but may also help guide devising chemotherapy regimens in BC patients.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.