The use of mouse tumour models constitutes the most widely used pre-clinical research tool in oncology [
1], having an important role in the discovery and development of anticancer drugs [
2]. A critical novel approach from cancer treatment is immunotherapy, where the host immune system is boost to destroy cancer cells. Understanding how the immune system interacts with tumours is therefore crucial for developing personalised immunotherapies and cancer treatments. Cancer genomics research has been revolutionised by the advances in next-generation sequencing (NGS). With costs constantly dropping, the demand for sequencing of mouse cancers is increasing, as well as the need for robust analysis pipelines [
3]. The development of most analytical tools and bioinformatics pipelines to analyse sequencing data, to date, have focussed on humans and hence do not account for species-specific differences in genome structures and experimental setups; and have so far not been systematically validated in the mouse context [
3]. The genome analysis toolkit (GATK) is the gold standard in germline variant discovery. It was originally developed for human genetics, and only recently its scope has been expanding to include other organisms and somatic variant calling [
4]. The GATK team is actively working on expanding access to other species, but the development and validation of new, robust, and reliable tools is not an easy process. The functional annotator tool in GATK, Funcotator, for instance, does not currently support non-human genomes. Although GATK was originally designed for human genome research, its best practices can be adapted to the analysis of non-human organisms. A critical challenge in developing a robust tumour genome analysis pipeline is to choose the appropriate analysis methods for somatic variant discovery and biomarker identification, but also the correct file formats, genome references and annotations. In this study, we propose a genome analysis pipeline designed specifically for mouse tumours. The pipeline encompasses three main components: a data analysis workflow for somatic variant discovery using whole-genome sequencing (WGS) data, a workflow for differential expression analysis using RNA-sequencing (RNA-seq) data, and a workflow for neoepitope prediction through neoantigen analysis. Neoepitopes are the MHC (major histocompatibility complex) presented targets for immune responses against cancer. The pipeline is based on standards and best practices, and is configured to integrate mouse references and annotations. All the file formats and genome references used, and all tools, methods, algorithms and packages included in our pipeline are current standards on NGS data analysis, and were quantitatively evaluated in regard to accuracy, precision, and reliability [
5‐
8].
The thiopurine drugs are purine antimetabolites widely used in the treatment of haematological cancers, autoimmune disorders, and organ transplant recipients. The thiopurine thioguanine, also known as 6-thioguanine (6TG), is used to treat acute myeloid leukaemia, acute lymphocytic leukaemia, and chronic myeloid leukaemia [
9]. Thiopurines are converted into thioguanine nucleotides that are incorporated into DNA in competition with normal guanine inducing mutations through single nucleotide mismatching [
10]. We recently applied the pipeline to show that the treatment with low dosage of 6TG of low-mutation melanoma in a pre-clinical mouse model is highly effective in reactivating T cells to attack cancer and mildly increases the tumour mutational burden (TMB) [
11]. Moreover, the combination of 6TG with the immune-checkpoint inhibitors (ICI), which block the interaction between the inhibitory receptors on T cells and their ligands, enhances the response to ICI therapy [
11]. Here, we describe the pipeline applied in the study and further the results by analysing the potential for MHC class I antigen presentation of the identified tumour mutational space, and investigate of how this space correlates to tumour control.