TCGA HNSCC Marker Paper

由 sufang 在 五, 02/27/2015 – 10:08 發表 Oral Cancer HNSCC TCGA

Comprehensive genomic characterization of head and neck squamous cell carcinomas (pdf 4292; PubMed Link)
The Cancer Genome Atlas profiled 279 head and neck squamous cell carcinomas (HNSCCs) to provide a comprehensive landscape of somatic genomic alterations. Here we show that human-papillomavirus-associated tumours are dominated by helical domain mutations of the oncogene PIK3CA, novel alterations involving loss of TRAF3, and amplification of the cell cycle gene E2F1. Smoking-related HNSCCs demonstrate near universal loss-of-function TP53 mutations and CDKN2A inactivation with frequent copy number alterations including amplification of 3q26/28 and 11q13/22. A subgroup of oral cavity tumours with favourable clinical outcomes displayed infrequent copy number alterations in conjunction with activating mutations of HRAS or PIK3CA, coupled with inactivating mutations of CASP8, NOTCH1 and TP53. Other distinct subgroups contained loss-of-function alterations of the chromatin modifier NSD1WNT pathway genes AJUBA and FAT1, and activation of oxidative stress factor NFE2L2, mainly in laryngeal tumours. Therapeutic candidate alterations were identified in most HNSCCs.
Five Figures in the Main Text:
Figure 1 | DNA copy number alterations. a, Copy number alterations by anatomic site and HPV status for squamous cancers. Lung squamous cell carcinoma (LUSC, n = 358) and cervical squamous cell carcinoma (CESC, n = 114). b, Unsupervised analysis of copy number alteration of HNSCC (n = 279) with associated characteristics. The rectangle indicates chromosome 7 amplifications in the purple cluster. NA, not available. 
Figure 2 | Significantly mutated genes in HNSCC. Genes (rows) with significantly mutated genes (identified using the MutSigCV algorithm; q , 0.1) ordered by q value; additional genes with trends towards significance are also shown. Samples (columns, n = 279) are arranged to emphasize mutual exclusivity among mutations. Left, mutation percentage in TCGA. Right, mutation percentage in COSMIC (‘upper aerodigestive tract’ tissue). Top, overall number of mutations per megabase. Colour coding indicates mutation type.
Figure 3 | Candidate therapeutic targets and driver oncogenic events. Alteration events for key genes are displayed by sample (n = 279). TSG, tumour suppressor gene.
Figure 4 | Integrated analysis of genomic alterations. a,b,Samples (n=279) ared isplayed in columns and grouped by gene expression (a) or methylation (b) subtype (sub.). Unadjusted two-sided Fisher’s exact test P values assess the association of each genomic alteration. Methylation probe location of CpG islands, shores and shelves are shown on the left of b. Annotation shows HPV status and subtype (16, 33 and 35). CN, copy number.
Figure 5 | Deregulation of signalling pathways and transcription factors. Key affected pathways, components and inferred functions, are summarized in the main text and Supplementary Information section 7 for n = 279 samples. The frequency (%) of genetic alterations for HPV(–) and HPV(+) tumours are shown separately within sub-panels and highlighted. Also see Supplementary Fig. 7.15. Pathway alterations include homozygous deletions, focal amplifications and somatic mutations. Activated and inactivated pathways/genes, and activating or inhibitory symbols are based on predicted effects of genome alterations and/or pathway functions. 

Ten Parts of Supplementary Information: 
S1: Biospecimen collection and clinical data
S1.1. Biospecimen collection and clinical data  
S1.2. HPV detection methods  
S1.3. Survival analysis  **
Figure S1.1. HPV status as a function of clinical and molecular characteristics  
Figure  S1.2.  Receiver  operating  characteristic  (ROC)  curves  in  HPV-associated  miRNAs  in  oropharyngeal HNSCC  
Figure S1.3. DNA methylation signatures of HPV
Figure S1.4. Survival analysis for select clinical and genomic variables
Figure S1.5. Survival analysis for platform-specific subtypes 
Table S1.1. Summary of clinical data 
Data file S1.1. Data freeze clinical dataset (This file contains clinical and demographic information for all patients)
Data file S1.2. Summary of HPV detection results  (This file contains the results of molecular analyses used to determine the HPV status of all patients. These include data from in situ hybridization, p16 staining, RNA and DNA sequencing, and the MassArray assay.)
Data file S1.3 Mutation signatures by HPV status (This file contains counts of base changes seen in APOBEC and smoking mutation signatures for HPV(+) and HPV(-) subjects. Fisher’s exact test p-values are also shown.)
S2: Copy number analysis  
S2.1. SSNP array-based copy number analysis
S2.2. Structural alterations  
Figure S2.1. GISTIC 2.0  analysis  of  significantly  reoccurring  focal  alteration  in  279  HNSCC tumors  
Figure S2.2. GISTIC amplification and deletion peaks in lung squamous cell and cervical squamous cell  carcinoma  
Figure S2.3. Comparison of GISTIC 2.0 analyses of 243 HPV(–) and 36 HPV(+) head and neck tumors  
Figure S2.4. Number of copy number segments in HPV(+) and HPV(–) samples  
Data file S2.1. GISTIC amplification and deletion peak annotation in head and neck squamous cell (all  samples, by HPV
status, and by site), lung squamous cell carcinoma, cervical squamous cell carcinoma  (results from GISTIC analyses for head and neck (HNSC), lung, and cervical squamous cell carcinoma. Additional analyses for HNSC were conducted by HPV status and tumor site.)
Data  file  S2.2. Fisher’s  exact  test  p-values  for  frequency  comparisons  of  significantly  reoccurring  alterations by HPV
status and site (results of analyses of frequency comparisons of significantly reoccurring DNA copy number alterations by HPV status and tumor site. Fisher’s exact test p-values are also shown.)
S3: RNA sequencing  
S3.1. RNA sequencing and expression quantification  
S3.2. RNA-Seq for confirmation of somatic alterations reported in whole exome sequencing  
S3.3. Gene fusion detection  
S3.4. RNA-Seq for gene splicing and viral integration  
Figure S3.1. RNA-Seq for confirmation of somatic alterations reported in whole exome sequencing  
Figure S3.2.  FGFR3-TACC3 fusion event  
Figure S3.3. EGFR vIII mutant sample  
Figure S3.4. Exon 14 skipping in MET+ 
Figure S3.5. Alterations of CDKN2A gene structure, copy number, and expression of its protein coding  transcripts p16INK4A and p14ARF  
Figure S3.6. Integration of DNA mutation type, copy number, and gene expression for CDKN2A+ 
Figure S3.7. Alterations of FAT1 gene structure, copy number, and expression  
Figure S3.8. Integration of DNA mutation type, copy number, and gene expression for FAT1+ 
Figure S3.9. Integration of DNA mutation type, copy number, and gene expression for predicted driver  genes relevant to HNSCC  
Figure S3.10. Distribution of HPV integration breakpoints across the host genome
Figure S3.11. The KLK12 gene documents recurrent alternate transcription in HNSCC  Figure S3.12. Heterogeneous TP63 isoform usage in HNSCC+ 
Data file S3.1. RNA-Seq predicted fusions ( contains the results of fusion detection analyses performed with MapSplice. )
Data file S3.2. Viral integration sites (information about HPV viral integration sites based on the analysis of RNA and DNA sequencing 
data.)
Data file S3.3. SigFuge clustering results for alternatively spliced genes (the results of SigFuge analyses to detect differential expression of multiple gene isoforms. Uncorrected p-values are shown.)
S4: DNA sequencing: exome and genome  
S4.1. Exome sequencing, high-pass whole genome sequencing, and data processing
S4.2. Mutation validation
S4.3. Low pass whole genome sequencing 
Figure S4.1. Mutation validation counts by allelic fraction for HNSCC  
Figure S4.2. Predicted  coding  impact  by  transcript  base  position  and  functional  domain  for  selected  genes  
Data file S4.1. Summary of multiple MUTSIG analyses (the results of MutSig analyses to detect significantly mutated genes. Additional analyses were conducted by gene expression subtype, tumor site, and HPV status. Mutation counts by site and HPV status are also shown, as are Fisher’s exact test p-values.) 
Data file S4.2. Structural aberration calls from BreakDancer and Meerkat  
S5: Molecular Subtypes and Subset Analyses  
S5.1.  Detection  of  previously  validated  gene  expression  subtypes  in  HNSCC  and  correlation  with  lung  squamous cell carcinoma  
S5.2 Validation of selected genomic alterations of the gene expression subtypes  S5.3. Subset analyses by genomic platform  
Figure  S5.1. Comparison  of  gene  expression  patterns  in  squamous  cell  carcinomas  of  the  upper  aerodigestive tract  Figure  S5.2.    Comparison  of  select  genes  and  expression  subtype  centroids  for  squamous  cell  carcinomas of the upper aerodigestive tract  
Figure S5.3. DNA copy number in chromosome 7 by gene expression subtype  
Figure  S5.4. DNA  copy  number  and  gene  expression  of  canonical  oncogenes  in  chromosome  3q  by  gene expression subtype  
Figure S5.5. Gene expression heatmap for 37 normal samples  
Figure S5.6. miRs that are differentially abundant between tumor and adjacent normal samples 
Figure S5.7. miRs that are differentially abundant between HPV(+) and HPV(D) samples 
Figure S5.8. miRs that are differentially abundant between different anatomic sites 
Data file S5.1. Summary of RNA differential abundance analyses 
Data file S5.2. Summary of miRNA differential abundance analyses
Data file S5.3. Epigenetically silenced genes in head and neck squamous cell carcinoma
Data  file  S5.4. Results  of  all  pairDwise  comparisons  of  DNA  methylation  levels  between  tumor  sites,  HPV(+) smokers and non-smokers, HPV(+) and HPV(–) samples, and oropharynx only HPV(+) and HPV(–) samples 
S5:
S5.1: This file contains the results of SAM analyses to identify differentially expressed genes. False discovery rate q- values are shown for tumor vs. normal, as well as comparisons based on tumor site, HPV status, and smoking status.
S5.2: This file contains the results of SAMseq analyses to identify differentially expressed miRNAs. False discovery
          rate q-values and other summary statistics are shown for tumor vs. normal, as well as comparisons based on
          tumor site, HPV status, smoking status, and miRNA subtype.
S5.3: This file contains the results of analyses that identified epigenetically silenced genes based on gene
          expression levels in methylated and unmethylated samples. Test statistics, and corrected and uncorrected p-
          values are also shown, as are correlations of methylation and expression levels.
S5.4: This file contains the results of analyses to identify differentially methylated genes. Test statistics, and
          corrected and uncorrected p-values are shown for comparisons based on tumor site, HPV status, and smoking
          status.
S6: Reverse phase protein array analysis 
S6.1. Methods and statistical analysis
Figure S6.1. Protein expression of p16, pRb, and E2F1 by HPV status 
Figure S6.2. RPPA analysis of EGFR as a function of EGFR amplification  
Data file S6.1. RPPA antibodies  
Data file S6.2. Data freeze samples with RPPA data available  
S6.1: This file contains information about the 160 antibodies that were used in the reverse phase protein array
         analyses.
S6.2: This file lists the barcodes for the n = 200 samples for which reverse phase protein array analyses were
         performed.
S7: Pathways and integrated analysis  
.    S7.1. MEMo analysis of coDoccurring and mutually exclusive genomic events  
.    S7.2. Genomic aberrations in gene expression subtypes  
.    S7.3. Exploratory clustering / Unsupervised analysis of genomic platforms  
.    S7.4. Supervised integrated analysis of miRNA, gene expression, and copy number  
.    S7.5. Integrated pathway analysis using PARADIGM and PARADIGMDSHIFT  
.    S7.6. Somatic alteration in therapeutic targets  
.    Figure S7.1. CoDoccurrence and mutual exclusivity of select genomic events  
.    Figure S7.2. DNA copy number and gene expression in chromosome 11q  
.    Figure S7.3. DNA copy number and gene expression for HLA class 1 and lymphocyte signature genes  
.    Figure  S7.4.    Unsupervised  clustering  of  reverse  phase  protein  array  data  by  nonDnegative  matrix  factorization (NMF) clustering  
.    Figure S7.5. Correlation of RPPA subtypes (by NMF clustering) and mutations  
.    Figure S7.6. Unsupervised clustering of miRNADSeq data  
.    Figure S7.7. Covariates, EMT scores and differentially abundant miRNAs by unsupervised cluster  
.    Figure S7.8. DNA methylation subtypes are associated with somatic mutations, EMT score, and target  gene expression 
.    Figure S7.9. Cluster of clusters analysis
Figure  S7.10.    Decreased  copy  number  and  expression  of  miR-100-5p  and  let-7c-5p  are  correlated  
.    with increased CDK6 and E2F1 expression in head and neck cancer 
Figure S7.11. Subtypes defined by PARADIGM integrated pathway levels  
.    Figure  S7.12.    Enriched  subDnetwork  for  features  significantly  differentiated  between  HPV(+)  and  HPV(D) samples  
.    Figure S7.13. PARADIGMDSHIFT analysis of NFE2L2 
Figure S7.14. PARADIGMDSHIFT analysis of NOTCH family genes  
.    Figure  S7.15.  Diversity  and  frequency  of  genetic  changes  leading  to  deregulation  of  signaling  pathways and transcription factors in HPV (D), part 1 and HPV(+), part 2 HNSCC  
.    Table S7.1. miRNAs associated with NSD1Ddepleted/hypomethylated cluster
Table  S7.2.    Increased  mRNA  expression  associated  with  decreased  miRD100  and  letD7c  expression  in  
.    deleted genomic regions  
Table S7.3. Copy number loss of miRD100 and letD7c in tumor specimens  
.               Data File S7.1. Associations of integrated genomic events 
.               Data File S7.2. Summary of class labels from different platforms 
.               Data File S7.3. Summary of pathway activation 
S7:
S7.1: This file contains p-values for the mutual exclusivity modules analyses presented in Figure S7.1. In addition,
          uncorrected and corrected Fisher’s exact test p-values are shown for the associations presented in Figures 4A
          and 4B.
S7.2: This file summarizes information about the subtypes identified by the RNA, miRNA, methylation, reverse phase
          protein array, and PARADIGM analyses. Two-way tables show counts for all pairs of subtypes. Fisher’s exact
          test p-values are also presented.
S7.3: This file identifies patients that exhibit alterations in the pathways described in Figures 5, S7.15 part 1, and
          S7.15 part 2. For each patient, specific alterations are shown based on output from the cBioPortal.
S8: DNA methylation profiling 
S9: miRNA sequencing 
Table S9.1. Priorities for resolving annotation ambiguities for aligned miRNA-Seq reads 
S10: Batch effects analysis  
S10.1. Methods  
S10.2. Results by platform  
Figure S10.1. Hierarchical clustering for miRNA expression from miRNA-Seq data  
Figure S10.2. PCA: First two principal components for miRNA expression from miRNA-Seq data, with  samples connected
                       by centroids according to batch ID  
Figure S10.3. PCA: First two principal components for miRNA expression from miRNA-Seq data, with  samples connected by centroids according to tissue source site  
Figure S10.4. Hierarchical clustering plot for DNA methylation HM450 data
Figure S10.5. PCA for DNA methylation with samples connected by centroids according to batch ID  
Figure S10.6. PCA  for  DNA  methylation  with  samples  connected  by  centroids  according  to    tissue  source site  
Figure S10.7. Hierarchical clustering for mRNA expression from RNA-Seq data  
Figure S10.8. PCA: First two principal components for RNA-Seq, with samples connected by centroids  according to batch
                       ID  
Figure S10.9. PCA: First two principal components for RNA-Seq, with samples connected by centroids  according to tissue
                       source site 
Figure S10.10. Hierarchical clustering for SNP6 data  
Figure S10.11. PCA:  First  two  principal  components  for  SNP6,  with  samples  connected  by  centroids  according to
                         batch ID  
Figure S10.12. PCA:  First  two  principal  components  for  SNP6,  with  samples  connected  by  centroids  according to
                         tissue source site  

Comprehensive genomic characterization of head and neck squamous cell carcinomas (pdf 4292; PubMed Link)

The Cancer Genome Atlas profiled 279 head and neck squamous cell carcinomas (HNSCCs) to provide a comprehensive landscape of somatic genomic alterations. Here we show that human-papillomavirus-associated tumours are dominated by helical domain mutations of the oncogene PIK3CA, novel alterations involving loss of TRAF3, and amplification of the cell cycle gene E2F1. Smoking-related HNSCCs demonstrate near universal loss-of-function TP53 mutations and CDKN2A inactivation with frequent copy number alterations including amplification of 3q26/28 and 11q13/22. A subgroup of oral cavity tumours with favourable clinical outcomes displayed infrequent copy number alterations in conjunction with activating mutations of HRAS or PIK3CA, coupled with inactivating mutations of CASP8, NOTCH1 and TP53. Other distinct subgroups contained loss-of-function alterations of the chromatin modifier NSD1, WNT pathway genes AJUBA and FAT1, and activation of oxidative stress factor NFE2L2, mainly in laryngeal tumours. Therapeutic candidate alterations were identified in most HNSCCs.

Five Figures in the Main Text:

Figure 1 | DNA copy number alterations. a, Copy number alterations by anatomic site and HPV status for squamous cancers. Lung squamous cell carcinoma (LUSC, n = 358) and cervical squamous cell carcinoma (CESC, n = 114). b, Unsupervised analysis of copy number alteration of HNSCC (n = 279) with associated characteristics. The rectangle indicates chromosome 7 amplifications in the purple cluster. NA, not available.

Figure 2 | Significantly mutated genes in HNSCC. Genes (rows) with significantly mutated genes (identified using the MutSigCV algorithm; q , 0.1) ordered by q value; additional genes with trends towards significance are also shown. Samples (columns, n = 279) are arranged to emphasize mutual exclusivity among mutations. Left, mutation percentage in TCGA. Right, mutation percentage in COSMIC (‘upper aerodigestive tract’ tissue). Top, overall number of mutations per megabase. Colour coding indicates mutation type.

Figure 3 | Candidate therapeutic targets and driver oncogenic events. Alteration events for key genes are displayed by sample (n = 279). TSG, tumour suppressor gene.

Figure 4 | Integrated analysis of genomic alterations. a,b,Samples (n=279) ared isplayed in columns and grouped by gene expression (a) or methylation (b) subtype (sub.). Unadjusted two-sided Fisher’s exact test P values assess the association of each genomic alteration. Methylation probe location of CpG islands, shores and shelves are shown on the left of b. Annotation shows HPV status and subtype (16, 33 and 35). CN, copy number.

Figure 5 | Deregulation of signalling pathways and transcription factors. Key affected pathways, components and inferred functions, are summarized in the main text and Supplementary Information section 7 for n = 279 samples. The frequency (%) of genetic alterations for HPV(–) and HPV(+) tumours are shown separately within sub-panels and highlighted. Also see Supplementary Fig. 7.15. Pathway alterations include homozygous deletions, focal amplifications and somatic mutations. Activated and inactivated pathways/genes, and activating or inhibitory symbols are based on predicted effects of genome alterations and/or pathway functions.

Ten Parts of Supplementary Information:

S1: Biospecimen collection and clinical data
S1.1. Biospecimen collection and clinical data 
S1.2. HPV detection methods 
S1.3. Survival analysis  **
Figure S1.1. HPV status as a function of clinical and molecular characteristics 
Figure  S1.2.  Receiver  operating  characteristic  (ROC)  curves  in  HPV-associated  miRNAs  in  oropharyngeal HNSCC 
Figure S1.3. DNA methylation signatures of HPV
Figure S1.4. Survival analysis for select clinical and genomic variables
Figure S1.5. Survival analysis for platform-specific subtypes
Table S1.1. Summary of clinical data
Data file S1.1. Data freeze clinical dataset (This file contains clinical and demographic information for all patients)
Data file S1.2. Summary of HPV detection results  (This file contains the results of molecular analyses used to determine the HPV status of all patients. These include data from in situ hybridization, p16 staining, RNA and DNA sequencing, and the MassArray assay.)
Data file S1.3 Mutation signatures by HPV status (This file contains counts of base changes seen in APOBEC and smoking mutation signatures for HPV(+) and HPV(-) subjects. Fisher’s exact test p-values are also shown.)

S2: Copy number analysis 
S2.1. SSNP array-based copy number analysis
S2.2. Structural alterations 
Figure S2.1. GISTIC 2.0  analysis  of  significantly  reoccurring  focal  alteration  in  279  HNSCC tumors 
Figure S2.2. GISTIC amplification and deletion peaks in lung squamous cell and cervical squamous cell  carcinoma 
Figure S2.3. Comparison of GISTIC 2.0 analyses of 243 HPV(–) and 36 HPV(+) head and neck tumors 
Figure S2.4. Number of copy number segments in HPV(+) and HPV(–) samples 
Data file S2.1. GISTIC amplification and deletion peak annotation in head and neck squamous cell (all  samples, by HPV
status, and by site), lung squamous cell carcinoma, cervical squamous cell carcinoma  (results from GISTIC analyses for head and neck (HNSC), lung, and cervical squamous cell carcinoma. Additional analyses for HNSC were conducted by HPV status and tumor site.)
Data  file  S2.2. Fisher’s  exact  test  p-values  for  frequency  comparisons  of  significantly  reoccurring  alterations by HPV
status and site (results of analyses of frequency comparisons of significantly reoccurring DNA copy number alterations by HPV status and tumor site. Fisher’s exact test p-values are also shown.)

S3: RNA sequencing 
S3.1. RNA sequencing and expression quantification 
S3.2. RNA-Seq for confirmation of somatic alterations reported in whole exome sequencing 
S3.3. Gene fusion detection 
S3.4. RNA-Seq for gene splicing and viral integration 
Figure S3.1. RNA-Seq for confirmation of somatic alterations reported in whole exome sequencing 
Figure S3.2.  FGFR3-TACC3 fusion event 
Figure S3.3. EGFR vIII mutant sample 
Figure S3.4. Exon 14 skipping in MET+
Figure S3.5. Alterations of CDKN2A gene structure, copy number, and expression of its protein coding  transcripts p16INK4A and p14ARF 
Figure S3.6. Integration of DNA mutation type, copy number, and gene expression for CDKN2A+
Figure S3.7. Alterations of FAT1 gene structure, copy number, and expression 
Figure S3.8. Integration of DNA mutation type, copy number, and gene expression for FAT1+
Figure S3.9. Integration of DNA mutation type, copy number, and gene expression for predicted driver  genes relevant to HNSCC 
Figure S3.10. Distribution of HPV integration breakpoints across the host genome
Figure S3.11. The KLK12 gene documents recurrent alternate transcription in HNSCC  Figure S3.12. Heterogeneous TP63 isoform usage in HNSCC+
Data file S3.1. RNA-Seq predicted fusions ( contains the results of fusion detection analyses performed with MapSplice. )
Data file S3.2. Viral integration sites (information about HPV viral integration sites based on the analysis of RNA and DNA sequencing 
data.)
Data file S3.3. SigFuge clustering results for alternatively spliced genes (the results of SigFuge analyses to detect differential expression of multiple gene isoforms. Uncorrected p-values are shown.)

S4: DNA sequencing: exome and genome 
S4.1. Exome sequencing, high-pass whole genome sequencing, and data processing
S4.2. Mutation validation
S4.3. Low pass whole genome sequencing
Figure S4.1. Mutation validation counts by allelic fraction for HNSCC 
Figure S4.2. Predicted  coding  impact  by  transcript  base  position  and  functional  domain  for  selected  genes 
Data file S4.1. Summary of multiple MUTSIG analyses (the results of MutSig analyses to detect significantly mutated genes. Additional analyses were conducted by gene expression subtype, tumor site, and HPV status. Mutation counts by site and HPV status are also shown, as are Fisher’s exact test p-values.)
Data file S4.2. Structural aberration calls from BreakDancer and Meerkat 

S5: Molecular Subtypes and Subset Analyses 
S5.1.  Detection  of  previously  validated  gene  expression  subtypes  in  HNSCC  and  correlation  with  lung  squamous cell carcinoma 
S5.2 Validation of selected genomic alterations of the gene expression subtypes  S5.3. Subset analyses by genomic platform 
Figure  S5.1. Comparison  of  gene  expression  patterns  in  squamous  cell  carcinomas  of  the  upper  aerodigestive tract  Figure  S5.2.    Comparison  of  select  genes  and  expression  subtype  centroids  for  squamous  cell  carcinomas of the upper aerodigestive tract 
Figure S5.3. DNA copy number in chromosome 7 by gene expression subtype 
Figure  S5.4. DNA  copy  number  and  gene  expression  of  canonical  oncogenes  in  chromosome  3q  by  gene expression subtype 
Figure S5.5. Gene expression heatmap for 37 normal samples 
Figure S5.6. miRs that are differentially abundant between tumor and adjacent normal samples
Figure S5.7. miRs that are differentially abundant between HPV(+) and HPV(D) samples
Figure S5.8. miRs that are differentially abundant between different anatomic sites
Data file S5.1. Summary of RNA differential abundance analyses
Data file S5.2. Summary of miRNA differential abundance analyses
Data file S5.3. Epigenetically silenced genes in head and neck squamous cell carcinoma
Data  file  S5.4. Results  of  all  pairDwise  comparisons  of  DNA  methylation  levels  between  tumor  sites,  HPV(+) smokers and non-smokers, HPV(+) and HPV(–) samples, and oropharynx only HPV(+) and HPV(–) samples
S5:
S5.1: This file contains the results of SAM analyses to identify differentially expressed genes. False discovery rate q- values are shown for tumor vs. normal, as well as comparisons based on tumor site, HPV status, and smoking status.
S5.2: This file contains the results of SAMseq analyses to identify differentially expressed miRNAs. False discovery
          rate q-values and other summary statistics are shown for tumor vs. normal, as well as comparisons based on
          tumor site, HPV status, smoking status, and miRNA subtype.
S5.3: This file contains the results of analyses that identified epigenetically silenced genes based on gene
          expression levels in methylated and unmethylated samples. Test statistics, and corrected and uncorrected p-
          values are also shown, as are correlations of methylation and expression levels.
S5.4: This file contains the results of analyses to identify differentially methylated genes. Test statistics, and
          corrected and uncorrected p-values are shown for comparisons based on tumor site, HPV status, and smoking
          status.
S6: Reverse phase protein array analysis
S6.1. Methods and statistical analysis
Figure S6.1. Protein expression of p16, pRb, and E2F1 by HPV status
Figure S6.2. RPPA analysis of EGFR as a function of EGFR amplification 
Data file S6.1. RPPA antibodies 
Data file S6.2. Data freeze samples with RPPA data available 
S6.1: This file contains information about the 160 antibodies that were used in the reverse phase protein array
         analyses.
S6.2: This file lists the barcodes for the n = 200 samples for which reverse phase protein array analyses were
         performed.
S7: Pathways and integrated analysis 
.    S7.1. MEMo analysis of coDoccurring and mutually exclusive genomic events 
.    S7.2. Genomic aberrations in gene expression subtypes 
.    S7.3. Exploratory clustering / Unsupervised analysis of genomic platforms 
.    S7.4. Supervised integrated analysis of miRNA, gene expression, and copy number 
.    S7.5. Integrated pathway analysis using PARADIGM and PARADIGMDSHIFT 
.    S7.6. Somatic alteration in therapeutic targets 
.    Figure S7.1. CoDoccurrence and mutual exclusivity of select genomic events 
.    Figure S7.2. DNA copy number and gene expression in chromosome 11q 
.    Figure S7.3. DNA copy number and gene expression for HLA class 1 and lymphocyte signature genes 
.    Figure  S7.4.    Unsupervised  clustering  of  reverse  phase  protein  array  data  by  nonDnegative  matrix  factorization (NMF) clustering 
.    Figure S7.5. Correlation of RPPA subtypes (by NMF clustering) and mutations 
.    Figure S7.6. Unsupervised clustering of miRNADSeq data 
.    Figure S7.7. Covariates, EMT scores and differentially abundant miRNAs by unsupervised cluster 
.    Figure S7.8. DNA methylation subtypes are associated with somatic mutations, EMT score, and target  gene expression
.    Figure S7.9. Cluster of clusters analysis
Figure  S7.10.    Decreased  copy  number  and  expression  of  miR-100-5p  and  let-7c-5p  are  correlated 
.    with increased CDK6 and E2F1 expression in head and neck cancer
Figure S7.11. Subtypes defined by PARADIGM integrated pathway levels 
.    Figure  S7.12.    Enriched  subDnetwork  for  features  significantly  differentiated  between  HPV(+)  and  HPV(D) samples 
.    Figure S7.13. PARADIGMDSHIFT analysis of NFE2L2
Figure S7.14. PARADIGMDSHIFT analysis of NOTCH family genes 
.    Figure  S7.15.  Diversity  and  frequency  of  genetic  changes  leading  to  deregulation  of  signaling  pathways and transcription factors in HPV (D), part 1 and HPV(+), part 2 HNSCC 
.    Table S7.1. miRNAs associated with NSD1Ddepleted/hypomethylated cluster
Table  S7.2.    Increased  mRNA  expression  associated  with  decreased  miRD100  and  letD7c  expression  in 
.    deleted genomic regions 
Table S7.3. Copy number loss of miRD100 and letD7c in tumor specimens 
.               Data File S7.1. Associations of integrated genomic events
.               Data File S7.2. Summary of class labels from different platforms
.               Data File S7.3. Summary of pathway activation

S7:
S7.1: This file contains p-values for the mutual exclusivity modules analyses presented in Figure S7.1. In addition,
          uncorrected and corrected Fisher’s exact test p-values are shown for the associations presented in Figures 4A
          and 4B.
S7.2: This file summarizes information about the subtypes identified by the RNA, miRNA, methylation, reverse phase
          protein array, and PARADIGM analyses. Two-way tables show counts for all pairs of subtypes. Fisher’s exact
          test p-values are also presented.
S7.3: This file identifies patients that exhibit alterations in the pathways described in Figures 5, S7.15 part 1, and
          S7.15 part 2. For each patient, specific alterations are shown based on output from the cBioPortal.
S8: DNA methylation profiling

S9: miRNA sequencing
Table S9.1. Priorities for resolving annotation ambiguities for aligned miRNA-Seq reads

S10: Batch effects analysis 
S10.1. Methods 
S10.2. Results by platform 
Figure S10.1. Hierarchical clustering for miRNA expression from miRNA-Seq data 
Figure S10.2. PCA: First two principal components for miRNA expression from miRNA-Seq data, with  samples connected
                       by centroids according to batch ID 
Figure S10.3. PCA: First two principal components for miRNA expression from miRNA-Seq data, with  samples connected
                       by centroids according to tissue source site 
Figure S10.4. Hierarchical clustering plot for DNA methylation HM450 data
Figure S10.5. PCA for DNA methylation with samples connected by centroids according to batch ID 
Figure S10.6. PCA  for  DNA  methylation  with  samples  connected  by  centroids  according  to    tissue  source site 
Figure S10.7. Hierarchical clustering for mRNA expression from RNA-Seq data 
Figure S10.8. PCA: First two principal components for RNA-Seq, with samples connected by centroids  according to batch
                       ID 
Figure S10.9. PCA: First two principal components for RNA-Seq, with samples connected by centroids  according to tissue
                       source site
Figure S10.10. Hierarchical clustering for SNP6 data 
Figure S10.11. PCA:  First  two  principal  components  for  SNP6,  with  samples  connected  by  centroids  according to
                         batch ID 
Figure S10.12. PCA:  First  two  principal  components  for  SNP6,  with  samples  connected  by  centroids  according to
                         tissue source site 

Leave a Comment

Scroll to Top