banner
Centro de notícias
Qualidade e desempenho são as marcas de nossos produtos.

Um atlas integrado de tumor, imunidade e microbioma do câncer de cólon

Sep 01, 2023

Nature Medicine volume 29, páginas 1273–1286 (2023) Cite este artigo

18k acessos

116 Altmétrico

Detalhes das métricas

A falta de conjuntos de dados de câncer multi-ômicos com extensas informações de acompanhamento dificulta a identificação de biomarcadores precisos de resultados clínicos. Neste estudo de coorte, realizamos análises genômicas abrangentes em amostras congeladas frescas de 348 pacientes afetados por câncer de cólon primário, abrangendo RNA, exoma total, receptor de células T profundas e sequenciamento do gene rRNA bacteriano 16S no tumor e tecido de cólon saudável compatível, complementado com sequenciamento de todo o genoma do tumor para caracterização adicional do microbioma. Uma célula T auxiliar tipo 1, citotóxica, assinatura de expressão gênica, chamada Constante Imunológica de Rejeição, capturou a presença de clones de células T enriquecidos com tumor e expandidos clonalmente e superou os biomarcadores moleculares prognósticos convencionais, como o subtipo molecular de consenso e as classificações de instabilidade de microssatélites . A quantificação da imunoedição genética, definida como um número de neoantígenos menor do que o esperado, refinou ainda mais seu valor prognóstico. Identificamos uma assinatura de microbioma, impulsionada por Ruminococcus bromii, associada a um resultado favorável. Ao combinar a assinatura do microbioma e a Constante Imunológica de Rejeição, desenvolvemos e validamos um escore composto (mICRoScore), que identifica um grupo de pacientes com excelente probabilidade de sobrevida. O conjunto de dados multi-omics disponível publicamente fornece um recurso para uma melhor compreensão da biologia do câncer de cólon que pode facilitar a descoberta de abordagens terapêuticas personalizadas.

Embora tenha havido uma quantidade substancial de pesquisas realizadas sobre biomarcadores para câncer de cólon primário, as diretrizes clínicas atuais nos EUA e na Europa (incluindo as diretrizes da National Comprehensive Cancer Network e da European Society for Medical Oncology) se baseiam apenas no tumor-nódulo-metástase estadiamento e detecção de deficiência de DNA mismatch repair (MMR) ou instabilidade de microssatélites (MSI), além de variáveis ​​clínico-patológicas padrão, para determinar recomendações de tratamento1,2. A MSI é causada por defeitos somáticos ou germinativos dos genes MMR e leva ao acúmulo de mutações somáticas, neoantígenos, resultando em reconhecimento imunológico e alta densidade de linfócitos infiltrados no tumor3.

A força da reação imune adaptativa in situ, conforme capturada, por exemplo, pela avaliação da densidade e distribuição espacial das células T (Immunoscore), está associada a um risco reduzido de recaída e morte independentemente de outras variáveis ​​clínico-patológicas, incluindo o estado MSI4, 5.

No entanto, apesar da evidência esmagadora do efeito prognóstico do Immunoscore e outros parâmetros relacionados ao sistema imunológico no câncer de cólon6,7, uma falta de associação entre as estimativas baseadas na expressão gênica da resposta imune e a sobrevida do paciente no The Cancer Genome Atlas (TCGA) coorte de adenocarcinoma de cólon (COAD) foi observada pela comunidade de pesquisa8,9,10. O TCGA, por sua riqueza e curadoria de dados genômicos, representa o conjunto de dados proeminente para análises ômicas; no entanto, a coleta de dados clínicos abrangentes, incluindo resultados de sobrevida, não era um objetivo primário do TCGA nem uma possibilidade prática em vista de seu escopo mundial e restrições de tempo11. Como tal, os dados limitados de acompanhamento do paciente associados ao TCGA-COAD e outros conjuntos de dados do TCGA dificultaram análises de sobrevida estatisticamente rigorosas11. Além disso, o TCGA não incluiu ensaios dedicados para análise do repertório do receptor de células T (TCR) ou caracterização do microbioma, que foi posteriormente realizado usando dados de sequenciamento de DNA e RNA (RNA-seq) em massa e inclui apenas alguns tecidos sólidos saudáveis ​​(por exemplo, cólon saudável ) amostras12,13. Além disso, como o TCGA se concentrou inicialmente na catalogação de alterações genômicas e moleculares que ocorrem nas células cancerígenas, foram impostos critérios de inclusão de amostras com base em limites de pureza tumoral rigorosos14, potencialmente influenciando a população em direção a amostras de tumor menos imunes ou ricas em estroma.

0.1% in the tumor, which are at least 32 times higher in the tumor compared to normal) are highlighted. i, Correlation of proportion of tumor-enriched T cell clones in the tumor (in percent) with ICR score. Pearson's r and P value of the correlation are indicated in the plot. All P values are two-sided./p>12 per Mb. Overall P value is calculated by log-rank test. c, Scatter-plot of ICR score by genetic immunoediting (GIE) value for ICR-high and ICR-low samples. Number of samples in each quadrant is indicated in the graph. Gray area delineates ICR scores from 5–9. d, Kaplan–Meier for OS by IES. Censor points are indicated by vertical lines and corresponding table of number of patients at risk in each group is included below the Kaplan–Meier plot. Overall P value is calculated by log-rank test. e, Violin plot of IES by productive TCR clonality (immunoSEQ) (left) and MiXCR-derived TCR clonality (right). Spearman correlation statistics are indicated above each plot. Significance within ICR low and high is indicated. Center line, box limits and whiskers represent the median, interquartile range and 1.5× interquartile range, respectively. P values are two-sided, n reflects the independent number of samples./p> 2) (Fig. 5c and annotated in Supplementary Table 5). No major difference in α diversity (the variety and abundance of species within an individual sample) was observed between tumor and healthy samples (Extended Data Fig. 7b) and only a modestly reduced microbial diversity was observed in ICR-high versus ICR-low tumors (Extended Data Fig. 7b). Selenomonas and Selenomonas 3 were the taxa most significantly increased in ICR-high versus -low tumors (Fig. 5e, Extended Data Fig. 7c and Supplementary Table 6). In terms of survival analysis, the highest number of nominally significant associations was obtained using tumor data (rather than healthy colon data) and OS as the end point (Extended Data Fig. 7d and Supplementary Table 7)./p>20-fold coverage of at least 99% of targeted exons and >70-fold in at least 81% targeted exons. In healthy samples, sequencing achieved >20-fold coverage of at least 94% of targeted exons and >30-fold in at least 84% targeted exons. Adaptor trimming was performed using the tool trimadap (v.0.1.3). ConPair was run to evaluate concordance and estimate contamination between matched tumor–normal pairs. In eight of the pairs a mismatch was detected and for five pairs, a potential contamination was indicated. HLA typing data were used to validate these results. All potential mismatches and contaminations were excluded, retaining 281 patients for data analysis./p>2 µg) and sample selection was exclusively based on DNA availability. TCR sequencing was performed using extracted DNA of 114 primary tissue samples and ten matched healthy colon tissues with sufficient DNA available./p>0.1% were defined as tumor-enriched sequences, as previously implemented by Beausang et al.75. The fraction of tumor-enriched TCR sequences in the tumor was calculated by dividing the number of productive templates of tumor-enriched sequences by the total number of productive templates per tumor sample. Pearson's correlation coefficient between the fraction tumor-enriched TCR sequences and ICR score was calculated./p>1% in the general population. After these technical exclusion criteria, biological filters were applied, including selection of nonsynonymous mutations (frame shift deletions, frame shift insertions, inframe deletions, inframe insertions, missense mutations, nonsense mutations, nonstop mutations, splice site and translation start site mutations). The resulting number of variants/mutations per Mb (capture size is 40 Mb) per sample is referred to as the nonsynonymous TMB. Next, to identify most frequently mutated genes in our cohort that might play a role in cancer, we excluded variants that are predicted to be tolerated according to SIFT annotation or benign according to PolyPhen (polymorphism phenotyping). Finally, all artifact genes, which are typically encountered as bystander mutations in cancer that are mutated for example as a consequence of a high homology of sequences in the gene, were excluded76. The OncoPlot function from ComplexHeatmap (v.2.1.2) was used to visualize the most frequent somatic mutations./p>5% of the tumor samples) with frequencies detected in previously published datasets containing colon cancer samples (TCGA-COAD and NHS-HPFS) as well as reported cancer driver genes32 or colon oncogenic mediators38. First, we extracted genes with a nonsynonymous mutation frequency >5% in the AC-ICAM cohort. Subsequently, only genes that are likely involved in cancer development, as described in the section ‘Cancer-related gene annotation’, were retained. All artifact genes (mutations typically encountered as bystander mutations in cancer that are mutated for example as a consequence of a high homology of sequences in the gene), were excluded. Genes that have previously been reported as colon cancer oncogenic mediator38 or cancer driver gene for colorectal cancer (COADREAD)32 were also excluded. Finally, only genes with a mutation frequency <5% in the NHS-HPFS colon cancer cohort37 and <5% in TCGA-COAD36 were maintained. As a final filter, only genes that had a nonsynonymous mutation frequency of at least twofold in AC-ICAM compared to TCGA-COAD were labeled as potentially new in colon cancer./p> 0.4) or MSS (MANTIS score ≤ 0.4)./p> 500 nM, were used as criteria to infer neoantigens. Predicted neoantigens were used to calculate the GIE value. We calculated the GIE value by taking the ratio between the number of observed versus the number of expected neoantigens. The expected number of neoantigens was based on the assumption of a linearity between TMB and the number of neoantigens. We therefore assumed that samples that have a lower frequency of neoantigens than expected (lower GIE values), display evidence of immunoediting. A higher frequency of neoantigens than expected indicates a lack of immunoediting, see calculations section for details./p>60× coverage per sample. The median (across samples) of the average target coverage (per sample) was 76× (range of 50–92)./p> ±0.3. Clusters among the networks (groups of at least three correlated genera using the cutoffs specified above) were defined via a fast greedy clustering algorithm. All co-occurrence networks were made using the R package ‘NetCoMI (v.1.1.0) – Network Construction and Comparison for Microbiome Data’84 and visualized using Cytoscape (v.3.9.1)./p>0) and ‘low-risk’ (<0) groups as performed in the training set. Therefore, no cutoff optimization occurred in the validation phase./p>2 μg). Securing additional funds allowed us to perform WGS and 16S rRNA sequencing and to expand the WES and TCR analyses to any sample with sufficient DNA available. No specific power calculation was performed at that time and the targeted sample size was based on the estimated number of samples that could be retrieved from LUMC (n = 400), which compared favorably with the sample size of similar studies in the field./p>90% to detect a 10% mutational frequency in 90% of genes86./p>80% for an HR of 0.5 with a two-sided α of 0.05. With 154 OS events in the whole cohort, our study has a power of 90% for an HR of 0.59 (assuming two group of equal size c) and a power of 90% for an HR of 0.57 (assuming groups with unequal sample size, 2:1) with a two-sided α of 0.05./p>

0.1% in the tumor, that are at least 32 times more abundant in the tumor compared to the normal./p>12/Mb) versus Low (<12/Mb) TMB. b, Same as a, but only including ICR Medium. c, Kaplan–Meier curves for OS by GIE status. d, Same as c in ICR Medium patients. Overall P value is calculated by log-rank test and P value corresponding to HR is calculated using cox proportional hazard regression (a-d). e, Stacked bar charts of mutational load category (top) and MSI status (bottom) per IES. f, Kaplan–Meier curves for OS (left) and PFS (right) stratified by AJCC pathological stage (I, II, III) within IES4. Stratification was not performed for stage IV due to the limited number (n = 2). g, Stacked bar chart of distribution of AJCC Pathological Tumor Stage by IES. h, Multivariate cox proportional hazards model for OS including IES (ordinal, IES1, IES2, IES3, IES4) and AJCC Pathological Tumor Stage (ordinal, Stage I, II, III, IV). P values corresponding to HR calculated by cox proportional hazard regression analysis are indicated. i, Violin plot represents TCR clonality as determined by MiXCR in ICR Medium samples. Center line, box limits, and whiskers represent the median, interquartile range and 1.5x interquartile range respectively. P value calculated by unpaired, two-sided t-test. j, Results of the multiple linear regression model showing the respective contributions of productive TCR clonality (X1) and (X2) for prediction of IES (Y). Corresponding significance of the effects are indicated in the scatter-plots (left). k, Local Polynomial Regression Fitting of productive TCR clonality by IES (ordinal variable). The gray band reflects the 95% confidence interval for predictions of the local polynomial regression model. All P values are two-sided; n reflects the independent number of samples in all panels. Overall Survival (OS). Tumor Mutational Burden (TMB). Genetic Immunoediting (GIE). ImmunoEditing Score (IES)./p> 0). d, Concordance index of optimal multivariate cox regression model per dataset. The cross-validation performance highlights the mean concordance of 10-different folds with the optimal hyper parameters (gamma and lambda) that is, the same parameters as the optimal model. e, Forest plot with HR (center), corresponding 95% confidence intervals (error bars), and P value calculated by cox proportional hazard regression analysis for OS, using: 1) the 16 S MBR score in AC-ICAM, 2) WGS R. bromii abundance 3) PCR-based R. bromii abundance, 4) 16 S Ruminococcus 2 relative abundance and 5) MBR score calculated using WGS data. f, Heat map of Spearman correlation between the relative abundance of the MBR classifier taxa in tumor samples and immune traits. Only correlations with an FDR > 0.1 are visualized. An additional row is added for Ruminococcus 2 showing all correlations, unfiltered for FDR. * The taxonomical order is indicated between brackets, as family was unassigned. g, Kaplan–Meier curve for PFS in AC-ICAM, with all patients stratified by mICRoScore High vs Low. HR and P value are calculated using cox proportional regression. h, AJCC pathological stage within the mICRoScore High group in AC-ICAM and within TCGA-COAD i, Kaplan–Meier curve for PFS in AC-ICAM, with all patients with ICR High stratified by mICRoScore. Overall P value is calculated by log-rank test and P value corresponding to HR is calculated using cox proportional hazard regression. Overall Survival (OS), Progression-Free Survival (PFS). All P values are two-sided; n reflects the independent number of samples in all panels./p>