-
Notifications
You must be signed in to change notification settings - Fork 0
/
manuscript.qmd
112 lines (54 loc) · 31.8 KB
/
manuscript.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
```{r source}
source("src/libs.R")
source("global_variables.R", local = knitr::knit_global())
source("src/process_authors.R")
```
```{r process_authors}
parsed_authors <- process_authors("authors.yaml", with_degrees = FALSE)
```
#
## Abstract {.unnumbered}
`r paste0(parsed_authors$abstract, collapse = " \n")`
\newpage
```{=latex}
\begin{runninglinenumbers}
```
## Introduction {.unnumbered}
DLBCL is an aggressive and heterogeneous lymphoma for which standard-of-care R-CHOP (rituximab with cyclophosphamide, vincristine, doxorubicin, and prednisone) immunochemotherapy results in long term remission in more than 60% of patients. [@sehnDiffuseLargeBcell2021] However, outcomes are poor for the 30-40% of patients with primary refractory or relapsed disease (rrDLBCL) even after salvage therapy and autologous stem cell transplant (ASCT). [@roviraPrognosisPatientsDiffuse2015; @crumpOutcomesRefractoryDiffuse2017] The landscape of coding and non-coding somatic variants in DLBCL at diagnosis is well established,[@morinFrequentMutationHistonemodifying2011; @pasqualucciAnalysisCodingGenome2011; @arthurGenomewideDiscoverySomatic2018; @schmitzGeneticsPathogenesisDiffuse2018; @chapuyMolecularSubtypesDiffuse2018] and several studies have examined the mutational landscape of cohorts of rrDLBCL to compare the post-treatment genetic landscape to that of diagnostic DLBCL, identifying somatic variants that occur more frequently in rrDLBCL such as *MS4A1*, *TP53*, *NFKBIE*, *FOXO1*, *CREBBP*, and *KMT2D*.[@rushtonGeneticEvolutionaryPatterns2020; @morinGeneticLandscapesRelapsed2016; @mareschalWholeExomeSequencing2016; @trinhAnalysisFOXO1Mutations2013] Although several of these mutations are prognostic at diagnosis for the likelihood of relapse, they are insufficient to explain the poor outcomes experienced by rrDLBCL patients.
Tumor evolution is usually considered to follow one of two models: linear or branching evolution. Linear evolution is defined when the relapse tumor alone harbors a set of exclusive variants not found at diagnosis, implying direct descent of the relapse from the diagnostic tumor. Branching evolution is characterized by exclusive variants in both diagnostic and relapse tumors. In the transformation of follicular lymphoma (FL) to aggressive DLBCL (tFL), this branching pattern of evolution is considered evidence of a persistent common precursor cell (CPC) that is ancestral to both lymphomas.[@kridelHistologicalTransformationProgression2016a; @okosunIntegratedGenomicAnalysis2014] To-date, studies of DLBCL tumor evolution have leveraged circulating tumor DNA (ctDNA) and/or limited targeted capture space to examine the evolutionary dynamics of relapse in small cohorts, providing some evidence that branching evolution predominates even when tumor pairs are genetically very similar.[@rushtonGeneticEvolutionaryPatterns2020; @schererDistinctBiologicalSubtypes2016; @leeMutationalProfileClonal2021; @juskeviciusDistinctGeneticEvolution2016] However, the degree to which persistent CPC populations might contribute to DLBCL relapse is not yet known.
Critically, no studies have yet examined the evolution of the mutation landscape together with overall tumor biology, which can be evaluated through gene expression profiling (GEP)-based cell-of-origin (COO)[@alizadehDistinctTypesDiffuse2000; @scottDeterminingCelloforiginSubtypes2014] and dark-zone signature (DZsig)[@ennishiDoubleHitGeneExpression2018; @alduaijMolecularDeterminantsClinical2022] classification. More recently, genetics-based classifiers have been developed that leverage co-occurrence of somatic variants to identify shared biology within DLBCL. Intriguingly, the three studies that described genetics-based groups converged on 5-7 highly overlapping subgroups.[@schmitzGeneticsPathogenesisDiffuse2018; @chapuyMolecularSubtypesDiffuse2018; @wrightProbabilisticClassificationTool2020; @lacyTargetedSequencingDLBCL2020; @morinMolecularProfilingDiffuse2022] The LymphGen algorithm is currently the only publicly-available tool for assigning genetics-based subgroups to an individual biopsy.[@wrightProbabilisticClassificationTool2020] These classification systems are becoming the foundation for precision medicine in DLBCL, and while the current assumption is that the features that underlie the classification of each tumor would be fixed over time, this requires formal testing.
Here, we examined a large population-based cohort of rrDLBCL and confirmed that response rate and outcomes to salvage (immuno)chemotherapy and ASCT are superior for patients with late relapses relative to primary refractory or early relapse. To examine the genetic and evolutionary relationships between diagnostic and rrDLBCL underlying these clinical differences, we assembled a cohort of `r all_summary$total_patients[1]` patients with multiple DLBCL biopsies, and interrogated them with a combination of fluorescence *in situ* hybridization (FISH) for recurrent rearrangements, GEP for COO and DZsig, and/or whole genome (WGS, `r num_per_seq_type["genome"]` patients) or whole exome sequencing (WES, `r num_per_seq_type["capture"]` patients) of two or more tumors per patient. Clonal evolution analyses showed an association between the time to relapse and genetic divergence, with late relapses exhibiting a pattern of deep branching evolution. This also revealed an unexpected pattern of convergent evolution among divergent tumor pairs. Our findings of divergent evolution in late relapses suggests these are effectively *de novo* aggressive disease, therefore retaining chemosensitivity and driving superior outcomes in this group.
## Results {.unnumbered}
### Late relapses have superior outcomes
Considering prior observations that outcomes to salvage therapies are related to progression/relapse timing,[@gisselbrechtSalvageRegimensAutologous2010; @wangLateRelapsesPatients2019] we first sought to confirm this observation in a large population-based patient cohort ("outcomes cohort"). We identified 221 patients with *de novo* DLBCL treated with front line R-CHOP(-like) therapy that experienced DLBCL progression or relapse (Table S1-2). All patients received salvage chemotherapy (89% received GDP (gemcitabine, dexamethasone, cisplatin) +/- rituximab)[@crumpRandomizedComparisonGemcitabine2014] with intention-to-treat with consolidative ASCT in patients with (immuno)chemotherapy responsive disease. Patients were categorized into three relapse timing categories: primary refractory disease was defined as progression or relapse within 9 months of diagnosis, approximating 3 months post-end of treatment and consistent with the definition provided by Hitz et al.[@hitzOutcomePatientsPrimary2015] Late relapses were defined as more than 24 months after diagnosis, with this timing reflecting the definition of EFS24 – a validated end point in which patients event-free 24 months following immunochemotherapy collectively have superior disease-related outcomes.[@maurerEventfreeSurvival242014] Early relapses were defined as those occurring between 9-24 months from diagnosis. We found significant differences in both response rates ([@fig-1]A) and the proportion of patients who ultimately received consolidative ASCT ([@fig-1]B), demonstrating superior (immuno)chemosensitivity of tumors of patients experiencing late relapses. Patients experiencing late relapse had significantly superior progression-free survival (PFS) and overall survival (OS) relative to patients that experienced primary refractory or early relapse when considering either time from first progression/relapse ([@fig-1]C-D) or from receipt of ASCT ([@fig-1]E-F). Outcome differences persisted after adjusting for age at diagnosis and IPI at relapse (Extended Data Figure 1). Differences in outcomes were driven by both the proportion of patients receiving ASCT and the outcomes following ASCT. While there was no significant difference between the proportion of patients with early and late relapse receiving ASCT, post-ASCT outcomes were significantly superior in the late relapse group. The overall superior outcomes from the time of progression in the early relapse vs primary refractory group was related to the greater propotion of patients receiving ASCT as there were no differences in post-ASCT outcomes between these groups.
![`r paste0(parsed_authors$figure_legends[1])`](figures_files/figure-pdf/fig-1-1-edited.pdf){#fig-1 fig-pos='h'}
### Late relapses are highly divergent
To examine the underlying tumor biology and patterns of evolution driving the superior outcomes observed in late relapses, we assembled a cohort of patients that experienced relapse with available serial DLBCL biopsies ("molecular characterization cohort"). While a uniform curative treatment approach at time of relapse was important to demonstrate a relationship between relapse timing and (immuno)chemosensitivity in the outcomes cohort, the criteria for inclusion in the molecular characterization cohort was sufficient material from multiple biopsies (and matched constitutional DNA for WGS/WES). Thus, patients who were not treated with intention to consolidate with ASCT were included along with patients with prior indolent lymphoma as long as multiple DLBCL biopsies were available. The distribution of time to relapse in the molecular characterization cohort differs from that observed in the outcomes cohort, reflecting historic patterns of obtaining a biopsy to confirm relapse less frequently in primary refractory disease and availability of constitutional DNA. In total, `r all_summary["total_patients"]` patients were identified, of which 29 had prior follicular lymphoma (FL; 22.5%), two had prior marginal zone lymphoma (MZL; 1.6%), and one had prior chronic lymphocytic leukemia (CLL; 0.8%). Among the `r all_summary["is_denovo"]` patients with apparently *de novo* DLBCL at diagnosis, 11 had a subsequent FL diagnosis (11.3%) and four had a subsequent MZL (including extranodal MZL of mucosa-associated lymphoid tissues (MALT); 4.1%). Six total patients had discordant low-grade bone marrow lymphoid infiltrates at diagnosis and two at relapse. Depending on tissue availability, each pair of biopsies was interrogated with a combination of FISH for *MYC*, *BCL2*, and/or *BCL6* rearrangements, digital GEP (NanoString DLBCL90) for COO and DZsig classification,[@ennishiDoublehitGeneExpression2019; @scottDeterminingCelloforiginSubtypes2014] and/or WGS or WES (Extended Data Figure 2 and Table S3-4). Of the 129 total patients, 73 had sufficient material for WGS (N=68) or WES (N=5) ([@fig-2]), and `r overlap["total"]` were also included in the outcomes cohort (`r overlap["Primary Refractory"]` primary refractory, `r overlap["Early Relapse"]` early relapses, and `r overlap["Late Relapse"]` late relapses). Of these 73 patients, two late relapse patients had prior DLBCL biopsies that were not available for sequencing, while the remainder of patients had the first and at least one subsequent DLBCL biopsy sequenced.
![`r paste0(parsed_authors$figure_legends[2])`](figures_files/figure-pdf/fig-2-1.pdf){#fig-2 height=80% fig-pos='p'}
The use of FFPE tissues for most samples resulted in variable sequencing depth (mean `r round(mean_coverage["genome", "mean_depth"], digits = 1)`X across WGS samples and `r round(mean_coverage["capture", "mean_depth"], digits = 1)`X in exomes; Extended Data Figure 3A and Table S5). Although tumors in the late relapse category had significantly lower depth of coverage on average, there was no correlation of total mutation burden with coverage (Extended Data Figure 3B) indicating that sequencing depth was sufficient to comprehensively detect clonal variants. We also performed deep targeted DNA sequencing of a panel of genes relevant for LymphGen classification ("LySeqST", Table S6) on multiple biopsies subjected to WGS from `r lst_count["two_or_more_lst"]` patients and on a single biopsy from another `r lst_count["one_lst"]` patients. The LySeqST assay achieved a mean depth of `r round(lst_coverage)`X across its capture space (Extended Data Figure 4A and Table S7). The lower variant allele frequencies (VAFs) of variants detected by LySeqST alone vs. genomes demonstrates that it enhanced detection of subclonal variants that fall below the limit of detection of WGS (Extended Data Figure 4B).
Next, we explored the overall divergence of mutations by comparing the number of shared (common between both biopsies) and exclusive (present in only one biopsy) mutations between the first two DLBCL biopsies in each patient. For this and all subsequent analyses, we pooled the LySeqST and WGS variant calls and only retained variants at positions with a sequencing depth of at least 10 unique molecules in all tumors from the same patient. While primary refractory and early relapse tumors have a rich landscape of variants shared between tumors ([@fig-3]A middle), many late relapses have few, with most mutations exclusive to either the diagnostic or relapse biopsy ([@fig-3]A top and bottom). In both primary refractory and early relapse disease, the number of mutations shared between tumors is strongly correlated with the total number of variants identified at either diagnosis or relapse with slopes nearing unity, demonstrating that most variants are shared between tumors ([@fig-3]B). In contrast, the number of shared variants in late relapses is only weakly correlated to the total mutation burden in either tumor ([@fig-3]B). Comparing the percentage of exclusive variants in each tumor to the time between biopsies reveals a clear linear trend, where tumor pairs separated by many years have very few shared variants ([@fig-3]C and Table S8). This trend is consistent when considering time to relapse as a categorical variable (Extended Data Figure 5), when the absolute number of exclusive mutations is considered (Extended Data Figure 6), and is independent of genome coverage (Table S9). The linear relationship between exclusive variants and relapse timing was also consistent when transformed FL tumors were considered separately from *de novo* DLBCL (Extended Data Figure 7). These results are consistent with a branching evolution model of evolution, where late relapse tumor pairs arise from a distant common ancestor harboring few lymphoma-defining mutations.
![`r paste0(parsed_authors$figure_legends[3])`](figures_files/figure-pdf/fig-3-1-edited.pdf){#fig-3 height=80% fig-pos='p'}
Given the high degree of divergence observed in some late relapse tumors, we used RNAseq data to identify functional expressed IG receptor rearrangements and confirm clonal relatedness of tumor pairs (Table S8). All `r mixcr_list$primary_refractory_IGH$total` primary refractory and `r mixcr_list$early_relapse_IGH$total` early relapse patients had concordant IGHV gene usage, while one of `r mixcr_list$late_relapse_IGH$total` late relapses was discordant ([@fig-3]D). Light chain rearrangements were more frequently discordant, which may suggest ongoing receptor editing ([@fig-3]E).[@collinsImmunoglobulinLightChain2018] The lone patient with discordant heavy chain rearrangements also had discordant light chain rearrangements, suggesting these tumors are not clonally related and the second DLBCL is effectively *de novo*.
### Temporal dynamics of structural variants
Rearrangements of *MYC*, *BCL2*, and *BCL6* are important drivers of aggressive lymphoma biology and contribute to disease and genetics-based classification.[@wrightProbabilisticClassificationTool2020; @alaggio5thEditionWorld2022a; @campoInternationalConsensusClassification2022] *BCL2* rearrangement status was concordant in all patients tested ([@fig-4]A and Table S3), consistent with the established origin of *BCL2* rearrangements during V(D)J recombination in early B cell differentiation.[@tsujimoto1418Chromosome1985] In `r sv_list["bcl2_ba", "Concordant"]` patients where WGS identified *BCL2* breakpoints in two or more tumors, the breakpoints were always identical (Table S11).
![`r paste0(parsed_authors$figure_legends[4])`](figures_files/figure-pdf/fig-4-1.pdf){#fig-4 height=80% fig-pos='p'}
In contrast, *MYC* rearrangement status was discordant between timepoints in `r total_discordant["MYC", "total_discordant"]`/`r total_discordant["MYC", "total_assay"]` patients tested (`r total_discordant["MYC", "percent_discordant"]`%), and for *BCL6* in `r total_discordant["BCL6", "total_discordant"]`/`r total_discordant["BCL6", "total_assay"]` patients tested (`r total_discordant["BCL6", "percent_discordant"]`%) with BA FISH ([@fig-4]A). As a proportion of patients in which any tumor harbored a rearrangement, the rate of discordance is substantial, with `r total_discordant["MYC", "percent_discordant_pos"]`% of `r total_discordant["MYC", "ever_positive"]` *MYC*-translocated patients discordant between tumors and `r total_discordant["BCL6", "percent_discordant_pos"]`% of `r total_discordant["BCL6", "ever_positive"]` *BCL6*-translocated patients. Interestingly, in all 10 patients where *BCL6* rearrangements were identified by WGS at multiple timepoints, the breakpoints were identical. Persistent *BCL6*::IGH rearrangements were only found in primary refractory and early relapse, while all late relapses involved persistent *BCL6* rearrangements with non-IG partners.
*MYC* breakpoints were identified by WGS in multiple tumors from 6 patients, one of which was cryptic to FISH.[@hiltonDoublehitSignatureIdentifies2019] Two patients, both late relapses, had different translocation partners at diagnosis and relapse ([@fig-3]B and Table S8), demonstrating independent acquisition of *MYC* translocations in each tumor on the background of the existing *BCL2* translocation in both of these patients, making all of these high-grade B-cell lymphoma with *MYC* and *BCL2* rearrangements. A third late relapse patient had an identical *BCL6*::*MYC* translocation in both tumors, and two patients with early relapse and one with primary refractory disease also had identical *MYC* breakpoints in both tumors. These findings suggest that in patients experiencing late relapse, the *MYC*-translocated aggressive lymphoma is effectively eradicated by treatment, while the indolent CPC harboring a *BCL2* translocation or other variants can persist for many years, and new *MYC* translocations may occur on development of a subsequent aggressive lymphoma. That these late relapses with discordant *MYC* rearrangements both harbored *MYC*::IGH translocations at diagnosis, while the late relapse with a concordant *MYC* rearrangement had a persistent *BCL6*::*MYC* translocation, is consistent with previous studies suggesting that *MYC* rearrangements involving the immunoglobulin (IG) loci have a more aggressive phenotype than those with non-IG rearrangements,[@alduaijMolecularDeterminantsClinical2022; @rosenwaldPrognosticSignificanceMYC2019] indicating that a CPC may harbor such non-IG *MYC* rearrangements and acquire additional variants at relapse to reproduce the aggressive phenotype. Our findings indicate that *BCL6* rearrangements follow a similar pattern.
### Biological consistency of tumor pairs
We next evaluated the consistency of molecular subgroups using GEP and LymphGen over time. First, using digital GEP (the NanoString DLBCL90 assay)[@scottDeterminingCelloforiginSubtypes2014; @ennishiDoubleHitGeneExpression2018] to compare COO and DZsig from `r d90_assays` patients and considering only frank changes in COO classification (*i.e.* a switch from GCB to ABC or *vice versa*), we observed a high level of concordance between diagnosis and relapse (Table S3). None of `r coo_list["Primary Refractory", "total_assayed"]` primary refractory patients, only one of `r coo_list["Early Relapse", "total_assayed"]` early relapse patients (`r coo_list["Early Relapse", "percent_discordant"]`%), and only `r coo_list["Late Relapse", "Discordant"]` of `r coo_list["Late Relapse", "total_assayed"]` late relapses (`r coo_list["Late Relapse", "percent_discordant"]`%) were discordant ([@fig-4]C). Comparison of the NanoString linear predictor scores (LPS) between timepoints revealed a weaker correlation in late relapse patients, possibly indicating additional biological divergence not captured by this binary classification ([@fig-4]D). A similar trend was observed in DZsig scores applied to GCB or COO-unclassified tumors (Extended Data Figure 8).
To evaluate consistency in genetic subgroup assignment, we compared LymphGen classifications across diagnostic/relapse tumor pairs. In total, this yielded a genetic classification for 80% of tumors. In all relapse timing categories, LymphGen classifications were highly concordant, with discordance mainly occurring in patients with overlapping composite or "Other" (not assigned to any subgroup with sufficient confidence) classifications ([@fig-4]E, Table S12). However, there was a single early relapse patient out of `r sum(lg_table["Early Relapse", ])` (`r round(lg_table["Early Relapse", "Frank"]/sum(lg_table["Early Relapse", ])*100, digits = 1)`%) with a frank discordance (BN2 to MCD), and `r lg_table["Late Relapse", "Frank"]` discordant cases among `r sum(lg_table["Late Relapse", ])` late relapses (`r round(lg_table["Late Relapse", "Frank"]/sum(lg_table["Late Relapse", ])*100, digits = 1)`%). Thus, there was high consistency of both GEP and genetic classification despite the mutational divergence observed in late relapses.
### Convergent evolution in divergent pairs
The relative consistency of molecular subgroups as proxies for tumor biology is at odds with our observation that late relapses share relatively few mutations with the diagnostic tumor. To reconcile these differences, we performed phylogenetic analyses of all available tumors from each patient, leveraging all coding mutations alongside non-coding mutations in regions known to be affected by aberrant somatic hypermutation (aSHM; Table S13).[@arthurGenomewideDiscoverySomatic2018] In primary refractory tumors, the vast majority of somatic variants are found in the shared phylogenetic "trunk" (clonal in both tumors; [@fig-5]A). In early relapse tumors, the trunk is comparatively smaller and branching evolution gives rise to exclusive mutations in both diagnostic and relapse tumors ([@fig-5]B). In late relapses, few mutations are in the trunk, with substantial divergence on each branch ([@fig-5]C-D). In one patient described earlier, the trunk comprised a single shared coding mutation ([@fig-5]E). This patient had 7.5 years between diagnosis and relapse, is among only 5 tumors with frank discordance of LymphGen and COO classifications, and has different inferred IGH and IGK/L rearrangements ([@fig-3]D-E), together providing strong evidence that these tumors are not clonally related and thus arose independently.
![`r paste0(parsed_authors$figure_legends[5])`](figures_files/figure-pdf/fig-5-1.pdf){#fig-5 height=80% fig-pos='p'}
In addition, we noted a tendency for divergent tumor pairs to harbor exclusive mutations among the same genes. In the representative MCD-classified early relapse tumor pair, each tumor has independently acquired mutations in *BTG1*, *PIM1*, and *ETV6* ([@fig-5]B); the representative late relapse BN2 tumor pair in *CD70* and *STAT3* ([@fig-5]C); and the representative late relapse EZB tumor pair in *FOXO1*, *KLHL6*, *BTG2*, and *MYC* ([@fig-5]D). To examine this phenomenon more broadly, we used variant calls from `r length(unique(divergent$patient_id))` patients (`r divergent_count["Early Relapse"]` early relapse, `r divergent_count["Late Relapse"]` late relapse) with divergent patterns of evolution, defined as having at least 25% of mutations exclusive to each tumor, and identified truncal (shared among all tumors from the same patient) and exclusive mutations. In total, 28 genes had two or more truncal mutations ([@fig-6]A and Table S14). Some genetic subgroup-defining mutations are among these, including *MYD88*^L265P^ mutations in 4/5 mutated patients and *CREBBP* KAT domain mutations in 5/5 mutated patients. Loci affected by aberrant somatic hypermutation (aSHM), including *BCL2*, *IGLL5*, and *BTG2* had a high number of both truncal and exclusive mutations, suggesting that aSHM can be an early shared event but continues after divergence. Other genetic subgroup-defining mutations were less frequently truncal, including *NOTCH2* PEST domain truncating mutations (1/3), *EZH2*^Y646^ (0/2), and *TET2* mutations (2/7). As individual mutations such as *EZH2*^Y646^ may be considered for treatment stratification or prognosis, this finding underscores the importance of re-characterizing late relapses.
![`r paste0(parsed_authors$figure_legends[6])`](figures_files/figure-pdf/fig-6-1-edited.pdf){#fig-6 height=80% fig-pos='p'}
In total, `r length(constrained$patient_id)` of `r length(unique(divergent$patient_id))` patients with divergent patterns of tumor evolution had tumors that independently acquired mutations in two or more of the same lymphoma-related genes (Table S15-16). *KMT2D*, *PIM1*, *ACTB*, and *ETV6* were among genes most frequently recurrently mutated in tumor pairs ([@fig-6]B). We compared LymphGen classifications of patients in which these mutations recurred to the LymphGen subgroup with which each feature is associated. Some features, such as MCD-related *PIM1* and *ETV6*, recurred in patients with MCD-classified tumors, while *ACTB*, associated with the ST2 LymphGen class, only recurred in patients without ST2-classified tumors ([@fig-6]B). Lastly, we examined the relationship between LymphGen classifications and prior or subsequent low-grade disease. The genetic similarities between different LymphGen classes and different low-grade mature B-cell lymphomas has been noted previously, and it has been speculated that this similarity reflects shared evolutionary history and CPC features in each LymphGen subgroup-indolent lymphoma pairing.[@morinMolecularProfilingDiffuse2022; @wrightProbabilisticClassificationTool2020] As expected based on this model, patients with FL at any time in their disease course had DLBCL tumors predominantly classified as EZB, while the few MZL/MALT lymphomas occurred in patients with BN2- and ST2-classified DLBCLs ([@fig-6]C). Overall, our findings are consistent with a shared CPC origin for DLBCL at diagnosis, relapse and any prior or subsequent indolent disease, and indicate that the few mutations in the CPC population constrain the possible genetic features acquired in each tumor, resulting in biologically consistent disease even in patients where the DLBCLs share few somatic variants.
## Discussion {.unnumbered}
By leveraging multiple metrics of tumor evolution including cytogenetics, GEP, and unbiased genome- and exome-wide sequencing, we have established distinct patterns of tumor evolution that correlate strongly with the timing of DLBCL relapse. The high rate of mutations exclusive to both diagnostic and relapse biopsies shows that branching evolution predominates in late relapses, strongly supporting the existence of persistent CPC populations capable of giving rise to multiple DLBCL over time. However, GEP- and genetics-based classifications remained largely consistent, suggesting that the earliest clonal mutations in a CPC constrain the biology of the subsequent DLBCL(s). This constrained evolution may be the basis for the remarkable convergence of the three studies that have defined the genetic subgroups of DLBCL.[@schmitzGeneticsPathogenesisDiffuse2018; @chapuyMolecularSubtypesDiffuse2018; @lacyTargetedSequencingDLBCL2020; @morinMolecularProfilingDiffuse2022] Subgroup-defining mutations used in the LymphGen classification were sometimes among the inferred CPC mutations identified, while others were not consistently clonal, suggesting that additional genomic aberrations or other non-genetic features, such as DNA methylation or tumor microenvironment, that are responsible for establishing distinct CPC populations are still to be discovered. Furthermore, these CPC mutations appear to constrain the set of loci that acquire mutations during tumorigenesis. As some of these loci include regions affected by aSHM, constrained mutations are not always pathogenic drivers. However, aSHM is well known to target the transcriptional start sites of highly actively transcribed genes, including in normal memory B cells,[@machadoDiverseMutationalLandscapes2022; @alvarez-pradoBroadAtlasSomatic2018] and many aSHM loci have strong associations to genetic subgroups [@wrightProbabilisticClassificationTool2020], and therefore provide footprints of the phenotypic states cells have passed through *en route* to DLBCL.
Our observations of tumor evolution in DLBCL have both similarities to, and differences from, those made previously in studies of the transformation of indolent lymphomas, including CLL and FL. In Richter's transformation (RT) of CLL, evolution follows a more linear pattern, and evidence for subclones that will eventually seed RT have been observed at diagnosis.[@nadeuDetectionEarlySeeding2022] Parry *et al.* compared RT to DLBCL using the Harvard genetics-based classification system and found that, while RTs clonally-unrelated to the preceding CLL clustered with *de novo* DLBCLs, the clonally-related RTs clustered separately, demonstrating a genetic uniformity to DLBCLs originating from a CLL-like CPC.[@parryEvolutionaryHistoryTransformation2023; @chapuyMolecularSubtypesDiffuse2018] However, the Harvard classification lacks a *NOTCH1*-driven subgroup, which is hypothesized to be most similar to CLL, so the relationship between RT and DLBCL genetic subgroups remains underexplored. In contrast, several studies of FL transformation have demonstrated branching patterns of evolution similar to what we have observed in DLBCL, where FL and tFL originate from a shared ancestral CPC, and no evidence of the eventual tFL-seeding subclone has been found in diagnostic FL samples.[@kridelHistologicalTransformationProgression2016a; @okosunIntegratedGenomicAnalysis2014] As expected based on genetic similarities and the proposed CPC origin of indolent and aggressive lymphomas, we demonstrated, for the first time, that DLBCLs were classified into the genetic subgroups with the most similarity to the indolent lymphoma diagnosed in each patient. Importantly, the entire spectrum of DLBCL genetic subgroups were observed in the late relapses of patients without any history of indolent disease. This demonstrates that distinct CPCs may provide the substrate for all genetic subgroups, and that patients do not have to manifest overt indolent disease in order for the DLBCL relapse to exhibit constrained evolution.
The patterns of DLBCL tumor evolution observed here help explain the responses to salvage (immuno)chemotherapy observed at disease relapse in DLBCL. In primary refractory disease, the pattern of tumor evolution suggests that innate chemoresistance is present at diagnosis, with little change in the composition of mutations upon treatment ([@fig-6]D). In this study and others, these tumors do not typically respond to further (immuno)chemotherapy-based salvage regimens and outcomes are poor,[@crumpOutcomesRefractoryDiffuse2017] while alternatives to chemotherapy have been shown to produce superior outcomes in this patient population.[@lockeAxicabtageneCiloleucelSecondLine2022; @abramsonLisocabtageneMaraleucelSecondline2022] The population of primary refractory patients should therefore be the focus in identifying both genetic and non-genetic mechanisms of resistance to front-line immunochemotherapy.
In contrast, our observations of the biology of late relapse are consistent with elimination of the original DLBCL but persistence of a CPC harboring a very small number of mutations. These CPC populations subsequently give rise to a genetically divergent DLBCL with a large number of newly-acquired mutations ([@fig-6]D). Although they share genetic features, the repertoire of driver mutations in the relapse is not preserved. As applications of precision medicine are being explored in DLBCL with emphasis on the rrDLBCL population, genomic analysis of relapsed samples is warranted. As these late relapses are effectively chemotherapy naïve, immunochemotherapy-based regimens may remain a rational treatment option.
## Acknowledgements {.unnumbered}
`r paste0(parsed_authors$acknowledgements, collapse = " \n")`
## Competing Interests {.unnumbered}
`r paste0(parsed_authors$cois, collapse = " \n")`