Introduction

For decades, human tumor-derived cell lines have been a cornerstone of cancer research and have shaped our understanding of the genetic and epigenetic changes that drive the process of malignancy. In addition to molecular and cell biology studies, cancer cell lines have been extensively used in areas, such as drug screening and biomarker discovery.1, 2, 3, 4, 5 For research, cell culture presents unique advantages, such as ample supply of live cells, ease of controlling experimental factors and of being common reference model systems. However, questions have been raised about the clinical relevance of findings obtained by the use of cancer cell lines.1 Biological issues, such as the monoclonal nature and the absence of tumor stroma and technical factors, including cross-contamination and culture adaptation, limit the direct comparison with in vivo tumors. Nevertheless, many cell lines harbor genetic and epigenetic aberrancies that are also found in matching cancer tissue biopsies.6, 7, 8, 9, 10 By cell line authentication analyses and a careful selection of suitable cell lines for the specific research question being asked, some of the abovementioned limitations can be addressed in the study design. Improved genetic and epigenetic characterization of a set of cell lines from the same type of cancer will help scientists to choose the best research tool.

Colorectal cancer (CRC) is a heterogeneous disease with three different, but partly overlapping, molecular phenotypes reflecting different forms of DNA instability. The chromosomal instability pathway (CIN) is the most common phenotype, accounting for 85% of all sporadic CRCs.11, 12, 13 The malignant cells in CIN tumors are typically aneuploid and reveal large-scale chromosomal rearrangements. The microsatellite instability (MSI) phenotype represents 15% of all CRCs and is caused by various deficiencies in the DNA mismatch-repair system, leading to a large increase in the mutation rate.14, 15 Cancers with the CpG island methylator phenotype (CIMP) exhibit aberrant DNA methylation, leading to concordant promoter hypermethylation of multiple genes.16 A precise definition of this phenotype and a unified panel of markers for classification remains to be established. Here a panel of representative genotype-authenticated colon cancer cell lines are further classified according to their genetic and epigenetic molecular phenotypes.

Results

Overview of colon cancer cell lines

A major obstacle in the validity of data generated from cancer cell lines is potential cross-contamination. Recently, a database containing >400 cross-contaminated or misidentified cell lines was published by the International Cell Line Authentication Committee.17 The cell lines included in the present study were analyzed by short tandem repeat (STR) profiling. For all cell lines, profiles were compared against publically available databases. To evaluate the cell lines with no publically available STR profile, these were subjected to a clustering analysis along with the profiles of >100 different cancer cell lines in order to check for inappropriate similarities. As a combined result, three of the initial 27 colon cancer cell lines were discarded due to cross-contamination (Figure 1).

Figure 1
figure 1

Colon cancer cell lines STR profiling. Hierarchical clustering of cell lines based on STR length of three alleles of nine STR markers. Gray color in the heatmap indicates missing allele. AMEL marker indicates sex chromosomes present. Cell lines found misclassified and excluded from further analysis are highlighted in red. Cell line pairs previously known to be derived from the same patients are highlighted in gray.

As previously known, HCT-15/DLD-1 and HT-29/WiDr are derived from the same patient.18, 19 However, considering their widespread use, all four were subjected to analyses. As expected, there were no genetic differences within the pairs. In addition, two sets of cell lines derived from primary tumor and metastasis from the same patient are included here: SW480 (primary) and SW620 (lymph node), and IS1 (primary) and IS3 (peritoneal metastasis). SW480 and SW620 carried identical mutation profiles, but had epigenetic differences. IS1 was homozygous, whereas IS3 was heterozygous for the KRAS mutation.

The 24 cell lines included in this study varied in appearance and growth characteristics (Figure 2). The fastest growing cultures were those of Caco-2, COLO 320, DLD-1, HCT-15, HCT-116, HT-29 and TC71 (doubling time 20–24 h). Although most cell lines formed quasi-monolayers, EB, FRI, IS3, LS1034, SW1116 and V9P formed dense ‘cell islands’ and were also the slowest growing cultures. Cell line origins are listed in Table 1.

Figure 2
figure 2

Colon cancer cell lines vary in growth rate and morphology. Phase-contrast micrographs depict the individual cell cultures 24 h after trypsinization and seeding. Fast-growing cancer cell lines are indicated with a yellow dot and slower-growing cell lines are indicated by a red dot. The remaining cell lines had an intermediate growth rate. Scale bar, 100 μm.

Table 1 Colon cancer cell line origins

MSI and CIN status in colon cancer cell lines

Using the BAT-25 and BAT-26 mononucleotide repeat markers, 9/24 cancer cell lines were found to be MSI (Table 2). CIN was found to be mutually exclusive with MSI and was the most common phenotype with 15/24 cell lines (Table 2).

Table 2 Colon cancer cell lines classified by the molecular pathways CIN, MSI and CIMP, and mutation status of cancer critical genes

CIMP in colon cancer cell lines

Classification of colon cancer cell lines into CIMP-positive and -negative samples were based on CIMP panel 1 suggested by Issa16 (CDKN2A (p16), MINT1, MINT31 and MLH1) and CIMP panel 2 suggested by Weisenberger et al.20 (CACNA1G, IGF2, NEUROG1, RUNX3 and SOCS1). Among the 24 colon cancer cell lines, 13 and 9 were classified as CIMP positive for each panel, respectively (Table 2). In accordance with previous findings, panel 2 displayed a bimodal distribution of the number of methylated genes, as illustrated in Figure 3a. Figure 3b shows the CIMP status compared with the two other molecular pathways, CIN and MSI, as well as cancer gene mutations.

Figure 3
figure 3

CIMP in colon cancer cell lines. (a) The status of CIMP panel 1 (Issa16 left) and panel 2 (Weisenberger et al.20 right) are illustrated. Panel 2 displayed a bimodal distribution of the number of methylated markers, identifying a distinct group of colon cancer cell lines with frequent DNA methylation. (b) Molecular profiles of colon cancer cell lines. A total of 10 markers in 2 preselected panels were tested for CIMP-related DNA methylation in 24 colon cancer cell lines. Green and red color signifies unmethylated and methylated samples, respectively. CIMP-positive samples are indicated with purple color, light blue signifies CIMP-negative samples. Samples with CIN or MSI, or BRAF, KRAS, PIK3CA, PTEN and/or TP53 mutations are marked by black color. (c) Venn diagrams illustrate the association between the three CRC phenotypes CIN, MSI and CIMP panel 1 (left) and CIMP panel 2 (right) in colon cancer cell lines.

BRAF, KRAS, PIK3CA, PTEN and TP53 mutations in colon cancer cell lines

In order of decreasing frequencies, TP53, KRAS, PIK3CA, BRAF and PTEN are among the most commonly altered genes in CRC. A subset of the present panel of cell lines have previously been characterized for TP53 mutations.21 We found that TP53 was the most commonly mutated gene, affecting 17/24 cell lines. Three cell lines had frame-shift or nonsense mutations, while the remaining had missense mutations (Table 2). The SIFT Human Protein tool and the IARC TP53 database were used to assess the functional impact of these substitutions.22, 23 All of the 17 cell lines carried mutations predicted to be ‘damaging/non-functional’. Notably, SW480 and SW620 carried each two different TP53 mutations: ‘tolerated/increased activity’ P309S and ‘damaging/non-functional’ R273H substitutions. Although TP53 is polymorphic at codon 72 and several studies have suggested increased cancer susceptibility for carriers of the TP53P72 variant, this association is uncertain.24, 25, 26 Of our 24 cell lines, 8 had at least 1 such allele. Full details on all mutations are listed in Supplementary Table 1.

Hyperactivating KRAS mutations were found in 15 cancer cell lines, and out of these five were homozygous (Table 2 and Supplementary Table 1). BRAF mutations were found in another five cell lines and were, as expected, mutually exclusive with those of KRAS. All BRAF-mutated cell lines retained a wild-type BRAF copy. In total, 20/24 cell lines harbored mutations in either KRAS or BRAF.

The PIK3CA gene had hyperactivating mutations in 11 samples (Table 2). SW948 was the only cell line homozygous for the mutant allele. PTEN encodes a tumor-suppressor protein counteracting the phosphoinositide 3-kinase (PI3K) complex.27 Two samples, CO-115 and TC71, had mutations leading to premature stop codons. Summarized, 13/24 cell lines had PI3K/AKT hyperactivating disruptions.

Epigenetic and genetic stratification of colon cancer cell lines

Venn diagrams illustrate the overlap between the three developmental pathways (Figure 3c). CIMP panel 1/Issa16 overlapped with the majority of cell lines with MSI and BRAF mutation (Table 3). For CIMP panel 2/Weisenberger et al.,20 the binary logistic regression analysis revealed a significant association between a positive phenotype and MSI (P=0.03), and a CIMP-negative phenotype and TP53 mutation (P=0.01), in agreement with previous findings in CRC.16, 20 CIMP-positive cell lines were associated with mutations in BRAF and PIK3CA (borderline significance P=0.07 and 0.06, respectively). Results are summarized in Table 3 and are illustrated in Figure 3c.

Table 3 Associations between CIMP status and other molecular features

Discussion

We present here the profiles of key epigenetic and genetic features of 24 colon cancer cell lines, which is depicted in Supplementary Figure 1.

There is still no consensus regarding how the CIMP phenotype should be classified.28 Several gene panels exist and optimal marker thresholds are not yet determined. Furthermore, there is no consensus regarding whether CIMP consists of two subgroups,20 three subgroups29 or four subgroups.30 Many have also struggled to confirm the bimodal distribution of the number of methylated markers first described by Ogino et al.31 This was in recent years elegantly reproduced by Weisenberger et al.,20 using a comprehensive approach to identify suitable markers for CIMP classification and, at the same time, demonstrating that this indeed is a distinct subgroup of colorectal tumors. In the present study, we have analyzed the two most commonly used CIMP panels,16, 20 and in spite of the limited sample number a bimodal distribution was confirmed for CIMP panel 2, supporting the use of this Weisenberger-derived panel also for classifying colon cancer cell lines. This was further supported by the association found between CIMP and BRAF mutations, which is in compliance with results from primary CRCs.20

Despite the fact that MSI is frequently caused by promoter hypermethylation of MLH1 in sporadic CRC, and that MSI is largely overlapping with the CIMP phenotype, only three out of seven MSI CIMP-positive cell lines had MLH1 promoter methylation. Among the remaining four, DLD-1/HCT-15 carry disrupted MSH6, HCT-116 has mutated MLH132, 33 and TC71 is derived from a patient with a history of hereditary non-polyposis CRC syndrome, and consequently has a so-far unknown germline mismatch-repair deficiency.34

Interestingly, two MSI cell lines were scored as CIMP negative across both epimarker panels. Mutations in KRAS have previously been associated with a CIMP-low phenotype29, 35 and it is possible that these cell lines would be reclassified as a CIMP-low phenotype, a subgroup associated with MSI, if the threshold criteria were changed or if markers that have the ability to separate CIMP-high from CIMP-low phenotype were taken into account. Indeed, both LoVo and LS-174T harbored KRAS mutations. We have previously shown that these cell lines harbor promoter methylation of the majority of promoters included in a six-gene DNA methylation biomarker panel for early detection of CRC.4

TP53 is a gene that is pivotal in maintaining genome integrity and in inducing apoptosis in cells damaged beyond repair.36 Seven of our cell lines presented wild-type TP53. Alternative mechanisms for the deregulation of this tumor suppressive P53 signaling circuit are ATM loss or MDM2 hyperactivation.37 The PI3K/AKT pathway is known to induce MDM2 activity and could thus contribute to the loss of TP53 in some of the seven wild-type cell lines.38

KRAS and BRAF are proto-oncogenes in the RAS–RAF–mitogen-activated protein kinase pathway relaying pro-proliferative signaling. In a previous study, BRAF and KRAS mutation status, as well as CIMP status, have been determined for 12 of the 24 cell lines reported here.39 With the exception of a KRAS mutation detected in SW948, all data were in agreement with these results. The RAS–RAF–mitogen-activated protein kinase signaling promotes growth and is hyperactivated in a large fraction of colorectal carcinomas. KRAS and BRAF are the most common alterations, but also alterations to NRAS, EGFR, ERBB2 and ERBB3 are known to contribute to pathway activation in CRC.32, 40 In our panel, 4/24 lines were negative for both KRAS and BRAF hotspot mutations. Interestingly, the same four cell lines were negative for PIK3CA/PTEN alterations as well. Activating PIK3CA mutations and loss or mutational inactivation of PTEN are typical aberrations, leading to hyperactivation of the pro-tumorigenic PI3K/AKT pathway.27 In the case of SW48, a hyperactivating mutation in EGFR has been described.32 However, for the remaining three cell lines, Caco-2, COLO 320 and V9P, alternative mechanisms must be driving the malignant growth.

In summary, we report an epigenetic and genetic profiling of a large panel of colon cancer cell lines. By comparing the two most cited CIMP panels in the literature, we support the panel of Weisenberger et al.20 as the most suitable choice for CIMP evaluation in colon cancer cell lines, also suggesting that colon cancer cell lines might be relevant model systems for studying the CIMP phenotype. The genetic and epigenetic information provided in the present study should aid in the selection of representative colon cancer cell lines for future research.

Materials and methods

Colon cancer cell lines

Twenty-seven colon cancer cell lines were initially included in the present study. HCT-116, HCT-15, LoVo, RKO, SW1116, SW48, SW620, SW948, NCI-H508 and WiDr were purchased from the American Type Culture Collection (ATCC, Manassas, VA, USA). ALA, CaCo-2, CO-115, COLO 320, DLD-1, EB, FRI, HT-29, IS1, IS2, IS3, LS1034, LS-174T, TC7, TC71, SW480 and V9P were kindly provided by collaborators. Cell lines were cultured in medium with added fetal bovine serum, glutamine, penicillin and streptomycin, and were maintained in humidified 37 °C 5% CO2 incubators as described in Supplementary Table 2. Before collection, cultures were tested for mycoplasma infection using Myco Alert (Lonza, Walkersville, MD, USA) according to the manufacturer’s protocol.

DNA was isolated using either a standard phenol/chloroform procedure, or a magnetic beads approach (the Maxwell 16 DNA Purification kits, Promega Corporation, Madison, WI, USA, and MagAttract DNA Mini M48 kit, Qiagen Inc., Valencia, CA, USA). DNA was STR profiled using the AmpFLSTR Identifiler PCR Amplification Kit (Life Technologies, Carlsbad, CA, USA). Resulting cancer cell line STR profiles were cross-compared and, where available, matched with the ATCC’s and German Collection of Microorganisms and Cell cultures’ (Braunschweig, Germany) online databases. Hierarchical clustering of STR data was performed using Euclidian distances and average linkage clustering in Partek Genomics Suite 6.6 (Partek Inc., St Louis, MO, USA; Figure 2). ALA, CO-115, EB, FRI, IS1, IS2, IS3, TC7, TC71 and V9P are non-commercial cell lines and their STR profiles will be provided upon request. Three of the 27 cancer cell lines were found to be misclassified. ALA and IS2 had identical profiles to SW480/SW620 and LS1034, respectively. TC7 had a STR profile incompatible with its origin as a Caco-2 subclone.41 Consequently, ALA, IS2 and TC7 were excluded from further analysis.

Micrographs of live cell cultures were captured with an Eclipse TS100 microscope equipped with a × 10 phase-contrast objective using accompanying NIS-Elements F Package 2.21 software (all from Nikon, Tokyo, Japan). Resulting images were imported into Photoshop CS4 (Adobe Systems, Mountain View, CA, USA), cropped and color matched.

MSI status and CIN phenotype

The MSI status was determined by analyzing the BAT-25 and BAT-26 mononucleotide repeat loci as previously described.42 BAT-25 and BAT-26 represent two out of the five markers in the Bethesda panel and have been shown to correctly identify 97% of MSI-high cases.43, 44 With the notable exception of LoVo, there was full concordance between the two markers. LoVo lacked the BAT-26 locus altogether, but was classified as MSI based on BAT-25 fragment length and in accordance with another study.32 The CIN status for all cell lines was retrieved from previous data from us and others,45, 46, 47, 48 and was in concordance with the present MSI data and gene mutation data.

CpG island methylator phenotype

Genomic DNA was subjected to bisulfite-mediated conversion using the EpiTect Bisulfite Kit from Qiagen and DNA methylation was determined by quantitative methylation-specific PCR, as previously described4 in 10 CIMP-defined promoters belonging to two distinct panels. CIMP panel 1/Issa16 consisted of CDKN2A (p16), MINT1, MINT2, MINT31 and MLH149 and CIMP panel 2/Weisenberger et al.20 consisted of CACNA1G, IGF2, NEUROG1, RUNX3 and SOCS1. 6-FAM-labeled probes were purchased from Life Technologies and primers were purchased from BioNordika Norway AS (Oslo, Norway). For all samples, three replicates were run for each of the genes and the median value was used for data analysis. The repetitive ALU sequence (ALU-C4) was used to normalize for the amount of bisulfite-converted DNA input.50 A methylated reference (CpGenome Universal Methylated DNA, Millipore, Billerica, MA, USA) was used to generate 1:5 dilution series (32.5–0.052 ng) constituting the standard curve. All samples were censored after cycle 35 according to the protocol from Life Technologies. The percent of methylated reference values were calculated based on the median value of GENE:ALU ratio for each sample divided by the median GENE:ALU ratio of the positive control, and multiplied by 100. Samples were considered positive for methylation when the percent of methylated reference was 10, in accordance with previous publications.20, 49 In CIMP panel 1, promoter hypermethylation of MINT2 occurred in all cell lines and was therefore non-informative for CIMP classification. According to previously established criteria, CIMP-positive samples for panel 1 were defined as having three or four methylated markers and CIMP-negative samples as having zero to two methylated markers.49 For panel 2, CIMP-positive cell lines were defined as harboring 3/5 methylated markers and CIMP-negative samples as having a maximum of 2 methylated markers.20

Mutation analyses of BRAF, KRAS, PIK3CA, PTEN and TP53 cancer genes

Total DNA was subjected to PCR amplification followed by Sanger sequencing as previously described.51 Resulting sequences were compared with the consensus coding sequence retrieved from the University of California, Santa Cruz genome browser, hg19 (accessed November 2012).52, 53 All mutations are annotated at the protein level according to previously described nomenclature.54 Mutational hotspots, BRAF codon V600, KRAS codons G12, G13 and Q61, and PIK3CA codons E542, E545 and H1047 were analyzed. The entire coding regions of PTEN and TP53 were examined. Mutation calls were verified by resequencing using different sets of primers. The functional impact of amino acid substitutions was assessed using the SIFT Human Protein tool with default parameters (accessed March 2013) and the IARC TP53 database (accessed June 2013).22, 23 All mutation data was compared with data available from COSMIC (accessed January 2013).32 Two hyperactivating mutations, BRAFA146T (LS1034) and PIK3CAP449T (HT-29/WiDr) have previously been reported in regions not covered in the present study.55, 56 These two mutations were included for statistical comparisons.

Statistical analyses

SPSS 16.0 was used for performing statistical analyses (IBM, Armonk, NY, USA). Binary logistic regression analysis was used to examine associations between CIMP and genetic features. All P-values were derived from two-tailed tests and findings with P-values 0.05 were considered statistically significant. No correction for multiple testing was performed. As cell lines HCT-15/DLD-1 and HT-29/WiDr are derived from the same patient and most likely from the same tumor,18, 19 only HCT-15 and HT-29 were included in the statistical analyses.