Introduction

The sensitive and accurate detection of vaccine-specific T cells is a fundamental requirement for use of T cell monitoring assays to facilitate the development of immunotherapeutics within clinical trials [13]. The CIMT Immunoguiding Program (CIP) is one of a number of recent international initiatives that is committed to quality assurance and harmonization of the most commonly used techniques for cellular immunomonitoring, such as staining with HLA-peptide multimers, flow cytometric determination of intracellular cytokines and the IFN-γ ELISPOT assay [47]. Through a series of proficiency testing panels, critical variables in assay protocols have been identified and corrected, resulting in assay harmonization [810]. In addition, the panels provide the opportunity of external validation for the participating laboratories. Here we report on the analysis of the influence of serum supplement to ELISPOT test media across different protocols.

Many laboratories use either human or fetal calf serum in their ELISPOT test medium to maintain the viability and function of T cells. However, serum batches are known to have unique compositions and immunological properties. They require pre-testing, using control samples from representative donors with known reactivities to identify batches with low background reactivity and optimal antigen-specific spot production [11]. As serum pools are per definition limited in size, rounds of testing to identify optimal serum batches will have to be repeated on a regular basis. The use of pre-tested serum batches, therefore, is a restricting factor in its own right. Results obtained in laboratories with different batches may limit comparability between institutions; similarly, results may differ even when generated within one center after switching to a new serum batch. It is also well described that serum composition can change over time [12, 13], which suggests that serum properties also change during long-term storage. Consequently, if it were possible to undertake ELISPOT assays under serum free conditions without loss of sensitivity, this is likely to reduce the variability of the assay and improve consistency over time and between laboratories.

As yet there has been no collaborative study to evaluate the need for serum in the medium used for the ELISPOT assay across a variety of different protocols. CIP has, therefore, initiated a proficiency panel with the specific aim of identifying the impact of serum supplemented medium versus serum-free medium on the number of viable cells, background spot production and test performance. Here, we report that the use of media without serum does not affect the number of viable cells, background spot production, or detection rates, even when different ELISPOT protocols were used. However, the omission of serum requires ELISPOT protocol optimization for each specific serum-free medium.

Materials and methods

Preparation of PBMC samples and peptides

Anonymized buffy coats, collected from HLA-A*0201 positive healthy donors during routine blood donation, were obtained from the National Blood Service in accordance with local regulatory requirements. Peripheral blood mononuclear cells (PBMC) were prepared in the laboratories of the Cancer Sciences Division, University of Southampton, UK. PBMC were isolated using Ficoll Paque™ Plus, density gradient separation (GE Healthcare Lifesciences, Buckinghamshire, UK), washed twice in RPMI 1640, supplemented with 1 mM sodium pyruvate, 1% penicillin–streptomycin-l-glutamine (Invitrogen Ltd, Paisley, UK). PBMC were counted and stored in 1 ml aliquots at 1–2 × 107/vial depending on the total cell number, in 40% RPMI, 50% human AB serum and 10% DMSO using a Nalgene controlled rate freezing container overnight at −80°C before transfer to liquid nitrogen vapor phase for storage until required. The human AB serum used in cryopreservation had been batch-tested for use with minimal background in the ELISPOT.

Two synthetic peptides (Peptide Protein Research, Fareham, UK) for HLA-A*0201-restricted viral epitopes were used as model antigens (influenza matrix protein58–66, GILGFVFTL, and human CMV pp65495–503, NLVPMVATV). Screening was performed at two central laboratories (Southampton, UK and Leiden, The Netherlands) using IFN-γ ELISPOT to identify influenza- and CMV-positive reactivities. Five different donors were selected, including donors with no, low, medium or high numbers of detectable T cells (range 0–100/105 PBMC) specific for one or both model antigens. A total of seven antigen-specific T cell responses were identified by extensive pretesting.

Participants and panel design

The ELISPOT proficiency panel consisted of 18 participants from 7 European countries (Denmark, Germany, Poland, Sweden, Switzerland, The Netherlands and the UK).

Coded PBMC samples (2 vials from each of 5 donors, D1–D5) and synthetic peptides (aliquots of 1 μg/μl to be used at a final dilution of 1 μg/ml) were shipped on dry ice, together with instructions for testing. Participating centers were asked to test the PBMC using an IFN-γ ELISPOT assay separately under two conditions: (1) using medium containing 5–10% serum, and (2) using one of the three commonly used serum-free lymphocyte media [(1) AIM-V from Gibco/Invitrogen, (2) C.T.L serum-free test medium from Cellular Technology Limited or (3) X-Vivo 15 from BioWhittaker], and to incorporate the four minimum requirements established in earlier panel phases [10]: (1) not to use allogeneic APCs, (2) to add ≥4 × 105 lymphocytes per well, (3) to introduce a resting (recovery) time of 2–24 h before adding cells to the ELISPOT plate, and (4) to perform the tests with triplicate wells. Cells plus medium alone were used as negative control. A template pipetting scheme was provided for consistency. Otherwise, assays were performed according to locally established operating procedures and reagents. Cells were counted immediately after thawing and again after resting. On completion of the assays including plate reading and auditing, participants complete a raw data report form and a questionnaire to provide full details of the reagents and protocol used. The tests were performed within 12 weeks of sample shipment. No central plate analysis was performed.

Analysis of results

Of the 18 participating laboratories, one center was unable to recover a sufficient number of cells to perform the analysis, and another completed only half of the requested analysis. Sixteen complete data sets were, therefore, collected. All but one of the laboratories considered themselves to be “experienced” or “professional” in the use of ELISPOT assay.

Response definition: the detection of influenza- and/or CMV-specific reactivities was based on acceptance criteria that had been established on previous CIP ELISPOT panels [10]. Where insufficient cells were recovered to perform the test in triplicate the results were eliminated from further analysis. Wells containing cells plus medium only and no peptide were used as control triplicates. Wells containing cells, medium plus peptides were defined as “experimental” triplicates. An experimental triplicate was considered to be positive relative to the medium control triplicate following a two-sided Student’s t test for unpaired samples, when (1) P < 0.05 and (2) the mean spot number was greater than threefold the mean spot number of the medium control triplicate. Antigen-specific spots were determined by subtracting the mean spot number in the medium control triplicate from the mean spot number in the experimental triplicate. The statistical analyses for comparing the number of viable cells, background spots and detection rates in serum supplemented or serum-free media were calculated using a two sided, Student’s t test for unpaired samples (P < 0.05).

Results

The impact of serum on the number of viable cells

Viable cells were counted immediately after thawing and again after the mandatory resting phase in the assay. Twelve centers performed manual counting using a microscope and trypan blue exclusion. Three centers used Guava-readers and one used a CASY cell counter, while also providing complementary data sets for manual counting. The mean number of viable cells based on trypan blue exclusion (for all 5 donors) immediately after thawing was 10.6 ± 3.3 × 106 per vial, with a minimum of 4.3 and a maximum of 18.1 × 106 cells being reported by the participants. After resting, the mean number of viable cells was decreased compared to direct counting following thawing, with an average cell loss of approximately 31–40% (Table 1). However, there was no significant difference between the results obtained in the presence of serum (mean 7.3 × 106 cells) compared to resting in the absence of serum (mean 6.4 × 106 cells) (Table 1; Fig. 1a). We concluded that the number of viable cells was not compromised in serum-free medium.

Table 1 Number of viable PBMC after thawing and resting
Fig. 1
figure 1

Overall results from sixteen IFNγ-ELISPOT protocols using serum-supplemented and serum-free media. a Recovery of viable PBMC using trypan blue and a manual haemocytometer, following resting for 2–20 h. b Background spot production (spots per 100,000 PBMC) in the cells plus medium only control. c Detection rates, shown as % of responses detected among the seven possible donor–antigen combinations. The mean values for all samples (D1–D5) and all labs (n = 16) were used to plot minimum and maximum value, the range (error bars), inter-quartile range (boxes), median (horizontal line) and mean (black triangle) for the whole group under serum supplemented or serum-free conditions

With the exception of one laboratory (ID09), all PBMC were thawed at room temperature or 37°C. We did not observe any correlation between the numbers of viable cells and the type of medium used for thawing, the counting method, and the duration of the resting phase [8 centers included a short resting phase in their protocol (2–6 h) and 8 centers used an overnight resting (14–20 h)]. Three laboratories (ID03, ID07 and ID08) reported the use of serum in their thawing medium, and it cannot be excluded that these conditions influenced the test results. However, the participants were instructed to use the same medium for the resting phase as they would be using in the ELISPOT assay. For serum-free conditions, this meant that cells were rested in the absence of serum for 2–21 h. A detailed list of thawing conditions is available in Supplementary Table 1A.

Background spot production under serum-supplemented and serum-free conditions

We then compared the induction of non-specific background spots under serum and serum-free conditions. For this, background spot frequency was calculated for each center for each of the five donors (D1–D5) under both conditions. As background spot production in the wells containing cells plus medium only was similar in quantity and distribution among the group of 16 laboratories for all 5 donors (not shown), the average values for all 5 donors, with serum and without serum, were then compared directly (Fig. 1b). The mean background spot values with and without serum were not significantly different (4.9 and 4.7 per 100,000 PBMC, respectively), suggesting that the presence of serum in the medium does not affect background spot production. The median background spot production, which is less affected by the three outliers with high background was 1.1 (serum) and 1.8 (serum-free) spots per 100,000. The background values observed for the majority of laboratories was below two spots per 100,000 PBMC.

Detection rates in serum-supplemented and serum-free conditions

We asked all participating centers to test influenza- and CMV-reactivities of the five donor PBMC samples (seven donor–antigen combinations) by performing the IFN-γ ELISPOT assay using medium containing 5–10% serum and using serum-free medium. Three of the 16 laboratories routinely used serum-free medium in their protocol, with the remaining 13 laboratories preferring the use of pre-selected serum. Although the majority of participants (12/13) used human AB serum, seven different brands and three non-commercially produced sera were reported, whereas three laboratories did not specify the source of their serum (Supplementary Table 1B). This emphasizes the potential difficulties of standardizing this reagent.

The detection of influenza- and/or CMV-specific reactivities was subject to the strict acceptance criteria that we had previously defined ([10]; “Materials and methods”). Laboratory performance was expressed as the detection rate, calculated as the percentage of all potential responses that were detected (Fig. 1c). In the whole group of centers, detection rates were not significantly different for serum and serum-free conditions (P = 0.33, Student’s t test). Under test conditions with serum, 63.7% of all responses were detected for the whole group, with seven laboratories reporting >85% detection rates. Under serum-free conditions, 52.9% of the responses were detected for the whole group, and 4 laboratories reported >85% detection rates. Comparison of the performance of individual laboratories revealed that nine centers reported approximately equal detection rates (<15% difference) under both conditions, two centers generated better results and five centers encountered decreased detection rates using serum-free medium (>15% difference; Supplementary Table 2A). Importantly, the majority (13/16) of laboratories had optimized their protocols for use with serum and not for serum-free conditions. These laboratories combined serum with one of the following media: RPMI (6×), IMDM (4×), X-Vivo 15 (2×) and Iscove (1×). The three laboratories that normally used serum-free protocols added serum to X-Vivo 15 (2×) or RPMI (1×). The analysis of these small data sets did not reveal differences in the detection of positive responses in relation to the type of medium used.

We then assessed whether the ability to detect antigen-specific T cells correlated with the level of background IFN-γ production (number of spots in the medium only control). As shown in Fig. 2, the antigen-specific detection rates increased as background spot production decreased, with similar coefficients of correlation of −0.66 for serum-supplemented and −0.72 for serum-free conditions (Fig. 2). The correlation was independent of whether the medium contained serum, indicating that serum is not the major factor contributing to the background spot frequency.

Fig. 2
figure 2

Inverse correlation between background and detection rates. Mean detection rates for each lab expressed, as the % positive responses as defined in “Materials and methods”, are plotted against the mean cells plus medium only background (spots per 100,000 PBMC) using serum supplemented (open triangles) and serum-free (filled circles) media. Under both conditions, detection rate decreases as background increases. Correlation co-efficients: serum, −0.66; serum-free, −0.72

Qualitative and quantitative comparisons between laboratories

Comparisons of individual performances in detecting antigen-specific T cells were made to examine inter-laboratory variability, and to identify any protocol commonalities that might impact on the performance. For this, results from the assays performed under the preferred condition were used, i.e., serum-supplemented medium and serum-free medium for 13 and 3 labs (ID01, 04 and 16), respectively (Supplementary Table 1A).

From the group of 16 participating centers, six individual laboratories were identified as “high performers”, by having successfully detected 6 or 7 positive T cell responses (detection rate >85%) using their preferred protocols (Table 2A). The spot frequencies reported demonstrate that in spite of the concordance between these six laboratories in detecting influenza- and CMV-specific reactivities, there was less agreement in the quantification of antigen-specific T cells. Variation between the 6 “high performers” ranged from 74–153 spots/100,000 PBMC for the highest response D3/CMV (mean frequency 116/100,000 PBMC) to 1–13 spots/100,000 for the lowest response D2/CMV (mean frequency 5/100,000 PBMC).

We investigated the possibility that the differences in spot frequencies reported by participants might be influenced by the total number of cells seeded per well. As the number of antigen-specific spots and the number of cells seeded per well do not necessarily show a linear correlation, spot numbers reported by laboratories using different numbers of cells per well might not be directly comparable. In the case of the six high performing labs, five labs seeded 400,000 cells per well and one lab (ID09) used 450,000 PBMC per well. The latter laboratory reported spot frequencies which were above the mean for all six successfully detected donor–antigen combinations (Table 2), suggesting that the total number of seeded PBMC may influence the outcome. Based on this, we investigated the role of cell number and test performance in the whole group of 16 labs. A total number of 11 labs seeded 400,000 PBMC per well, 3 labs seeded 450,000 PBMC (ID08, ID 09 and ID21) and 2 labs seeded 500,000 PBMC (ID05 and ID13). However, no statistical difference for (1) the number of background spots, (2) the number of antigen-specific spots or (3) the detection rate for any of the three subgroups was observed (not shown). The chance to detect a response correlated with the estimated frequency of antigen-specific cells within the sample (Table 2), confirming our previous findings [10]. A summary of all results reported by the 16 participating centers for both conditions (serum vs. serum-free) is provided in supplementary Tables 2A and B.

Table 2 Reported frequencies of antigen-specific T cells

Finally, comparison of individual protocol variables between the high performing laboratories did not identify any commonalities (Table 3); the protocols of these six laboratories were varied, with different thawing conditions, counting methods and resting periods. There were also differences in the test medium, serum sources, type of ELISPOT plates, antibody combinations and ELISPOT reader used. This demonstrates that ELISPOT assays can be successfully performed under completely different, but for each laboratory optimized, conditions.

Table 3 Comparison of ELISPOT protocols used by high performing laboratories

Discussion

We have used a multicenter approach to study the impact of serum use in a broad variety of ELISPOT protocols by comparing the performance of 16 different laboratories. Numbers of viable cells after thawing and resting were similar for both conditions (Fig. 1a), demonstrating that cell quality was not compromised in serum-free media. Similarly, there was little difference between the levels of background spot production under serum and serum-free conditions (Fig. 1b), suggesting that serum is not a major factor contributing to background spot detection. Overall detection rates (expressed as the percentage of detected among potentially detectable responses) did not differ significantly between the two conditions (Fig. 1c). The fact that four laboratories were able to detect >85% responses without serum clearly illustrates that excellent results can be obtained with serum-free protocols. However, five laboratories (38%) found that the detection rates decreased when they switched from using their serum-containing test medium to a serum-free medium. It has to be emphasized that 13/16 of the laboratories in this panel employ ELISPOT protocols that had been optimized for the use of serum supplemented medium. Based on these findings we recommend the use of serum-free protocols to centers that perform ELISPOT assays for monitoring immune responses on a regular basis, as this should eliminate the problems associated with batch-testing and serum stability, and assist in the process of assay harmonization. Our observations are in line with those from a recent, single center study [11], which concluded that removal of serum and addition of low-dose IL-7 led to increased test performance when applied to a specific ELISPOT protocol. During the review process of our paper an additional inter-laboratory study was published by Zhang et al. [14], where the use of eight different sera was compared to the use of the same serum-free medium in a single optimized protocol. The results showed that signal to noise ratio increased in the presence of serum, and confirmed that good performance can be achieved using an optimized serum-free protocol. In contrast, our study of a variety of different locally established protocols clearly shows that removal of serum can increase the performance in some laboratories, while decreasing performance in others (i.e., increased background spot production or lower numbers of antigen-specific spots), and that ELISPOT assays can be successfully performed under completely different but for each laboratory optimized conditions. We conclude that not all protocols support high performance under serum-free conditions without re-optimization.

The mean detection rate for the tests performed under the routinely established protocol conditions was 65% representing the detection of four (4.55) of the possible seven responses. This relatively low detection rate is due to two factors. In this panel phase, two of the seven responses that could have been detected (the two donor–antigen combinations D2/CMV and D4/Flu) were below or equal to six (on average) antigen–specific spots/100,000 PBMC (Table 2). In order to successfully detect these two very low responses the background spot production should have been less or equal to two spots which is below the mean background spot production usually found in an average lab (CIP, unpublished data). We, therefore, expected an “average” lab to detect only five of the seven responses which would have been a detection rate of 71.4%. The second reason for the low overall detection rate was that very strict response definition criteria were applied, which we had defined in previous rounds of inter-laboratory testing [10]. When we applied the more commonly used response definition criterion of twofold (instead of threefold) over background, overall detection rate increased from 65 to 71.8% (for the preferred condition of each lab) clearly showing that the group of participants was experienced and representative of an average laboratory.

The experience from the CIP ELISPOT proficiency panel program and previous findings from Janetzki [8] strongly suggested that background critically influences the detection rate. The data from this panel confirm that there is a distinct relationship between the background spot frequency and detection rates (Fig. 2) reinforcing the importance of optimizing assay protocols to generate low backgrounds.

Six of the 16 participating laboratories were able to detect 86–100% of the positive responses (6/7 or 7/7) using their favoured conditions and were identified as “high performers” (Table 2). Although the performances of these laboratories were qualitatively equivalent, the antigen-specific T cell frequencies were highly variable, ranging between 1 and 13 spots/100,000 PBMC for the lowest response (D2/CMV) and 74–153 spots/100,000 PBMC for the highest response (D3/CMV). However, these six high performing laboratories reported results with less variation than the remaining ten laboratories (Table 2). High inter-center variability has been observed previously in proficiency panels organized by us and others [8, 10], especially with low and moderate frequencies T cells, and it is clear that quantitative comparison of results reported across institutions still remains a major problem in the field. Results from the current panel continue to illustrate the difficulty in decreasing inter-center variability in the ELISPOT assay, even among experienced laboratories. To address this issue, broadly available standard samples are urgently needed and may even require the use of the same protocols [14]. This, together with the implementation of external quality testing projects and the generation of a commonly accepted framework to report data from immunomonitoring assays [15], is a major international challenge for achieving comparability of data across institutions.

Comparison of the protocols used by high performing laboratories revealed that these were different from one another (Table 3); there was no indication that any one factor in the protocol had critically influenced performance with respect to the detection rate of T cell reactivities. On the other hand, laboratories that used protocols that had many commonalities could have different detection rates. Similar observations have been made previously by us and others [8, 10] and suggest that high performance depends on many factors, including the protocol components used by each different laboratory [14] as well as yet unidentified parameters. For example, the laboratory environment in which assays are performed may play an important role. Immunological knowledge, experience, training, high levels of quality control and assay validation [1618] are likely to be the key components of good assay performances. CIP will now begin to systematically investigate the influence of lab environment, quality assurance and test validation on assay performance.

The current CIP panel was designed to compare the effects of serum supplementation or depletion on different locally established IFNγ ELISPOT protocols, and did not include a systematic comparison of commercially available media for lymphocyte culture and testing. A number of laboratories of the CIP have actively participated in an independent proficiency panel organized by the Cancer Research Institute’s Cancer Vaccine Consortium (CRI-CVC), which aimed at the comparison of 4 serum-free lymphocyte media with the optimized laboratory medium supplemented with serum [19]. Although the design of the CRI-CVC panel differed significantly from that of the study reported here, both programs reach similar results which supports the overall conclusion on the use of serum-free test media in ELISPOT assays. Recent literature suggests that the protocols and conditions (time lines, media, serum supplement, etc.) used for preparing and freezing cellular material critically influence the cell quality and immunological function [2022]. These variables have not yet been systematically investigated within a group applying a range of individual protocols. CIP and CRI-CVC will, therefore, collaboratively organize a systematic analysis of these variables in a representative testing group to generate a base for rational development for highly sensitive ELISPOT assays with low background spot induction.