Maagcarcinoom

Initiatief: Cluster Oesofagus- en maagcarcinoom Aantal modules: 60

Biomarker diagnostiek – PD-L1 expressie

Uitgangsvraag

Welke diagnostische test heeft de voorkeur voor het vaststellen van PD-L1 expressie bij patiënten met een gemetastaseerd adenocarcinoom van de gastro-oesofageale junctie of de maag?

Aanbeveling

Voer bij voorkeur de gevalideerde diagnostische test uit waarvoor het eigen centrum gecertificeerd is, om PD-L1 expressie vast te stellen bij patiënten met een adenocarcinoom van de maag en gastro-oesofageale junctie.

Overwegingen

Voor- en nadelen van de interventie en de kwaliteit van het bewijs

 

Er is één studie die algehele overleving heeft gerapporteerd als uitkomstmaat (Park, 2020). Er zijn verschillende factoren in een multivariabele analyse mee genomen om overall survival te voorspellen. Zowel PD-L1 expressie gediagnosticeerd met 22C3 antilichaam op de Agilent autostainer (HR 2.63) als SP-263 antilichaam op de Ventana stainer (HR 2.20) zijn als factoren mee genomen waarbij er geen groot verschil tussen de verschillende stainers met antilichamen gevonden is.  Patiënten met PD-L1 expressie in deze studie leken een betere prognose te hebben dan patiënten zonder deze expressie. Het is lastig om een juiste interpretatie aan deze data te geven.

 

Er is één studie die heeft gekeken naar diagnostische accuratesse (Kim, 2021). De studie van Kim (2021) rapporteerde fout-positieven en fout-negatieven voor de biopsiemonsters waarbij de operatiemonsters als gouden standaard dienden. Er is niet duidelijk beschreven met welk antilichaam en welke stainer de operatiemonsters als gouden standaard zijn meegenomen.

 

Er zijn zeven studies die naar diagnostische concordantie hebben gekeken (Ahn, 2021; Dabbagh, 2021; Kim, 2021; Park, 2020; Ma, 2018; Narita, 2021; Yeong, 2022). Vier studies hebben gekeken naar algehele overeenstemming en positieve en negatieve overeenstemming tussen verschillende antilichamen met verschillende stainers (Ahn, 2021; Dabbagh, 2021; Kim, 2021; Park, 2020). De studie van Kim (2021) heeft PD-L1 expressie met het 22C3 antilichaam op de Dako stainer vergeleken met het 22C3 antilichaam op de Ventana stainer en SP-263 antilichaam op de Ventana stainer. Daaruit bleek dat de overeenstemming tussen het 22C3 antilichaam op de Dako stainer en 22C3 antilichaam op de Ventana stainer het grootst was.

Daarnaast hebben vijf studies gekeken naar concordantie en correlatie (Ahn, 2021; Ma, 2018; Park, 2020; Narita, 2021; Yeong, 2022). Ma (2018) heeft gekeken naar concordantie tussen PD-L1 expressie met het SP-142 antilichaam op de Ventana stainer en 28-8 antilichaam op de Leica Bond stainer voor het totaal aantal cellen en alleen voor tumor cellen. Daarbij was de concordantie tussen deze antilichamen en stainers voor tumor cellen het grootst.

Park (2020) heeft gekeken naar de PD-L1 expressie met het 22C3 antilichaam op de Agilent Autostainer en het SP-263 antilichaam op de Ventana stainer voor CPS en TPS waarbij de correlatie voor TPS groter was.

Yeong (2022) heeft gekeken naar de correlatie tussen PD-L1 expressie met het 22C3 antilichaam op de Leica Bond stainer, SP-142 antilichaam op de Leica Bond stainer en 28-8 antilichaam op de Leica Bond stainer. Daarbij was de correlatie tussen het 22C3 antilichaam en het SP-142 antilichaam op de Leica Bond stainer, het grootst.

De studies van Ahn (2021) en Narita (2021) hebben beide gekeken naar de concordantie tussen 22C3 antilichaam op de Dako stainer en de 28-8 antilichaam op de Dako stainer waarbij ze voor verschillende afkappunten van de CPS (vijf en tien) een hoge concordantie vonden (kappa waarden 0.837-0.899).

 

Voor alle uitkomstmaten is de bewijskracht zeer laag. Dit heeft te maken met het observationele design van alle studies en het risico op bias vanwege onduidelijkheid omtrent patiëntselectie en de teststrategieën. De studies die zijn geïncludeerd zijn over het algemeen uitgevoerd in Azië waarbij het de vraag is hoe vergelijkbaar de test(strategieën) en incidenties zijn met de Nederlandse setting en hoe generaliseerbaar de resultaten daardoor zijn.

Daarnaast worden veel verschillende antilichamen op verschillende stainers met elkaar vergeleken waardoor uitkomsten van studies niet samengevoegd kunnen worden. Het gebrek aan een goede referentiestandaard zorgt ervoor dat er niet goed naar de uitkomstmaat diagnostische accuratesse gekeken kan worden. Het ontbreken van deze standaard is daarentegen de aanleiding voor deze vraag om te kijken of er een duidelijkere lijn kan komen in de antilichamen en stainers die gebruikt worden voor het vaststellen van de PD-L1 expressie.

 

Waarden en voorkeuren van patiënten (en evt. hun verzorgers)

Er is geen onderzoek gedaan naar de waarden en voorkeuren van patiënten wat betreft de verschillende stainers. De werkgroep heeft dit aspect daarom niet meegewogen bij het formuleren van de aanbeveling.

 

Kosten (middelenbeslag)

De werkgroep heeft geen informatie gevonden over de kostenverschillen tussen de verschillende stainers De werkgroep heeft dit aspect daarom niet meegewogen bij het formuleren van de aanbeveling. De werkgroep verwacht dat de aanbeveling geen relevante impact heeft op de zorgkosten anders dan de kosten die sowieso gemaakt zullen worden voor inzet van een stainer.

 

Aanvaardbaarheid, haalbaarheid en implementatie

De werkgroep is van mening dat de aanbeveling aanvaardbaar is voor zowel zorgverleners als patiënten. De werkgroep verwacht dat omdat de Dako stainer vrijwel niet gebruikt wordt het gebruik ervan niet makkelijk geïmplementeerd zal worden in Nederland. De aanbeveling sluit aan bij de huidige werkwijze in de praktijk.

 

Rationale van de aanbeveling: weging van argumenten voor en tegen de diagnostische procedure

  1. Op basis van resultaten van CheckMate 649 studie komen patiënten met gemetastaseerd en advanced maag adenocarcinoom, gastro-oesofageale junctie adenocarcinoom en oesofagus adenocarcinoom met PD-L1 CPS 5 en hoger in aanmerking voor een eerstelijns behandeling met immuun-checkpoint inhibitoren in combinatie met chemotherapie (Janjigian, 2021). In deze studie was de PD-L1 test uitgevoerd met 28-8 antilichamen samen met Dako stainer, wat als de ‘gouden standaard’ gezien moet worden.
  2. In Nederland wordt de Ventana stainer het meest gebruikt en daarnaast de Leica Bond stainer. Het gebruik van de Dako stainer kan niet makkelijk geïmplementeerd worden in Nederland.
  3. Uit de (schaarse) beschikbare literatuur is onvoldoende bewijs gevonden om duidelijke conclusies te trekken met betrekking tot correlatie in prognostische waarde, diagnostische accuratesse, diagnostische concordantie en inter-beoordelaars variabiliteit van de PD-L1 test, uitgevoerd met 28-8 antilichamen met de Dako stainer (‘gouden standaard’) en de in Nederland algemene met verschillende antilichamen uitgevoerde testen met de Ventana en Leica Bond stainer.
  4. Een goed ontworpen en gepowerede studie waarin een vergelijking wordt uitgevoerd tussen de PD-L1 test met 28-8 antilichamen met de Dako stainer en 22C3, SP-142 of SP-263 antilichamen, uitgevoerd met de Ventana en de Leica Bond stainer is nodig om een goed onderbouwd beleid met betrekking tot inzet van de PD-L1 CPS test in Nederland te komen.

Onderbouwing

Sinds 2022 is het op basis van de resultaten van de CheckMate 649 studie mogelijk een deel van de patiënten met een gemetastaseerd maagcarcinoom, afhankelijk van de uitslag van de PD-L1 test, in de eerstelijns behandeling te behandelen met anti-PD-(L)1 in combinatie met chemotherapie dan wel anti-CTLA-4 (Janjigian, 2021). Op dit moment is er geen eenduidige richtlijn over hoe de PD-L1 test uitgevoerd wordt. In Nederland wordt op verschillende manieren PD-L1 expressie getest, met verschillende typen antilichamen, zoals 22C3, SP-142 of SP-263. Daarnaast wordt in Nederland de Ventana stainer het meest gebruikt naast de Leica Bond stainer. De Dako stainer wordt vanuit Nederlands perspectief gezien als de ‘gouden standaard’ maar deze wordt in Nederland niet of nauwelijks ingezet. Het gebrek aan eenduidigheid kan leiden tot onnodige verschillen in zorg.

Idealiter wordt er gekeken naar patiënt gerelateerde uitkomsten voor verschillende test-treatment strategieën om zo te kijken naar de beste test en behandelcombinatie voor deze patiënten. Naar alle waarschijnlijkheid zullen dit type studies omtrent dit onderwerp beperkt zijn. Er wordt daarom ook gekeken naar diagnostische accuratesse en diagnostische studies die naar overeenstemming tussen de testen kijken om te bepalen welke test het meest effectief is om PD-L1 expressie vast te stellen.

Overall survival

Very low GRADE

The evidence is very uncertain about the effect of SP-263 antibody with the Ventana stainer compared with 22C3 antibody on Agilent autostainer for indicating PD-L1 expression on overall survival in patients with advanced gastric cancer.

 

Source: Park (2020)

 

Diagnostic accuracy

Very low GRADE

The evidence is very uncertain about the diagnostic accuracy of SP-263 antibody with Ventana stainer, 22C3 antibody with Ventana stainer and 22C3 antibody with Dako stainer for indicating PD-L1 expression in patients with advanced gastric cancer.

 

Source: Kim (2021)

 

Diagnostic concordance

Very low GRADE

The evidence is very uncertain about the diagnostic concordance between different diagnostic tests for indicating PD-L1 expression in patients with advanced (metastatic) gastric cancer.

 

Source: Ahn (2021); Dabbagh (2021); Kim (2021); Ma (2018); Narita (2021); Park (2020); Yeong (2022)

 

Inter-observer reliability

Very low GRADE

The evidence is very uncertain about the inter-observer reliability between SP-263 antibody with the Ventana stainer compared with 22C3 antibody on Agilent autostainer and the SP-142 antibody with the Ventana stainer compared with 28-8 antibody with the Leica bond stainer indicating PD-L1 expression in patients with advanced (metastatic) gastric cancer.

 

Source: Park (2020); Ma (2018)

Description of studies

To provide a clear overview of which tests are included in the studies, the tests and are presented in table 1.

 

Table 1. Overview of PD-L1 tests in included studies

Study

Included PD-L1 expression test(s)

Ahn (2021)

22C3 antibody with Dako stainer

28-8 antibody with Dako stainer

Dabbagh (2021)

22C3 antibody with Ventana stainer

SP-263 antibody with Ventana stainer

Kim (2021)

SP-263 antibody with Ventana stainer

22C3 antibody with Ventana stainer

22C3 antibody with Dako stainer

Ma (2018)

SP-142 antibody with Ventana stainer

28-8 antibody with Leica Bond stainer

E1L3N antibody with Leica Bond stainer

Narita (2021)

22C3 antibody with Dako stainer

28-8 antibody with Dako stainer

Park (2020)

SP-263 antibody with Ventana stainer

22C3 antibody with Agilent autostainer

Yeong (2022)

22C3 antibody with Leica Bond stainer

SP-142 antibody with Leica Bond stainer

28-8 antibody with the Leica Bond stainer

 

Ahn (2021) performed a retrospective analysis of a cohort with patients with advanced gastric cancer who were treated at the Samsung Medical Center, Korea between 1997 and 2020. Primary gastric tumor specimens were assessed for PD-L1 expression using 22C3 antibody with the Dako stainer and 28-8 antibody with the Dako stainer. Two pathologists evaluated the slides and with evaluating the slides from the 28-8 Dako, the pathologists were blinded to the 22C3 Dako results.

In total, 55 samples from patients were assessed. Of the 55 patients, 33 (60%) was younger than 60 years and 65% was male. Tumor according to the Lauren classification was intestinal in 14 patients (25%), diffuse in 33 patients (60%), mixed in five patients (10%) and indeterminate in three patients (5%).

Ahn (2021) reported overall percentage agreement, positive percentage agreement, negative percentage agreement and concordance between 22C3 Dako and 28-8 Dako for CPS with cut off of ten.

 

Dabbagh (2021) performed an analysis of cases with gastric and gastroesophageal junction carcinoma who were collected form Jordanian patients who underwent a total or partial gastrectomy at Kings Hussein Cancer Center in Amman, Jordan between 2010 and 2018. Cases with insufficient tumor cells were excluded. Cases were assessed for PD-L1 expression using 22C3 antibody with the Ventana stainer and the SP-263 antibody with the Ventana stainer. Interpretations of the cases were independently performed for each antibody by one of the three authors with at least one month washout interval between readings.

In total, 96 cases were included in the study. Of the 96 patients, 43 (45%) were younger than 60 years and 59 patients (61%) was male. Regarding tumor stage, 23 patients (24%) had clinical stage I-II, 46 patients had stage III-IV (48%) and in 27 patients (28%) the clinical stage was not available. 

Dabbagh (2021) reported OPA, NPA and PPA between the 22C3 Ventana and SP-283 Ventana for CPS with cut off ten.

 

Kim (2021) performed a retrospective analysis of a cohort with patients with advanced gastric cancer who underwent a gastrectomy at Asan Medical Center in Korea between 2014 and 2017. Biopsy and operative specimens assessed for PD-L1 expression using 22C3 antibody with the Ventana, stainer, SP263 antibody with the Ventana stainer and 22C3 antibody with the Dako stainer, were compared. Two pathologists scored the slides independently.

The cohort consisted of 100 patients with a median age of 60 years and 60 percent was male. The tumor location was as follows: upper region (19%), middle region (17%), lower region (50%), and entire region (14%).

Kim (2021) reported OPA, PPA and NPA between the different tests for the operation and biopsy specimens for the Combined Positivity Score (CPS) of five.

Kim (2021) also reported OPA, PPA and NPA between resection and matched biopsied specimens for the different tests. False negatives and false positives for biopsy specimens were reported for the different tests where operative specimens were used as gold standard.

 

Ma (2018) performed an analysis on a consecutive series of surgically resected samples of patients with primary advanced gastric cancer from the Xijing Digestive Hospital in China, between 2011 and 2012. Patients were diagnosed with advanced gastric cancer (stage I-III) by pathologists based on hematoxylin and eosin staining. Patients did not receive any treatment before surgery. In total samples were collected from 315 patients and three consecutive sections were cut from each specimen. PD-L1 staining using the antibody SP-142 (Spring Bioscience) with the Ventana stainer, 28-8 (Abcam) with the Leica Bond stainer and E1L3N (Cell Signaling Technology) with the Leica Bond stainer were compared. Ma (2018) did not define a reference standard. Two pathologists analysed PD-L1 expression on tumor and stromal/immune cells.
The SP-142 group consisted of 122 patients with a mean age of 59.4 years. The 28-8 group consisted of 106 patients with a mean age of 59.6 years and the E1L3N group consisted of 24 patients with a mean age of 55.7 years. Tumor type was as follows: adenocarcinoma (91%),  squamous carcinoma (1%), mucocellular carcinoma (2%), other types of carcinoma (6%).

Ma (2018) reported diagnostic concordance between SP-142 antibody and 28-8 antibody for total cells and tumor cells for cut-off value of 1%, 5% and 10%. Ma (2018) also reported inter-observer reliability.

 

Narita (2021) performed an analysis of samples collected from gastrectomy specimens of patients with esophagogastric cancer at Aichi Cancer Center (ACC) Hospital in Japan. Patients were diagnosed with esophagogastric adenocarcinoma (stage I-IV), underwent the gastrectomy at the ACC hospital between 2009 and 2010, had sufficient tumor content in formalin-fixed paraffin-embedded (FFPE) samples and received no systemic chemotherapy before surgery.

PD-L1 staining using the 22C3 antibody with the Dako stainer and the 28-8 antibody with the Dako stainer, were compared. Two pathologists evaluated the immunostaining.

In total, specimens from 226 patients were included. Median age was 65 years (range 32-86) and 162 patients (72%) were male. Regarding tumor stage, 100 patients (44%) had stage I, 39 patients (17%) had stage II, 58 patients (26%) stage III and 29 patients (13%) stage IV. Of the 226 patients, 87 (38%) had a diffuse tumor histology and 139 patients (62%) had intestinal tumor histology.

Narita (2021) reported diagnostic concordance between the 22C3 Dako stainer and 28-8 Dako stainer with a CPS cut off point of five and ten, using kappa scores.

 

Park (2020) performed an analysis of tissue samples from patients with stage II and III gastric cancer who underwent surgical resection at Seoul National University Bundang hospital between 2006 and 2013. PD-L1 was stained using the 22C3 antibody (Dako) with the Agilent Autostainer and the SP-263 (Ventana) with the Ventana stainer. Park (2020) did not define a reference standard.

In total, 379 samples of patients were included in the analysis. 251 patients (66.2%) were younger than 65 years and 62.8 percent was male. According to the Lauren’s criteria, 35.6 percent had the intestinal subtype, 56.2 percent the diffuse subtype and 7.7 percent mixed subtype.

Park (2020) reported overall survival, overall percentage of agreement, positive percentage of agreement and negative percentage of agreement between the 22C3 antibody and SP-263 antibody at the center of the tumor and at the invasive margin for CPS cut-off value of 10 or higher. Park (2020) also reported correlation coefficient of CPS between the assays and the interobserver variation between five pathologists for CPS of ten or higher.

 

Yeong (2022) performed an analysis of tissue samples from patients obtained via biopsy or resection of gastric cancer at the National University Hospital (NUH) in Singapore between 1997 and 2019. Samples recorded to be suitable for research and with sufficient tissue for analysis were identified by the department of pathology. Samples were developed in a tissue microarray. PD-L1 was stained using 22C3 antibody with the Leica Bond Stainer, SP142 antibody with the Leica Bond Stainer and 28-8 antibody with the Leica Bond Stainer.

In total, 344 samples of patients were included in the analysis with a median age of 68 years. Of the included samples of the patients, 85.5 percent had the Chinese ethnicity. According to Lauren’s criteria, 50.3 percent of the patients had the intestinal subtype.

Yeong (2022) reported detection rate for PD-L1 positivity at CPS cut-off of five, diagnostic concordance between different antibody assays at CPS cut-off of five and correlation between different antibody assays for CPS and TPS.

 

Results

 

Overall survival

One study reported overall survival (OS) (Park, 2020).

Park (2020) reported both PD-L1 expression diagnosed by 22C3 Agilent stainer and SP-263 Ventana stainer as prognostic factors for OS. Median overall survival is more than 125 months. PD-L1 expression by 22C3 Agilent stainer with a hazard ratio (HR) of 2.63 (95%CI 1.26-5.48) and SP-263 Ventana stainer with a HR of 2.20 (95%CI 1.06-4.57).

 

Diagnostic accuracy

False positives

One study reported false positives in biopsy specimens for CPS with cut-off five (Kim, 2021). Kim (2021) reported four false-positives (4%) on the 22C3 Ventana Stainer, no false-positives on the SP263 Ventana Stainer and one (1%) false-positives on the 22C3 Dako Stainer. Resection specimens were used as “golden standard”.

 

False negatives

One study reported false negatives in biopsy specimens for CPS with cut-off five (Kim, 2021). Kim (2021) reported three false negatives on the 22C3 Ventana Stainer, four false negatives on the SP263 Ventana Stainer and five false negatives on the 22C3 Dako stainer. Resection specimens were used as “golden standard”.

 

Diagnostic concordance

Overall agreement

Seven studies reported overall agreement (Ahn, 2021; Dabbagh, 2021; Kim, 2021; Ma, 2018; Narita, 2021; Park, 2020; Yeong, 2022)

 

Four studies reported overall percentage of agreement (Ahn, 2021; Dabbagh, 2021; Kim, 2021; Park, 2020).

Ahn (2021) reported overall percentage of agreement between 22C3 Dako Stainer and 28-8 Dako Stainer for CPS 10 of 96 percent.

 

Dabbagh (2021) reported overall percentage of agreement between 22C3 Ventana Stainer and SP-263 Ventana Stainer for CPS ≥ 10 of 92.5 percent.

 

Kim (2021) reported overall percentage of agreement for operation specimens and biopsy specimens for CPS with cut-off five. The overall percentage of agreement between 22C3 Dako Stainer and SP-263 Ventana Stainer was 64 percent for operation specimens. Overall percentage of agreement between 22C3 Dako Stainer and 22C3 Ventana Stainer was 88 percent for operation specimens.

Kim (2021) reported overall percentage of agreement between 22C3 Dako Stainer and SP263 Ventana Stainer of 62 percent for biopsy specimens. Overall percentage of agreement between 22C3 Dako Stainer and 22C3 Ventana Stainer was 85 percent for operation specimens.

 

Park (2020) reported overall percentage of agreement for CPS ≥ 10 at the center of the tumor and at the invasive margin. The overall percentage of agreement between 22C3 Agilent autostainer and SP-263 Ventana stainer for CPS ≥ 10 at the center of the tumor was 99.2 percent and at the invasive margin 98.7 percent.

 

Five studies reported overall agreement (diagnostic concordance) using kappa (Ahn, 2021; Ma, 2018; Narita, 2021; Park, 2020; Yeong, 2022).

Ahn (2021) reported concordance at CPS 10 between 22C3 Dako and 28-8 Dako using kappa. The reported kappa value was 0.899.

 

Ma (2018) reported agreement between SP-142 Ventana stainer and 28-8 Leica Bond stainer for total cells and tumor cells at one percent cut-off (table 2).

 

Table 2. Overall agreement between SP-142 Ventana stainer and 28-8 Leica Bond stainer (Ma, 2018)

Cut-off value

Concordance between SP-142 Ventana and 28-8 Leica Bond for total cells

Concordance between SP-142 Ventana and 28-8 Leica Bond for tumor cells

1%

K=0.740

K=0.813

5%

K=0.816

K=0.810

10%

K=0.823

K=0.830

K: Cohen’s kappa coefficient

 

Narita (2021) reported diagnostic concordance between 22C3 Dako stainer and 28-8 Dako stainer at CPS with a cut-off point of five and ten. The reported Kappa value for CPS five was 0.881 and the Kappa value for CPS ten was 0.837.

 

Park (2020) reported correlation between 22C3 Agilent autostainer and SP-263 Ventana stainer for CPS and TPS at the center of the tumor, at the invasive margin and overall (table 3).

 

Table 3. Correlation between 22C3 Agilent autostainer and SP-263 Ventana stainer (Park, 2020)

 

Correlation between 22C3 Agilent autostainer and SP-263 Ventana stainer for CPS

Correlation between 22C3 Agilent autostainer and SP-263 Ventana stainer for TPS

At the center of the tumor

K=0.916

K=0.951

At the invasive margin

K=0.912

K=0.935

Overall

Spearman P=0.914

Spearman P=0.943

K: Cohen’s kappa coefficient; Spearman P: Spearman’s Rho

 

Yeong (2022) reported diagnostic concordance between 22C3 Leica Bond stainer, 28-8 Leica Bond stainer and SP-142 Leica Bond stainer (table 4).

 

Table 4. Overall agreement between 22C3 Leica Bond stainer and 28-8 Leica Bond stainer and SP-142 Leica Bond stainer (Yeong, 2022)

 

Concordance between 22C3 Leica Bond stainer and 28-8 Leica Bond stainer for CPS ≥ 5

Concordance between 22C3 Leica Bond stainer and SP-142 Leica Bond stainer for CPS ≥ 5

Accuracy

73.3%

80.8%

Concordance

Gwet’s K=0.598

Gwet’s K=0.735

Gwet’s K: Gwet’s kappa

 

Yeong (2022) also reported correlation between different assays for CPS, TPS and immune cells (IC) (table 5).

 

Table 5. Correlation between different assays for CPS and TPS (Yeong, 2022)

 

Correlation between 22C3 Leica Bond stainer and 28-8 Leica Bond stainer

Correlation between 28-8 Leica Bond stainer and SP-142 Leica Bond stainer

Correlation between 22C3 Leica Bond stainer and SP-142 Leica Bond stainer

CPS

Spearman P= 0.392

Spearman P= 0.213

Spearman P= 0.409

TPS

Spearman P= 0.381

Spearman P= 0.180

Spearman P= 0.417

Spearman P: Spearman’s Rho

 

Negative percentage of agreement

Four studies reported negative percentage of agreement (Ahn, 2021; Dabbagh, 2021; Kim, 2021; Park, 2020).

Ahn (2021) reported negative percentage of agreement between 22C3 Dako and 28-8 Dako for CPS 10 of 100 percent.

 

Dabbagh (2021) reported negative percentage of agreement between 22C3 Ventana Stainer and SP-263 Ventana Stainer for CPS ≥ 10 of 91 percent.

 

Kim (2021) reported negative percentage of agreement for operation specimens and biopsy specimens for CPS with cut-off five (Kim, 2021). The negative percentage of agreement between 22C3 Dako Stainer and SP263 Ventana Stainer was 96.4 percent for operation specimens. Negative percentage of agreement between 22C3 Dako Stainer and 22C3 Ventana Stainer was 99 percent for operation specimens.

Kim (2021) reported negative percentage of agreement between 22C3 Dako Stainer and SP263 Ventana Stainer of 96.6 percent for biopsy specimens. Negative percentage of agreement between 22C3 Dako Stainer and 22C3 Ventana Stainer was 94.3 percent for biopsy specimens.

Park (2020) reported negative percentage of agreement for CPS and TPS ≥ 10 at the center of the tumor and at the invasive margin. The negative percentage of agreement between 22C3 Agilent autostainer and SP-263 Ventana stainer for CPS ≥ 10 at the center of the tumor was 99.7 percent and at the invasive margin 99.4 percent.

The negative percentage of agreement between 22C3 Agilent autostainer and SP-263 Ventana stainer for TPS ≥ 10 at the center of the tumor was 99.7 percent and at the invasive margin 98.6 percent. 

 

Positive percentage of agreement

Four studies reported positive percentage of agreement (Ahn, 2021; Dabbagh, 2021; Kim, 2021; Park, 2020).

Ahn (2021) reported positive percentage of agreement between 22C3 Dako and 28-8 Dako for CPS 10 of 95 percent.

 

Dabbagh (2021) reported overall percentage of agreement between 22C3 Ventana Stainer and SP-263 Ventana Stainer for CPS ≥ 10 of 96.3 percent.

 

Kim (2021) reported positive percentage of agreement for operation specimens and biopsy specimens for CPS with cut-off five. The positive percentage of agreement between 22C3 Dako Stainer and SP263 Ventana Stainer was 24.4 percent for operation specimens. Positive percentage of agreement between 22C3 Dako Stainer and 22C3 Ventana Stainer was 54.4 percent for operation specimens.

Kim (2021) reported positive percentage of agreement between 22C3 Dako Stainer and SP263 Ventana Stainer of 12.2 percent for biopsy specimens. Positive percentage of agreement between 22C3 Dako Stainer and 22C3 Ventana Stainer was 16.7 percent for biopsy specimens.

Park (2020) reported positive percentage of agreement for CPS and TPS ≥ 10 at the center of the tumor and at the invasive margin. The positive percentage of agreement between 22C3 Agilent autostainer and SP-263 Ventana stainer for CPS ≥ 10 at the center of the tumor was 94.6 percent and at the invasive margin 94.0 percent.

The positive percentage of agreement between 22C3 Agilent autostainer and SP-263 Ventana stainer for TPS ≥ 10 at the center of the tumor was 100 percent and at the invasive margin 96.7 percent.

 

Inter-observer variability

Two studies reported interobserver variability (Ma, 2018; Park, 2020).

Ma (2018) reported inter-pathologist correlation for PD-L1 expression in tumor cells for SP-142 Ventana stainer of R2 0.9805 and for 28-8 Leica Bond stainer R2 0.9853. The inter-pathologist correlation for PD-L1 expression in immune/stromal cells for SP-142 Ventana stainer of R2 0.5653 and for 28-8 Leica Bond stainer R2 0.5745.

 

Park (2020) reported interobserver variation as Fleiss kappa (K) between five pathologists for PD-L1 expression CPS ≥ 10 with 22C3 Agilent autostainer K 0.224 and with SP-263 Ventana K 0.140.

 

Level of evidence of the literature

 

The level of evidence regarding the outcome measure overall survival was downgraded by three levels because of study limitations (-1; risk of bias because of unclear selection and unclear interpretation of index test and reference standard), applicability (-1; bias due to indirectness because the index tests are not directly compared) and number of included patients (-1; imprecision because of low sample size).

Therefore the evidence was graded as very low.

 

The level of evidence regarding the outcome measure diagnostic accuracy was downgraded by three levels because of study limitations (-1; risk of bias because of unclear reference standard results), applicability (-1; bias due to indirectness because the reference standard in the study is not corresponding with the PICRO) and number of included patients (-1; imprecision because of low sample size).

Therefore the evidence was graded as very low.

 

The level of evidence regarding the outcome measure diagnostic concordance was downgraded by three levels because of study limitations (-1; risk of bias because of unclear participant selection and interpretation of index or comparator tests), applicability (-1; bias due to indirectness because not all diagnostics tests in the studies correspond with the tests in the PICRO) and number of included patients (-1; imprecision because of low sample size).

Therefore the evidence was graded as very low.

 

The level of evidence regarding the outcome measure inter-observer reliability was downgraded by three levels because of study limitations (-1; risk of bias because of unclear participant selection and interpretation of index tests), applicability (-1; bias due to indirectness because not all diagnostics tests in the studies correspond with the tests in the PICRO) and number of included patients (-1; imprecision because of low sample size).

Therefore the evidence was graded as very low.

A systematic review of the literature was performed to answer the following question: What is the diagnostic accuracy and concordance of the PD-L1 test with 28-8 antibody on Ventana Stainer, the SP-142 or SP-263 antibody on Ventana stainer or Leica Bond stainer in relation to the reference PD-L1 test with 22C3 antibody on Dako stainer to detect PD-L1 expression in patients with adenocarcinoma of the gastric, gastro-oesophagus junction and oesophagus and is there a difference in overall survival between different test strategies?

 

P: Patients with adenocarcinoma of the gastric, gastro-oesophagus junction and oesophagus

I:  PD-L1 test with 28-8 antibody, SP-142 antibody or SP-263 antibody on Ventana stainer or Dako stainer

C:  Comparating with any other PD-L1 expression test

R: PD-L1 test with 22C3 antibody on Dako stainer

O: Diagnostic accuracy (sensitivity, negative predictive value, specificity, positive predictive value) for Combined Positive Score (CPS) with the cut-off of five or ten or for Total Positive Score (TPS) with the cut-off of one, diagnostic concordance (percentage of agreement, positive percentage of agreement, negative percentage of agreement) for CPS with the cut-off of five or ten or TPS cut-off of one, interobserver variability, overall survival

Timing and setting: When there is an indication for first-line systemic treatment

 

Relevant outcome measures

The guideline development group considered overall survival, sensitivity, negative predictive value and positive percentage of agreement as critical outcome measures for decision making and specificity, positive predictive value, (negative) percentage of agreement and interobserver variability as important outcome measures for decision making.

 

A priori, the working group did not define the outcome measures listed above but used the definitions used in the studies.

 

The diagnostic accuracy is influenced by multiple factors and a golden or reference standard is lacking, the working group did not predefine a clinically relevant difference but interpreted the diagnostic accuracy and diagnostic concordance between tests in the context of the included studies.

Search and select (Methods)

The databases Medline (via OVID) and Embase (via Embase.com) were searched with relevant search terms until 29-08-2022. The detailed search strategy is depicted under the tab Methods. The systematic literature search resulted in 23 hits. Studies were selected based on the following criteria:

 

  • The study population had to meet the criteria as defined in the PICRO;
  • At least the index test or one of the comparator tests had to be as defined in the PICRO;
  • One or more reported outcomes had to be reported as defined in the PICRO;
  • Articles written in English or Dutch

 

Eleven studies were initially selected based on title and abstract screening. After reading the full text, four studies were excluded (see the table with reasons for exclusion under the tab Methods), and seven studies were included.

 

Results

Seven studies were included in the analysis of the literature. Important study characteristics and results are summarized in the evidence tables (see appendix). The assessment of the risk of bias is summarized in the risk of bias tables (see appendix).

  1. 1 - Ahn S, Kim KM. PD-L1 expression in gastric cancer: interchangeability of 22C3 and 28-8 pharmDx assays for responses to immunotherapy. Mod Pathol. 2021 Sep;34(9):1719-1727. doi: 10.1038/s41379-021-00823-9. Epub 2021 May 17. PMID: 34002009. Ahn S, Kim KM. PD-L1 expression in gastric cancer: interchangeability of 22C3 and 28-8 pharmDx assays for responses to immunotherapy. Mod Pathol. 2021 Sep;34(9):1719-1727. doi: 10.1038/s41379-021-00823-9. Epub 2021 May 17. PMID: 34002009.
  2. 2 - Dabbagh TZ, Sughayer MA. PD-L1 Expression Harmonization in Gastric Cancer Using 22C3 PharmDx and SP263 Assays. Appl Immunohistochem Mol Morphol. 2021 Jul 1;29(6):462-466. doi: 10.1097/PAI.0000000000000902. PMID: 33480602.
  3. 3 - Kim SW, Jeong G, Ryu MH, Park YS. Comparison of PD-L1 immunohistochemical assays in advanced gastric adenocarcinomas using endoscopic biopsy and paired resected specimens. Pathology. 2021 Aug;53(5):586-594. doi: 10.1016/j.pathol.2020.10.015. Epub 2021 Feb 3. PMID: 33546812.
  4. 4 - Ma J, Li J, Qian M, Han W, Tian M, Li Z, Wang Z, He S, Wu K. PD-L1 expression and the prognostic significance in gastric cancer: a retrospective comparison of three PD-L1 antibody clones (SP142, 28-8 and E1L3N). Diagn Pathol. 2018 Nov 21;13(1):91. doi: 10.1186/s13000-018-0766-0. PMID: 30463584; PMCID: PMC6249875.
  5. 5 - Narita Y, Sasaki E, Masuishi T, Taniguchi H, Kadowaki S, Ito S, Yatabe Y, Muro K. PD-L1 immunohistochemistry comparison of 22C3 and 28-8 assays for gastric cancer. J Gastrointest Oncol. 2021 Dec;12(6):2696-2705. doi: 10.21037/jgo-21-505. PMID: 35070399; PMCID: PMC8748031.
  6. 6 - Park Y, Koh J, Na HY, Kwak Y, Lee KW, Ahn SH, Park DJ, Kim HH, Lee HS. PD-L1 Testing in Gastric Cancer by the Combined Positive Score of the 22C3 PharmDx and SP263 Assay with Clinically Relevant Cut-offs. Cancer Res Treat. 2020 Jul;52(3):661-670. doi: 10.4143/crt.2019.718. Epub 2020 Jan 10. PMID: 32019283; PMCID: PMC7373862.
  7. 7 - Janjigian YY, Shitara K, Moehler M, Garrido M, Salman P, Shen L, Wyrwicz L, Yamaguchi K, Skoczylas T, Campos Bragagnoli A, Liu T, Schenker M, Yanez P, Tehfe M, Kowalyszyn R, Karamouzis MV, Bruges R, Zander T, Pazo-Cid R, Hitre E, Feeney K, Cleary JM, Poulart V, Cullen D, Lei M, Xiao H, Kondo K, Li M, Ajani JA. First-line nivolumab plus chemotherapy versus chemotherapy alone for advanced gastric, gastro-oesophageal junction, and oesophageal adenocarcinoma (CheckMate 649): a randomised, open-label, phase 3 trial. Lancet. 2021 Jul 3;398(10294):27-40. doi: 10.1016/S0140-6736(21)00797-2. Epub 2021 Jun 5. PMID: 34102137; PMCID: PMC8436782.
  8. 8 - Yeong J, Lum HYJ, Teo CB, Tan BKJ, Chan YH, Tay RYK, Choo JR, Jeyasekharan AD, Miow QH, Loo LH, Yong WP, Sundar R. Choice of PD-L1 immunohistochemistry assay influences clinical eligibility for gastric cancer immunotherapy. Gastric Cancer. 2022 Jul;25(4):741-750. doi: 10.1007/s10120-022-01301-0. Epub 2022 Jun 4. PMID: 35661944; PMCID: PMC9226082.

Evidence table for diagnostic test accuracy studies

 

Research question: What is the diagnostic accuracy and concordance of the PD-L1 test with 28-8 antibody on Ventana Stainer versus the PD-L1 test with SP-142 or SP-263 antibody on Ventana stainer in relation to the reference PD-L1 test with 22C3 antibody on Dako stainer to detect PD-L1 expression in patients with metastatic gastric adenocarcinoma and is there a difference in overall survival between different test strategies?

Study reference

Study characteristics

Patient characteristics

 

Index test

(test of interest)

Reference test

 

Follow-up

Outcome measures and effect size

Comments

Ahn, 2021

Type of study: Retrospective cohort study

 

Setting and country: Single center study, South-Korea

 

Funding and conflicts of interest: The authors declare no conflicts of interest.

 

Inclusion criteria:

- Patients with advanced gastric cancer who were treated at the Samsung medical center between 1997 and 2020

 

Exclusion criteria:

Not reported

 

N=55

 

Prevalence: NA

 

Age < 60 years: N=33 (60%)

Age ≥ 60 years: N=22 (40%)

 

Sex

Male: 65%

Female: 35%

 

Other important characteristics:

 

Lauren classification:

Intestinal: N=14 (25%)

Diffuse: N=33 (60%)

Mixed: N=5 (10%)

Indeterminate: N=3 (5%)

 

Tumor stage:

I: N=4 (7%)

II: N=7 (13%)

III: N=33 (60%)

IV: N=11 (20%)

 

Tissue type:

Stomach resection: N=49 (89%)

Stomach biopsy: N=2 (4%)

Peritoneal biopsy: N=4 (7%)

 

 

Index test:

PharmDx 22C3 assay on Dako autostainer

 

Cut-off point(s):

For tumor cells, positive PD-L1 staining was defined as complete and/or partial circumferential linear cellular membrane staining at any intensity. Immune cells were scored as proportion of tumor area covered with any discernible PD-L1 staining of any intensity in immune cells.

CPS was calculated by dividing number of PD-L1 stained cells (tumor cells and immune cells) by the total number of viable tumor cells and multiply by 100.

 

Comparator test:

28-8 PharmDx assay on Dako autostainer

 

Cut-off point(s):

For tumor cells, positive PD-L1 staining was defined as complete and/or partial circumferential linear cellular membrane staining at any intensity. Immune cells were scored as proportion of tumor area covered with any discernible PD-L1 staining of any intensity in immune cells.

CPS was calculated by dividing number of PD-L1 stained cells (tumor cells and immune cells) by the total number of viable tumor cells and multiply by 100.

 

 

Reference test:

Not applicable

 

 

Cut-off point(s):

Not applicable

 

 

Time between the index test and reference test: Assays were performed on the same tissue blocks.

 

For how many participants were no complete outcome data available?

No missing data reported.

 

 

Outcome measures and effect size (include 95%CI and p-value if available)4:

 

Overall Percentage Agreement

CPS cut-off 10

96% (95%CI 87-99)

 

Negative Percentage Agreement

CPS cut-off 10

100% (95%CI 73-100)

 

Positive Percentage Agreement

CPS cut-off 10

95% (95%CI 84-99)

 

Diagnostic concordance

CPS cut-off 10

K=0.899

 

 

 

 

 

Authors conclusion: In conclusion, we conducted a comparison of PD-L1 CPS between 22C3 pharmDx and 28-8 pharmDx assays in patients with gastric cancer and showed that the two assays are highly comparable at various CPS cutoffs. This study provides evidence for the potential interchangeability of these two assays in gastric cancer.

 

 

Dabbagh, 2021

Type of study: Retrospective cohort study

 

Setting and country: Single center study, Amman Jordan

 

Funding and conflicts of interest: The authors declare no conflicts of interest.

 

Inclusion criteria:

- Patients with gastric or gastro-esophageal junction adenocarcinoma who underwent total or partial gastrectomy at King Hussein Cancer Center between 2010 and 2018

 

Exclusion criteria:
- Cases that demonstrated a low number of TCs in all its available blocks ( < 100 viable cells), due to neoadjuvant chemotherapy

 

N=99

 

Prevalence: NA

 

Age < 60 years: 45%

Age ≥ 60 years: 55%

 

Sex

Male: 61%

Female: 39%

 

Other important characteristics:

 

Clinical tumor stage:

I-II: N=23 (24%)

III-IV: N=46 (48%)

Not available: N=27 (28%)

 

Index test:

22C3 antibody clone on Ventana Benchmark Ultra system

 

Cut-off point(s):
Scoring of PD-L1 was based on CPS, which is defined as the ratio of the sum of PD‐L1 membrane stained TCs and membrane/cytoplasmic stained macro- phages and lymphocytes in the tumor microenvironment to the total TCs present, multiplied by 100. Immunoreactivity was considered negative if the CPS was <1.

 

Comparator test:

SP-263 antibody clone on Ventana Benchmark Ultra system

 

Cut-off point(s):

Scoring of PD-L1 was based on CPS, which is defined as the ratio of the sum of PD‐L1 membrane stained TCs and membrane/cytoplasmic stained macro- phages and lymphocytes in the tumor microenvironment to the total TCs present, multiplied by 100. Immunoreactivity was considered negative if the CPS was <1.

 

 

Reference test:

22C3 antibody clone on Agilent Autostainer Link 48.

 

 

Cut-off point(s):

Scoring of PD-L1 was based on CPS, which is defined as the ratio of the sum of PD‐L1 membrane stained TCs and membrane/cytoplasmic stained macro- phages and lymphocytes in the tumor microenvironment to the total TCs present, multiplied by 100. Immunoreactivity was considered negative if the CPS was <1.

 

Time between the index test and reference test: Assays were performed on the same tissue blocks.

 

For how many participants were no complete outcome data available?

 

N=5 (5%)

 

Reason(s): Change in tumor field across the constructed slides.

 

Outcome measures and effect size (include 95%CI and p-value if available)4:

 

Overall Percentage Agreement

CPS cut-off 10

92.5%

 

Negative Percentage Agreement

CPS cut-off 10

91%

 

Positive Percentage Agreement

CPS cut-off 10

96.3%

 

Diagnostic concordance between 22C3 Ventana and SP-263 Ventana: 0.932 (95%CI 0.90-0.95)

 

 

 

 

 

Authors conclusion: In conclusion, we conducted a comparison of PD-L1 CPS between 22C3 pharmDx and 28-8 pharmDx assays in patients with gastric cancer and showed that the two assays are highly comparable at various CPS cutoffs. This study provides evidence for the potential interchangeability of these two assays in gastric cancer.

 

 

Kim, 2021

Type of study: Retrospective cohort study

 

Setting and country: Single center study, South-Korea

 

Funding and conflicts of interest: Study was supported by grant from Asan Institute for Life Sciences. Authors state no conflicts of interest

 

Inclusion criteria:

- Patients with advanced gastric cancer who underwent a gastrectomy between 2014 and 2017

 

Exclusion criteria:

- Patients who received neoadjuvant therapy

- Patients whose preoperative cancer tissue was not available

- Slides with positive inflammatory cells in areas of necrosis

 

N=100

 

Prevalence: NA

 

Median age in years [range]: 60 [31-82]

 

Sex

Male: 60%

Female: 40%

 

Other important characteristics:

 

Tumor location

Upper: 19%

Middle: 17%

Lower: 50%

Entire: 14%

 

WHO classification

Tubular: 55%

Poorly cohesive: 32%

Mucinous: 9%

Others: 4%

 

Index test:

Ventana PD-L1 22C3

 

Cut-off point(s):

Minimum of 100 viable tumor cells were considered to be adequate for PD-L1 positivity. The Combined Positive Score (CPS) was used with a four tiered scoring system which was applied using the following cut-offs: <1; 1 to <5; 5 to 50 and >50

 

Comparator test:

Ventana PD-L1 SP263

 

Cut-off point(s):

Minimum of 100 viable tumor cells were considered to be adequate for PD-L1 positivity. The Combined Positive Score (CPS) was used with a four tiered scoring system which was applied using the following cut-offs: <1; 1 to <5; 5 to 50 and >50

 

Comparator test 2:

Dako PD-L1 IHC 22C3 PharmDX (Agilent Technologies, USA)

 

 

Cut-off point(s):

Minimum of 100 viable tumor cells were considered to be adequate for PD-L1 positivity. The Combined Positive Score (CPS) was used with a four tiered scoring system which was applied using the following cut-offs: <1; 1 to <5; 5 to 50 and >50

 

Reference test:

Operation specimens

 

 

Cut-off point(s):

Not reported

 

 

Time between the index test and reference test: Not reported

 

For how many participants were no complete outcome data available?

N=1 (1%)

 

Reason(s): Among biopsy specimens, exclusion due to lack of tumor cells.

Outcome measures and effect size (include 95%CI and p-value if available)4:

 

Operation specimens – Overall Percentage Agreement

CPS cut-off 1

22C3 Dako versus SP263 Ventana: 71%

22C3 Dako versus 22C3 Ventana: 70%

 

CPS cut-off 5

22C3 Dako versus SP263 Ventana: 64%

22C3 Dako versus 22C3 Ventana: 88%

 

CPS cut-off 50

22C3 Dako versus SP263 Ventana: 84%

22C3 Dako versus 22C3 Ventana: 98%

 

Operation specimens – Positive Percentage Agreement

CPS cut-off 1

22C3 Dako versus SP263 Ventana: 58.5%

22C3 Dako versus 22C3 Ventana: 45%

 

CPS cut-off 5

22C3 Dako versus SP263 Ventana: 24.4%

22C3 Dako versus 22C3 Ventana: 54.5%

 

CPS cut-off 50

22C3 Dako versus SP263 Ventana: 11.1%

22C3 Dako versus 22C3 Ventana: 50%

 

Operation specimens – Negative Percentage Agreement

CPS cut-off 1

22C3 Dako versus SP263 Ventana: 94.3%

22C3 Dako versus 22C3 Ventana: 86.7%

 

CPS cut-off 5

22C3 Dako versus SP263 Ventana: 96.4%

22C3 Dako versus 22C3 Ventana: 92.1%

 

CPS cut-off 50

22C3 Dako versus SP263 Ventana: 100%

22C3 Dako versus 22C3 Ventana: 99%

 

Biopsy specimens – Overall Percentage Agreement

CPS cut-off 1

22C3 Dako versus SP263 Ventana: 61%

22C3 Dako versus 22C3 Ventana: 63%

 

CPS cut-off 5

22C3 Dako versus SP263 Ventana: 62%

22C3 Dako versus 22C3 Ventana: 85%

 

CPS cut-off 50

22C3 Dako versus SP263 Ventana: 88%

22C3 Dako versus 22C3 Ventana: 96%

 

Biopsy specimens – Positive Percentage Agreement

CPS cut-off 1

22C3 Dako versus SP263 Ventana: 96.4%

22C3 Dako versus 22C3 Ventana: 32.1%

 

CPS cut-off 5

22C3 Dako versus SP263 Ventana: 12.2%

22C3 Dako versus 22C3 Ventana: 16.7%

 

CPS cut-off 50

22C3 Dako versus SP263 Ventana: 0%

22C3 Dako versus 22C3 Ventana: 0%

 

Biopsy specimens – Negative Percentage Agreement

CPS cut-off 1

22C3 Dako versus SP263 Ventana: 47.2%

22C3 Dako versus 22C3 Ventana: 75%

 

CPS cut-off 5

22C3 Dako versus SP263 Ventana: 96.6%

22C3 Dako versus 22C3 Ventana: 94.3%

 

CPS cut-off 50

22C3 Dako versus SP263 Ventana: 89.8%

22C3 Dako versus 22C3 Ventana: 98%

 

Overall Percentage Agreement between resection and matched biopsied specimens

CPS cut-off 1

22C3 Ventana: 93%

SP263 Ventana: 100%

22C3 Dako: 86%

 

CPS cut-off 5

22C3 Ventana: 93%

SP263 Ventana: 96%

22C3 Dako: 90%

 

CPS cut-off 50

22C3 Ventana: 100%

SP263 Ventana: 88%

22C3 Dako: 96%

 

Positive Percentage Agreement between resection and matched biopsied specimens

CPS cut-off 1

22C3 Ventana: 88.5%

SP263 Ventana: 100%

22C3 Dako: 67.5%

 

CPS cut-off 5

22C3 Ventana: 72.7%

SP263 Ventana: 91%

22C3 Dako: 38.5%

 

CPS cut-off 50

22C3 Ventana: 100%

SP263 Ventana: 44%

22C3 Dako: 0%

 

Negative Percentage Agreement between resection and matched biopsied specimens

CPS cut-off 1

22C3 Ventana: 94.6%

SP263 Ventana: 100%

22C3 Dako: 98.3%

 

CPS cut-off 5

22C3 Ventana: 95.5%

SP263 Ventana: 100%

22C3 Dako: 97.7%

 

CPS cut-off 50

22C3 Ventana: 100%

SP263 Ventana: 97.5%

22C3 Dako: 98%

 

 

False positivesa

CPS cut-off 1

22C3 Ventana: 2

SP263 Ventana: 0

22C3 Dako: 0

 

CPS cut-off 5

22C3 Ventana: 4

SP263 Ventana: 0

22C3 Dako: 1

 

CPS cut-off 50

22C3 Ventana: 0

SP263 Ventana: 2

22C3 Dako: 2

 

False negativesa

CPS cut-off 1

22C3 Ventana: 3

SP263 Ventana: 0

22C3 Dako: 13

 

CPS cut-off 5

22C3 Ventana: 3

SP263 Ventana: 4

22C3 Dako: 5

 

CPS cut-off 50

22C3 Ventana: 0

SP263 Ventana: 10

22C3 Dako: 2

 

 

Authors conclusion: We have demonstrated a significantly low correlation among three PD-L1 IHC assays using the SP263 and 22C3 antibody clones in AGC patients, demonstrating that they cannot be used interchangeably in clinical practice. Our data have also revealed that the PD-L1 status between endoscopic Bx and Op specimens shows the best agreement when the SP263 assay is used with a CPS1 cut-off, suggesting SP263 may provide the most representative re- sults for the evaluation of PD-L1 status in AGC.

Different PD-L1 assays in combination with a particular anti PD-L1 drug in clinical trials result in different standardised assays.

 

PD-L1 expression varies within the same tumor and is usually assessed on a single tumor biopsy.

Operation specimens offer a larger area of tumor to assess and therefore gives a more precise interpretation of PD-L1 expression

Ma, 2018

Type of study: Cohort study

 

Setting and country: Single center study, China

 

Funding and conflicts of interest: Authors declare to have no competing interests. Work was supported by grants from the National Natural Science Foundation of China

 

Inclusion criteria:

- Patients diagnosed with advanced gastric cancer (stages I-III) by pathologists based on hematoxylin and eosin staining

 

Exclusion criteria:

Not reported

 

SP142: N=122

28-8: N=106

E1L3N: N=24

 

Prevalence: NA

 

Mean age in years:

SP142: 59.4

28-8: 59.6

E1L3N: 55.7

 

Male sex:

SP142: 74.6%

28-8: 76.4%

E1L3N: 66.7%

 

Tumor location – Cardia

SP142: N=33 (27.5%)

28-8: N=30 (28.6%)

E1L3N: N=6 (25%)

 

Body:

SP142: N=31 (25.8%)

28-8: N=27 (25.7%)

E1L3N: N=7 (29.2%)

 

Antrum:

SP142: N=54 (45%)

28-8: N=46 (43.8%)

E1L3N: N=10 (41.7%)

 

Upper 2/3:

SP142: N=0 (0%)

28-8: N=0 (0%)

E1L3N: N=0 (0%)

 

Whole:

SP142: N=2 (1.7%)

28-8: N=2 (1.9%)

E1L3N: N=1 (4.2%)

 

Type of tumor -

Adenocarcinoma:

SP142: N=111 (91%)

28-8: N= 96 (96%)

E1L3N: N=22 (91.7%)

 

Squamous carcinoma:

SP142: N=1 (0.8%)

28-8: N=1 (0.9%)

E1L3N: N=0 (0%)

 

Mucocellular carcinoma:

SP142: N=2 (1.6%)

28-8: N=2 (1.9%)

E1L3N: N=0 (0%)

 

Others:

SP142: N=8 (6.6%)

28-8: N=7 (6.6%)

E1L3N: N=2 (8.4%)

 

T-stage –

1a & 1b:

SP142: N=7 (5.7%)

28-8: N=7 (6.6%)

E1L3N: N=0 (0%)

 

T-Stage 2:

SP142: N=14 (11.5%)

28-8: N=13 (12.3%)

E1L3N: N=3 (12.5%)

 

T-stage 3:

SP142: N=56 (45.9%)

28-8: N=49 (46.2%)

E1L3N: N=12 (50%)

 

T-stage 4a & 4b:

SP142: N=45 (36.9%)

28-8: N=37 (34.9%)

E1L3N: N=9 (37.5%)

 

Index test:

PD-L1 staining using clone SP142 (1:100; Spring Bioscience Corp, rabbit IgG) applied using the Ventana Benchmark platform

 

 

Cut-off point(s):

Tumor cell positivity was score as the percentage of tumor cells exhibiting membrane staining of any intensity. Immune cell positivity was scored as the proportion of tumor area, including associated intratumor and contiguous peritumor stroma, occupied by PD-L1 stained immune cells at any intensity.

Total cell positivity was scored as the percentage of positive cells, including tumor and immune/stromal cells, in all cells. Percentage of tumor cells showing positivity was recorded as less than or equal 1, 5, 10, 25% or greater than 25%.

 

Comparator test:

PD-L1 staining using clone 28-8 (1:300; Abcam, rabbit IgG) applied using the Leica Bond platform (Bond™ Epitope Retrieval ER1 solution)

 

Cut-off point(s):

Tumor cell positivity was score as the percentage of tumor cells exhibiting membrane staining of any intensity. Immune cell positivity was scored as the proportion of tumor area, including associated intratumor and contiguous peritumor stroma, occupied by PD-L1 stained immune cells at any intensity.

Total cell positivity was scored as the percentage of positive cells, including tumor and immune/stromal cells, in all cells. Percentage of tumor cells showing positivity was recorded as less than or equal 1, 5, 10, 25% or greater than 25%.

 

Comparator test 2:

PD-L1 staining using clone E1L3N (1:200; Cell Signaling Technology Inc., rabbit IgG) applied using the Leica Bond platform (Bond™ Epitope Retrieval ER2 solution)

 

Cut-off point(s):

Tumor cell positivity was score as the percentage of tumor cells exhibiting membrane staining of any intensity. Immune cell positivity was scored as the proportion of tumor area, including associated intratumor and contiguous peritumor stroma, occupied by PD-L1 stained immune cells at any intensity.

Total cell positivity was scored as the percentage of positive cells, including tumor and immune/stromal cells, in all cells. Percentage of tumor cells showing positivity was recorded as less than or equal 1, 5, 10, 25% or greater than 25%.

 

Describe reference test: NA

 

 

 

Cut-off point(s): NA

 

 

Time between the index test and reference test: Not applicable (tissue specimens were prepared from the same surgically resected GC specimen)

 

For how many participants were no complete outcome data available: Not reported

 

Outcome measures and effect size (include 95%CI and p-value if available):

 

Positivity detection in total cells at 5% cut-off:

SP142 Ventana: 32/315 (10%)

28-8 Leica Bond: 34/315 (10.8%)

E1L3N Leica Bond: 14/315 (4.4%)

 

Positivity detection in total cells at 10% cut-off:

SP142 Ventana: 19/315 (6%)

28-8 Leica Bond: 14/315 (4%)

E1L3N Leica Bond: 8/315 (2.5%)

 

Diagnostic concordance between SP142 Ventana and 28-8 Bond for total cells at different cut-off values:

1%: K=0.740

5%: K=0.816

10%: K=0.823

 

Positivity detection in tumor cells at 5% cut-off:

SP142 Ventana: 27/315 (8.6%)

28-8 Bond: 31/315 (9.8%)

E1L3N Bond: 14/315 (4.4%)

 

Positivity detection in tumor cells at 10% cut-off:

SP142 Ventana: 16/315 (5%)

28-8 Bond: 15/315 (4.7%)

E1L3N Bond: 9/315 (2.8%)

 

Diagnostic concordance between SP142 and 28-8 for tumor cells at different cut-off values:

1%: K=0.813

5%: K=0.810

10%: K=0.830

 

Positivity detection in immune/stromal cells at 1% cut-off:

SP142 Ventana: 58/315 (18.41%)

28-8 Bond: 24/315 (7.62%)

E1L3N Bond: 1/315 (0.3%)

 

Inter-pathologist correlation for tumor cells:

SP142 Ventana: R2 = 0.9805

28-8 Leica Bond: R2 = 0.9853

 

Inter-pathologist correlation for immune/stromal cells:

SP142 Ventana: R2 = 0.5653

28-8 Leica Bond: R2 = 0.5745

 

Authors conclusion: A highly concordant result was observed for clones SP142 and 28-8 in tumor cells, particularly when we used a 5% cut-off value, whereas clone SP142 showed a distinctive advantage in staining immune/stromal cells.

Narita, 2021

Type of study: Retrospective cohort study

 

Setting and country: Single center study, Japan

 

Funding and conflicts of interest: Authors reported different grants and personal fees from Ono Pharma, Bristol-Myers Squibb, Astrazeneca, Eli Lilly, Yakult Honsha, Daiichi Sankyo, Taiho, MSD, Novartis, Tadeka, Chugai Merck Bio Pharma, Bayer, Lilly Japan, Sanofi, Dainippon Sumitomo Pharma, Array BioPharma, MSD Oncology, Sysmex, Medical & Biological laboratories, Mitsubishi Tanabe Pharma, Nippon Kayaku, Merck Sharp & Dohme, ArcherDx, Pfizer, Roche/Ventana, Agilent/Dako, Thermo Fisher Science, Solasia Pharma, Parexel International.

 

Inclusion criteria:

- Patients with histological diagnosis of esophagogastric adenocarcinoma (stage I-IV)

- Underwent gastrectomy at a single institution from 2009 to 2010

- Sufficient tumor content in formalin-fixed paraffin-embedded samples collected from gastrectomy specimens

- No systemic chemotherapy before surgery

 

Exclusion criteria:

Not reported

 

N=226

 

Prevalence: NA

 

Age median (range): 65 years (32-86)

 

Sex, male: N=162 (72%)

 

Tumor stage (pTNM):

Stage II: N=180 (47.5%)

Stage III: N=199 (52.5%)

 

Tumor location:

Esophagogastric junction: N=12 (5%) (25.3%)

Upper third: N=47 (21%)

Middle third: N=85 (38%)

Lower third: N=82 (36%)

 

TNM stage:

I: N=100 (44%)

II: N=39 (17%)

III: N=58 (26%)

IV: N=29 (13%)

 

Index test:

PD-L1 staining using 22C3 antibody with the Dako stainer

 

Cut-off point(s):

PD-L1 positivity was evaluated by the CPS, which was defined as the number of PD-L1 stained cells as a proportion of the total number of tumor cells multiply by 100.

 

Comparator test:

PD-L1 staining using 28-8 antibody with the Dako stainer

 

Cut-off point(s):

PD-L1 positivity was evaluated by the CPS, which was defined as the number of PD-L1 stained cells as a proportion of the total number of tumor cells multiply by 100.

 

Reference test: NA

 

 

 

Cut-off point(s): NA

 

 

 

 

 

Time between the index test and reference test: Not applicable (tissue specimens were prepared from the same surgically resected GC specimen)

 

For how many participants were no complete outcome data available: Not reported

Outcome measures and effect size (include 95%CI and p-value if available):

 

Diagnostic concordance

CPS 5:

Kappa: 0.881

 

Diagnostic concordance

CPS 10:

Kappa: 0.837

 

Authors conclusion: In conclusion, our study demonstrated that the PD-L1 CPS in gastric cancer patients is highly concordant between the 22C3 and 28-8 PharmDx assay using various CPS cutoffs. This study suggest the potential interchangeability of these two assays to determine PD-L1 expression levels in gastric cancer patients.

 

 

 

Park, 2020

Type of study: Retrospective cohort study

 

Setting and country: Single center study, Korea

 

Funding and conflicts of interest: This study was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education. Conflict of interest relevant to this article was not reported.

 

Inclusion criteria:

- Patients with stage II and III gastric cancer who underwent surgical resection at Seoul National University Bundang Hospital between 2006 and 2013.

 

Exclusion criteria:

Not reported

 

N=379

 

Prevalence: NA

 

Age (in years):

< 65: N=251 (66.2%)

≥ 65: N=128 (33.8%)

 

Sex, male: N=238 (62.8%)

 

Tumor stage (pTNM):

Stage II: N=180 (47.5%)

Stage III: N=199 (52.5%)

 

Tumor location:

Upper third: N=96 (25.3%)

Middle third: N=80 (21.1%)

Lower third: N=182 (48.1%)

GEJ: N=5 (1.3%)

Entire: N=16 (4.2%)

 

Lauren classification:

Intestinal: N=135 (35.6%)

Diffuse: N=213 (56.2%)

Mixed: N=29 (7.7%)

Indeterminate: N=2 (0.5%)

 

Index test:

PD-L1 staining using 22C3 antibody (Dako) with the Agilent Autostainer Link

 

Cut-off point(s):

CPS was defined as the total number of tumor cells and immune cells (including lymphocytes and macrophages) stained with PD-L1 divided by the number of all viable tumor cells, then multiplied by 100.

Each countable array core section contained at least 100 viable GC cells.

 

Comparator test:

PD-L1 staining using the SP263 antibody (Ventana) with the Ventana stainer

 

Cut-off point(s):

CPS was defined as the total number of tumor cells and immune cells (including lymphocytes and macrophages) stained with PD-L1 divided by the number of all viable tumor cells, then multiplied by 100.

Each countable array core section contained at least 100 viable GC cells.

 

Reference test: NA

 

 

 

Cut-off point(s): NA

 

 

 

 

 

Time between the index test and reference test: Not applicable (tissue specimens were prepared from the same surgically resected GC specimen)

 

For how many participants were no complete outcome data available: Not reported

Outcome measures and effect size (include 95%CI and p-value if available):

 

Outcome measures for CPS

Number of PD-L1 positive cases for CPS ≥ 10 at center of tumor:
22C3 Agilent: 37 (9.8%)
SP263 Ventana: 36 (9.5%)

 

Number of PD-L1 positive cases for CPS ≥ 10 at invasive margin:
22C3 Agilent: 50 (13.2%)
SP263 Ventana: 49 (12.9%)

 

Number of PD-L1 negative cases for CPS ≥ 10 at center of tumor:
22C3 Agilent: 342 (90.2%)
SP263 Ventana: 343 (90.5%)

 

Number of PD-L1 negative cases for CPS ≥ 10 at invasive margin:
22C3 Agilent: 329 (86.8%)
SP263 Ventana: 330 (87.1%)

 

Overall percentage of agreement between 22C3 and SP263 for CPS ≥ 10:

At center of tumor: 99.2% (lower 95% CI 97.7)
At invasive margin: 98.7% (lower 95% CI 97.0)

 

Positive percentage of agreement between 22C3 and SP263 for CPS ≥ 10:

At center of tumor: 94.6% (lower 95% CI 81.8)
At invasive margin: 94.0% (lower 95% CI 83.5)

 

Negative percentage of agreement between 22C3 and SP263 for CPS ≥ 10:

At center of tumor: 99.7% (lower 95% CI 98.4)
At invasive margin: 99.4% (lower 95% CI 97.8)

 

Correlation coefficient of CPS between to assays at center of tumor: 0.916

 

Correlation coefficient of CPS between to assays at invasive margin: 0.912

 

Spearman correlation between 22C3 Agilent and SP263 Ventana for CPS:

Spearman P: 0.914

 

Outcome measures for TPS

Number of PD-L1 positive cases for TPS ≥ 10 at center of tumor:
22C3 Agilent: 28 (7.4%)
SP263 Ventana: 29 (7.7%)

 

Number of PD-L1 positive cases for TPS ≥ 10 at invasive margin:
22C3 Agilent: 30 (7.9%)
SP263 Ventana: 34 (9.0%)

 

Number of PD-L1 negative cases for TPS ≥ 10 at center of tumor:
22C3 Agilent: 351 (92.6%)
SP263 Ventana: 350 (92.3%)

 

Number of PD-L1 negative cases for TPS ≥ 10 at invasive margin:
22C3 Agilent: 349 (92.1%)
SP263 Ventana: 345 (91%)

 

Overall percentage of agreement between 22C3 and SP263 for TPS ≥ 10:

At center of tumor: 99.7% (lower 95% CI 98.5)
At invasive margin: 98.4% (lower 95% CI 96.6)

 

Positive percentage of agreement between 22C3 and SP263 for TPS ≥ 10:

At center of tumor: 100% (lower 95% CI 87.7)
At invasive margin: 96.7% (lower 95% CI 82.8)

 

Negative percentage of agreement between 22C3 and SP263 for TPS ≥ 10:

At center of tumor: 99.7% (lower 95% CI 98.4)
At invasive margin: 98.6% (lower 95% CI 96.7)

 

Correlation coefficient for TPS between 22C3 and SP263 assays at center of tumor: 0.951

 

Correlation coefficient for TPS between 22C3 and SP263 assays at invasive margin: 0.935

 

Spearman correlation between 22C3 Agilent and SP263 Ventana for TPS:

Spearman P: 0.943

 

Interobserver variation between five pathologists for CPS ≥ 10: Fleiss’ Kappa (lower 95% CI)

22C3: 0.224 (8.0)

SP263: 0.140 (2.4)

 

Authors conclusion: In conclusion, the 22C3 pharmDx and SP263 assay showed high agreement for the same GC specimens, but expression heterogeneity and interobserver variability were also found to be higher than assay variability. In addition, the higher cut-off value with the CPS method resulted in greater inter- changeability between the 22C3 pharmDx and SP263 assay.

 

 

 

Yeong, 2022

Type of study: Cross-sectional analysis of retrospective cohort

 

Setting and country: Single center study, Singapore

 

Funding and conflicts of interest: Authors received honoraria or grants from Bristol-Myers Squibb, Lilly, Roche, Taiho, AstraZeneca, Merck, Eisai, Bayer, Novartis, Paxman Coolers, MSD. One author had advisory activity with Abbvie, Amgen, AstraZeneca, BMS, Ipsen and Novartis and one author is speaker with Bayer, Eisai, Lilly, Sanofi and Taiho.

Study is supported by the National Medical Research Council and partially funded by the National Medical Research Council Open Fund-Large Collaborative Grant

Inclusion criteria:

- Archival formalin-fixed paraffin-embedded (FFPE) tissue samples from patients obtained via biopsy or resection of gastric cancer at the National University Hospital (NUH) in Singapore between 1997 and 2019

- Samples were developed in a tissue microarray (TMA)

- Samples consisted of both surgical resection specimens and biopsies

- Cases recorded to be suitable for research and with sufficient tissue for analysis were identified by the department of Pathology of the NUH Singapore

 

Exclusion criteria:

Not reported

 

N=344

 

Prevalence: NA

 

Median age in years (IQR): 68 (16.25)

 

Sex: 68.6% Male

 

Chinese ethnicity: 85.5%

 

Subtype (Lauren classification) – intestinal: N=173/344 (50.3%)

 

Describe index test:

PD-L1 staining using 22C3 antibody on Leica Bond stainer

 

Cut-off point(s):

Combined Positive Score (CPS) was calculated as the number of PD-L1 staining tumor and immune cells divided by the total viable tumor cells multiplied by 100.

Tumor Proportion Score (TPS) was calculated as the percentage of tumor cells showing staining relative to all tumor cells present in the sample.

Immune Cells (IC) was calculated as proportion of tumor area occupied by PD-L1-positive tumor infiltrating immunce cells.

PD-L1 scoring and analysing was performed by a pathologist and followed a detailed protocol. 

 

 

Comparator test:

PD-L1 staining using SP142 antibody on Leica Bond stainer

 

Cut-off point(s):

Combined Positive Score (CPS) was calculated as the number of PD-L1 staining tumor and immune cells divided by the total viable tumor cells multiplied by 100.

Tumor Proportion Score (TPS) was calculated as the percentage of tumor cells showing staining relative to all tumor cells present in the sample.

Immune Cells (IC) was calculated as proportion of tumor area occupied by PD-L1-positive tumor infiltrating immunce cells.

PD-L1 scoring and analysing was performed by a pathologist and followed a detailed protocol. 

 

Comparator test 2:

PD-L1 staining using 28-8 antibody on Leica Bond stainer

 

Cut-off point(s):

Combined Positive Score (CPS) was calculated as the number of PD-L1 staining tumor and immune cells divided by the total viable tumor cells multiplied by 100.

Tumor Proportion Score (TPS) was calculated as the percentage of tumor cells showing staining relative to all tumor cells present in the sample.

Immune Cells (IC) was calculated as proportion of tumor area occupied by PD-L1-positive tumor infiltrating immunce cells.

PD-L1 scoring and analysing was performed by a pathologist and followed a detailed protocol. 

 

Describe reference test: NA

 

 

 

 

 

Time between the index test and reference test: Not reported

 

For how many participants were no complete outcome data available: Not reported

Outcome measures and effect size (include 95%CI and p-value if available):

 

Proportion of positivity of PD-L1 at CPS cut-off of 5:

22C3: N=46 (13.4%)

28-8: N=100 (29.1%)

SP142: N=68 (19.8%)

 

Diagnostic concordance between 22C3 and 28-8 assays at CPS cut-off 5:

Accuracy: 73.3%

K: 0.598

 

 Diagnostic concordance between 22C3 and SP142 assays at CPS cut-off 5:

Accuracy: 80.8%

K: 0.735

 

Correlation between 22C3 and 28-8 assays:

CPS: 0.392

TPS: 0.381

IC: 0.230

 

Correlation between 28-8 and SP142 assays:

CPS: 0.213

TPS: 0.180

IC: 0.133

 

Correlation between 22C3 and SP142:

CPS: 0.409

TPS: 0.417

IC: 0.347

Authors conclusion: The percentage of PD-L1 positive samples at various CPS cut-offs for the 28-8 assay were approximately twofold higher than that of the 22C3 assay, with only moderate concordance between 22C3 and 28-8 assays at CPS ≥ 5. These findings do not support the interchangeability of the assays for determining PD-L1 status of gastric adenocarcinoma at the clinically relevant CPS cut-off of ≥ 5 

 

a  Using operative specimen as gold standard

 

Risk of bias assessment diagnostic accuracy studies (QUADAS II, 2011)

 

Research question: What is the diagnostic accuracy and concordance of the PD-L1 test with 28-8 antibody on Ventana Stainer versus the PD-L1 test with SP-142 or SP-263 antibody on Ventana stainer in relation to the reference PD-L1 test with 22C3 antibody on Dako stainer to detect PD-L1 expression in patients with metastatic gastric adenocarcinoma and is there a difference in overall survival between different test strategies?

Study reference

Patient selection

 

 

Index test

Reference standard

Flow and timing

Comments with respect to applicability

Ahn, 2021

Was a consecutive or random sample of patients enrolled?

Yes

 

Was a case-control design avoided?

Yes

 

Did the study avoid inappropriate exclusions?

Unclear, no exclusion criteria

 

 

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes

 

If a threshold was used, was it pre-specified?

Yes

 

 

 

Is the reference standard likely to correctly classify the target condition?

No, no reference standard available, ‘reference/comparator test’ was 22C3 Dako

 

Were the reference standard results interpreted without knowledge of the results of the index test?

Yes

 

 

 

Was there an appropriate interval between index test(s) and reference standard?

Tests (assays) were performed on the same tissue.

 

Did all patients receive a reference standard?

Yes

 

Did patients receive the same reference standard?

Yes

 

Were all patients included in the analysis?

Yes

 

Are there concerns that the included patients do not match the review question?

No

 

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

No

 

Are there concerns that the target condition as defined by the reference standard does not match the review question?

No

 

 

CONCLUSION:

Could the selection of patients have introduced bias?

 

 

RISK: UNCLEAR

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

 

RISK: LOW

 

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

 

RISK: LOW

CONCLUSION

Could the patient flow have introduced bias?

 

 

RISK: LOW

 

Dabbagh, 2021

Was a consecutive or random sample of patients enrolled?

Yes

 

Was a case-control design avoided?

Yes

 

Did the study avoid inappropriate exclusions?

Unclear, no clear definition of in- and exclusion criteria.

 

 

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes

 

If a threshold was used, was it pre-specified?

Unclear

 

 

 

Is the reference standard likely to correctly classify the target condition?

No, no reference standard available

 

Were the reference standard results interpreted without knowledge of the results of the index test?

Yes

 

 

 

Was there an appropriate interval between index test(s) and reference standard?

Tests (assays) were performed on the same tissue.

 

Did all patients receive a reference standard?

Yes

 

Did patients receive the same reference standard?

Yes

 

Were all patients included in the analysis?

Yes

 

Are there concerns that the included patients do not match the review question?

No

 

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

No

 

Are there concerns that the target condition as defined by the reference standard does not match the review question?

No

 

 

CONCLUSION:

Could the selection of patients have introduced bias?

 

 

RISK: UNCLEAR

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

 

RISK: UNCLEAR

 

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

 

RISK: LOW

CONCLUSION

Could the patient flow have introduced bias?

 

 

RISK: LOW

 

Kim, 2021

Was a consecutive or random sample of patients enrolled?

Yes

 

Was a case-control design avoided?

Yes

 

Did the study avoid inappropriate exclusions?

Yes

 

 

Were the index test results interpreted without knowledge of the results of the reference standard?

Unclear

 

If a threshold was used, was it pre-specified?

Yes

 

 

 

Is the reference standard likely to correctly classify the target condition?

Unclear, Kim (2021) used the operative specimens as gold standard but no report of which antibody was used to stain the operation specimens.

 

Were the reference standard results interpreted without knowledge of the results of the index test?

Unclear

 

 

 

Was there an appropriate interval between index test(s) and reference standard?

Unclear

 

Did all patients receive a reference standard?

Yes

 

Did patients receive the same reference standard?

Yes

 

Were all patients included in the analysis?

Yes

 

Are there concerns that the included patients do not match the review question?

No

 

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Yes, this study does not include PD-L1 test with SP-142 antibody on Ventana stainer

 

Are there concerns that the target condition as defined by the reference standard does not match the review question?

No

 

CONCLUSION:

Could the selection of patients have introduced bias?

 

 

RISK: LOW

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

 

RISK: UNCLEAR

 

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

 

RISK: HIGH

CONCLUSION

Could the patient flow have introduced bias?

 

 

RISK: UNCLEAR

 

Ma, 2018

Was a consecutive or random sample of patients enrolled?

Yes

 

Was a case-control design avoided?

Yes

 

Did the study avoid inappropriate exclusions?

Unclear, no clear in- and exclusion criteria

 

 

Were the index test results interpreted without knowledge of the results of the reference standard?

Unclear, three pathologists reviewed all slides

 

If a threshold was used, was it pre-specified?

Yes

 

 

 

Is the reference standard likely to correctly classify the target condition?

No, no reference standard available

 

Were the reference standard results interpreted without knowledge of the results of the index test?

Unclear

 

 

 

Was there an appropriate interval between index test(s) and reference standard?

Not applicable, tissue specimens were prepared from the same surgically resected GC specimen

 

Did all patients receive a reference standard?

No

 

Did patients receive the same reference standard?

No

 

Were all patients included in the analysis?

Yes

Are there concerns that the included patients do not match the review question?

Yes, small percentage of patients included with other types of carcinoma than adenocarcinoma

 

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Yes, it does not include the PD-L1 test with 22C3 antibody on Dako stainer

 

Are there concerns that the target condition as defined by the reference standard does not match the review question?

No

 

 

CONCLUSION:

Could the selection of patients have introduced bias?

 

 

RISK: UNCLEAR

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

 

RISK: UNCLEAR

 

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

 

RISK: UNCLEAR

CONCLUSION

Could the patient flow have introduced bias?

 

 

RISK: LOW

 

Narita, 2021

Was a consecutive or random sample of patients enrolled?

Yes

 

Was a case-control design avoided?

Yes

 

Did the study avoid inappropriate exclusions?

Yes

 

 

Were the index test results interpreted without knowledge of the results of the reference standard?

Unclear

 

If a threshold was used, was it pre-specified?

Yes

 

 

 

Is the reference standard likely to correctly classify the target condition?

No, no reference standard available

 

Were the reference standard results interpreted without knowledge of the results of the index test?

Unclear

 

 

 

Was there an appropriate interval between index test(s) and reference standard?

Not applicable, tissue specimens were prepared from the same surgically resected specimen

 

Did all patients receive a reference standard?

Yes

 

Did patients receive the same reference standard?

Yes

 

Were all patients included in the analysis?

Yes

Are there concerns that the included patients do not match the review question?

No

 

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

No

 

Are there concerns that the target condition as defined by the reference standard does not match the review question?

No

 

 

CONCLUSION:

Could the selection of patients have introduced bias?

 

 

RISK: LOW

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

 

RISK: UNCLEAR

 

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

 

RISK: UNCLEAR

CONCLUSION

Could the patient flow have introduced bias?

 

 

RISK: LOW

 

Park, 2020

Was a consecutive or random sample of patients enrolled?

Yes

 

Was a case-control design avoided?

Yes

 

Did the study avoid inappropriate exclusions?

Unclear, no clear in- and exclusion criteria were reported

 

Were the index test results interpreted without knowledge of the results of the reference standard?

Unclear, two pathologists reviewed slides.

 

If a threshold was used, was it pre-specified?

Yes

 

 

Is the reference standard likely to correctly classify the target condition?

No, no reference standard available

 

Were the reference standard results interpreted without knowledge of the results of the index test?

Not applicable

 

 

Was there an appropriate interval between index test(s) and reference standard?

Not applicable, tissue specimens (TMA slides) were prepared from the same surgically resected GC specimen

 

Did all patients receive a reference standard?

No

 

Did patients receive the same reference standard?

No

 

Were all patients included in the analysis?

Yes

Are there concerns that the included patients do not match the review question?

Yes, not clear how many specimens of patients were included with other types of carcinoma than adenocarcinoma.

 

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Yes, the PD-L1 test with 22C3 antibody was performed on the Agilent Autostainer

 

Are there concerns that the target condition as defined by the reference standard does not match the review question?

No

 

 

CONCLUSION:

Could the selection of patients have introduced bias?

 

 

RISK: UNCLEAR

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

 

RISK: UNCLEAR

 

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

 

RISK: UNCLEAR

CONCLUSION

Could the patient flow have introduced bias?

 

 

RISK: LOW

 

Yeong, 2022

Was a consecutive or random sample of patients enrolled?

No

 

Was a case-control design avoided?

Yes

 

Did the study avoid inappropriate exclusions?

Unclear, no clear exclusion criteria were reported

 

 

Were the index test results interpreted without knowledge of the results of the reference standard?

Unclear

 

If a threshold was used, was it pre-specified?

Yes

 

 

 

Is the reference standard likely to correctly classify the target condition?

Unclear

 

Were the reference standard results interpreted without knowledge of the results of the index test?

Unclear

 

 

 

Was there an appropriate interval between index test(s) and reference standard?

Unclear

 

Did all patients receive a reference standard?

No

 

Did patients receive the same reference standard?

No, all patients received the three tests

 

Were all patients included in the analysis?

Yes

 

Are there concerns that the included patients do not match the review question?

No

 

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

No

 

Are there concerns that the target condition as defined by the reference standard does not match the review question?

No

 

 

CONCLUSION:

Could the selection of patients have introduced bias?

 

 

RISK: LOW

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

 

RISK: UNCLEAR

 

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

 

RISK: UNCLEAR

CONCLUSION

Could the patient flow have introduced bias?

 

 

RISK: UNCLEAR

 

 

Autorisatiedatum en geldigheid

Laatst beoordeeld  : 01-07-2021

Laatst geautoriseerd  : 01-10-2023

Geplande herbeoordeling  : 01-11-2023

Initiatief en autorisatie

Initiatief:
  • Cluster Oesofagus- en maagcarcinoom
Geautoriseerd door:
  • Nederlandse Internisten Vereniging
  • Nederlandse Vereniging van Maag-Darm-Leverartsen
  • Nederlandse Vereniging voor Heelkunde
  • Nederlandse Vereniging voor Keel-Neus-Oorheelkunde en Heelkunde van het Hoofd-Halsgebied
  • Nederlandse Vereniging voor Nucleaire geneeskunde
  • Nederlandse Vereniging voor Pathologie
  • Nederlandse Vereniging voor Radiologie
  • Nederlandse Vereniging voor Radiotherapie en Oncologie
  • Nederlandse Vereniging van Ziekenhuisapothekers
  • Stichting voor Patiënten met Kanker aan het Spijsverteringskanaal

Algemene gegevens

De ontwikkeling/herziening van deze richtlijnmodule werd ondersteund door het Kennisinstituut van de Federatie Medisch Specialisten (www.demedischspecialist.nl/kennisinstituut) en werd gefinancierd uit de Kwaliteitsgelden Medisch Specialisten (SKMS). De financier heeft geen enkele invloed gehad op de inhoud van de richtlijnmodule.

Samenstelling werkgroep

Voor het ontwikkelen van de richtlijnmodules is in 2021 een multidisciplinair cluster ingesteld, bestaande uit vertegenwoordigers van alle relevante specialismen (zie hiervoor de samenstelling van het cluster) die betrokken zijn bij de zorg voor patiënten met oesofagus- en maagcarcinoom.

 

Clusterstuurgroep

  • Dhr. prof. dr. P.D. (Peter) Siersema (voorzitter), maag-darm-leverarts, Erasmus MC, Rotterdam; NVMDL
  • Mevr. dr. A. (Annemarieke) Bartels – Rutten, radioloog, NKI-AVL, Amsterdam; NVvR
  • Dhr. prof. dr. M.I. (Mark) van Berge Henegouwen, chirurg, Amsterdam UMC, Amsterdam; NVvH
  • Dhr. prof. dr. R. (Richard) van Hillegersberg, chirurg, UMC Utrecht, Utrecht; NVvH
  • Dhr. dr. M.C.C.M. (Maarten) Hulshof, Radiotherapeut, Amsterdam UMC, Amsterdam; NVRO
  • Mevr. dr. H.W.M. (Hanneke) van Laarhoven, internist, Amsterdam UMC, Amsterdam; NIV
  • Mevr. dr. E.M. (Liesbeth) Timmermans, bestuurslid Stichting voor Patiënten met Kanker aan het Spijsverteringskanaal; SPKS (tot 1 december 2022)
  • Dhr. dr. E. (Erik) Vegt, nucleair geneeskundige, Erasmus MC, Rotterdam; NVNG

Clusterexpertisegroep

  • Dhr. drs. W.W. (Weibel) Braunius, keel-neus-oorarts, UMC Utrecht, Utrecht; NVKNO
  • Mevr. dr. M.J. (Marc) van Det, chirurg, Ziekenhuisgroep Twente; NVvH
  • Mevr. dr. S.S. (Suzanne) Gisbertz, chirurg, Amsterdam UMC, Amsterdam; NVvH
  • Mevr. dr. N.C.T. (Nicole) van Grieken, patholoog, Amsterdam UMC, Amsterdam; NVVP
  • Dhr. dr. R. (Ronald) Hoekstra, internist, Ziekenhuisgroep Twente; NIV
  • Dhr. R. (Remco) Huiszoon MBA, ervaringsdeskundige Stichting voor Patiënten met kanker aan het Spijsverteringskanaal; SPKS
  • Dhr. dr. P.M. (Paul) Jeene, radiotherapeut, Radiotherapiegroep; NVRO
  • Dhr. dr. S.M. (Sjoerd) Lagarde, chirurg, Erasmus MC, Rotterdam; NVvH
  • Dhr. dr. R.W.F. (Roelof) van Leeuwen, ziekenhuisapotheker, Erasmus MC, Rotterdam; NVZA
  • Dhr. dr. S.L. (Sybren) Meijer, patholoog, Amsterdam UMC, Amsterdam; NVVP
  • Mevr. dr. B. (Bianca) Mostert, internist, Erasmus MC, Rotterdam; NIV
  • Mevr. dr. C.T. (Kristel) Muijs, radiotherapeut, UMCG, Groningen; NVRO
  • Mevr. dr. R.E. (Roos) Pouw, maag-darm-leverarts, Amsterdam UMC, Amsterdam; NVMDL
  • Mevr. dr. L. (Luidmila) Peppelenbosch – Kodach, patholoog, NKI-AVL, Amsterdam; NVVP
  • Mevr. drs. H. (Heidi) Rütten, radiotherapeut, Radboud UMC, Nijmegen; NVRO
  • Mevr. dr. M. (Marije) Slingerland, internist, LUMC, Leiden; NIV
  • Mevr. prof. dr. V.M.C.W. (Manon) Spaander, maag-darm-leverarts, Erasmus MC, Rotterdam; NVMDL

Met ondersteuning van:

  • Mevr. dr. C. (Charlotte) Gaasterland, adviseur, Kennisinstituut van de Federatie Medisch Specialisten
  • Mevr. S.N. (Sarah) van Duijn MSc, junior adviseur, Kennisinstituut van de Federatie Medisch Specialisten
  • Mevr. M. (Miriam) te Lintel Hekkert MSc, junior adviseur, Kennisinstituut van de Federatie Medisch Specialisten

Belangenverklaringen

De Code ter voorkoming van oneigenlijke beïnvloeding door belangenverstrengeling is gevolgd. Alle clusterleden hebben schriftelijk verklaard of zij in de laatste drie jaar directe financiële belangen (betrekking bij een commercieel bedrijf, persoonlijke financiële belangen, onderzoeksfinanciering) of indirecte belangen (persoonlijke relaties, reputatiemanagement) hebben gehad. Gedurende de ontwikkeling of herziening van een module worden wijzigingen in belangen aan de voorzitter doorgegeven. De belangenverklaring wordt opnieuw bevestigd tijdens de commentaarfase.

Een overzicht van de belangen van de clusterleden en het oordeel over het omgaan met eventuele belangen vindt u in onderstaande tabel. De ondertekende belangenverklaringen zijn op te vragen bij het secretariaat van het Kennisinstituut van de Federatie Medisch Specialisten.

 

Clusterstuurgroep

Clusterlid

Functie

Nevenfuncties

Gemelde belangen

Ondernomen actie

Siersema (Voorzitter)

MDL-arts Erasmus MC Rotterdam

 

Editor in Chief, Endoscopy

Research funding/advisory board zonder invloed op deze richtlijn

Geen restrictie

Timmermans

- Bestuurslid SPKS (Stichting voor Patiënten met kanker aan het Spijsverteringskanaal) 5 uur per week

- Gedragswetenschappelijk docent huisartsenopleiding Eerstelijnsgeneeskunde Radboudumc

Onbetaald vrijwilligerswerk Bestuurslid SPKS (15 uur per week)

Geen

Geen restrictie

Van Laarhoven

Hoofd afdeling medische oncologie, Amsterdam UMC

- Wetenschappelijke raad KWF (onbetaald)

- Voorzitter ESMO upper GI faculty (onbetaald)

- Lid ESMO Leadership Generation programme (onbetaald)

- Lid EORTC upper GI strategy commiittee (onbetaald)

- Research funding/medication: Amgen, AstraZeneca, Bayer, BMS, Celgene, Janssen, Lilly, Merck, MSD, Nordic Pharma, Philips, Roche, Servier

Geen restrictie

Bartels

Radioloog, Antoni van Leeuwenhoek

Geen

Geen

Geen restrictie

Van Berge Henegouwen

Chirurg slokdarm en maagchirurgie Amsterdam UMC

Hoogleraar slokdarm en maagchirurgie Universiteit van Amsterdam

Geen

- Olympus financiering studie (researcher initiated grant)

Stryker financiering studie (researcher initiated grant)

uitkomsten richtlijn geen invloed op deze bedrijven of studies

- Consultancy voor meerdere bedrijven (uitbetaling aan Amsterdam UMC), niet gerelateerd aan richtlijn.

Geen restrictie

Hulshof

Radiotherapeut oncoloog Amsterdam UMC

Geen

Geen

Geen restrictie

Van Hillegersberg

Chirurg, UMC Utrecht

Proctor Intuitive Surgical Consultant Medtronic

- Bestuur DUCA, DICA

Geen restrictie

Vegt

Nucleair geneeskundige, Afdeling Radiologie en Nucleaire Geneeskunde, Erasmus MC, Rotterdam

Geen

- ZonMW-subsidie voor de PLASTIC-studie, programma doelmatigheid van zorg, naar de kosten-effectiviteit van FDG-PET/CT en laparoscopie bij maagcarcinoom.

Geen restrictie

 

Clusterexpertisegroep

 

Richtlijn oesofaguscarcinoom: Module 1 ‘Adjuvante immuuntherapie bij oesofaguscarcinoom’

Clusterlid

Functie

Nevenfuncties

Gemelde belangen

Ondernomen actie

Mostert

Internist-oncoloog, Erasmus MC

Consultancy voor: BMS, Lilly, Servier

- BMS: fase 2 studie: nivolumab tijdens actieve surveillance slokdarmcarcinoom Sanofi: cabazitaxel bij AR-v7 positieve prostaatcarcinoom patiënten Pfizer: DLA bij mammacarcinoompatiënten behandeld met CDK4/6

De NIV is gevraagd extra reviewers af te vaardigen tijdens de commentaarfase met betrekking tot de modules over immunotherapie en chemoradiatie.

Hoekstra

Internist-oncoloog, Ziekenhuisgroep Twente (ZGT)

Lid Concillium Medicinae Internae (onbetaald)

- Als internist-oncoloog betrokken bij inclusie van patiënten in klinische studies bij oesofagus- en maagcarcinoom. Op dit moment Critics-2 studie en Lyrics studie

Geen restrictie

Meijer

Patholoog, Amsterdam UMC

Geen

Geen

Geen restrictie

Richtlijn maagcarcinoom: Module 2 ‘Diagnostiek M-stadium’

Clusterlid

Functie

Nevenfuncties

Gemelde belangen

Ondernomen actie

Gisbertz

Slokdarmkanker en maagkanker chirurg - Amsterdam UMC

Geen

Geen

Geen restrictie

Van Det

Gastro-intestinaal chirurg

Ziekenhuis groep Twente (ZGT)

- Proctor/Instructor voor Intuitive Surgical betreffende Robot-Assisted operaties in de upper-GI zoals:

- Slokdarm resecties

- Maagresecties

- Hernia diafragmatica.

Geen

Geen restrictie

 

Richtlijn maagcarcinoom: Module 3 ‘Adjuvante therapie bij maagcarcinoom’

Clusterlid

Functie

Nevenfuncties

Gemelde belangen

Ondernomen actie

Rutten

Radiotherapeut, Radboud UMC

Geen

Geen

Geen restrictie

Slingerland

Internist-oncoloog LUMC

Geen

- Advisory board Lilly en BMS

De NIV is gevraagd extra reviewers af te vaardigen tijdens de commentaarfase met betrekking tot de modules over immunotherapie en chemoradiatie.

 

Richtlijn maagcarcinoom: Module 4 ‘Biomarker diagnostiek: PD-L1 bepaling’

Clusterlid

Functie

Nevenfuncties

Gemelde belangen

Ondernomen actie

Slingerland

Internist-oncoloog LUMC

Geen

- Advisory board Lilly en BMS

Geen restrictie

Van Grieken

Patholoog, Amsterdam UMC (locatie Vumc), Amsterdam

Detachering Expertisepanel poliepen BVO-DK, Screeningsorganisatie BVO darmkanker (3 uur/week)

- KWF - Identificatie van markers voor response op immunotherapie - projectleider

- KWF - CRITICS-II klinische trial voor resectabel maagcarcinoom

- ZonMW - Effect van chemotherapie bij patienten met microsatelliet instabiel resectabel maagcarcinoom. - projectleider

Geen restrictie

Kodach

Patholoog, NKI/AVL

Geen

Deelname studie inter-observer variabiliteit voor PD-L1 CPS in maagcarcinomen, gefinancierd door BMS, fee naar de werkgever AVL/NKI

Geen restrictie

Inbreng patiëntenperspectief

Er werd aandacht besteed aan het patiëntenperspectief door de afvaardiging van de Stichting voor Patiënten met kanker aan het Spijsverteringskanaal (SPKS). De verkregen input is meegenomen bij het opstellen van de uitgangsvragen, de keuze voor de uitkomstmaten en bij het opstellen van de overwegingen. De conceptmodule is tevens voor commentaar voorgelegd aan SPKS en de eventueel aangeleverde commentaren zijn bekeken en verwerkt.

 

Kwalitatieve raming van mogelijke financiële gevolgen in het kader van de Wkkgz

Bij de richtlijn is conform de Wet kwaliteit, klachten en geschillen zorg (Wkkgz) een kwalitatieve raming uitgevoerd of de aanbevelingen mogelijk leiden tot substantiële financiële gevolgen. Bij het uitvoeren van deze beoordeling zijn richtlijnmodules op verschillende domeinen getoetst (zie het stroomschema op de Richtlijnendatabase).

Uit de kwalitatieve raming blijkt dat er waarschijnlijk geen substantiële financiële gevolgen zijn, zie onderstaande tabel.

 

Module

Uitkomst raming

Toelichting

Module 1 ‘Adjuvante immuuntherapie’  

Geen financiële gevolgen

Uit de toetsing volgt dat de aanbeveling niet breed toepasbaar zijn (<5.000 patiënten) en daarom naar verwachten geen substantiële financiële gevolgen zal hebben voor de collectieve uitgaven.

Module 2 ‘M-stadiëring’  

Geen financiële gevolgen

Uit de toetsing volgt dat de aanbeveling niet breed toepasbaar zijn (<5.000 patiënten) en daarom naar verwachten geen substantiële financiële gevolgen zal hebben voor de collectieve uitgaven.

Module 3 ‘Adjuvante therapie’

Geen financiële gevolgen

Uit de toetsing volgt dat de aanbeveling niet breed toepasbaar zijn (<5.000 patiënten) en daarom naar verwachten geen substantiële financiële gevolgen zal hebben voor de collectieve uitgaven.

Module 4 ‘Biomarker diagnostiek: PD-L1 expressie’

Geen financiële gevolgen

Uit de toetsing volgt dat de aanbeveling niet breed toepasbaar zijn (<5.000 patiënten) en daarom naar verwachten geen substantiële financiële gevolgen zal hebben voor de collectieve uitgaven.

Werkwijze

AGREE

Deze richtlijnmodule is opgesteld conform de eisen vermeld in het rapport Medisch Specialistische Richtlijnen 2.0 van de adviescommissie Richtlijnen van de Raad Kwaliteit. Dit rapport is gebaseerd op het AGREE II instrument (Appraisal of Guidelines for Research & Evaluation II; Brouwers, 2010).

 

Need-for-update, prioritering en uitgangsvragen

Tijdens de need-for-update fase inventariseerde het cluster de geldigheid van de modules binnen het cluster. Naast de betrokken wetenschappelijke verenigingen en patiëntenorganisaties zijn hier ook andere stakeholders voor benaderd in juni 2021.

Per module is aangegeven of deze geldig is, kan worden samengevoegd met een andere module, obsoleet is en kan vervallen of niet meer geldig is en moet worden herzien. Ook was er de mogelijkheid om nieuwe onderwerpen voor modules aan te dragen die aansluiten bij één (of meerdere) richtlijn(en) behorend tot het cluster. De modules die door één of meerdere partijen werden aangekaart als ‘niet geldig’ zijn meegegaan in de prioriteringsfase. Deze module is geprioriteerd door het cluster.

 

Voor de geprioriteerde modules zijn door het cluster concept-uitgangsvragen herzien of opgesteld en definitief vastgesteld.

 

Uitkomstmaten

Na het opstellen van de zoekvraag behorende bij de uitgangsvraag inventariseerde het cluster welke uitkomstmaten voor de patiënt relevant zijn, waarbij zowel naar gewenste als ongewenste effecten werd gekeken. Hierbij werd een maximum van acht uitkomstmaten gehanteerd. Het cluster waardeerde deze uitkomstmaten volgens hun relatieve belang bij de besluitvorming rondom aanbevelingen als cruciaal (kritiek voor de besluitvorming), belangrijk (maar niet cruciaal) en onbelangrijk. Het cluster definieerde klinisch (patiënt) relevante verschillen, tenminste voor de cruciale uitkomstmaten.

 

Methode literatuursamenvatting

Een uitgebreide beschrijving van de strategie voor zoeken en selecteren van literatuur is te vinden onder ‘Zoeken en selecteren’ onder Onderbouwing. Indien mogelijk werd de data uit verschillende studies gepoold in een random-effects model. Review Manager 5.4 werd gebruikt voor de statistische analyses. De beoordeling van de kracht van het wetenschappelijke bewijs wordt hieronder toegelicht.

 

Beoordelen van de kracht van het wetenschappelijke bewijs

De kracht van het wetenschappelijke bewijs werd bepaald volgens de GRADE-methode. GRADE staat voor ‘Grading Recommendations Assessment, Development and Evaluation’ (zie http://www.gradeworkinggroup.org/). De basisprincipes van de GRADE-methodiek zijn: het benoemen en prioriteren van de klinisch (patiënt) relevante uitkomstmaten, een systematische review per uitkomstmaat, en een beoordeling van de bewijskracht per uitkomstmaat op basis van de acht GRADE-domeinen (domeinen voor downgraden: risk of bias, inconsistentie, indirectheid, imprecisie, en publicatiebias; domeinen voor upgraden: dosis-effect relatie, groot effect, en residuele plausibele confounding).

GRADE onderscheidt vier gradaties voor de kwaliteit van het wetenschappelijk bewijs: hoog, redelijk, laag en zeer laag. Deze gradaties verwijzen naar de mate van zekerheid die er bestaat over de literatuurconclusie, in het bijzonder de mate van zekerheid dat de literatuurconclusie de aanbeveling adequaat ondersteunt (Schünemann, 2013; Hultcrantz, 2017).

 

GRADE

Definitie

Hoog

  • er is hoge zekerheid dat het ware effect van behandeling dichtbij het geschatte effect van behandeling ligt;
  • het is zeer onwaarschijnlijk dat de literatuurconclusie klinisch relevant verandert wanneer er resultaten van nieuw grootschalig onderzoek aan de literatuuranalyse worden toegevoegd.

Redelijk

  • er is redelijke zekerheid dat het ware effect van behandeling dichtbij het geschatte effect van behandeling ligt;
  • het is mogelijk dat de conclusie klinisch relevant verandert wanneer er resultaten van nieuw grootschalig onderzoek aan de literatuuranalyse worden toegevoegd.

Laag

  • er is lage zekerheid dat het ware effect van behandeling dichtbij het geschatte effect van behandeling ligt;
  • er is een reële kans dat de conclusie klinisch relevant verandert wanneer er resultaten van nieuw grootschalig onderzoek aan de literatuuranalyse worden toegevoegd.

Zeer laag

  • er is zeer lage zekerheid dat het ware effect van behandeling dichtbij het geschatte effect van behandeling ligt;
  • de literatuurconclusie is zeer onzeker.

 

Bij het beoordelen (graderen) van de kracht van het wetenschappelijk bewijs in richtlijnen volgens de GRADE-methodiek spelen grenzen voor klinische besluitvorming een belangrijke rol (Hultcrantz, 2017). Dit zijn de grenzen die bij overschrijding aanleiding zouden geven tot een aanpassing van de aanbeveling. Om de grenzen voor klinische besluitvorming te bepalen moeten alle relevante uitkomstmaten en overwegingen worden meegewogen.

De grenzen voor klinische besluitvorming zijn daarmee niet één op één vergelijkbaar met het minimaal klinisch relevant verschil (Minimal Clinically Important Difference, MCID).

 

Met name in situaties waarin een interventie geen belangrijke nadelen heeft en de kosten relatief laag zijn, kan de grens voor klinische besluitvorming met betrekking tot de effectiviteit van de interventie bij een lagere waarde (dichter bij het nuleffect) liggen dan de MCID (Hultcrantz, 2017).

 

Overwegingen (van bewijs naar aanbeveling)

Om te komen tot een aanbeveling zijn naast (de kwaliteit van) het wetenschappelijke bewijs ook andere aspecten belangrijk en deze worden meegewogen, zoals aanvullende argumenten uit bijvoorbeeld de biomechanica of fysiologie, waarden en voorkeuren van patiënten, kosten (middelenbeslag), aanvaardbaarheid, haalbaarheid en implementatie. Deze aspecten zijn systematisch vermeld en beoordeeld (gewogen) onder het kopje ‘Overwegingen’ en kunnen (mede) gebaseerd zijn op expert opinion. Hierbij is gebruik gemaakt van een gestructureerd format gebaseerd op het evidence-to-decision framework van de internationale GRADE Working Group (Alonso-Coello, 2016a; Alonso-Coello 2016b). Dit evidence-to-decision framework is een integraal onderdeel van de GRADE methodiek.

 

Formuleren van aanbevelingen

De aanbevelingen geven antwoord op de uitgangsvraag en zijn gebaseerd op het beschikbare wetenschappelijke bewijs en de belangrijkste overwegingen, en een weging van de gunstige en ongunstige effecten van de relevante interventies. De kracht van het wetenschappelijk bewijs en het gewicht dat door het cluster wordt toegekend aan de overwegingen, bepalen samen de sterkte van de aanbeveling. Conform de GRADE-methodiek sluit een lage bewijskracht van conclusies in de systematische literatuuranalyse een sterke aanbeveling niet a priori uit, en zijn bij een hoge bewijskracht ook zwakke aanbevelingen mogelijk (Agoritsas, 2017; Neumann, 2016). De sterkte van de aanbeveling wordt altijd bepaald door weging van alle relevante argumenten tezamen. Het cluster heeft bij elke aanbeveling opgenomen hoe zij tot de richting en sterkte van de aanbeveling zijn gekomen.

In de GRADE-methodiek wordt onderscheid gemaakt tussen sterke en zwakke (of conditionele) aanbevelingen. De sterkte van een aanbeveling verwijst naar de mate van zekerheid dat de voordelen van de interventie opwegen tegen de nadelen (of vice versa), gezien over het hele spectrum van patiënten waarvoor de aanbeveling is bedoeld. De sterkte van een aanbeveling heeft duidelijke implicaties voor patiënten, behandelaars en beleidsmakers (zie onderstaande tabel). Een aanbeveling is geen dictaat, zelfs een sterke aanbeveling gebaseerd op bewijs van hoge kwaliteit (GRADE gradering HOOG) zal niet altijd van toepassing zijn, onder alle mogelijke omstandigheden en voor elke individuele patiënt.

 

Implicaties van sterke en zwakke aanbevelingen voor verschillende richtlijngebruikers

 

 

Sterke aanbeveling

Zwakke (conditionele) aanbeveling

Voor patiënten

De meeste patiënten zouden de aanbevolen interventie of aanpak kiezen en slechts een klein aantal niet.

Een aanzienlijk deel van de patiënten zouden de aanbevolen interventie of aanpak kiezen, maar veel patiënten ook niet. 

Voor behandelaars

De meeste patiënten zouden de aanbevolen interventie of aanpak moeten ontvangen.

Er zijn meerdere geschikte interventies of aanpakken. De patiënt moet worden ondersteund bij de keuze voor de interventie of aanpak die het beste aansluit bij zijn of haar waarden en voorkeuren.

Voor beleidsmakers

De aanbevolen interventie of aanpak kan worden gezien als standaardbeleid.

Beleidsbepaling vereist uitvoerige discussie met betrokkenheid van veel stakeholders. Er is een grotere kans op lokale beleidsverschillen. 

 

Organisatie van zorg

Bij de ontwikkeling van de richtlijnmodule is expliciet aandacht geweest voor de organisatie van zorg: alle aspecten die randvoorwaardelijk zijn voor het verlenen van zorg (zoals coördinatie, communicatie, (financiële) middelen, mankracht en infrastructuur). Randvoorwaarden die relevant zijn voor het beantwoorden van deze specifieke uitgangsvraag zijn genoemd bij de overwegingen. Meer algemene, overkoepelende, of bijkomende aspecten van de organisatie van zorg worden behandeld in de module Organisatie van zorg.

 

Commentaar- en autorisatiefase

De conceptrichtlijnmodule werd aan de betrokken (wetenschappelijke) verenigingen en (patiënt) organisaties voorgelegd ter commentaar.

De commentaren werden verzameld en besproken met het cluster. Naar aanleiding van de commentaren werd de conceptrichtlijnmodule aangepast en definitief vastgesteld door het cluster. De definitieve richtlijnmodule werd aan de deelnemende (wetenschappelijke) verenigingen en (patiënt)organisaties voorgelegd voor autorisatie en door hen geautoriseerd dan wel geaccordeerd.

 

Literatuur

Agoritsas T, Merglen A, Heen AF, Kristiansen A, Neumann I, Brito JP, Brignardello-Petersen R, Alexander PE, Rind DM, Vandvik PO, Guyatt GH. UpToDate adherence to GRADE criteria for strong recommendations: an analytical survey. BMJ Open. 2017 Nov 16;7(11):e018593. doi: 10.1136/bmjopen-2017-018593. PubMed PMID: 29150475; PubMed Central PMCID: PMC5701989.

 

Alonso-Coello P, Schünemann HJ, Moberg J, Brignardello-Petersen R, Akl EA, Davoli M, Treweek S, Mustafa RA, Rada G, Rosenbaum S, Morelli A, Guyatt GH, Oxman AD; GRADE Working Group. GRADE Evidence to Decision (EtD) frameworks: a systematic and transparent approach to making well informed healthcare choices. 1: Introduction. BMJ. 2016 Jun 28;353:i2016. doi: 10.1136/bmj.i2016. PubMed PMID: 27353417.

 

Alonso-Coello P, Oxman AD, Moberg J, Brignardello-Petersen R, Akl EA, Davoli M, Treweek S, Mustafa RA, Vandvik PO, Meerpohl J, Guyatt GH, Schünemann HJ; GRADE Working Group. GRADE Evidence to Decision (EtD) frameworks: a systematic and transparent approach to making well informed healthcare choices. 2: Clinical practice guidelines. BMJ. 2016 Jun 30;353:i2089. doi: 10.1136/bmj.i2089. PubMed PMID: 27365494.

 

Brouwers MC, Kho ME, Browman GP, Burgers JS, Cluzeau F, Feder G, Fervers B, Graham ID, Grimshaw J, Hanna SE, Littlejohns P, Makarski J, Zitzelsberger L; AGREE Next Steps Consortium. AGREE II: advancing guideline development, reporting and evaluation in health care. CMAJ. 2010 Dec 14;182(18):E839-42. doi: 10.1503/cmaj.090449. Epub 2010 Jul 5. Review. PubMed PMID: 20603348; PubMed Central PMCID: PMC3001530.

 

Hultcrantz M, Rind D, Akl EA, Treweek S, Mustafa RA, Iorio A, Alper BS, Meerpohl JJ, Murad MH, Ansari MT, Katikireddi SV, Östlund P, Tranæus S, Christensen R, Gartlehner G, Brozek J, Izcovich A, Schünemann H, Guyatt G. The GRADE Working Group clarifies the construct of certainty of evidence. J Clin Epidemiol. 2017 Jul;87:4-13. doi: 10.1016/j.jclinepi.2017.05.006. Epub 2017 May 18. PubMed PMID: 28529184; PubMed Central PMCID: PMC6542664.

 

Medisch Specialistische Richtlijnen 2.0 (2012). Adviescommissie Richtlijnen van de Raad Kwalitieit. http://richtlijnendatabase.nl/over_deze_site/over_richtlijnontwikkeling.html

 

Neumann I, Santesso N, Akl EA, Rind DM, Vandvik PO, Alonso-Coello P, Agoritsas T, Mustafa RA, Alexander PE, Schünemann H, Guyatt GH. A guide for health professionals to interpret and use recommendations in guidelines developed with the GRADE approach. J Clin Epidemiol. 2016 Apr;72:45-55. doi: 10.1016/j.jclinepi.2015.11.017. Epub 2016 Jan 6. Review. PubMed PMID: 26772609.

 

Schünemann H, Brożek J, Guyatt G, et al. GRADE handbook for grading quality of evidence and strength of recommendations. Updated October 2013. The GRADE Working Group, 2013. Available from http://gdt.guidelinedevelopment.org/central_prod/_design/client/handbook/handbook.html.

Zoekverantwoording

Zoekacties zijn opvraagbaar. Neem hiervoor contact op met de Richtlijnendatabase.

Volgende:
Follow-up