Diagnostische testen

Publicatiedatum: 25-09-2023

Beoordeeld op geldigheid: 25-07-2023

Uitgangsvraag

Welke diagnostische testen zet de hoofdbehandelaar in bij een verdenking op reuscelarteriitis in de tweede lijn?

Aanbeveling

Voer tenminste één diagnostische test uit bij patiënten met verdenking op reuscelarteriitis en hanteer bij voorkeur het stroomdiagram diagnostiek reuscelarteriitis daarbij (Figuur 1).

Overweeg bij klinische verdenking op reuscelarteriitis zo spoedig mogelijk na het starten van glucocorticoïden een diagnostische test te verrichten, zie Tabel 2

Overwegingen

Voor- en nadelen van de interventie en de kwaliteit van het bewijs

In de BSR-richtlijn wordt beschreven dat de meeste studies de diagnostische waarde van echografie (n=16) of c-MRI/MRA (n=7) hebben bestudeerd (Mackie, 2020). De vergelijking tussen het echografische ‘halo sign’ en de klinische diagnose van reuscelarteriitis (RCA) resulteerde in een gepoolde sensitiviteit van 79% (95% betrouwbaarheidsinterval (BI) 73% - 84%) en een specificiteit van 94% (95%BI 90% - 96%). Dit bewijs is door de commissie van de BSR-richtlijn beoordeeld met GRADE ‘matig’ vanwege het risico op bias. De vergelijking tussen craniële MRI (vaatwandoedeem en vaatwand ‘contrast enhancement’) met een klinische diagnose van RCA resulteerde in een gepoolde sensitiviteit van 75% (95%BI 69% - 80%) en een specificiteit van 89% (95%BI 84% - 93%). Dit bewijs is door de commissie van de BSR-richtlijn beoordeeld met GRADE ‘laag’ vanwege het risico op bias en het feit dat 5/6 studies uitgevoerd zijn door dezelfde onderzoeksgroep.

In de bovengenoemde aanvullende literatuur search beschrijven de meeste studies (n=5) de diagnostische waarde van het echografische ‘halo sign’ met de klinische diagnose volgens de huidige standaard als referentie. De sensitiviteit varieert van 33% tot 98% in de verschillende studies. De specificiteit varieert van 55% tot 100%. Gebrek aan standaardisatie van de echografie speelt hierbij een belangrijke rol. Onder meer werd bij het afbeelden van de arteria temporalis in de studies gebruik gemaakt van echoprobes met een wisselende en overwegend lage frequentie (4-18 MHz). Hierdoor zijn studies onderling lastig te vergelijken. Idealiter wordt de arteria temporalis tegenwoordig met echoprobes met een frequentie ≥ 18 MHz onderzocht. Ook zijn inmiddels kwalitatief betere (high-end) echomachines beschikbaar dan die gebruikt zijn in de studies. Dit heeft mogelijk een positieve invloed op de actuele sensitiviteit en specificiteit van echografie. In Appendix I van deze richtlijn vindt u hierover meer achtergrond. In Appendix III wordt de terminologie beschreven.

De twee, in de literatuur search benoemde, studies die de diagnostische waarde van craniële MRI (3D) hebben beschreven, met de klinische diagnose volgens de huidige standaard als referentie, laten een variatie van 73% tot 80% zien voor de sensitiviteit en 93% tot 100% voor de specificiteit. De bewijskracht van deze uitkomstmaten in de verschillende vergelijkingen wordt volgens de GRADE methode gegradeerd met laag. Dit wordt mede veroorzaakt doordat de resultaten mogelijk worden vertekend doordat de uitvoerder niet geblindeerd is voor de index en/of referentietest en omdat de studie populatie klein is.

De bewijsvoering voor de diagnostische testwaarden van de FDG PET/CT gebaseerd op de hierboven beschreven literatuur search, is gegradeerd als zeer laag omdat dit slechts gebaseerd is op één studie met de klinische diagnose als gouden standaard.

In verschillende studies worden de 1990 ACR criteria als referentie gebruikt. Echter, deze criteria zijn inmiddels onvoldoende geschikt om als referentie te gebruiken, aangezien ze verouderd zijn en geen recht doen aan het gehele spectrum van RCA (zie nadere toelichting hieronder bij het kopje vertaling naar de Nederlandse praktijk). Daarnaast zijn in verschillende studies ook een TAB en beeldvormende testen als referentie gebruikt. Dit kan in de betreffende studies van invloed zijn op de hoogte van de sensitiviteit en specificiteit van beeldvormende technieken. Er ontstaat een cirkelredenering indien binnen een studie een specifieke beeldvormende test als gouden standaard toegepast wordt en vervolgens ook een onderdeel vormt van de gebruikte classificatie criteria. Helaas geldt dit ook voor de TAB, wat momenteel als gouden standaard gezien wordt en onderdeel is van de 1990 ACR classificatie criteria, terwijl de sensitiviteit van deze test matig is en de specificiteit in principe geen 100% kan zijn doordat de interobserver variatie (kappa) tussen pathologen 0,62-0,81 is (Luqmani, 2016, Grossman, 2015; Zhou, 2009). Een ander belangrijk aspect is dat de prevalentie van het aantal patiënten met verdenking RCA wat daadwerkelijk met RCA gediagnosticeerd wordt verschilt tussen de studies. Dit kan mogelijk een vertekend beeld geven van de diagnostische testwaarden.

De overall bewijskracht is gezien alle bovengenoemde factoren laag tot zeer laag voor zowel echografie als de FDG PET/CT-scan en (craniële) MRI/MRA-scan.

Diagnostisch proces bij (verdenking) RCA: vertaling overwegingen en aanbevelingen naar Nederlandse praktijk gebaseerd op 1) BSR-richtlijn, 2) aanvullende search/literatuur en 3) Nederlandse expertise

De werkgroep is van mening dat noch de klinische diagnose, noch een TAB de perfecte referentiestandaarden zijn voor het evalueren van de diagnostische nauwkeurigheid van echografie, de FDG PET/CT-scan en c-MRI/MRA-scan bij RCA, omdat geen van beide 100% nauwkeurig zijn. Een klinische diagnose is gebaseerd op een combinatie van symptomen, bevindingen bij lichamelijk onderzoek en resultaten van specifieke laboratoriumtesten, die elk niet ziekte specifieke diagnostische uitkomsten zijn voor RCA. Daarmee bieden ze ook onvoldoende zekerheid om een zekere diagnose RCA te kunnen stellen dan wel uit te sluiten. Daarnaast blijft het essentieel om alternatieve diagnoses zoals een maligniteit dan wel infectie als oorzaak uit te sluiten.

De werkgroep is van mening dat de ACR-classificatie criteria voor arteriitis temporalis uit 1990 niet geschikt zijn om toe te passen in de huidige diagnostiek van RCA vanwege de volgende redenen: 1) De criteria zijn gedateerd en slechts ontwikkeld voor inclusie in wetenschappelijk onderzoek. 2) De criteria zijn niet gevalideerd voor het stellen van de diagnose. 3) Alleen craniële klachten (hoofdpijn, afwijkende arteria temporalis bij lichamelijk onderzoek en histologie van een arteria temporalis biopt) zijn meegenomen in de criteria, terwijl er vaak uitgebreidere arteriële betrokkenheid (zoals LV-RCA) is dan alleen de arteria temporalis. 4) Inmiddels zijn er meer diagnostische testen beschikbaar naast een TAB, zoals echografie, FDG PET/CT-scan en (craniële) MRI/MRA-scan. 5) Het CRP is niet meegenomen in de criteria. 6) De criteria gebaseerd zijn op kleine aantallen RCA-patiënten (N=214) afkomstig uit Noord-Amerika, zonder toepassing van een validatie cohort, en er alleen vergeleken is met andere vasculitiden. Toepassing van deze verouderde criteria kan daardoor zowel leiden tot ten onterecht niet classificeren als RCA of juist te laagdrempelig (en onterecht) wel classificeren als RCA (bijvoorbeeld een sinusitis waarbij ook voldaan wordt aan de ≥ 3 ACR 1990 criteria op basis van de combinatie van leeftijd ≥ 50 jaar, hoofdpijn en BSE ≥ 50 zonder dat de arteria temporalis of het arteria temporalis biopt afwijkend zijn).

In 2017 zijn gereviseerde classificatie criteria voor RCA opgesteld waarin meerdere craniële en extra-craniële symptomen, beeldvormede testen van grote arteriën (echografie, FDG PET/CT-scan, MRI/MRA-scan), lichamelijk onderzoek van extra-craniële arteriën en een verhoogde CRP toegevoegd zijn aan de originele 1990 criteria (Dejaco, 2017; Wiberg, 2022).

In 2022 zijn vernieuwde ACR/EULAR RCA classificatie criteria ontwikkeld en gepubliceerd (Ponte, 2022). Om deze criteria toe te mogen passen de volgende twee punten in overweging genomen te worden: de classificatie criteria worden toegepast om patiënten met een diagnose medium-vessel tot large-vessel vasculitis te classificeren als RCA; alternatieve diagnosen (zogenaamde vasculitis mimics) dienen uitgesloten te worden voordat de criteria toegepast mogen worden. Daarnaast dient voldaan te worden aan de absolute eis: leeftijd diagnose is ≥ 50 jaar. De classificatie criteria bestaat uit de som van de score voor 10 items (verschillende symptomen, bevindingen bij lichamelijk onderzoek, resultaten van de laboratorium test BSE en CRP, bevindingen van beeldvormende testen, uitslag TAB) met verschillende weging. Een minimum score van 6 is nodig om een patiënt als RCA te classificeren. Echter, deze ACR/EULAR criteria gelden net zoals de oudere en gereviseerde criteria slechts als classificatie criteria (en dus niet als diagnostische criteria), voor homogene inclusie van patiënten in wetenschappelijke studies. Het wordt derhalve ook afgeraden om deze vernieuwde classificatie criteria in de klinische praktijk als diagnostische criteria voor het stellen van de diagnose RCA toe te passen.

Naar aanleiding van de nieuwe BSR-richtlijn is nu voldoende bewijs om te stellen dat alle patiënten met RCA minstens één bevestigende diagnostische test zouden moeten ondergaan. Dit kan volgens de BSR een TAB of een echografie van zowel de arteria temporalis als axillaris zijn. TAB en echografie verschillen echter in hun diagnostische waardes voor RCA, waarbij TAB een relatief grotere “rule-in”-waarde heeft en echografie een relatief grotere “rule-out”-waarde (Mackie, 2020). De aanbeveling uit de BSR dat bij alle patiënten met RCA er minstens één bevestigende diagnostische test zou moeten plaatsvinden wordt overgenomen voor de Nederlandse situatie. Bij verdenking op RCA zijn derhalve, afhankelijk van de mate van klinische verdenking (laag, intermediair of hoog), altijd één of meer diagnostische testen geïndiceerd om de diagnose te kunnen stellen of verwerpen. Zoals reeds gesteld wordt in de BSR-richtlijn de voorkeur uitgesproken voor een TAB of, indien expertise beschikbaar is, (kleurendoppler) echografie van de arteria temporalis en axillaris of een combinatie van beide testen. In de Europese (EULAR) aanbevelingen uit 2018 wordt de voorkeur gegeven aan echografie mits een adequaat echoapparaat en expertise aanwezig zijn (Dejaco, 2018). Aangezien geavanceerde beeldvorming, zoals de FDG PET/CT-scan en c-MRI/MRA-scan, in Nederland uitgebreider beschikbaar en toegankelijk zijn in vergelijking met het Verenigd Koninkrijk, en deze testen bij specifieke manifestaties of vraagstellingen ook de voorkeur verdienen, zal de toepassing hiervan een prominente rol in de Nederlandse praktijk hebben en wordt op dit punt afgeweken van de BSR-richtlijn. Zo heeft bij verdenking op LV-RCA met aorta betrokkenheid de FDG PET/CT-scan de voorkeur boven de echografie, mits er een specifiek protocol en expertise voor beoordeling aanwezig is. De FDG PET/CT biedt tevens mogelijkheden voor beoordeling van betrokkenheid van verschillende craniële arteriën in RCA, echter ook hiervoor is een specifiek scanprotocol (langer scannen van het hoofd) vereist of het gebruik van nieuwe digitale, sensitievere PET camerasystemen. Hiermee is de FDG PET/CT ook een mogelijk alternatief voor echografie van craniële arteriën. Naar verwachting zullen nieuwe toekomstige ontwikkelingen bij PET/CT-scanners en specifieke inflammatie PET tracers verder kunnen bijdragen aan uitbreiding van deze toepassing. In lijn met zowel de BSR als EULAR aanbevelingen is echografie opgenomen als 1^ste diagnostische test in het diagnostisch proces - mits voldoende expertise en adequate echografie apparatuur aanwezig is - bij verdenking op overwegend craniële RCA (C-RCA) waarbij een TAB een alternatieve testoptie is. Bij verdenking op overwegend LV-RCA is de FDG PET/CT-scan het onderzoek van voorkeur, maar kan ook de echografie ingezet worden als 1^ste test (zie figuur 1 en Appendix I).

In de BSR-richtlijn is een diagnostisch stroomdiagram voor C-RCA opgenomen (zie figuur 1 BSR-richtlijn; Mackie, 2020). De werkgroep heeft besloten om dit stroomdiagram niet een op een over te nemen in deze richtlijn, omdat deze als dusdanig niet toepasbaar is voor de Nederlandse situatie en er onvoldoende onderscheid wordt gemaakt tussen overwegend C-RCA of LV-RCA. Om deze reden is een nieuw diagnostisch stroomdiagram ontwikkeld bestaande uit 2 delen: klinische verdenking overwegend C-RCA (deel 1) en klinische verdenking overwegend LV-RCA (deel 2). Het stroomdiagram is gebaseerd op de BSR-richtlijn en andere beschikbaar wetenschappelijk bewijs, en vervolgens aangepast op de Nederlandse situatie/expertise (Figuur 1 met toelichting). De werkgroep adviseert om dit stroomdiagram toe te passen bij de diagnostiek van RCA in de Nederlandse praktijk. In tegenstelling tot de BSR dient hierbij in eerste instantie op klinische gronden onderscheid gemaakt te worden tussen de verdenking op een overwegend C-RCA of LV-RCA om vervolgens tot de juiste vervolgtesten te komen. Aangezien het diagnostisch traject momenteel in ontwikkeling is, benadrukt de werkgroep dat het stroomdiagram een dynamisch schema is, dat moet worden aangepast op basis van nieuwe inzichten en ervaring. Zo is bijvoorbeeld de positie van de c-MRI/MRA-scan nog onvoldoende onderbouwd met wetenschappelijk literatuur, maar zijn er wel een aantal ziekenhuizen die inmiddels expertise opgedaan hebben bij patiënten met C-RCA.

De werkgroep is, in lijn met de BSR-richtlijn en EULAR-aanbevelingen (Dejaco, 2018), van mening dat voorafgaand aan de selectie van de meest geschikte bevestigende diagnostische test(en) een inschatting van de klinische waarschijnlijkheid op RCA (verdenking laag, intermediair of hoog) dient plaats te vinden. De GCA Probability Score (GCAPS) lijkt een belovend hulpmiddel om de waarschijnlijkheid op RCA in te schatten, maar de toepassing hiervan in de Nederlandse praktijk wordt nog niet geadviseerd vanwege onvoldoende wetenschappelijke validatie (zie Appendix II in Module Anamnese en lichamelijk onderzoek (reumatoloog/internist)).

De inschatting van klinische waarschijnlijkheid op RCA is het meest relevant bij verdenking op een overwegend C-RCA en is derhalve alleen opgenomen in deel 1 van het stroomdiagram. Deze inschatting vindt vervolgens plaats op basis van klinische bevindingen/overwegingen (anamnese, lichamelijk onderzoek, laboratoriumtesten zoals BSE en CRP) en ervaring/expertise aangezien vast omschreven vignetten wat als een laag, intermediair of hoge waarschijnlijkheid gezien wordt vooralsnog niet betrouwbaar te formuleren zijn voor de klinische praktijk.

Een uitgebreide toelichting op de specifieke diagnostische testen die verricht kunnen worden in het kader van het diagnostische proces is te vinden in Appendix I. In deze appendix staat ook praktische informatie over laboratoriumonderzoek beschreven ondanks dat dit niet opgenomen is in de uitgangsvraag.

Invloed glucocorticoïden (GC) op betrouwbaarheid diagnostische testen

Het gebruik van GC heeft een negatieve invloed op de betrouwbaarheid van alle diagnostische testen waardoor de sensitiviteit van de betreffende test geleidelijk afneemt. Hierdoor ontstaat het risico op een fout –negatieve test en dus de diagnose RCA onterecht verworpen wordt met alle nadelige gevolgen voor de patiënt van dien. De BSR-richtlijn geeft geen aanbeveling binnen welke termijn na het starten van GC de diagnostische testen verricht dienen te worden. De EULAR-aanbevelingen uit 2018 daarentegen adviseert om beeldvormende testen zo snel mogelijk, bij voorkeur binnen 1 week, na het starten van GC te verrichten (Dejaco, 2018). Echter, hierbij wordt geen onderscheid gemaakt tussen de verschillende diagnostische testen. Een specifieke aanbeveling per test (Tabel 2) zal om deze reden gebaseerd zijn op basis van Nederlandse expertise en de beschikbare literatuur hierover (zie Appendix II voor onderbouwing):

Tabel 2: Aanbevelingen m.b.t. binnen welke indicatief termijn na het starten van GC de diagnostische test bij voorkeur dient plaats te vinden (zie Appendix II):

Diagnostische test	Indicatief termijn na starten GC
TAB	≤ 7 dagen
Echografie	≤ 3 dagen
c-MRI/MRA-scan	≤ 5 dagen
FDG PET/CT-scan	≤ 3 dagen
CT-angiografie (CTA) scan	≤ 3-5 dagen

Waarden en voorkeuren van patiënten (en evt. hun verzorgers)

Het belangrijkste doel voor de patiënt van het uitvoeren van de verschillende diagnostische testen is om een betrouwbare diagnose RCA te stellen of met hoge zekerheid te verwerpen. De echografie is niet invasief en daarmee de minst belastende test voor de patiënt. Daarnaast kan hiermee snel, vaak dezelfde dag van het onderzoek, een diagnose gesteld dan wel uitgesloten worden. Wanneer een c-MRI/MRA wordt uitgevoerd, dient de patiënt zich bewust te zijn dat hij/zij langere tijd stil moet liggen op het bed van de scan in een lange MRI tunnel. Dit kan nadelig zijn voor patiënten met claustrofobie. Een ander nadeel van de MRI is dat er gebruik gemaakt wordt van contrastvloeistof (gadolinium) wat in zeldzame gevallen een allergische reactie kan veroorzaken. Bij FDG PET/CT-scan wordt een radioactieve vloeistof toegediend in de bloedbaan. Deze stof is niet gevaarlijk en verdwijnt binnen enkele uren uit het lichaam. Het FDG PET/CT-scan onderzoek zelf gaat wel gepaard met een beperkte dosis stralingsbelasting. Een voordeel van deze scan is dat het gehele lichaam tijdens één onderzoek in kaart wordt gebracht. Een nadeel is dat de patiënt gemiddeld 30 minuten stil moet liggen (bij nieuwe PET apparaten is dit maximaal 20 minuten, bij sommige zelfs 10 minuten) bij voorkeur met de armen omhoog. De meest invasieve test is de TAB. Deze test kan gepaard gaan met mogelijk complicaties zoals een bloeding of infectie. Daarnaast duurt het stellen van de diagnose langer in vergelijkingen met de andere testen. De voorkeur van patiënten gaat waarschijnlijk uit naar het uitvoeren van de echografie.

Kosten (middelenbeslag)

Diagnostische testen gaan gepaard met kosten voor deze onderzoeken en komen ten laste van het behandeltraject van de patiënt. Exacte uniforme bedragen kunnen niet gegeven worden omdat dit kan verschillen per ziekenhuis en afdeling. Daarnaast kunnen de kosten in de jaren veranderen bij toenemende toepassing en veranderende technologie. Grofweg is echografie het goedkoopste onderzoek, gevolgd door een MRI, TAB en het duurste onderzoek is een FDG PET/CT-scan.

Aanvaardbaarheid, haalbaarheid en implementatie

De voor- en nadelen van de verschillende diagnostische testen zijn van invloed op de aanvaardbaarheid, haalbaarheid en implementatie. Belangrijke aspecten om te benoemen zijn dat de TAB en FDG PET/CT-scan beide toegankelijk zijn in de Nederlandse praktijk. Voor een TAB is voldoende expertise van zowel de chirurg als patholoog essentieel en dient de lengte van het biopt te voldoen aan de minimale eis van ten minste 1 cm na fixatie. Het vereiste dat voldoende expertise gewenst is geldt ook voor de nucleair geneeskundige m.b.t de FDG PET/CT-scan, en voor de radioloog m.b.t. c-MRI/MRA. Daarentegen is de echografie momenteel nog te beperkt beschikbaar en de specifieke c-MRI/MRA nog nauwelijks beschikbaar in Nederland. Wanneer echografie of c-MRI/MRA worden ingezet als diagnostische test is van belang dat deze onderzoeken worden uitgevoerd door beeldvormende specialisten (zoals arts, vaatlaborant, KNF-laborant, etc.) die over voldoende expertise, ervaring en training beschikken om ze kwalitatief hoogwaardig uit te voeren (met de juiste instellingen), en beelden vervolgens betrouwbaar volgens bepaalde kwaliteitseisen te interpreteren. Momenteel is het aanbod van echografie en de c-MRI/MRA, inclusief de vereiste personen met de bijbehorende expertise, in Nederland nog te beperkt. Dit kan een negatieve invloed hebben op de snelheid van implementatie en uitvoerbaarheid van deze richtlijn en behoeft derhalve extra aandacht (zie module Organisatie van zorg).

Naast expertise is de tijdsduur van een onderzoek ook een belangrijk aspect voor de haalbaarheid en implementatie. Het daadwerkelijk uitvoeren van een echografie en c-MRI/MRA neemt ca. 60 minuten in beslag (exclusief interpretatie van beelden). Een FDG PET/CT-scan (exclusief voorbereiding en beoordeling door de nucleair geneeskundige) en TAB (ingreep door de chirurg exclusief de logistiek en beoordeling door de patholoog) nemen ca. 30 minuten in beslag. Echter, een FDG PET/CT-scan gaat gepaard met extra voorbereidingstijd van de patiënt waardoor het onderzoek meer tijdsinvestering zal kosten dan de andere diagnostische testen. Nieuwe ontwikkelingen met snellere total body PET/CT-scanners (door significant hogere resolutie en 1 field of view van het gehele lichaam) zullen in de toekomst snellere PET/CT-scan sessies mogelijk maken.

Rationale van de aanbevelingen:
In het diagnostische proces wordt de aanvankelijke (a priori) klinische verdenking centraal gesteld en is leidend in het vervolg van het diagnostisch proces. Het stellen van enkel een ‘klinische diagnose’ RCA op basis een combinatie van symptomen, lichamelijk onderzoek en bevindingen in laboratoriumonderzoek dient echter vermeden te worden. Derhalve stelt deze aanbeveling dat bij klinische verdenking op RCA tenminste één additionele diagnostische test zou moeten worden uitgevoerd en bij voorkeur het stroomdiagram diagnostiek RCA (figuur 1) daarbij te hanteren.
Voor wat betreft de keuze van een diagnostische test, wordt TAB nog steeds als goede optie gezien, maar niet meer als standaard eerste keus bij verdenking op C-RCA. De diagnostische testuitkomsten van TAB variëren in de literatuur, maar geven geen 100% accuratesse. Daarnaast is de test invasief en geeft het slechts een unilateraal segment van de a. temporalis weer. In de EULAR 2018 aanbevelingen is opgenomen dat echografie eerste keuze is als diagnostische test, vanwege het non-invasieve karakter en de mogelijkheid van weergave van een uitgebreid traject van beide temporaal arteriën alsmede andere voor echografie toegankelijke arteriën die bij C-RCA betrokken. Voorwaarde is dat adequate echografie apparatuur en C-RCA echografie expertise aanwezig zijn zowel bij de aanvrager als bij de uitvoerder hiervan. Daarnaast dient de echografie expliciet te zijn ingebed in het diagnostisch proces en, de kwaliteit gewaarborgd. Ook dienen “incidentele” verrichtingen door op dit gebied onervaren aanvragers en uitvoerders vermeden te worden (zie Appendix I). De TAB blijft bij verdenking op C-RCA, ondanks de genoemde beperkingen, wel het eerste alternatief voor de echografie gevolgd door de FDG PET/CT-scan en c-MRA.

Een beperkt aantal studies is verricht naar de diagnostische testwaarden van beeldvormende technieken voor LV-RCA. In de search voor de huidige richtlijn is LV-RCA niet meegenomen. De werkgroep vindt het echter belangrijk om wel een aanbeveling over de inzet van diagnostische testen bij klinische verdenking op overwegend LV-RCA te doen, aangezien het subtype van RCA de keuze van diagnostische testen bepaalt. Een definitief advies op basis van beperkte beschikbare literatuur kan niet worden gegeven. Echter op basis van de actuele wetenschappelijke inzichten en de expertise in de Nederlandse praktijk, wordt FDG PET/CT-scan of echografie als eerste test aanbevolen bij een klinische verdenking op overwegend LV-RCA. Beschikbare data tonen een hoge sensitiviteit en specificiteit van deze techniek bij LV-RCA. Bij beschikbaarheid van FDG PET/CT-scan en expertise van PET/CT LV-RCA beoordeling heeft deze techniek de voorkeur boven echografie, omdat het gehele lichaam en dus vrijwel alle potentieel bij LV-RCA betrokken arteriën in beeld kunnen worden gebracht. Daarnaast kunnen ziektebeelden uit de differentiaaldiagnose vaak worden aangetoond dan wel uitgesloten worden met een FDG PET/CT-scan. Het is echter een dure techniek, en gaat gepaard met stralenbelasting. Laagdrempelig kan derhalve als eerste stap ook echografie worden uitgevoerd, met een belangrijke beperking dat de aorta met deze vraagstelling niet adequaat kan worden beoordeeld. Als alternatief kan CTA worden overwogen, in het bijzonder als er gedacht wordt aan een vasculaire complicatie zoals een aneurysma, dissectie of stenose.
Alhoewel het vraagstuk van beïnvloeding van de diagnostische uitkomsten door gebruik van GC niet is meegenomen in de search van de huidige richtlijn, vindt de werkgroep het relevant om deze aanbeveling in de richtlijn op te nemen: Bij hoge klinische verdenking op overwegend C-RCA wordt aanbevolen om GC binnen 24 uur te starten, met name om oogheelkundige complicaties te voorkomen. De beschikbaarheid van de diagnostische testen moet geen vertragende factor zijn voor het starten van GC bij klinische verdenking op C-RCA. Bij verdenking op overwegend LV-RCA is het direct starten van GC niet direct noodzakelijk bij afwezigheid van craniële of oogheelkundige/neurologische manifestaties. Hierdoor is vaak ook meer tijd beschikbaar voor het diagnostische traject. Reeds spoedig na het starten van GC zal de diagnostische opbrengst van een diagnostische test afnemen. Gebaseerd op de huidige beschikbare literatuur, wordt aanbevolen om bij klinische verdenking op RCA zo spoedig mogelijk na het starten van GC een diagnostische test te verrichten: echografie en FDG PET/CT-scan binnen 3 dagen na start van GC, TAB binnen 7 dagen en c-MRI/MRA binnen 5 dagen. Voor CTA zijn geen gegevens beschikbaar, maar zou 3-5 dagen, in lijn met de bevindingen bij MRA, een pragmatisch advies zijn.

Onderbouwing

Achtergrond

Binnen Nederland bestaat er praktijkvariatie wat betreft het diagnostisch proces bij RCA. Van oudsher wordt de diagnose gesteld door middel van een temporale arterie biopsie (TAB), echter non-invasieve beeldvorming in de vorm van echografie, moleculaire beeldvorming (FDG PET/CT-scan) en craniële MRI/MRA-scan (c-MRI/MRA) worden steeds meer toegepast. Deze technieken bieden vele voordelen, maar kunnen alleen veilig worden toegepast wanneer deze goed zijn ingebed in het diagnostische proces en wanneer er expertise aanwezig is voor de uitvoering en interpretatie hiervan. Een uitdaging is dat patiënten zich bij verschillende specialismen kunnen presenteren. Daarnaast is de expertise m.b.t. RCA per praktijk wisselend. Ook bestaat er variatie in beschikbaarheid, standaardisering en kwaliteitswaarborging van diagnostische testen. Deze variatie leidt tot een heterogeen diagnostisch proces dat potentieel kan leiden tot onterecht gestelde of gemiste diagnosen. Echter, een snelle en juiste diagnose is een voorwaarde voor een adequate behandeling en het voorkomen van irreversibele orgaanschade (zoals visusverlies). Dit benadrukt het belang van deze Nederlandse multidisciplinaire richtlijn. Vaatwandbetrokkenheid bij beeldvorming kan ook aanwezig zijn in afwezigheid van lokale klinische verschijnselen en vice versa (Bosch, 2022; Michailidou, 2019).

Conclusies / Summary of Findings

Low GRADE

Evidence suggests that the sensitivity and specificity of ultrasound ‘halo sign’ for the diagnosis of GCA in patients with suspected GCA ranges from 33% to 98% and 55% to 100%, respectively, using the current clinical standard as reference.

Sources: Molina Collade (2021); Mukhtyar (2020); González Porto (2020); Conway (2019); Imfeld (2020).

Very low GRADE

Evidence is uncertain about the sensitivity and specificity of ultrasound ‘halo sign’ for the diagnosis of GCA in patients with suspected GCA, using temporal artery biopsy as reference.

Sources: Mukhtyar (2020).

Low GRADE

Evidence suggests that the sensitivity and specificity of 3D cranial MRI for the diagnosis of GCA in patients with suspected GCA ranges from 73% to 80% and 93% to 100%, respectively, using the current clinical standard as reference.

Sources: Rodriguez-Régent (2020), Poilon (2020).

Very low GRADE

Evidence is uncertain about the sensitivity and specificity of FDG PET/CT-scan for the diagnosis of GCA in patients with suspected GCA, using the current clinical standard as reference.

Sources: Imfeld (2020).

Samenvatting literatuur

First, we will describe the results of the BSR guideline (2020)

Then we will describe any new studies results we found in our current update.

BSR guideline

Diagnostic accuracy may be expressed as sensitivity or specificity, or as likelihood ratios; this information can be combined with the pre-test probability (established on clinical grounds) to select and interpret the results of confirmatory diagnostic tests. Compared to temporal artery biopsy, imaging tests such as color Doppler ultrasound have the advantage of access to both superficial temporal arteries in their entirety, and evaluation of other by ultrasound accessible arteries such as the axillary arteries. In the BSR guideline, most diagnostic accuracy studies have focused on the role of ultrasound (n=16) or MRI (n=7). One study addressed the role of FDG PET/CT-scan, and another study examined the role of the FDG PET/CT-scan and CT angiography for GCA diagnosis.

Ultrasound ‘halo sign’ vs. current practice

Seven studies (519 patients with suspected GCA, of whom 169 were diagnosed with GCA) compared the ultrasound ‘halo sign’ with a clinical diagnosis of GCA, giving a pooled sensitivity of 79% (95% CI: 73%-84%) and pooled specificity of 94% (95% CI: 90%-96%) (Aschwanden, 2013; Diamantopoulos, 2014; Habib, 2012; Karahaliou, 2006; Nesher, 2002; Reinhard, 2004; Schmidt, 1997). Quality of evidence (QoE) was +++; downgrading was performed because of risk of bias in 4/7 studies. One of these studies included 12 patients with a final diagnosis of LV-GCA (Diamantopoulos, 2014).

Ultrasound ‘halo sign’ vs. temporal artery biopsy

Five studies (185 patients with suspected GCA, of whom 57 were diagnosed with GCA) compared the ultrasound ‘halo sign’ with temporal artery biopsy, giving a pooled sensitivity of 74% (95% CI: 63%-83%) and pooled specificity of 81% (95% CI: 73%-88%) (Nesher, 2002; Reinhard, 2004; Schmidt, 1997; LeSar, 2002; Murgatroyd, 2003). QoE was +; downgrading was performed because of high risk of bias in all 5 studies, and because of inconsistency. Patients with LV-GCA were not evaluated in these studies.

Ultrasound ‘compression sign’ vs. current practice

Two studies (140 patients with suspected GCA, of whom 67 were diagnosed with GCA) compared the ultrasound ‘compression sign’ of temporal arteries with ACR criteria-based diagnosis of GCA, giving a pooled sensitivity of 79% (95% CI: 67%-88%) and a pooled specificity of 100% (95% CI: 95-100) (Aschwanden, 2013; 2015). QoE ++; downgrading was performed for risk of bias in one of the studies, and for the fact that both studies were performed by the same research group. The ACR criteria for GCA, which are not suitable for clinical diagnosis, served as reference standard in both studies.

Ultrasound vs. clinical practice

Three studies (560 patients with suspected GCA, of whom 327 had a clinical diagnosis of GCA) compared the diagnostic performance of ultrasound abnormality (defined as any one of halo sign, stenosis or occlusion) with clinical diagnosis of GCA, giving a pooled sensitivity of 61% (95% CI: 56%-67%) and pooled specificity of 86% (95% CI: 81%-90%) (Luqmani, 2016; Schmidt, 1997; Pfadenhauer, 2003). QoE ++; downgrading was performed for risk of bias in all three studies, and for inconsistency.

Four studies (563 patients with suspected GCA, of whom 180 had a positive temporal artery biopsy) compared the diagnostic performance of ultrasound abnormality (defined as any one of halo sign, stenosis or occlusion) with temporal artery biopsy, giving a pooled sensitivity of 81% (95% CI: 74%-86%) and pooled specificity of 74% (95% CI: 70%-79%) (19, 62, 66, 67). QoE ++; downgrading was performed for risk of bias in three of the four studies, and for imprecision.

A positive temporal artery biopsy, showing features of inflammation characteristic of GCA such as giant cells or panarteriitis (Lie, 2018), confirms the diagnosis of GCA. Although the true sensitivity of temporal artery biopsy is not precisely known, it is accepted that its sensitivity is substantially less than 100%; this is supported by the histological observation of skip lesions in some cases. An imperfect reference standard would result in underestimation of the diagnostic accuracy of ultrasound. When using clinical diagnosis as a reference standard it is important that this is made independently of the index test result in order to avoid bias; this may be done by blinding of the diagnostician to the index test result. Notably a large prospective UK study assessing the diagnostic value of ultrasound addressed this issue by blinding the patient, the treating clinician and the investigator to the ultrasound result (Luqmani, 2016). Ultrasound was found to be more sensitive but less specific than biopsy for diagnosis of GCA, was cost-effective and provided scope for reducing the number of patients who need a temporal artery biopsy (Luqmani, 2016). Overall, the pooled positive and negative likelihood ratios for ultrasound appear to support its use either for ruling out GCA in low-probability cases or for confirming GCA in high-probability cases (Appendix E and Figure 1 of the BSR publication (Mackie, 2020)). Ultrasound of the axillary arteries might add extra diagnostic information to ultrasound of the temporal arteries (Schmidt, 2018).

Cranial MRI vs. current practice

Six studies (500 patients with suspected GCA, of whom 268 were finally diagnosed with GCA) compared cranial artery MRI (vessel wall oedema and contrast enhancement) with clinical diagnosis, giving a pooled sensitivity of 75% (95% CI: 69%-80%) and a pooled specificity of 89% (95% CI: 84%-93%) (Bley, 2007; Bley, 2005; Geiger, 2010; Klink, 2014; Siemonsen, 2015; Rheaume, 2017). QoE ++; downgrading was performed for risk of bias in five of the studies, and for the fact that five of the six studies were performed by the same research group; sensitivity was somewhat lower in the study performed by a separate group (Rheaume, 2017).

Cranial MRI vs. temporal artery biopsy

Five studies (397 patients with suspected GCA, of whom 171 had positive temporal artery biopsy) compared cranial artery MRI (vessel wall oedema and contrast enhancement) with temporal artery biopsy, giving a pooled sensitivity of 94% (95% CI: 90%-97%) and specificity of 79% (95% CI: 73%-84%) (Bley, 2007; Bley, 2005; Geiger, 2010; Klink, 2014; Rheaume, 2017). QoE +; downgrading was performed for risk of bias in five of the studies, for inconsistency, and for likely publication bias.

Overall, MRI of the cranial arteries appears to be potentially useful for ruling out GCA if the result is negative, but false positive test results could occur, such that MRI of the cranial arteries would not be first choice for a confirmatory test in GCA (Rheaume, 2017). Other issues of relevance to cranial vascular MRI are low availability of high-resolution 3T MRI equipment and expertise, higher costs and possible adverse effects of contrast agents.

In contrast to the 2010 guideline, where the authors outlined that imaging techniques are promising for diagnosis and monitoring of GCA (Dasgupta, 2010), in this guideline there is now sufficient evidence, taken together, to state that all patients with GCA should have at least one confirmatory diagnostic test, which could be either temporal artery biopsy, or temporal and axillary artery ultrasound. However, temporal artery biopsy and ultrasound differ in their positive and negative likelihood ratios for GCA, with biopsy having relatively greater “rule-in” value and ultrasound having relatively greater “rule-out” value (Appendix E). Selection of the most appropriate confirmatory diagnostic test(s) therefore requires an assessment of the pre-test probability as outlined elsewhere (Dejaco, 2018); if both ultrasound and biopsy are possible, an approach to this is suggested in Figure 1 of the BSR guideline by Mackie (2020).

The ultrasound ‘halo sign’ diminishes in size during the first week of glucocorticoid therapy, indicating that sensitivity of the test is likely to depend on the delay between initiation of glucocorticoid therapy and the ultrasound test (Luqmani, 2016). Ultrasound is operator-dependent and requires adequate training. Ultrasound performs best in the “fast-track” setting, assuming rapid access, good technical equipment and high expertise with this method. With ultrasound, the non-compressible ’halo’ sign is the most important finding suggesting GCA (Chrysidis, 2018). Temporal artery biopsy should be performed by a surgeon experienced in this procedure, and samples should be at least 1cm in length post-fixation. The pathologist evaluating the biopsy should be experienced in diagnosing GCA. Data from the TABUL study (Luqmani, 2016) suggested significant variation between pathologists in the interpretation of temporal artery biopsy histology, so where biopsy findings are ambiguous (eg low-level inflammation restricted to the adventitia), discussion between the requesting clinician and the pathologist is desirable. In the absence of inflammatory infiltrate, a report of healed arteritis is not sufficient to diagnose GCA. Isolated vasa vasorum vasculitis is not diagnostic of GCA. Contralateral biopsy may slightly increase the yield of temporal artery biopsy, but is usually unnecessary. Biopsy may remain positive for several weeks after initiation of glucocorticoid therapy (Jakobsson, 2016).

Evaluation of involvement of the aorta and its proximal branches (LV-RCA)

FDG PET/CT-scan vs. current practice

One study (24 patients with suspected GCA, of whom 15 were diagnosed with GCA) compared FDG PET/CT-scan with clinical diagnosis of GCA, giving a sensitivity of 67% (95% CI: 38%-88%) and a specificity of 100% (95% CI: 66% to 100%) (Lariviere, 2016). QoE ++; downgraded because of indirectness and publication bias.

FDG PET/CT-scan vs. temporal artery biopsy

One study (69 patients with suspected GCA/PMR, of whom 13 had biopsy evidence of GCA) compared vascular 18F-glucose uptake in thorax and legs on FDG PET/CT-scan with temporal artery biopsy, giving a sensitivity of 77% (95% CI: 46% to 95%) and specificity of 66% (95% CI: 52% to 78%). Comparing vascular 18F-glucose uptake in thorax on FDG PET/CT-scan with temporal artery biopsy gave a sensitivity of 54% (95% CI: 25% to 81%) and specificity of 86% (95% CI: 74% to 94%). QoE +; downgraded because of risk of bias, indirectness and imprecision (Blockmans, 2000).

CT angiography vs. current practice

One study (24 patients with suspected GCA, of which 15 were diagnosed with GCA) compared CT angiography (CTA) with clinical diagnosis of GCA, giving a sensitivity of 73% (95% CI: 45%-92%) and specificity of 78% (95% CI: 40%-97%) (Lariviere, 2016). QoE++; downgraded for indirectness and publication bias. CTA can reveal wall thickening with contrast enhancement in biopsy-proven GCA (Prieto-González, 2012). There is also experience with CTA for accurate assessment of luminal diameter for large vessel stenosis in Takayasu arteritis (Yamada, 1998).

No studies of MR angiography for the diagnosis of LV-GCA were found meeting our criteria, but there is experience with MRI for detection of vessel wall oedema reflective of inflammation and accurate assessment of luminal diameter for large vessel dilatation and stenosis in diseases of the major arteries, such as Takayasu arteritis. Gadolinium-enhanced MR angiography may help identify aortitis in the large-vessel vasculitides but appears to be very sensitive to glucocorticoid therapy (Adler, 2017).

In addition to showing inflammation of the large vessels, whole body FDG PET/CT-scan may detect malignancy or infection so can be of use in the differential diagnosis of GCA. Contrast-enhanced CT of the chest and abdomen is also often used in clinical practice to screen for deep infection or occult malignancy. Moreover, aortic wall thickening on a contrast CT might help to identify GCA, albeit with lower sensitivity than FDG PET/CT-scan and could also potentially have uses in settings where FDG PET/CT-scan is unavailable (Lariviere, 2016; Vaidyanathan, 2018; Muto, 2014). Additional advantages of FDG PET/CT-scan and CT therefore include potential value in the workup of alternative diagnoses such as malignancy and infection.

As well as detecting axillary artery involvement for diagnosis of large-vessel involvement in GCA, vascular ultrasound may also be able to visualize the carotid arteries and obtain more limited views of the subclavian arteries, vertebral arteries, and parts of the aorta, but a higher level of operator expertise is required for these studies.

Overall, there is indirect evidence for the use of imaging tests to evaluate involvement of the aorta and its proximal branches in GCA, but the published evidence is extrapolated from other diseases such as Takayasu arteritis (Dejaco, 2018) and there is currently insufficient evidence from prospective studies of suspected GCA to yield precise estimates of diagnostic accuracy for these tests.

Update

An overview of the results is shown in Table 1.

Description of studies

Halo-sign ultrasound vs clinical practice and/or TAB:

Molina Collada (2021) performed a retrospective cohort study to investigate the diagnostic values of ultrasound (US) ‘halo sign’ compared with the gold standard for GCA diagnosis (i.e., clinical confirmation after 6-month follow-up). Patients with suspected GCA were referred to the fast-track pathway and underwent US examination within 24 h. temporal artery biopsy (TAB) and FDG PET/CT-scan were performed according to the treating clinician criteria. The gold standard for GCA diagnosis was clinical confirmation after 6-month follow-up. Patients underwent bilateral US examination of the three temporal artery (TA) segments (common superficial TA, its parietal and frontal branches) and extracranial (carotid, subclavian, and axillary) arteries. This was performed by the same assessor, using EsaoteMyLab8 with a high-frequency (12-18MHz) transducer. Ethical approval for this study was obtained. Importantly, the ultrasonographer was not blinded to clinical data.

In total 64 patients who were referred to the fast-track pathway were included in the analysis. In total 65% of them were female, and the mean age was 75.3 years. The prevalence of GCA was 26.5%, meaning that 17 patients were diagnosed with GCA. Limitation of the current study were the retrospective design, assessors were not blinded regarding clinical data, and steroids were used at baseline by 40-52% of the included patients.

Mukhtyar (2020) performed a cohort study to investigate the diagnostic values of US ‘halo sign’ compared with the gold standard for GCA diagnosis and TAB. Patients were eligible for inclusion if they had an US within 7 days and TAB within 28 days of commencing high-dose prednisolone. The first author performed US on a Toshiba Viamo US machine with a linear transducer (4–14 MHz) using tissue harmonic imaging mode. Under local anesthesia by an ophthalmic surgeon, TAB was performed. The following definitions of results were used: the US was defined as positive in the presence of non-compressible vessel wall oedema (the ‘halo sign’) in longitudinal and transverse views, stenosis or obstruction. TAB was defined as positive in the presence of intramural inflammatory infiltrate. Clinical decisions were recorded as GCA if clinicians chose to treat patients with the hospital-approved Norwich regimen for prednisolone. In total 25 patients were included in the analysis. Detailed information about patient characteristics is missing. Twenty of them were initial treated as GCA, after 100 days 16 of them were treated as GCA (i.e., prevalence 64%). Limitations of the current study were the relatively low sample size, and the fact that patients underwent first US and TAB (possible that if US was negative, TAB was not performed).

González Porto (2020) performed a prospective cohort study to investigate the diagnostic values of US ‘halo sign’ compared with the gold standard for GCA diagnosis (i.e., clinical confirmation after 6-month follow-up). Patients with suspected temporal arteritis were eligible for inclusion. At the second visit, patients underwent diagnostic tests; 1. Colour Doppler ultrasound on both temporal arteries using Mindray Z6 ultrasound machine with 7L4P linear probe. All US examinations were performed by the same assessor; 2. 36h after US, temporal artery biopsy was performed. Ethical approval for this study was obtained.

In total 57 patients were included, of them 4 dropped out due to refusing biopsy (n=1) or died (n=3). Of the included patients 21/57 were diagnosed with GCA (prevalence of 37%). Of the patients with GCA, 62% were female, and their mean age was 77 years. A limitation of the current study was the fact that assessors were not blinded regarding clinical data.

Conway (2019) performed a study using prospectively collected data from a registry. The aim of the study was to evaluate the performance of temporal artery ultrasound (TA US) compared to expert clinical judgement in patients presenting with suspected GCA. If patients were referred with a new presentation of GCA and had both a TA US and TAB, they were eligible for inclusion. TA US were performed on a Philips iU22 scanner using a high-frequency linear array 12-MHz transducer. The examination was considered positive for temporal arteritis if any segment of the TA demonstrated circumferential hypoechoic mural thickening

(the ‘halo’ sign). The clinical diagnosis was made by a rheumatologist at 6 months. Importantly, the rheumatologist was not blinded to the other results. Ethical approval for this study was obtained.
In total 291 patients were recruited from the registry. Of them 256 were followed up for at least 6 months. However, only 169 of them had both a TA US and TAB. In addition, seven of them had non-arterial TAB specialism. Therefore only 162 patients were included in the analysis. Of the included patients 123/162 were diagnosed with GCA (prevalence of 76%). Of the patients with GCA, 59% were females, and their mean age was 71 years. A limitation of the current study is the fact that the reference standard (i.e., clinical diagnosis) is not described in detail, and that the assessors (i.e., rheumatologists) were not blinded to the outcome of TA US.

Cranial MRI vs clinical practice

Rodriguez-Régent (2020) performed a prospective cohort study to investigate the diagnostic values of cranial MRI (CUBE T1, 3D) compared with the gold standard for GCA diagnosis. Patients with suspected CGA who underwent a cranial MRI as part of the diagnosis process were included. Exclusion criteria were a previous TAB or glucocorticoid (GC) therapy >48 before the MRI scan. All patients underwent the ‘usual’ diagnostic process as well. Based on this work-up (i.e., usual), the medical doctor established the final diagnosis. MRI was performed in 1 center on a 3T unit (MR 750, GE Healthcare). The images were reviewed independently by two experienced neuroradiologists, blinded to all other data. Ethical approval for this study was obtained.
In total 32 patients were included, of them 78% were female and the mean age was 70.2 years. Of the included patients 10/32 were diagnosed with GCA (prevalence of 31%). A limitation of the current study is the fact that the reference standard (i.e., clinical diagnosis) is not described in detail. In addition, the 3 T MRI is not widely available.

Poillon (2020) performed a prospective cohort study to investigate the diagnostic values of a cranial MRI (2D and 3D) compared with the clinical diagnosis according to the ACR criteria in patients with suspected GCA. If patients met the in- and exclusion criteria, they were eligible for inclusion. Cranial MRI was performed on a 3 T Philips Ingenia or a 3 T General Electric Discovery MRI with a dedicated 32-channel head coil. The clinical diagnosis of GCA was made in the case of a positive TAB. In patients with a negative TAB, a review of the clinical and biological charts including response to corticosteroid therapy (but excluding MRI findings) was performed by an interdisciplinary panel of rheumatologists and internists not involved in the management of the patient. By consensus, they established a final diagnosis of GCA based on ACR criteria. Importantly, the rheumatologist was blinded to the other results. Ethical approval for this study was obtained.
In total 79 patients were included, of them 51 were diagnosed with GCA (prevalence of 64%). Of all patients, 47% were female, and the median was 75 years. A limitation of the current study is that no 95% confidence intervals were presented by the diagnostic values, and it was not possible to calculate these. In addition, the 3 T MRI is not widely available.

Ultrasound and FDG PET/CT vs. clinical practice or TAB

Imfeld (2020) performed a (prospective) cohort study to investigate the diagnostic values of ultrasound (US) and FDG PET/CT-scan compared with the clinical diagnosis according to the ACR criteria in patients with suspected GCA. If patients were referred with a new presentation of GCA and had both a US and FDG PET/CT-scan, they were eligible for inclusion. The FDG PET/CT-scan was performed with a GE Discovery STE 16 or Siemens Biograph. A cut-off value of 1.3 was used. US was performed with Philips iU22 with 17-5 MHz and 9-3 MHz broadband transducers. The final diagnosis of GCA was made either if TAB was positive if patients fulfilled the 1990 ACR criteria, or if they fulfilled at least two out of five ACR criteria in combination with typical ‘vasculitic’ US findings in accordance to the OMERACT definition or vasculitic findings in FDG PET/CT-scan or MRI.

Importantly, the rheumatologist was blinded to the other results. Ethical approval for this study was obtained.

In total 102 patients were included, of them 68 were diagnosed with GCA (prevalence of 66%). Of these patients, 65% were female, and the median age was 75 years. One limitation of the current study is that no 95% confidence intervals were presented by the diagnostic values, and it was not possible to calculate these.

Results

Results of the included studies are descripted per comparison. An overview of the results is shown in Table 1.

Ultrasound ‘halo sign’ vs. current practice

Molina Collada (2021) described the diagnostic values of ultrasound (US) with current practice as reference. The study showed that 15/17 GCA patients with a positive US were true positives, and that 45/47 patients without GCA and with a negative US were true negatives. 2/17 GCA patients with a negative US were false negative, and 2/47 patients without GCA were classified as false positive. This results in a sensitivity of 88% (95%CI 64% to 99%), and a specificity of 96% (95%CI 85% to 98%).

The study of Mukhtyar (2020) described the diagnostic values of US with current practice as reference. The study showed that 14/14 patients with a positive US were true positives, and that 9/11 patients with a negative US were true negatives. 2/11 patients with a negative US were false negative, and none of them were classified as false positive. This results in a sensitivity of 88% (95%CI 62% to 98%), and a specificity of 100% (95%CI 66% to 100%).

The study González Porto (2020) described the sensitivity and specificity values of US (i.e., halo sign, temporal stenosis, arterial occlusion) with current practice as reference. The sensitivity (95%CI) for US halo sign, temporal stenosis, arterial occlusion was 33% (15% to 57%), 14% (3% to 36%) and 10% (1% to 30%), respectively. The specificity (95%) for US halo sign, temporal stenosis, arterial occlusion was 69% (51% to 83%), 94% (81% to 99%), 97% (85% to 100%), respectively.

Conway (2019) described the sensitivity and specificity values of temporal artery US (i.e., halo sign) with current practice as reference. 65/123 patients with GCA had a positive US, resulting in a sensitivity of 53% (95%CI 44% to 62%). 28/39 patients without GCA had a negative US, resulting in a specificity of 72% (95%CI 55% to 85%).

Imfeld (2020) described the sensitivity and specificity of US compared with current clinical practice as reference standard. This resulted in a sensitivity of 57%, and a specificity of 97%.

Overall, based on 5 studies including 449 patients, the sensitivity ranges from 33% to 98%, and the specificity ranges from 55% to 100%.

Ultrasound ‘halo sign’ vs. TAB

The study of Mukhtyar (2020) described the diagnostic values of ultrasound with temporal artery biopsy (TAB) as reference. The study showed that 7/8 patients with a positive US were true positives, and that 10/17 patients with a negative US were true negatives. 1/8 patient with a negative US was false negative, and 7/17 patients with a positive US were false positives. This results in a sensitivity of 88% (95%CI 47% to 100%), and a specificity of 59% (33% to 82%).

Cranial MRI vs. current practice

Rodriguez-Régent (2020) described the diagnostic values of cranial MRI (i.e., mural enhancement in 3D post-contrast CUBE T1) compared with the clinical diagnose as reference. 8/10 patients with GCA had a positive cranial MRI, and none of the patients without GCA. This resulted in a sensitivity of 80% and a specificity of 100%.

Poillon (2020) described the diagnostic values of cranial MRI (i.e., 2D contrast-enhanced vessel-wall, axial only 3D contrast-enhanced vessel-wall reformatted 3D contrast-enhanced vessel-wall) compared with the clinical diagnose as reference. This resulted in a sensitivity of 70%, 73%, 80%, and a specificity of 85%, 93%, 100% for 2D, axial only 3D, and reformatted 3D, respectively.

Overall, based on 2 studies including 111 patients, the sensitivity ranges from 73% to 80%, and the specificity ranges from 93% to 100% using a 3D MRI.

FDG PET/CT-scan vs. current practice

Imfeld (2020) described the sensitivity and specificity of FDG PET/CT-scan compared with current clinical practice as reference standard. This resulted in a sensitivity of 72%, and a specificity of 85%. By using the US and FDG PET/CT-scan both, the sensitivity increased to 88%, and the specificity decreased to 82%.

Table 1: Summary of results.

Diagnostic test	Study	Reference test	Number of patients	Sensitivity	Specificity
Ultrasound ‘halo sign’	Molina Collada (2021)	Clinic: 6 mnd FU	N=64	88%	96%
	Mukhtyar (2020)	Clinic: 100 wks FU	N=25	88%	100%
	González Porto (2020)	Clinic: 6 mnd FU	N=57	33%	69%
	Conway (2019)	Clinic: 6 mnd FU	N=291	53%	72%
	Imfeld (2020)	TAB OR partially ACR criteria OR FDG PET/CT-scan OR Ultrasound ‘halo sign’	N=102	97%	57%
	Mukhtyar (2020)	TAB	N=25	88%	59%
	González Porto (2020)	temporal stenosis	N=25	14%	84%
	González Porto (2020)	arterial occlusion	N=25	10%	97%
MRI 3D	Rodriguez-Régent (2020)	TAB	N=32	80%	100%
	Poillon (2020)	ACR-criteria / TAB	N=79	70% (2D) 80% (3D)	85% (2D) 100% (3D)
FDG PET/CT-scan	Imfeld (2020)	TAB or partiallyACR-criteria or + FDG PET/CT-scan OR Ultrasound ‘halo sign’	N=102	85% 82% (PET/CT-scan of echo)	72% 88% (PET/CT-scan of echo)

FU= follow-up, TAB= temporal artery biopsy

Level of evidence of the literature

The level of evidence of the literature from the BSR-guideline (Mackie, 2020) was not assessed in the current summary of literature. We refer to this publication for detailed information.

The level of evidence (GRADE method) of literature included in the update was determined per comparison and diagnostic outcome measure and was based on results from diagnostic accuracy studies and therefore starts at level “high”. Subsequently, the level of evidence was downgraded if there were relevant shortcomings in one of the several GRADE domains: risk of bias, inconsistency, indirectness, imprecision, and publication bias.

Ultrasound ‘halo sign’ vs. current practice

The level of evidence regarding the outcome measure sensitivity and specificity was downgraded by 2 levels because of risk of bias (1 level, assessors were not blinded in 3/4 studies), and imprecision (1 level, wide ranges).

Ultrasound ‘halo sign’ vs. TAB

The level of evidence regarding the outcome measure sensitivity and specificity was downgraded by 3 levels because of risk of bias (assessors were aware of result of the index test) and, imprecision (2 levels, only one study of 25 patients).

Cranial MRI vs. current practice

The level of evidence regarding the outcome measure sensitivity and specificity was downgraded by 2 levels because of risk of bias (potential selection bias in one of the included studies) and, imprecision (1 levels, only two studies including in 111 patients in total).

FDG PET/CT-scan vs. current practice

The level of evidence regarding the outcome measure sensitivity and specificity was downgraded by 3 levels because of risk of bias (potential selection bias in the included study) and, imprecision (2 levels, only one study of 102 patients).

Zoeken en selecteren

Een update van het literatuuronderzoek van de BSR- richtlijn is uitgevoerd voor de huidige uitgangsvraag. Het betreft hier een search naar nieuwe literatuur tot en met 2021. Medio 2022 gepubliceerde artikelen zijn niet meer meegenomen in deze search.

Search and select

What is the value of diagnostic tests compared to current practice in the diagnosis of giant cell arteritis (GCA)?

P: patients with suspected GCA

I: additional diagnostic tests (advanced imaging, MRI, FDG-PET, duplex and B-mode ultrasound of cranial and/or extracranial arteries, halo sign, halo score, halo-to-Doppler ratio, compression sign, stenosis, occlusion, intima-media thickness, intima-media

thickness, adventitia thickness and slope sign, very-high frequency ultrasound, very-high resolution ultrasound)

C: current practice ((clinical diagnosis (without formal criteria), ACR classification criteria and temporal artery biopsy result; GCA with/without extra-cranial)

O: diagnostic value, diagnostic accuracy, diagnosis, reliability, utility, visual loss, patient reported outcomes

Relevant outcome measures

The guideline development group considered diagnostic values, diagnostic accuracy as a critical outcome measure for decision making; and diagnosis, reliability, utility, visual loss, patient reported outcomes as an important outcome measure for decision making.

The working group defined the outcome measures as follows: Diagnostic studies included direct outcomes (true positives, true negatives, false positives, false negatives, sensitivities and specificities; complications of the index test and of the reference standard; resource use), the number of studies and quality assessment related to each of these outcomes and the effect estimate (i.e. number of individuals classified per 1000 people) according to different pretest probabilities [low (<20%), intermediate (20–50%) and high (>50%) pretest

probability].

Search and select (Methods)

The databases Medline (via OVID) and Embase (via Embase.com) were searched (updated) with relevant search terms from 2018 until May, 2021. The detailed search strategy is depicted under the tab Methods. The systematic literature search resulted in 860 hits. Studies were selected based on the following criteria;

- patients with suspected GCA,

- comparison between intervention (i.e., laboratory test or additional diagnostic tests) and control (i.e., current practice ((clinical diagnosis (without formal criteria), ACR classification criteria and temporal artery biopsy result; GCA with/without extra-cranial))

- At least one of the outcome measures of interest, as described in the PICO, is reported.

Thirty-six studies were initially selected based on title and abstract screening. After reading the full text, twenty-nine studies were excluded (see the table with reasons for exclusion under the tab Methods), and seven studies were included.

Results

Seven studies were included in the current update of the analysis of the literature. Important study characteristics and results are summarized in the evidence tables. The assessment of the risk of bias is summarized in the risk of bias tables.

Referenties

Adler S, Sprecher M, Wermelinger F, Klink T, Bonel H, Villiger PM. Diagnostic value of contrast-enhanced magnetic resonance angiography in large-vessel vasculitis. Swiss Med Wkly. 2017 Feb 17;147:w14397. Doi: 10.4414/smw.2017.14397. PMID: 28322426.
Aschwanden M, Daikeler T, Kesten F, Baldi T, Benz D, Tyndall A, Imfeld S, Staub D, Hess C, Jaeger KA. Temporal artery compression signa novel ultrasound finding for the diagnosis of giant cell arteritis. Ultraschall Med. 2013 Feb;34(1):47-50. Doi: 10.1055/s-0032-1312821. Epub 2012 Jun 12. PMID: 22693039.
Aschwanden M, Imfeld S, Staub D, Baldi T, Walker UA, Berger CT, Hess C, Daikeler T. The ultrasound compression sign to diagnose temporal giant cell arteritis shows an excellent interobserver agreement. Clin Exp Rheumatol. 2015 Mar-Apr;33(2 Suppl 89):S-113-5. Epub 2015 May 26. PMID: 26016760.
Bley TA, Uhl M, Carew J, Markl M, Schmidt D, Peter HH, Langer M, Wieben O. Diagnostic value of high-resolution MR imaging in giant cell arteritis. AJNR Am J Neuroradiol. 2007 Oct;28(9):1722-7. Doi: 10.3174/ajnr.A0638. Epub 2007 Sep 20. PMID: 17885247.
Bley TA, Weiben O, Uhl M, Vaith P, Schmidt D, Warnatz K, Langer M. Assessment of the cranial involvement pattern of giant cell arteritis with 3T magnetic resonance imaging. Arthritis Rheum. 2005 Aug;52(8):2470-7. Doi: 10.1002/art.21226. PMID: 16052572.
Bley TA, Markl M, Schelp M, Uhl M, Frydrychowicz A, Vaith P, Peter HH, Langer M, Warnatz K. Mural inflammatory hyperenhancement in MRI of giant cell (temporal) arteritis resolves under corticosteroid treatment. Rheumatology (Oxford). 2008 Jan;47(1):65-7. doi: 10.1093/rheumatology/kem283. PMID: 18077491.
Blockmans D, Stroobants S, Maes A, Mortelmans L. Positron emission tomography in giant cell arteritis and polymyalgia rheumatica: evidence for inflammation of the aortic arch. Am J Med. 2000 Feb 15;108(3):246-9. Doi: 10.1016/s0002-9343(99)00424-6. PMID: 10723979.
Bosch P, Dejaco C, Schmidt WA, Schlüter KD, Pregartner G, Schäfer VS. Association of ultrasound-confirmed axillary artery vasculitis and clinical outcomes in giant cell arteritis. Semin Arthritis Rheum. 2022 Oct;56:152051. doi: 10.1016/j.semarthrit.2022.152051. Epub 2022 Jun 15. PMID: 35780722.
Chrysidis S, Duftner C, Dejaco C, Schäfer VS, Ramiro S, Carrara G, Scirè CA, Hocevar A, Diamantopoulos AP, Iagnocco A, Mukhtyar C, Ponte C, Naredo E, De Miguel E, Bruyn GA, Warrington KJ, Terslev L, Milchert M, DAgostino MA, Koster MJ, Rastalsky N, Hanova P, Macchioni P, Kermani TA, Lorenzen T, Døhn UM, Fredberg U, Hartung W, Dasgupta B, Schmidt WA. Definitions and reliability assessment of elementary ultrasound lesions in giant cell arteritis: a study from the OMERACT Large Vessel Vasculitis Ultrasound Working Group. RMD Open. 2018 May 17;4(1):e000598. Doi: 10.1136/rmdopen-2017-000598. PMID: 29862043; PMCID: PMC5976098.
Dasgupta B, Borg FA, Hassan N, Alexander L, Barraclough K, Bourke B, Fulcher J, Hollywood J, Hutchings A, James P, Kyle V, Nott J, Power M, Samanta A; BSR and BHPR Standards, Guidelines and Audit Working Group. BSR and BHPR guidelines for the management of giant cell arteritis. Rheumatology (Oxford). 2010 Aug;49(8):1594-7. Doi: 10.1093/rheumatology/keq039a. Epub 2010 Apr 5. PMID: 20371504.
Dejaco C, Ramiro S, Duftner C, Besson FL, Bley TA, Blockmans D, Brouwer E, Cimmino MA, Clark E, Dasgupta B, Diamantopoulos AP, Direskeneli H, Iagnocco A, Klink T, Neill L, Ponte C, Salvarani C, Slart RHJA, Whitlock M, Schmidt WA. EULAR recommendations for the use of imaging in large vessel vasculitis in clinical practice. Ann Rheum Dis. 2018 May;77(5):636-
Dejaco C, Duftner C, Buttgereit F, Matteson EL, Dasgupta B. The spectrum of giant cell arteritis and polymyalgia rheumatica: revisiting the concept of the disease. Rheumatology (Oxford). 2017 Apr 1;56(4):506-515. doi: 10.1093/rheumatology/kew273. PMID: 27481272.
Diamantopoulos AP, Haugeberg G, Lindland A, Myklebust G. The fast-track ultrasound clinic for early diagnosis of giant cell arteritis significantly reduces permanent visual impairment: towards a more effective strategy to improve clinical outcome in giant cell arteritis? Rheumatology (Oxford). 2016 Jan;55(1):66-70. doi: 10.1093/rheumatology/kev289. Epub 2015 Aug 18. PMID: 26286743.
Diamantopoulos AP, Haugeberg G, Hetland H, Soldal DM, Bie R, Myklebust G. Diagnostic value of color Doppler ultrasonography of temporal arteries and large vessels in giant cell arteritis: a consecutive case series. Arthritis Care Res (Hoboken). 2014 Jan;66(1):113-9. Doi: 10.1002/acr.22178. PMID: 24106211.
Durling B, Toren A, Patel V, Gilberg S, Weis E, Jordan D. Incidence of discordant temporal artery biopsy in the diagnosis of giant cell arteritis. Canadian journal of ophthalmology.Journal canadien d'ophtalmologie 2014 Apr;49(2):157-161
Geiger J, Bley T, Uhl M, Frydrychowicz A, Langer M, Markl M. Diagnostic value of T2-weighted imaging for the detection of superficial cranial artery inflammation in giant cell arteritis. J Magn Reson Imaging. 2010 Feb;31(2):470-4. Doi: 10.1002/jmri.22047. PMID: 20099359.
Grossman C, Barshack I, Bornstein G, Ben-Zvi I. Is temporal artery biopsy essential in all cases of suspected giant cell arteritis? Clin Exp Rheumatol. 2015 Mar-Apr;33(2 Suppl 89):S-84-9. Epub 2015 May 26. PMID: 26016755.
Habib HM, Essa AA, Hassan AA. Color duplex ultrasonography of temporal arteries: role in diagnosis and follow-up of suspected cases of temporal arteritis. Clin Rheumatol. 2012 Feb;31(2):231-7. Doi: 10.1007/s10067-011-1808-0. Epub 2011 Jul 9. PMID: 21743987.
Hauenstein C, Reinhard M, Geiger J, Markl M, Hetzel A, Treszl A, Vaith P, Bley TA. Effects of early corticosteroid treatment on magnetic resonance imaging and ultrasonography findings in giant cell arteritis. Rheumatology (Oxford). 2012 Nov;51(11):1999-2003. doi: 10.1093/rheumatology/kes153. Epub 2012 Jul 6. PMID: 22772317.
Jakobsson K, Jacobsson L, Mohammad AJ, Nilsson JÅ, Warrington K, Matteson EL, Turesson C. The effect of clinical features and glucocorticoids on biopsy findings in giant cell arteritis. BMC Musculoskelet Disord. 2016 Aug 24;17(1):363. Doi: 10.1186/s12891-016-1225-2. PMID: 27558589; PMCID: PMC4997683.
Karahaliou M, Vaiopoulos G, Papaspyrou S, Kanakis MA, Revenas K, Sfikakis PP. Colour duplex sonography of temporal arteries before decision for biopsy: a prospective study in 55 patients with suspected giant cell arteritis. Arthritis Res Ther. 2006;8(4):R116. Doi: 10.1186/ar2003. PMID: 16859533; PMCID: PMC1779378.
Klink T, Geiger J, Both M, Ness T, Heinzelmann S, Reinhard M, Holl-Ulrich K, Duwendag D, Vaith P, Bley TA. Giant cell arteritis: diagnostic accuracy of MR imaging of superficial cranial arteries in initial diagnosis-results from a multicenter trial. Radiology. 2014 Dec;273(3):844-52. Doi: 10.1148/radiol.14140056. Epub 2014 Aug 6. PMID: 25102371.
Laskou F, Coath F, Mackie SL, Banerjee S, Aung T, Dasgupta B. A probability score to aid the diagnosis of suspected giant cell arteritis. Clin Exp Rheumatol. 2019 Mar-Apr;37 Suppl 117(2):104-108. Epub 2019 Feb 15. PMID: 30767870.
LeSar CJ, Meier GH, DeMasi RJ, Sood J, Nelms CR, Carter KA, Gayle RG, Parent FN, Marcinczyk MJ. The utility of color duplex ultrasonography in the diagnosis of temporal arteritis. J Vasc Surg. 2002 Dec;36(6):1154-60. Doi: 10.1067/mva.2002.129648. PMID: 12469046.
Lariviere D, Benali K, Coustet B, Pasi N, Hyafil F, Klein I, Chauchard M, Alexandra JF, Goulenok T, Dossier A, Dieude P, Papo T, Sacre K. Positron emission tomography and computed tomography angiography for the diagnosis of giant cell arteritis: A real-life prospective study. Medicine (Baltimore). 2016 Jul;95(30):e4146. Doi: 10.1097/MD.0000000000004146. PMID: 27472684; PMCID: PMC5265821.
Lie JT. Illustrated histopathologic classification criteria for selected vasculitis syndromes. American College of Rheumatology Subcommittee on Classification of Vasculitis. Arthritis Rheum. 1990 Aug;33(8):1074-87. Doi: 10.1002/art.1780330804. PMID: 1975173.
Lie JT. Illustrated histopathologic classification criteria for selected vasculitis syndromes. American College of Rheumatology Subcommittee on Classification of Vasculitis. Arthritis Rheum. 1990 Aug;33(8):1074-87. Doi: 10.1002/art.1780330804. PMID: 1975173.
Luqmani R, Lee E, Singh S, Gillett M, Schmidt WA, Bradburn M, Dasgupta B, Diamantopoulos AP, Forrester-Barker W, Hamilton W, Masters S, McDonald B, McNally E, Pease C, Piper J, Salmon J, Wailoo A, Wolfe K, Hutchings A. The Role of Ultrasound Compared to Biopsy of Temporal Arteries in the Diagnosis and Treatment of Giant Cell Arteritis (TABUL): a diagnostic accuracy and cost-effectiveness study. Health Technol Assess. 2016 Nov;20(90):1-238. Doi: 10.3310/hta20900. PMID: 27925577; PMCID: PMC5165283.
Mackie SL, Dejaco C, Appenzeller S, Camellino D, Duftner C, Gonzalez-Chiappe S, Mahr A, Mukhtyar C, Reynolds G, de Souza AWS, Brouwer E, Bukhari M, Buttgereit F, Byrne D, Cid MC, Cimmino M, Direskeneli H, Gilbert K, Kermani TA, Khan A, Lanyon P, Luqmani R, Mallen C, Mason JC, Matteson EL, Merkel PA, Mollan S, Neill L, Sullivan EO, Sandovici M, Schmidt WA, Watts R, Whitlock M, Yacyshyn E, Ytterberg S, Dasgupta B. British Society for Rheumatology guideline on diagnosis and treatment of giant cell arteritis. Rheumatology (Oxford). 2020 Mar 1;59(3):e1-e23. Doi: 10.1093/rheumatology/kez672. PMID: 31970405.
Maleszewski JJ, Younge BR, Fritzlen JT, Hunder GG, Goronzy JJ, Warrington KJ, Weyand CM. Clinical and pathological evolution of giant cell arteritis: a prospective study of follow-up temporal artery biopsies in 40 treated patients. Mod Pathol. 2017 Jun;30(6):788-796. doi: 10.1038/modpathol.2017.10. Epub 2017 Mar 3. PMID: 28256573; PMCID: PMC5650068.
Marvisi C, Accorsi Buttini E, Vaglio A. Aortitis and periaortitis: The puzzling spectrum of inflammatory aortic diseases. Presse Med. 2020 Apr;49(1):104018. doi: 10.1016/j.lpm.2020.104018. Epub 2020 Mar 28. PMID: 32234379.
McDonnell PJ, Moore GW, Miller NR, Hutchins GM, Green WR. Temporal arteritis. A clinicopathologic study. Ophthalmology. 1986 Apr;93(4):518-30. doi: 10.1016/s0161-6420(86)33706-0. PMID: 3703528.
Melville AR, Donaldson K, Dale J, Ciechomska A. Validation of the Southend giant cell arteritis probability score in a Scottish single-centre fast-track pathway. Rheumatol Adv Pract. 2021 Dec 15;6(1):rkab102. doi: 10.1093/rap/rkab102. PMID: 35059557; PMCID: PMC8765789.
Michailidou D, Rosenblum JS, Rimland CA, Marko J, Ahlman MA, Grayson PC. Clinical symptoms and associated vascular imaging findings in Takayasu's arteritis compared to giant cell arteritis. Ann Rheum Dis. 2020 Feb;79(2):262-267. doi: 10.1136/annrheumdis-2019-216145. Epub 2019 Oct 24. PMID: 31649025.
Monti S, Bartoletti A, Bellis E, Delvino P, Montecucco C. Fast-Track Ultrasound Clinic for the Diagnosis of Giant Cell Arteritis Changes the Prognosis of the Disease but Not the Risk of Future Relapse. Front Med (Lausanne). 2020 Dec 8;7:589794. doi: 10.3389/fmed.2020.589794. PMID: 33364248; PMCID: PMC7753207.
Monti S, Floris A, Ponte CB, Schmidt WA, Diamantopoulos AP, Pereira C, Vaggers S, Luqmani RA. The proposed role of ultrasound in the management of giant cell arteritis in routine clinical practice. Rheumatology (Oxford). 2018 Jan 1;57(1):112-119. doi: 10.1093/rheumatology/kex341. PMID: 29045738.
Murgatroyd H, Nimmo M, Evans A, MacEwen C. The use of ultrasound as an aid in the diagnosis of giant cell arteritis: a pilot study comparing histological features with ultrasound findings. Eye (Lond). 2003 Apr;17(3):415-9. Doi: 10.1038/sj.eye.6700350. PMID: 12724706.
Nesher G, Shemesh D, Mates M, Sonnenblick M, Abramowitz HB. The predictive value of the halo sign in color Doppler ultrasonography of the temporal arteries for diagnosing giant cell arteritis. J Rheumatol. 2002 Jun;29(6):1224-6. PMID: 12064840.
Neuman LM, van Nieuwland M, Vermeer M, Boumans D, Colin EM, Alves C. External validation of the giant cell arteritis probability score in the Netherlands. Clin Exp Rheumatol. 2022 May;40(4):787-792. doi: 10.55563/clinexprheumatol/ckvbpg. Epub 2021 Nov 29. PMID: 34874827.
Nielsen BD, Gormsen LC, Hansen IT, Keller KK, Therkildsen P, Hauge EM. Three days of high-dose glucocorticoid treatment attenuates large-vessel FDG uptake in large-vessel giant cell arteritis but with a limited impact on diagnostic accuracy. Eur J Nucl Med Mol Imaging. 2018 Jul;45(7):1119-1128. doi: 10.1007/s00259-018-4021-4. Epub 2018 Apr 18. PMID: 29671039.
Reinhard M, Schmidt D, Hetzel A. Color-coded sonography in suspected temporal arteritis-experiences after 83 cases. Rheumatol Int. 2004 Nov;24(6):340-6. Doi: 10.1007/s00296-003-0372-6. Epub 2003 Nov 5. PMID: 14600785.
Patil P, Williams M, Maw WW, Achilleos K, Elsideeg S, Dejaco C, Borg F, Gupta S, Dasgupta B. Fast track pathway reduces sight loss in giant cell arteritis: results of a longitudinal observational cohort study. Clin Exp Rheumatol. 2015 Mar-Apr;33(2 Suppl 89):S-103-6. Epub 2015 May 26. PMID: 26016758.
Pfadenhauer K, Weber H. Duplex sonography of the temporal and occipital artery in the diagnosis of temporal arteritis. A prospective study. J Rheumatol. 2003 Oct;30(10):2177-81. PMID: 14528514.
Ponte C, Grayson PC, Robson JC, Suppiah R, Gribbons KB, Judge A, Craven A, Khalid S, Hutchings A, Watts RA, Merkel PA, Luqmani RA; DCVAS Study Group. 2022 American College of Rheumatology/EULAR Classification Criteria for Giant Cell Arteritis. Arthritis Rheumatol. 2022 Nov 8. doi: 10.1002/art.42325. Epub ahead of print. PMID: 36350123.
Prieto-González S, Depetris M, García-Martínez A, Espígol-Frigolé G, Tavera-Bahillo I, Corbera-Bellata M, Planas-Rigol E, Alba MA, Hernández-Rodríguez J, Grau JM, Lomeña F, Cid MC. Positron emission tomography assessment of large vessel inflammation in patients with newly diagnosed, biopsy-proven giant cell arteritis: a prospective, case-control study. Ann Rheum Dis. 2014 Jul;73(7):1388-92. doi: 10.1136/annrheumdis-2013-204572. Epub 2014 Mar 24. PMID: 24665112.
Prieto-González S, Arguis P, García-Martínez A, Espígol-Frigolé G, Tavera-Bahillo I, Butjosa M, Sánchez M, Hernández-Rodríguez J, Grau JM, Cid MC. Large vessel involvement in biopsy-proven giant cell arteritis: prospective study in 40 newly diagnosed patients using CT angiography. Ann Rheum Dis. 2012 Jul;71(7):1170-6. Doi: 10.1136/annrheumdis-2011-200865. Epub 2012 Jan 20. PMID: 22267328.
Pugh D, Grayson P, Basu N, Dhaun N. Aortitis: recent advances, current concepts and future possibilities. Heart. 2021 Oct;107(20):1620-1629. doi: 10.1136/heartjnl-2020-318085. Epub 2021 Feb 16. PMID: 33593995.
Rhéaume M, Rebello R, Pagnoux C, Carette S, Clements-Baker M, Cohen-Hallaleh V, Doucette-Preville D, Stanley Jackson B, Salama Sargious Salama S, Ioannidis G, Khalidi NA. High-Resolution Magnetic Resonance Imaging of Scalp Arteries for the Diagnosis of Giant Cell Arteritis: Results of a Prospective Cohort Study. Arthritis Rheumatol. 2017 Jan;69(1):161-168. Doi: 10.1002/art.39824. PMID: 27483045.
Schmidt WA, Kraft HE, Vorpahl K, Völker L, Gromnica-Ihle EJ. Color duplex ultrasonography in the diagnosis of temporal arteritis. N Engl J Med. 1997 Nov 6;337(19):1336-42. Doi: 10.1056/NEJM199711063371902. PMID: 9358127.
Schmidt WA, Seifert A, Gromnica-Ihle E, Krause A, Natusch A. Ultrasound of proximal upper extremity arteries to increase the diagnostic yield in large-vessel giant cell arteritis. Rheumatology (Oxford). 2008 Jan;47(1):96-101. Doi: 10.1093/rheumatology/kem322. PMID: 18077499.
Schmidt WA. Ultrasound in the diagnosis and management of giant cell arteritis. Rheumatology (Oxford). 2018 Feb 1;57(suppl_2):ii22-ii31. doi: 10.1093/rheumatology/kex461. PMID: 29982780.
Sebastian A, Tomelleri A, Kayani A, Prieto-Pena D, Ranasinghe C, Dasgupta B. Probability-based algorithm using ultrasound and additional tests for suspected GCA in a fast-track clinic. RMD Open. 2020 Sep;6(3):e001297. doi: 10.1136/rmdopen-2020-001297. PMID: 32994361; PMCID: PMC7547539.
Siemonsen S, Brekenfeld C, Holst B, Kaufmann-Buehler AK, Fiehler J, Bley TA. 3T MRI reveals extra- and intracranial involvement in giant cell arteritis. AJNR Am J Neuroradiol. 2015 Jan;36(1):91-7. Doi: 10.3174/ajnr.A4086. Epub 2014 Aug 28. PMID: 25169925; PMCID: PMC7965928.
Walter MA, Melzer RA, Schindler C, Müller-Brand J, Tyndall A, Nitzsche EU. The value of [18F]FDG-PET in the diagnosis of large-vessel vasculitis and the assessment of activity and extent of disease. Eur J Nucl Med Mol Imaging. 2005 Jun;32(6):674-81. doi: 10.1007/s00259-004-1757-9. Epub 2005 Mar 4. PMID: 15747154.
Wiberg F, Naderi N, Mohammad AJ, Turesson C. Evaluation of revised classification criteria for giant cell arteritis and its clinical phenotypes. Rheumatology (Oxford). 2021 Dec 24;61(1):383-387. doi: 10.1093/rheumatology/keab353. PMID: 33871583; PMCID: PMC8742823.
Yamada I, Nakagawa T, Himeno Y, Numano F, Shibuya H. Takayasu arteritis: evaluation of the thoracic aorta with CT angiography. Radiology. 1998 Oct;209(1):103-9. Doi: 10.1148/radiology.209.1.9769819. PMID: 9769819.
Zarka F, Rhéaume M, Belhocine M, Goulet M, Febrer G, Mansour AM, Troyanov Y, Starnino T, Meunier RS, Chagnon I, Routhier N, Bénard V, Ducharme-Bénard S, Ross C, Makhzoum JP. Colour Doppler ultrasound and the giant cell arteritis probability score for the diagnosis of giant cell arteritis: a Canadian single-centre experience. Rheumatol Adv Pract. 2021 Nov 10;5(3):rkab083. doi: 10.1093/rap/rkab083. PMID: 34859177; PMCID: PMC8633428.
Zhou L, Luneau K, Weyand CM, Biousse V, Newman NJ, Grossniklaus HE. Clinicopathologic correlations in giant cell arteritis: a retrospective study of 107 cases. Ophthalmology. 2009 Aug;116(8):1574-80. doi: 10.1016/j.ophtha.2009.02.027. Epub 2009 Jun 4. PMID: 19500846; PMCID: PMC2721017.

Evidence tabellen

Research question: What is the value of diagnostic tests compared to current practice in the diagnosis of giant cell arteritis (GCA)?

Study reference	Study characteristics	Patient characteristics	Index test (test of interest)	Reference test	Follow-up	Outcome measures and effect size	Comments
Molina Collada, 2021	Type of study^[1]: Retrospective observational study Setting and country: Hospital, Spain Funding and conflicts of interest: None.	Inclusion criteria: patients referred to the FTP underwent US examination within 24 h per protocol. The study was performed in routine daily practice conditions including consecutive unselected patients. Exclusion criteria: - N=64 Prevalence: 26.5% Mean age: 75.3 years Sex: 65 % Female Other important characteristics:	Describe index test: Ultra sound Halo sign positive Cut-off point(s): N.a.	Describe reference test^[2]: The gold standard for GCA diagnosis was clinical confirmation after 6-month follow-up. Cut-off point(s): n.a.	Time between the index test en reference test: 6 months For how many participants were no complete outcome data available? N 0 (0%) Reasons for incomplete outcome data described? N.a.	Outcome measures and effect size (include 95%CI and p-value if available)⁴: Halo sign positive vs. clinical Sensitivity: 88% (64% to 99%) Specificity: 96% (85% to 99%)	Retrospective design Patients used steroids at baseline. Ultrasonographer was not blinded to clinical data. Low prevalence
Mukhtyar, 2020	Type of study: cohort Setting and country: Hospital, UK Funding and conflicts of interest: First author receives funding from NIHR for 3 hours every week.	Inclusion criteria: US within 7 days and TAB within 28 days of commencing high-dose prednisolone Exclusion criteria: - N= 25 Prevalence: 16/25 = 64% Mean age ± SD: not mentioned Sex: % M / % F Not mentioned Other important characteristics:	Describe index test: US halo positive Cut-off point(s): N.a. Comparator test: TAB positive Cut-off point(s): N.a.	Describe reference test: Clinical GCA after 100 days Cut-off point(s): N.a.	Time between the index test en reference test: Max. 21 days For how many participants were no complete outcome data available? N 0(0%) Reasons for incomplete outcome data described? n.a.	Outcome measures and effect size (include 95%CI and p-value if available): Halo sign positive vs. TAB Sensitivity 88% (47% to 99%) Specificity 59% (33% to 82%) Halo sign positive vs. clinical Sensitivity: 88% (62% to 98%) Specificity: 100% (66% to 100%)	Patient characteristics are not mentioned. Small sample size Selection bias as patients could be dropped out due to negative US. High prevalance
González Porto, 2020	Type of study: Prospective study Setting and country: Hospital, Mexico Funding and conflicts of interest: None.	Inclusion criteria: Patients with suspected temporal arteritis. Exclusion criteria: - N= 57 Prevalence:37% Mean age ± SD: 77.4 (8.6)* Sex: 62% F* *patients with GCA.	Describe index test: US halo positive Cut-off point(s): N.a. Comparator test: TAB positive Cut-off point(s): N.a.	Describe reference test: Clinical diagnose after 6 months follow-up Cut-off point(s): Na.a	Time between the index test en reference test: 6 months For how many participants were no complete outcome data available? N 4 (7%) Reasons for incomplete outcome data described? Died during follow up, and one refused biopsy	Outcome measures and effect size (include 95%CI and p-value if available): Halo sign positive vs. clinical Sensitivity: 33% (15% to 57%) Specificity: 69% (51% to 83%) Halo sign positive vs. temporal stenosis Sensitivity: 14% (3% to 36%) Specificity: 84% (81% to 99%) Halo sign positive vs. arterial occlusion Sensitivity: 10% (1% to 30%) Specificity: 97% (85% to 100%)	Assessors not blinded to outcomes of other parameters.
Rodriguez-Régent, 2020	Type of study: prospective cohort study Setting and country: Hospital, France Funding and conflicts of interest: Not mentioned.	Inclusion criteria: patients with a clinical suspicion of GCA who underwent MRI as part of the diagnosis process. Exclusion criteria: previous TAB or GC therapy >48 hours before the MRI scan. N=32 Prevalence: 10/32 = 31% Mean age ± SD: 70 (12) years Sex: 78 % Female Other important characteristics:	Describe index test: MRI CUBE T1 (3D) Cut-off point(s): n.a. Comparator test: n.a. Cut-off point(s): n.a.	Describe reference test: Clinical diagnose based on TAB. Cut-off point(s): N.a.	Time between the index test and reference test: Not mentioned. For how many participants were no complete outcome data available? N=0 (0 %) Reasons for incomplete outcome data described? n.a.	Outcome measures and effect size (include 95%CI and p-value if available): MRI vs. clinical diagnose. Sensitivity: 80% (44% to 97%) Specificity: 100% (85% to 100%)	Assessors were blinded. No precise description of reference standard, also time between measurement not mentioned. Outcomes only explained in text.
Conway, 2019	Type of study: Prospective registry Setting and country: Hospital. Ireland. Funding and conflicts of interest: not mentioned.	Inclusion criteria: New presentation of suspected GCA and had both a TA US and TAB. Exclusion criteria:- N= 291 Prevalence: 123/162 (78%) Mean age ± SD: 71 (10)years * Sex: 59 % Female* *patients with GCA.	Describe index test: Temporal artery ultrasound Cut-off point(s): n.a. Comparator test: Temporal artery biopsy Cut-off point(s): n.a.	Describe reference test: Clinical diagnose after 6 months follow-up. Cut-off point(s):	Time between the index test en reference test: 6 months For how many participants were no complete outcome data available? N 129 (44%) Reasons for incomplete outcome data described? - no follow up of 6 months (n=35) - only TA US or TAB (n=87) - non-arterial TAB (n=7)	Outcome measures and effect size (include 95%CI and p-value if available): TA US vs. clinical diagnose Sensitivity; 52.8% (43.7% to 61.9%) Specificity; 71.8% (54.9% to 84.5%)	Rheumatologists were not blinded to the results of the TA US or TAB. Steroids were used before TA US or TAB. * correction in analysis.
Imfeld, 2020	Type of study: Cohort study Setting and country: Hospital, Switzerland Funding and conflicts of interest: None.	Inclusion criteria: Patients with suspicion of GCA presenting at the university hospital of Basel, received both US and PET/CT for work-up. Exclusion criteria: N= 102 Prevalence: 68/102 (66%) Median age: 75 years* Sex: 65% Female* *patients with GCA.	Describe index test: US Cut-off point(s): N.a. Comparator test PET/CT scan Cut-off point(s): 1.3	Describe reference test: TAB + and ACR criteria Cut-off point(s): n.a.	Time between the index test en reference test: Not mentioned in text. For how many participants were no complete outcome data available? N 0(0%) Reasons for incomplete outcome data described? Only data present of patients who underwent process.	Outcome measures and effect size (include 95%CI and p-value if available): US vs. clinical Sensi: 0.97 Spec: 0.57 PET/CT vs. clinical Sensi: 0.85 Spec: 0.72 US or PET/CT vs. clinical Sensi: 0.82 Spec: 0.88	No 95%CI mentioned. US and PET/CT readers were blinded for clinical diagnostic data and complementary imaging results.
Poilon, 2020	Type of study: Prospective cohort Setting and country: Multicentre, France Funding and conflicts of interest: None.	Inclusion criteria: (a) patients aged 50 years or older; (b) suspicion of GCA based on the presence of the following criteria: constitutional symptoms (anorexia, weight loss, fever, asthenia, malaise), new onset headache or neck pain, induration or pain of the temporal artery, jaw claudication, tongue or swallowing disorders, polymyalgia rheumatica, ophthalmological findings (transient visual loss or diplopia, unilateral or bilateral visual loss from anterior ischemic optic neuropathy or occlusion of the central retinal artery, diplopia secondary to ocular motor palsy or extraocular muscle ischemia), abnormal laboratory results (erythrocyte sedimentation rate (ESR), C-reactive protein (CRP) level); (c) completion of an MRI with CE-VW sequences. Exclusion criteria: the absence of both 2D and 3D CE-VW sequences N=79 Prevalence: 51/79 (65%) Mean age ± SD: 75 (9.5) years. Sex: 47% Female	Describe index test: MIR 2D and 3D Cut-off point(s): Detailed information please see table 1 of the article. Comparator test: Cut-off point(s):	Describe reference test: In case of a positive TAB, based on the ACR criteria. Cut-off point(s): n.a.	Time between the index test en reference test: Not mentioned in text. For how many participants were no complete outcome data available? N 0(0%) Reasons for incomplete outcome data described? N.a. strict criteria.	Outcome measures and effect size (include 95%CI and p-value if available): 2D vs. clinical Sensi: 0.70 Spec: 0.85 3D vs. clinical Sensi: 0.80 Spec: 1.00	- Low sample size, - More female than females? - Assessors were blinded to clinical data - 3-T MRI not widely available - GC could be used. - no 95% confidence intervals.

Study reference

Study characteristics

Patient characteristics

Index test

(test of interest)

Reference test

Follow-up

Outcome measures and effect size

Comments

Molina Collada, 2021

Type of study^[1]:

Retrospective observational study

Setting and country:

Hospital, Spain

Funding and conflicts of interest:

None.

Inclusion criteria:

patients referred to the FTP underwent US examination within 24 h per protocol. The

study was performed in routine daily practice conditions including

consecutive unselected patients.

Exclusion criteria: -

N=64

Prevalence: 26.5%

Mean age: 75.3 years

Sex: 65 % Female

Other important characteristics:

Describe index test:

Ultra sound Halo sign positive

Cut-off point(s):

N.a.

Describe reference test^[2]:

The gold standard for GCA diagnosis was clinical confirmation after 6-month follow-up.

Cut-off point(s):

n.a.

Time between the index test en reference test:

6 months

For how many participants were no complete outcome data available?

N 0 (0%)

Reasons for incomplete outcome data described?

N.a.

Outcome measures and effect size (include 95%CI and p-value if available)⁴:

Halo sign positive vs. clinical

Sensitivity: 88% (64% to 99%)

Specificity: 96% (85% to 99%)

Retrospective design

Patients used steroids at baseline.

Ultrasonographer was not blinded to clinical data.

Low prevalence

Mukhtyar, 2020

Type of study:

cohort

Setting and country:

Hospital, UK

Funding and conflicts of interest:

First author receives funding from NIHR for 3

hours every week.

Inclusion criteria: US within 7 days and TAB within 28 days of commencing high-dose

prednisolone

Exclusion criteria: -

N= 25

Prevalence: 16/25 = 64%

Mean age ± SD: not mentioned

Sex: % M / % F

Not mentioned

Other important characteristics:

Describe index test:

US halo positive

Cut-off point(s):

N.a.

Comparator test:

TAB positive

Cut-off point(s):

N.a.

Describe reference test:

Clinical GCA after 100 days

Cut-off point(s):

N.a.

Time between the index test en reference test:

Max. 21 days

For how many participants were no complete outcome data available?

N 0(0%)

Reasons for incomplete outcome data described?

n.a.

Outcome measures and effect size (include 95%CI and p-value if available):

Halo sign positive vs. TAB

Sensitivity 88% (47% to 99%)

Specificity 59% (33% to 82%)

Halo sign positive vs. clinical

Sensitivity: 88% (62% to 98%)

Specificity: 100% (66% to 100%)

Patient characteristics are not mentioned.

Small sample size

Selection bias as patients could be dropped out due to negative US.

High prevalance

González Porto, 2020

Type of study:

Prospective study

Setting and country:

Hospital, Mexico

Funding and conflicts of interest:

None.

Inclusion criteria:

Patients with suspected temporal arteritis.

Exclusion criteria: -

N= 57

Prevalence:37%

Mean age ± SD: 77.4 (8.6)*

Sex: 62% F*

*patients with GCA.

Describe index test:

US halo positive

Cut-off point(s):

N.a.

Comparator test:

TAB positive

Cut-off point(s):

N.a.

Describe reference test:

Clinical diagnose after 6 months follow-up

Cut-off point(s):

Na.a

Time between the index test en reference test:

6 months

For how many participants were no complete outcome data available?

N 4 (7%)

Reasons for incomplete outcome data described?

Died during follow up, and one refused biopsy

Outcome measures and effect size (include 95%CI and p-value if available):

Halo sign positive vs. clinical

Sensitivity: 33% (15% to 57%)

Specificity: 69% (51% to 83%)

Halo sign positive vs. temporal stenosis

Sensitivity: 14% (3% to 36%)

Specificity: 84% (81% to 99%)

Halo sign positive vs. arterial occlusion

Sensitivity: 10% (1% to 30%)

Specificity: 97% (85% to 100%)

Assessors not blinded to outcomes of other parameters.

Rodriguez-Régent, 2020

Type of study: prospective cohort study

Setting and country: Hospital, France

Funding and conflicts of interest:

Not mentioned.

Inclusion criteria: patients with a clinical suspicion of GCA who underwent MRI as part of the diagnosis process.

Exclusion criteria: previous TAB or GC therapy >48 hours before the MRI scan.

N=32

Prevalence: 10/32 = 31%

Mean age ± SD: 70 (12) years

Sex: 78 % Female

Other important characteristics:

Describe index test:

MRI CUBE T1 (3D)

Cut-off point(s):

n.a.

Comparator test:

n.a.

Cut-off point(s):

n.a.

Describe reference test:

Clinical diagnose based on TAB.

Cut-off point(s):

N.a.

Time between the index test and reference test:

Not mentioned.

For how many participants were no complete outcome data available?

N=0 (0 %)

Reasons for incomplete outcome data described?

n.a.

Outcome measures and effect size (include 95%CI and p-value if available):

MRI vs. clinical diagnose.

Sensitivity: 80% (44% to 97%)

Specificity: 100% (85% to 100%)

Assessors were blinded.

No precise description of reference standard, also time between measurement not mentioned.

Outcomes only explained in text.

Conway, 2019

Type of study:

Prospective registry

Setting and country:

Hospital. Ireland.

Funding and conflicts of interest: not mentioned.

Inclusion criteria:

New presentation of suspected GCA and had both a TA US and TAB.

Exclusion criteria:-

N= 291

Prevalence: 123/162 (78%)

Mean age ± SD: 71 (10)years *

Sex: 59 % Female*

*patients with GCA.

Describe index test:

Temporal artery ultrasound

Cut-off point(s):

n.a.

Comparator test:

Temporal artery biopsy

Cut-off point(s):

n.a.

Describe reference test:

Clinical diagnose after 6 months follow-up.

Cut-off point(s):

Time between the index test en reference test:

6 months

For how many participants were no complete outcome data available?

N 129 (44%)

Reasons for incomplete outcome data described?
- no follow up of 6 months (n=35)
- only TA US or TAB (n=87)
- non-arterial TAB (n=7)

Outcome measures and effect size (include 95%CI and p-value if available):

TA US vs. clinical diagnose

Sensitivity; 52.8% (43.7% to 61.9%)

Specificity; 71.8% (54.9% to 84.5%)

Rheumatologists were not

blinded to the results of the TA US or TAB.

Steroids were used before TA US or TAB. * correction in analysis.

Imfeld, 2020

Type of study:

Cohort study

Setting and country:

Hospital, Switzerland

Funding and conflicts of interest:

None.

Inclusion criteria:

Patients with suspicion of GCA presenting at the university hospital of Basel, received both US and PET/CT for work-up.

Exclusion criteria:

N= 102

Prevalence: 68/102 (66%)

Median age: 75 years*

Sex: 65% Female*

*patients with GCA.

Describe index test:

Cut-off point(s):

N.a.

Comparator test

PET/CT scan

Cut-off point(s):

1.3

Describe reference test:

TAB + and ACR criteria

Cut-off point(s):

n.a.

Time between the index test en reference test:

Not mentioned in text.

For how many participants were no complete outcome data available?

N 0(0%)

Reasons for incomplete outcome data described?

Only data present of patients who underwent process.

Outcome measures and effect size (include 95%CI and p-value if available):

US vs. clinical

Sensi: 0.97

Spec: 0.57

PET/CT vs. clinical

Sensi: 0.85

Spec: 0.72

US or PET/CT vs. clinical

Sensi: 0.82

Spec: 0.88

No 95%CI mentioned.

US and PET/CT readers were blinded for clinical diagnostic

data and complementary imaging results.

Poilon, 2020

Type of study:

Prospective cohort

Setting and country:

Multicentre, France

Funding and conflicts of interest: None.

Inclusion criteria:

(a) patients aged

50 years or older; (b) suspicion of GCA based on the presence

of the following criteria: constitutional symptoms (anorexia,

weight loss, fever, asthenia, malaise), new onset headache or neck pain, induration or pain of the temporal artery, jaw claudication,

tongue or swallowing disorders, polymyalgia

rheumatica, ophthalmological findings (transient visual loss or diplopia, unilateral or bilateral visual loss from anterior ischemic optic neuropathy or occlusion of the central retinal artery, diplopia secondary to ocular motor palsy or extraocular muscle ischemia), abnormal laboratory results (erythrocyte

sedimentation rate (ESR), C-reactive protein (CRP) level);

Exclusion criteria: the absence of both 2D and 3D CE-VW sequences

N=79

Prevalence: 51/79 (65%)

Mean age ± SD: 75 (9.5) years.

Sex: 47% Female

Describe index test:

MIR 2D and 3D

Cut-off point(s):

Detailed information please see table 1 of the article.

Comparator test:

Cut-off point(s):

Describe reference test:

In case of a positive TAB, based on the ACR criteria.

Cut-off point(s):

n.a.

Time between the index test en reference test:

Not mentioned in text.

For how many participants were no complete outcome data available?

N 0(0%)

Reasons for incomplete outcome data described?

N.a. strict criteria.

Outcome measures and effect size (include 95%CI and p-value if available):

2D vs. clinical

Sensi: 0.70

Spec: 0.85

3D vs. clinical

Sensi: 0.80

Spec: 1.00

- Low sample size,

- More female than females?

- Assessors were blinded to clinical data

- 3-T MRI not widely available

- GC could be used.

- no 95% confidence intervals.

Risk of bias assessment diagnostic accuracy studies (QUADAS II, 2011)

Research question: What is the value of diagnostic tests compared to current practice in the diagnosis of giant cell arteritis (GCA)?

Study reference	Patient selection	Index test	Reference standard	Flow and timing	Comments with respect to applicability
Molina Collada, 2021)	Was a consecutive or random sample of patients enrolled? Yes, patients referred to the FTP. The study was performed in routine daily practice conditions including consecutive unselected patients. Was a case-control design avoided? Yes, retrospective cohort Did the study avoid inappropriate exclusions? No, no exclusions	Were the index test results interpreted without knowledge of the results of the reference standard? Unclear, ultrasonographer was aware of clinical data. If a threshold was used, was it pre-specified? n.a.	Is the reference standard likely to correctly classify the target condition? Yes, gold standard Were the reference standard results interpreted without knowledge of the results of the index test? Unclear, not mentioned in text.	Was there an appropriate interval between index test(s) and reference standard? Yes, 6 months Did all patients receive a reference standard? Yes, retrospective cohorot Did patients receive the same reference standard? Yes, gold standard. Were all patients included in the analysis? Yes, retrospective cohort.	Are there concerns that the included patients do not match the review question? No, patients with suspected GCA Are there concerns that the index test, its conduct, or interpretation differ from the review question? No. Are there concerns that the target condition as defined by the reference standard does not match the review question? No.
CONCLUSION: Could the selection of patients have introduced bias? RISK: LOW	CONCLUSION: Could the conduct or interpretation of the index test have introduced bias? RISK: Unclear, assessor was aware of clinical data.	CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias? RISK: Unclear, not mentioned that assessor was aware of index test.	CONCLUSION Could the patient flow have introduced bias? RISK: LOW
Mukhtyar, 2020	Was a consecutive or random sample of patients enrolled? Unclear, patients were included if they had an US and TAB. Was a case-control design avoided? Yes, cohort study Did the study avoid inappropriate exclusions? No, no exclusions	Were the index test results interpreted without knowledge of the results of the reference standard? Yes, first US thereafter TAB If a threshold was used, was it pre-specified? N.a.	Is the reference standard likely to correctly classify the target condition? No, first US thereafter TAB and clinical diagnose. Were the reference standard results interpreted without knowledge of the results of the index test? No	Was there an appropriate interval between index test(s) and reference standard? Yes Did all patients receive a reference standard? Yes Did patients receive the same reference standard? Yes Were all patients included in the analysis? Yes	Are there concerns that the included patients do not match the review question? no Are there concerns that the index test, its conduct, or interpretation differ from the review question? no Are there concerns that the target condition as defined by the reference standard does not match the review question? no
	CONCLUSION: Could the selection of patients have introduced bias? RISK: Unclear, as it could be that patients with a negative US did not had a TAB.	CONCLUSION: Could the conduct or interpretation of the index test have introduced bias? RISK: Low, first US thereafter TAB.	CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias? RISK: High, assessors were aware of US outcome.	CONCLUSION Could the patient flow have introduced bias? RISK: Low
González Porto, 2020	Was a consecutive or random sample of patients enrolled? Yes, patients with suspected temporal arteritis. Was a case-control design avoided? Yes, prospective cohort. Did the study avoid inappropriate exclusions? Yes, no exclusion criteria.	Were the index test results interpreted without knowledge of the results of the reference standard? No, it is mentioned that assessors were aware of outcome of the other tests. If a threshold was used, was it pre-specified? N.a.	Is the reference standard likely to correctly classify the target condition? Yes Were the reference standard results interpreted without knowledge of the results of the index test? No.	Was there an appropriate interval between index test(s) and reference standard? Yes, first US and biopsy, diagnose after 6 months. Did all patients receive a reference standard? Yes Did patients receive the same reference standard? Yes, performed by the same assessor Were all patients included in the analysis? No, 4 not, since 3 died and one refused a biopsy	Are there concerns that the included patients do not match the review question? No Are there concerns that the index test, its conduct, or interpretation differ from the review question? No Are there concerns that the target condition as defined by the reference standard does not match the review question? No
	CONCLUSION: Could the selection of patients have introduced bias? RISK: LOW	CONCLUSION: Could the conduct or interpretation of the index test have introduced bias? RISK: High, assessors were not blinded.	CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias? RISK: High, assessors were not blinded.	CONCLUSION Could the patient flow have introduced bias? RISK: LOW
Rodriguez-Régent, 2020	Was a consecutive or random sample of patients enrolled? Yes, patients with suspected temporal arteritis. Was a case-control design avoided? Yes, prospective cohort Did the study avoid inappropriate exclusions? Yes.	Were the index test results interpreted without knowledge of the results of the reference standard? Yes, assessors were blinded If a threshold was used, was it pre-specified? N.a.	Is the reference standard likely to correctly classify the target condition? Unclear, no specific description of reference standard. Were the reference standard results interpreted without knowledge of the results of the index test? Yes	Was there an appropriate interval between index test(s) and reference standard? Unclear, not mentioned Did all patients receive a reference standard? Yes Did patients receive the same reference standard? Yes Were all patients included in the analysis? Yes	Are there concerns that the included patients do not match the review question? No Are there concerns that the index test, its conduct, or interpretation differ from the review question? No Are there concerns that the target condition as defined by the reference standard does not match the review question? No
	CONCLUSION: Could the selection of patients have introduced bias? RISK: LOW	CONCLUSION: Could the conduct or interpretation of the index test have introduced bias? RISK: LOW	CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias? RISK: Unclear	CONCLUSION Could the patient flow have introduced bias? RISK: Low, only time interval is lacking.
Conway, 2019	Was a consecutive or random sample of patients enrolled? No, only 162 of the 291 patients included in the analysis. Was a case-control design avoided? Yes, registry data Did the study avoid inappropriate exclusions? No.	Were the index test results interpreted without knowledge of the results of the reference standard? No, rheumatologists were not blinded. If a threshold was used, was it pre-specified? N.a.	Is the reference standard likely to correctly classify the target condition? No, rheumatologists were not blinded. Were the reference standard results interpreted without knowledge of the results of the index test? N.a.	Was there an appropriate interval between index test(s) and reference standard? Yes Did all patients receive a reference standard? Yes Did patients receive the same reference standard? Yes Were all patients included in the analysis? No, see only 162 of the 291	Are there concerns that the included patients do not match the review question? No Are there concerns that the index test, its conduct, or interpretation differ from the review question? No Are there concerns that the target condition as defined by the reference standard does not match the review question? No
	CONCLUSION: Could the selection of patients have introduced bias? RISK: High, selected population	CONCLUSION: Could the conduct or interpretation of the index test have introduced bias? RISK: High, no blinding	CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias? RISK: High, no blinding	CONCLUSION Could the patient flow have introduced bias? RISK: High, selected population
Imfeld, 2020	Was a consecutive or random sample of patients enrolled? Yes, patients with suspected GCA Was a case-control design avoided? Yes Did the study avoid inappropriate exclusions? Yes.	Were the index test results interpreted without knowledge of the results of the reference standard? Yes, assessors were blinded. If a threshold was used, was it pre-specified? N.a.	Is the reference standard likely to correctly classify the target condition? Yes, assessors were blinded. Were the reference standard results interpreted without knowledge of the results of the index test? n.a.	Was there an appropriate interval between index test(s) and reference standard? Yes, if not it is descripted in the text. Did all patients receive a reference standard? Yes Did patients receive the same reference standard? Yes Were all patients included in the analysis? Yes, but only patients meeting the criteria were included.	Are there concerns that the included patients do not match the review question? no Are there concerns that the index test, its conduct, or interpretation differ from the review question? No Are there concerns that the target condition as defined by the reference standard does not match the review question? No
	CONCLUSION: Could the selection of patients have introduced bias? RISK: LOW	CONCLUSION: Could the conduct or interpretation of the index test have introduced bias? RISK: LOW	CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias? RISK: LOW	CONCLUSION Could the patient flow have introduced bias? RISK: Unclear, potential selection bias.
Poillon, 2020	Was a consecutive or random sample of patients enrolled? No, patients were selected according to strict criteria. Was a case-control design avoided? Yes, cohort Did the study avoid inappropriate exclusions? Yes.	Were the index test results interpreted without knowledge of the results of the reference standard? Yes, assessors were blinded If a threshold was used, was it pre-specified? Detailed description.	Is the reference standard likely to correctly classify the target condition? Yes Were the reference standard results interpreted without knowledge of the results of the index test? Yes	Was there an appropriate interval between index test(s) and reference standard? Unclear, not mentioned in text. Did all patients receive a reference standard? Yes Did patients receive the same reference standard? Yes. Were all patients included in the analysis? Yes	Are there concerns that the included patients do not match the review question? no Are there concerns that the index test, its conduct, or interpretation differ from the review question? no Are there concerns that the target condition as defined by the reference standard does not match the review question? no
	CONCLUSION: Could the selection of patients have introduced bias? RISK: High, potential selection bias.	CONCLUSION: Could the conduct or interpretation of the index test have introduced bias? RISK: Low, assessors were blinded.	CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias? RISK: Low, assessors were blinded.	CONCLUSION Could the patient flow have introduced bias? RISK: Low

Study reference

Patient selection

Index test

Reference standard

Flow and timing

Comments with respect to applicability

Molina Collada, 2021)

Was a consecutive or random sample of patients enrolled?

Yes, patients referred to the FTP. The study was performed in routine daily practice conditions including

consecutive unselected patients.

Was a case-control design avoided?

Yes, retrospective cohort

Did the study avoid inappropriate exclusions?

No, no exclusions

Were the index test results interpreted without knowledge of the results of the reference standard?

Unclear, ultrasonographer was aware of clinical data.

If a threshold was used, was it pre-specified?

n.a.

Is the reference standard likely to correctly classify the target condition?

Yes, gold standard

Were the reference standard results interpreted without knowledge of the results of the index test?

Unclear, not mentioned in text.

Was there an appropriate interval between index test(s) and reference standard?

Yes, 6 months

Did all patients receive a reference standard?

Yes, retrospective cohorot

Did patients receive the same reference standard?

Yes, gold standard.

Were all patients included in the analysis?

Yes, retrospective cohort.

Are there concerns that the included patients do not match the review question?

No, patients with suspected GCA

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

No.

Are there concerns that the target condition as defined by the reference standard does not match the review question?

No.

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: LOW

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: Unclear, assessor was aware of clinical data.

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: Unclear, not mentioned that assessor was aware of index test.

CONCLUSION

Could the patient flow have introduced bias?

RISK: LOW

Mukhtyar, 2020

Was a consecutive or random sample of patients enrolled?

Unclear, patients were included if they had an US and TAB.

Was a case-control design avoided?

Yes, cohort study

Did the study avoid inappropriate exclusions?

No, no exclusions

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes, first US thereafter TAB

If a threshold was used, was it pre-specified?

N.a.

Is the reference standard likely to correctly classify the target condition?

No, first US thereafter TAB and clinical diagnose.

Were the reference standard results interpreted without knowledge of the results of the index test?

Was there an appropriate interval between index test(s) and reference standard?

Yes

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Yes

Were all patients included in the analysis?

Yes

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: Unclear, as it could be that patients with a negative US did not had a TAB.

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: Low, first US thereafter TAB.

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: High, assessors were aware of US outcome.

CONCLUSION

Could the patient flow have introduced bias?

RISK: Low

González Porto, 2020

Was a consecutive or random sample of patients enrolled?

Yes, patients with suspected temporal arteritis.

Was a case-control design avoided?

Yes, prospective cohort.

Did the study avoid inappropriate exclusions?

Yes, no exclusion criteria.

Were the index test results interpreted without knowledge of the results of the reference standard?

No, it is mentioned that assessors were aware of outcome of the other tests.

If a threshold was used, was it pre-specified?

N.a.

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

No.

Was there an appropriate interval between index test(s) and reference standard?

Yes, first US and biopsy, diagnose after 6 months.

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Yes, performed by the same assessor

Were all patients included in the analysis?

No, 4 not, since 3 died and one refused a biopsy

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: LOW

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: High, assessors were not blinded.

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: High, assessors were not blinded.

CONCLUSION

Could the patient flow have introduced bias?

RISK: LOW

Rodriguez-Régent, 2020

Was a consecutive or random sample of patients enrolled?

Yes, patients with suspected temporal arteritis.

Was a case-control design avoided?

Yes, prospective cohort

Did the study avoid inappropriate exclusions?

Yes.

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes, assessors were blinded

If a threshold was used, was it pre-specified?

N.a.

Is the reference standard likely to correctly classify the target condition?

Unclear, no specific description of reference standard.

Were the reference standard results interpreted without knowledge of the results of the index test?

Yes

Was there an appropriate interval between index test(s) and reference standard?

Unclear, not mentioned

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Yes

Were all patients included in the analysis?

Yes

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: LOW

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: LOW

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: Unclear

CONCLUSION

Could the patient flow have introduced bias?

RISK: Low, only time interval is lacking.

Conway, 2019

Was a consecutive or random sample of patients enrolled?

No, only 162 of the 291 patients included in the analysis.

Was a case-control design avoided?

Yes, registry data

Did the study avoid inappropriate exclusions?

No.

Were the index test results interpreted without knowledge of the results of the reference standard?

No, rheumatologists were not blinded.

If a threshold was used, was it pre-specified?

N.a.

Is the reference standard likely to correctly classify the target condition?

No, rheumatologists were not blinded.

Were the reference standard results interpreted without knowledge of the results of the index test?

N.a.

Was there an appropriate interval between index test(s) and reference standard?

Yes

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Yes

Were all patients included in the analysis?

No, see only 162 of the 291

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: High, selected population

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: High, no blinding

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: High, no blinding

CONCLUSION

Could the patient flow have introduced bias?

RISK: High, selected population

Imfeld, 2020

Was a consecutive or random sample of patients enrolled?

Yes, patients with suspected GCA

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Yes.

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes, assessors were blinded.

If a threshold was used, was it pre-specified?

N.a.

Is the reference standard likely to correctly classify the target condition?

Yes, assessors were blinded.

Were the reference standard results interpreted without knowledge of the results of the index test?

n.a.

Was there an appropriate interval between index test(s) and reference standard?

Yes, if not it is descripted in the text.

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Yes

Were all patients included in the analysis?

Yes, but only patients meeting the criteria were included.

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: LOW

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: LOW

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: LOW

CONCLUSION

Could the patient flow have introduced bias?

RISK: Unclear, potential selection bias.

Poillon, 2020

Was a consecutive or random sample of patients enrolled?

No, patients were selected according to strict criteria.

Was a case-control design avoided?

Yes, cohort

Did the study avoid inappropriate exclusions?

Yes.

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes, assessors were blinded

If a threshold was used, was it pre-specified?

Detailed description.

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

Yes

Was there an appropriate interval between index test(s) and reference standard?

Unclear, not mentioned in text.

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Yes.

Were all patients included in the analysis?

Yes

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: High, potential selection bias.

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: Low, assessors were blinded.

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: Low, assessors were blinded.

CONCLUSION

Could the patient flow have introduced bias?

RISK: Low

Judgments on risk of bias are dependent on the research question: some items are more likely to introduce bias than others, and may be given more weight in the final conclusion on the overall risk of bias per domain:

Patient selection:

Consecutive or random sample has a low risk to introduce bias.
A case control design is very likely to overestimate accuracy and thus introduce bias.
Inappropriate exclusion is likely to introduce bias.

Index test:

This item is similar to “blinding” in intervention studies. The potential for bias is related to the subjectivity of index test interpretation and the order of testing.
Selecting the test threshold to optimise sensitivity and/or specificity may lead to overoptimistic estimates of test performance and introduce bias.

Reference standard:

When the reference standard is not 100% sensitive and 100% specific, disagreements between the index test and reference standard may be incorrect, which increases the risk of bias.
This item is similar to “blinding” in intervention studies. The potential for bias is related to the subjectivity of index test interpretation and the order of testing.

Flow and timing:

If there is a delay or if treatment is started between index test and reference standard, misclassification may occur due to recovery or deterioration of the condition, which increases the risk of bias.
If the results of the index test influence the decision on whether to perform the reference standard or which reference standard is used, estimated diagnostic accuracy may be biased.
All patients who were recruited into the study should be included in the analysis, if not, the risk of bias is increased.

Judgement on applicability:

Patient selection: there may be concerns regarding applicability if patients included in the study differ from those targeted by the review question, in terms of severity of the target condition, demographic features, presence of differential diagnosis or co-morbidity, setting of the study and previous testing protocols.

Index test: if index tests methods differ from those specified in the review question there may be concerns regarding applicability.

Reference standard: the reference standard may be free of bias but the target condition that it defines may differ from the target condition specified in the review question.

Table of excluded studies

Author and year	Reason for exclusion
van der Geest, 2021	Patients with GCA
van der Geest, 2020	Not in line with PICO
Duftner, 2018	Search until 2017
Grayson, 2018	Not suspected GCA
Moragas Solanes, 2019	Not only GCA
Noumegni, 2021	Agreement between 12- and 22MHz
Junek, 2021	Not in line with PICO
Jese, 2021	Patients with GCA
Schramm, 2019	Patients with GCA
Nielsen, 2020	Other reference standard
Hop, 2020	Not in line with PICO
Ponte, 2020	Not in line with PICO
Monti, 2020	Not in line with PICO
Yip, 2020	Patients with GCA
Monti, 2018	Not in line with PICO
Gribbons, 2020	Patients with GCA
Vaidyanathan, 2018	Not in line with PICO
Hay, 2019	Reference standard not in line with PICO
Conway, 2018	P not in line with PICO
Banerjee, 2020	Monitoring disease activity
Ford, 2020	monitoring
Malik, 2020	Not in line with PICO
Zou, 2019	Article retraction
Emamifar, 2020	Only 3 patients with GCA
Sebastian, 2021	Reference standard not complete in line with PICO
Maz, 2021	PICO not in line
Van der Geest, 2019	Post-hoc of study included in BSR guideline

Literature search strategy

Embase

No.	Query	Results
#12	#9 NOT #11	726
#11	#9 AND #10	51
#10	('meta analysis'/exp OR 'meta analysis (topic)'/exp OR metaanaly:ti,ab OR 'meta analy':ti,ab OR metanaly:ti,ab OR 'systematic review'/de OR 'cochrane database of systematic reviews'/jt OR prisma:ti,ab OR prospero:ti,ab OR (((systemati OR scoping OR umbrella OR 'structured literature') NEAR/3 (review* OR overview)):ti,ab) OR ((systemic NEAR/1 review):ti,ab) OR (((systemati OR literature OR database* OR 'data base') NEAR/10 search):ti,ab) OR (((structured OR comprehensive* OR systemic) NEAR/3 search):ti,ab) OR (((literature NEAR/3 review):ti,ab) AND (search:ti,ab OR database:ti,ab OR 'data base':ti,ab)) OR (('data extraction':ti,ab OR 'data source':ti,ab) AND 'study selection':ti,ab) OR ('search strategy':ti,ab AND 'selection criteria':ti,ab) OR ('data source':ti,ab AND 'data synthesis':ti,ab) OR medline:ab OR pubmed:ab OR embase:ab OR cochrane:ab OR (((critical OR rapid) NEAR/2 (review* OR overview* OR synthes)):ti) OR ((((critical OR rapid) NEAR/3 (review OR overview* OR synthes)):ab) AND (search:ab OR database:ab OR 'data base':ab)) OR metasynthes:ti,ab OR 'meta synthes':ti,ab) NOT (('animal'/exp OR 'animal experiment'/exp OR 'animal model'/exp OR 'nonhuman'/exp) NOT 'human'/exp) NOT ('conference abstract'/it OR 'conference review'/it OR 'editorial'/it OR 'letter'/it OR 'note'/it)	557389
#9	#8 AND [1-6-2018]/sd NOT ('conference abstract'/it OR 'editorial'/it OR 'letter'/it OR 'note'/it) NOT (('animal experiment'/exp OR 'animal model'/exp OR 'nonhuman'/exp) NOT 'human'/exp)	777
#8	#6 AND #7	4552
#7	'diagnostic procedure'/exp OR 'sensitivity and specificity'/de OR sensitiv:ab,ti OR specific:ab,ti OR predict:ab,ti OR 'roc curve':ab,ti OR 'receiver operator':ab,ti OR 'receiver operators':ab,ti OR likelihood:ab,ti OR 'diagnostic error'/exp OR 'diagnostic accuracy'/exp OR 'diagnostic test accuracy study'/exp OR 'inter observer':ab,ti OR 'intra observer':ab,ti OR interobserver:ab,ti OR intraobserver:ab,ti OR validity:ab,ti OR kappa:ab,ti OR reliability:ab,ti OR reproducibility:ab,ti OR ((test NEAR/2 're-test'):ab,ti) OR ((test NEAR/2 'retest'):ab,ti) OR 'reproducibility'/exp OR accuracy:ab,ti OR 'differential diagnosis'/exp OR 'validation study'/de OR 'measurement precision'/exp OR 'diagnostic value'/exp OR 'reliability'/exp OR 'predictive value'/exp OR ppv:ti,ab,kw OR npv:ti,ab,kw OR diagnos:ti,ab	22724263
#6	#1 AND #5	4584
#5	#2 OR #3 OR #4	2362446
#4	('deoxyglucose'/exp OR 'deoxyglucose':ab,ti,kw OR 'desoxyglucose':ab,ti,kw OR 'deoxy glucose':ab,ti,kw OR 'desoxy glucose':ab,ti,kw OR 'deoxy d glucose':ab,ti,kw OR 'desoxy d glucose':ab,ti,kw OR '2deoxyglucose':ab,ti,kw OR '2deoxy d glucose':ab,ti,kw OR 'fluorodeoxyglucose':ab,ti,kw OR 'fluorodesoxyglucose':ab,ti,kw OR 'fludeoxyglucose':ab,ti,kw OR 'fluordeoxyglucose':ab,ti,kw OR 'fluordesoxyglucose':ab,ti,kw OR '18fluorodeoxyglucose':ab,ti,kw OR '18fluorodesoxyglucose':ab,ti,kw OR '18fluordeoxyglucose':ab,ti,kw OR 'fdg':ab,ti,kw OR '18fdg':ab,ti,kw OR '18f dg':ab,ti,kw OR '18f fdg':ab,ti,kw OR 'fdgpet':ab,ti,kw OR (('fluor':ab,ti,kw OR '2fluor':ab,ti,kw OR 'fluoro':ab,ti,kw OR 'fluorodeoxy':ab,ti,kw OR 'fludeoxy':ab,ti,kw OR 'fluorine':ab,ti,kw OR '18f':ab,ti,kw OR '18flu':ab,ti,kw) AND ('glucose':ab,ti,kw OR 'galactose':ab,ti,kw))) AND ('positron emission tomography'/exp OR 'pet':ab,ti,kw OR 'petscan':ab,ti,kw OR 'fdgpet':ab,ti,kw OR ('emission':ab,ti,kw AND 'tomogra*':ab,ti,kw))	74001
#3	'echography'/exp OR 'color ultrasound flowmetry'/exp OR ultraso:ab,ti OR sonograph:ab,ti OR echograph:ab,ti OR echotomograph:ab,ti	1175440
#2	'nuclear magnetic resonance imaging'/exp OR ('magnetic resonance':ab,ti AND (image:ab,ti OR images:ab,ti OR imaging:ab,ti)) OR mri:ab,ti OR mris:ab,ti OR nmr:ab,ti OR mra:ab,ti OR mras:ab,ti OR zeugmatograph:ab,ti OR 'mr tomography':ab,ti OR 'mr tomographies':ab,ti OR 'mr tomographic':ab,ti OR 'proton spin':ab,ti OR ((magneti:ab,ti OR 'chemical shift':ab,ti) AND imaging:ab,ti) OR fmri:ab,ti OR fmris:ab,ti	1297872
#1	'giant cell arteritis'/exp OR 'aortitis'/exp OR 'giant cell arteritis':ti,ab,kw OR (((temporal OR giant) NEAR/3 arteritis):ti,ab,kw) OR horton:ti,ab,kw OR gca:ti,ab,kw OR aortitis:ti,ab,kw OR ((('large vessel' OR 'single organ') NEAR/2 (vasculitis OR arteritis)):ti,ab,kw)	19478

Ovid/Medline

#	Searches	Results
1	Giant Cell Arteritis/ or Aortitis/ or giant cell arteritis.ti,ab,kf. or ((temporal or giant) adj3 arteritis).ti,ab,kf. or horton.ti,ab,kf. or gca.ti,ab,kf. or aortitis.ti,ab,kf. or ((large vessel or single organ) adj2 (vasculitis or arteritis)).ti,ab,kf.	13766
2	exp magnetic resonance imaging/ or ("magnetic resonance" and (image or images or imaging)).ti,ab,kf. or mri.ti,ab,kf. or mris.ti,ab,kf. or nmr.ti,ab,kf. or mra.ti,ab,kf. or mras.ti,ab,kf. or zeugmatograph.ti,ab,kf. or "mr tomography".ti,ab,kf. or "mr tomographies".ti,ab,kf. or "mr tomographic".ti,ab,kf. or "proton spin".ti,ab,kf. or ((magneti or "chemical shift") and imaging).ti,ab,kf. or fmri.ti,ab,kf. or fmris.ti,ab,kf.	832204
3	exp Ultrasonography/ or ultraso.ti,ab,kf. or sonograph.ti,ab,kf. or echograph.ti,ab,kf. or echocardiograph.ti,ab,kf. or echotomograph*.ti,ab,kf.	719107
4	(exp Deoxyglucose/ or deoxyglucose.tw. or desoxyglucose.tw. or deoxy-glucose.tw. or desoxy-glucose.tw. or deoxy-d-glucose.tw. or desoxy-d-glucose.tw. or 2deoxyglucose.tw. or 2deoxy-d-glucose.tw. or fluorodeoxyglucose.tw. or fluorodesoxyglucose.tw. or fludeoxyglucose.tw. or fluordeoxyglucose.tw. or fluordesoxyglucose.tw. or 18fluorodeoxyglucose.tw. or 18fluorodesoxyglucose.tw. or 18fluordeoxyglucose.tw. or fdg.tw. or 18fdg.tw. or 18f-dg.tw. or 18f-fdg.tw. or fdg18.tw. or fdgpet.tw. or ((fluor or 2fluor or fluoro or fluorodeoxy or fludeoxy or fluorine or 18f or 18flu) and (glucose or galactose)).tw.) and (exp Positron-Emission Tomography/ or pet.tw. or petct.tw. or petscan.tw. or fdgpet.tw. or (emission and tomogra).tw.)	46238
5	2 or 3 or 4	1515688
6	1 and 5	2187
7	limit 6 to yr="2018 -Current"	580
8	exp Diagnosis/ or diagnos.ti,ab. or exp "Sensitivity and Specificity"/ or (Sensitiv* or Specific).ti,ab. or (predict or ROC-curve or receiver-operator).ti,ab. or (likelihood or LR).ti,ab. or exp Diagnostic Errors/ or (inter-observer or intra-observer or interobserver or intraobserver or validity or kappa or reliability).ti,ab. or reproducibility.ti,ab. or (test adj2 (re-test or retest)).ti,ab. or "Reproducibility of Results"/ or accuracy.ti,ab. or Diagnosis, Differential/ or Validation Study/	13121800
9	7 and 8	418
10	9 not ((exp animals/ or exp models, animal/) not humans/) not (letter/ or comment/ or editorial/)	381
11	(meta-analysis/ or meta-analysis as topic/ or (metaanaly* or meta-analy* or metanaly).ti,ab,kf. or systematic review/ or cochrane.jw. or (prisma or prospero).ti,ab,kf. or ((systemati or scoping or umbrella or "structured literature") adj3 (review* or overview)).ti,ab,kf. or (systemic adj1 review).ti,ab,kf. or ((systemati or literature or database* or data-base) adj10 search).ti,ab,kf. or ((structured or comprehensive* or systemic) adj3 search).ti,ab,kf. or ((literature adj3 review) and (search or database* or data-base)).ti,ab,kf. or (("data extraction" or "data source") and "study selection").ti,ab,kf. or ("search strategy" and "selection criteria").ti,ab,kf. or ("data source" and "data synthesis").ti,ab,kf. or (medline or pubmed or embase or cochrane).ab. or ((critical or rapid) adj2 (review or overview* or synthes)).ti. or (((critical or rapid) adj3 (review or overview* or synthes)) and (search or database* or data-base)).ab. or (metasynthes or meta-synthes*).ti,ab,kf.) not (comment/ or editorial/ or letter/ or ((exp animals/ or exp models, animal/) not humans/))	494714
12	10 and 11	17
13	10 not 12	364
14	"Society for Rheumatology guideline on diagnosis and treatment of giant cell arteritis" [Article Title]	6
15	10 and 14	1

[1] In geval van een case-control design moeten de patiëntkarakteristieken per groep (cases en controls) worden uitgewerkt. NB; case control studies zullen de accuratesse overschatten (Lijmer et al., 1999)

[2] De referentiestandaard is de test waarmee definitief wordt aangetoond of iemand al dan niet ziek is. Idealiter is de referentiestandaard de Gouden standaard (100% sensitief en 100% specifiek). Let op! dit is niet de “comparison test/index 2”.

⁴ Beschrijf de statistische parameters voor de vergelijking van de indextest(en) met de referentietest, en voor de vergelijking tussen de indextesten onderling (als er twee of meer indextesten worden vergeleken).

Verantwoording

Beoordelingsdatum en geldigheid

Publicatiedatum : 25-09-2023

Beoordeeld op geldigheid : 25-07-2023

Initiatief en autorisatie

Initiatief:

Nederlandse Vereniging voor Reumatologie

Geautoriseerd door:

Koninklijk Nederlands Genootschap voor Fysiotherapie
Nederlands Oogheelkundig Gezelschap
Nederlandse Internisten Vereniging
Nederlandse Vereniging voor Reumatologie
Verpleegkundigen en Verzorgenden Nederland
Vasculitis Stichting

Algemene gegevens

Autorisatie van deze richtlijn is afgestemd met Nederlands Huisartsen Genootschap.

De ontwikkeling/herziening van deze richtlijnmodule werd ondersteund door het Kennisinstituut van de Federatie Medisch Specialisten (www.demedischspecialist.nl/kennisinstituut) en werd gefinancierd uit de Kwaliteitsgelden Medisch Specialisten (SKMS).

De financier heeft geen enkele invloed gehad op de inhoud van de richtlijnmodule.

Samenstelling werkgroep

Voor het ontwikkelen van de richtlijnmodule is in 2019 een multidisciplinaire werkgroep ingesteld, bestaande uit vertegenwoordigers van alle relevante specialismen (zie hiervoor de Samenstelling van de werkgroep) die betrokken zijn bij de zorg voor patiënten met reuscelarteriitis.

Samenstelling van de werkgroep

Werkgroep

Prof. Dr. E. Brouwer, reumatoloog, werkzaam in UMC Groningen, NVR, voorzitter van de werkgroep.
Drs. D. Boumans, reumatoloog, werkzaam in Ziekenhuisgroep Twente (t/m december 2022), NVR.
Prof. Dr. J. van der Laken, reumatoloog, werkzaam in Amsterdam UMC, NVR.
Dr. A. van der Maas, reumatoloog, werkzaam in Sint Maartenskliniek, NVR.
Dr. M. Sandovici, reumatoloog, werkzaam in UMC Groningen, NVR
Dr. K. Visser, reumatoloog, werkzaam in Hagaziekenhuis, NVR.
Dr. W. Eizenga, huisarts, NHG.
Dr. A.E. Hak, internist-klinisch immunoloog, werkzaam in Amsterdam UMC, NIV/NVvAKI.
Dr. D.J. Mulder, internist-vasculair geneeskundige werkzaam in UMC Groningen, NIV/NVIVG.
Dr. J.W. Pott, oogarts, werkzaam in UMC Groningen, NOG.
Mw. O. Vos, verpleegkundig specialist, werkzaam in Tergooi MC, V&VN.
Dhr. H. Spijkerman, fysiotherapeut, KNGF.
Mw. M. Deinema, patiëntvertegenwoordiger, vasculitis stichting.

Klankbordgroep

Dr. D. Paap, fysiotherapeut, werkzaam in UMC Groningen, KNGF.
Dr. R. Ruiter, internist ouderengeneeskunde, klinisch farmacoloog, werkzaam in Maasstad ziekenhuis, NIV/OGK.
Dr. T. Balvers, neuroloog, werkzaam in LUMC, NVN.
Prof. Dr. R.H.J.A. Slart, nucleair geneeskundige, werkzaam in UMC Groningen, NVNG.
Drs. G. Mecozzi, chirurg, werkzaam in UMC Groningen, NVT.
Dr. B.R. Saleem, chirurg, werkzaam in UMC Groningen, NVvH.
Mw. M. Esseboom, oefentherapeut, VvOCM.
Mw. J. Korlaar- Luigjes, verpleegkundige gespecialiseerd in reumatologie en vasculitis, werkzaam in Meander MC, V&VN.
Mw. M. van Engelen, patiëntvertegenwoordiger, vasculitis stichting.
Mw. O. van Eden, patiëntvertegenwoordiger, vasculitis stichting.

Met ondersteuning van

Drs. I. van Dusseldorp, literatuurspecialist, Kennisinstituut van de Federatie Medisch Specialisten.
Dr. M.M.A. Verhoeven, adviseur, Kennisinstituut van de Federatie Medisch Specialisten.

Belangenverklaringen

De Code ter voorkoming van oneigenlijke beïnvloeding door belangenverstrengeling is gevolgd. Alle werkgroepleden hebben schriftelijk verklaard of zij in de laatste drie jaar directe financiële belangen (betrekking bij een commercieel bedrijf, persoonlijke financiële belangen, onderzoeksfinanciering) of indirecte belangen (persoonlijke relaties, reputatiemanagement) hebben gehad. Gedurende de ontwikkeling of herziening van een module worden wijzigingen in belangen aan de voorzitter doorgegeven. De belangenverklaring wordt opnieuw bevestigd tijdens de commentaarfase.

Een overzicht van de belangen van werkgroepleden en het oordeel over het omgaan met eventuele belangen vindt u in onderstaande tabel. De ondertekende belangenverklaringen zijn op te vragen bij het secretariaat van het Kennisinstituut van de Federatie Medisch Specialisten.

Werkgroeplid	Functie	Nevenfuncties	Gemelde belangen	Ondernomen actie
Prof. Dr. E. Brouwer (voorzitter)	internist, reumatoloog	- Vanaf 2 april 2022 Bestuurslid stichting Auto-immune Research Collaboration Hub ARCH		Geen
Drs. D. Boumans	reumatoloog	- Lid OMERACT-US Large Vessel Vasculitis - Consulterend reumatoloog Prisma Netwerk Siilo B.V.	-	Geen
Prof. Dr. J. van der Laken	reumatoloog	-	-	Geen
Dr. A. van der Maas	reumatoloog	Lid van de wetenschappelijke adviesraad van Medidact, werkzaamheden: meedenken over kopij/onderwerpen, soms artikelen en af en toe voorwoord/editorial schrijven	1. In ons ziekenhuis zijn we bezig met onderzoek naar PMR. Er loopt een korte proof-of-concept studie naar rituximab bij PMR, vooralsnog ongesubsidieerd. Er zijn geen partijen betrokken met financiële belangen bij de uitkomst. 2. We zijn verder een RCT aan het opzetten naar methotrexaat bij patiënten bij wie recent de diagnose PMR is gesteld. We hebben hiervoor subsidie toegezegd gekregen van ReumaNederland. Zij hebben geen financieel belang bij de uitkomst van dit onderzoek. 3. Onze afdeling werkt mee aan een sponsor geïnitieerde studie met sarilumab bij PMR. Onze afdeling krijgt een vergoeding voor het includeren en behandelen van patiënten in het kader van dit onderzoek. Los daarvan is er geen bijkomend financieel belang.	Geen
Dr. M. Sandovici	reumatoloog	-	PI ReumaNederland project: “A novel disease model for Giant Cell Arteritis: the antibody-independent role of B cells in the pathogenesis of Giant Cell Arteritis”	Geen
Dr. K. Visser	reumatoloog	commissie kwaliteit NVR, onbetaald; EULAR werkgroep aanbevelingen t.a.v. reumatische immuun gerelateerde bijwerkingen immunotherapie, onbetaald	HAGA reumatologie – sarilumab studie Sanofi	Geen
Dr. W. Eizenga	Huisarts			Geen
Dr. A.E. Hak	Internist, Klinisch immunoloog		Dr. A.E. Hak is coördinator van het door het Ministerie van VWS erkend Vasculitis Expertisecentrum AMC (actueel Amsterdam UMC, locatie AMC). Binnen dit expertisecentrum zijn patiënten met GCA in zorg, dan wel wordt van elders om expertise gevraagd (naast overige vormen van vasculitis). In dit kader wordt regelmatig overleg gevoerd met de Vasculitis Stichting (Patiëntenorganisatie).	Geen
Dr. D.J. Mulder	Internist, Vasculair geneeskundige	-	-	Geen
Dr. J.W. Pott	oogarts	Opleider Oogheelkunde UMCG, secretaris werkgroep Nederlandse Neuro-ophthalmology (NeNDS), Lid bestuur European Neuro-ophthalmology Society (EUNDS)		Geen
Mw. O. Vos	verpleegkundig specialist,	-	-	Geen
Dhr. H. Spijkerman	Fysiotherapeut	Lid stuurgroep Netwerk Multipele Sclerose Groningen (onbetaald), Lid Geriatrie Netwerk Groningen (onbetaald), Student post HBO - master Geriatrie - fysiotherapie te Breda. Verwachte afstudeerdatum: juni 2020 (onbetaald)		Geen
Mw. M. Deinema	patiëntvertegenwoordiger		--	Geen

Inbreng patiëntenperspectief

Er werd aandacht besteed aan het patiënten perspectief door patiëntenverenigingen uit te nodigen voor de schriftelijke knelpuntenanalyse en een lid van de patiëntenvereniging af te vaardigen in de werkgroep. Het verslag hiervan is besproken in de werkgroep. De verkregen input is meegenomen bij het opstellen van de uitgangsvragen, de keuze voor de uitkomstmaten en bij het opstellen van de overwegingen. De conceptrichtlijn is tevens voor commentaar voorgelegd aan de patiëntenvereniging en de eventueel aangeleverde commentaren zijn bekeken en verwerkt.

Wkkgz & Kwalitatieve raming van mogelijke substantiële financiële gevolgen

Kwalitatieve raming van mogelijke financiële gevolgen in het kader van de Wkkgz

Bij de richtlijn is conform de Wet kwaliteit, klachten en geschillen zorg (Wkkgz) een kwalitatieve raming uitgevoerd of de aanbevelingen mogelijk leiden tot substantiële financiële gevolgen. Bij het uitvoeren van deze beoordeling zijn richtlijnmodules op verschillende domeinen getoetst (zie het stroomschema op de Richtlijnendatabase).

Uit de kwalitatieve raming blijkt dat er waarschijnlijk geen substantiële financiële gevolgen zijn, zie onderstaande tabel.

Module	Uitkomst raming	Toelichting
Module verwijzing	Geen substantiële financiële gevolgen	Hoewel uit de toetsing volgt dat de aanbeveling(en) breed toepasbaar zijn (5.000-40.000 patiënten), volgt ook uit de toetsing dat het overgrote deel (±90%) van de zorgaanbieders en zorgverleners al aan de norm voldoet. Er worden daarom geen substantiële financiële gevolgen verwacht.
Module diagnostiek	Geen substantiële financiële gevolgen	Hoewel uit de toetsing volgt dat de aanbeveling(en) breed toepasbaar zijn (5.000-40.000 patiënten), volgt ook uit de toetsing dat het overgrote deel (±90%) van de zorgaanbieders en zorgverleners al aan de norm voldoet. Er worden daarom geen substantiële financiële gevolgen verwacht.
Module medicamenteuze behandeling	Geen substantiële financiële gevolgen	Hoewel uit de toetsing volgt dat de aanbeveling(en) breed toepasbaar zijn (5.000-40.000 patiënten), volgt ook uit de toetsing dat het overgrote deel (±90%) van de zorgaanbieders en zorgverleners al aan de norm voldoet. Er worden daarom geen substantiële financiële gevolgen verwacht.
Module niet- medicamenteuze behandeling	Geen substantiële financiële gevolgen	Hoewel uit de toetsing volgt dat de aanbeveling(en) breed toepasbaar zijn (5.000-40.000 patiënten), volgt ook uit de toetsing dat het overgrote deel (±90%) van de zorgaanbieders en zorgverleners al aan de norm voldoet. Er worden daarom geen substantiële financiële gevolgen verwacht.
Module monitoring	Geen substantiële financiële gevolgen	Hoewel uit de toetsing volgt dat de aanbeveling(en) breed toepasbaar zijn (5.000-40.000 patiënten), volgt ook uit de toetsing dat het overgrote deel (±90%) van de zorgaanbieders en zorgverleners al aan de norm voldoet. Er worden daarom geen substantiële financiële gevolgen verwacht.

Werkwijze

AGREE

Deze richtlijnmodule is opgesteld conform de eisen vermeld in het rapport Medisch Specialistische Richtlijnen 2.0 van de adviescommissie Richtlijnen van de Raad Kwaliteit. Dit rapport is gebaseerd op het AGREE II instrument (Appraisal of Guidelines for Research & Evaluation II; Brouwers, 2010).

Knelpuntenanalyse en uitgangsvragen

Tijdens de voorbereidende fase inventariseerde de werkgroep de knelpunten in de zorg voor patiënten met de schriftelijk knelpuntenanalyse. Tevens zijn er knelpunten aangedragen door VvOCM, NIV, ZN, NVZ, NVR, IGJ, KNGF, VIG, NVZA, ReumaNederland, V&VN, Nationale Vereniging ReumaZorg Nederland, NHG, KNMP, NOG via de schriftelijke knelpuntenanalyse. Een verslag hiervan is opgenomen onder aanverwante producten (Bijlage 1).

Op basis van de uitkomsten van de knelpuntenanalyse zijn door de werkgroep concept-uitgangsvragen opgesteld en definitief vastgesteld.

Uitkomstmaten

Na het opstellen van de zoekvraag behorende bij de uitgangsvraag inventariseerde de werkgroep welke uitkomstmaten voor de patiënt relevant zijn, waarbij zowel naar gewenste als ongewenste effecten werd gekeken. Hierbij werd een maximum van acht uitkomstmaten gehanteerd. De werkgroep waardeerde deze uitkomstmaten volgens hun relatieve belang bij de besluitvorming rondom aanbevelingen, als cruciaal (kritiek voor de besluitvorming), belangrijk (maar niet cruciaal) en onbelangrijk. Tevens definieerde de werkgroep tenminste voor de cruciale uitkomstmaten welke verschillen zij klinisch (patiënt) relevant vonden.

Methode literatuursamenvatting

Een uitgebreide beschrijving van de strategie voor zoeken en selecteren van literatuur en de beoordeling van de risk-of-bias van de individuele studies is te vinden onder ‘Zoeken en selecteren’ onder Onderbouwing. De beoordeling van de kracht van het wetenschappelijke bewijs wordt hieronder toegelicht.

Beoordelen van de kracht van het wetenschappelijke bewijs

De kracht van het wetenschappelijke bewijs werd bepaald volgens de GRADE-methode. GRADE staat voor ‘Grading Recommendations Assessment, Development and Evaluation’ (zie http://www.gradeworkinggroup.org/). De basisprincipes van de GRADE-methodiek zijn: het benoemen en prioriteren van de klinisch (patiënt) relevante uitkomstmaten, een systematische review per uitkomstmaat, en een beoordeling van de bewijskracht per uitkomstmaat op basis van de acht GRADE-domeinen (domeinen voor downgraden: risk of bias, inconsistentie, indirectheid, imprecisie, en publicatiebias; domeinen voor upgraden: dosis-effect relatie, groot effect, en residuele plausibele confounding).

GRADE onderscheidt vier gradaties voor de kwaliteit van het wetenschappelijk bewijs: hoog, redelijk, laag en zeer laag. Deze gradaties verwijzen naar de mate van zekerheid die er bestaat over de literatuurconclusie, in het bijzonder de mate van zekerheid dat de literatuurconclusie de aanbeveling adequaat ondersteunt (Schünemann, 2013; Hultcrantz, 2017).

GRADE	Definitie
Hoog	er is hoge zekerheid dat het ware effect van behandeling dichtbij het geschatte effect van behandeling ligt; het is zeer onwaarschijnlijk dat de literatuurconclusie klinisch relevant verandert wanneer er resultaten van nieuw grootschalig onderzoek aan de literatuuranalyse worden toegevoegd.
Redelijk	er is redelijke zekerheid dat het ware effect van behandeling dichtbij het geschatte effect van behandeling ligt; het is mogelijk dat de conclusie klinisch relevant verandert wanneer er resultaten van nieuw grootschalig onderzoek aan de literatuuranalyse worden toegevoegd.
Laag	er is lage zekerheid dat het ware effect van behandeling dichtbij het geschatte effect van behandeling ligt; er is een reële kans dat de conclusie klinisch relevant verandert wanneer er resultaten van nieuw grootschalig onderzoek aan de literatuuranalyse worden toegevoegd.
Zeer laag	er is zeer lage zekerheid dat het ware effect van behandeling dichtbij het geschatte effect van behandeling ligt; de literatuurconclusie is zeer onzeker.

Bij het beoordelen (graderen) van de kracht van het wetenschappelijk bewijs in richtlijnen volgens de GRADE-methodiek spelen grenzen voor klinische besluitvorming een belangrijke rol (Hultcrantz, 2017). Dit zijn de grenzen die bij overschrijding aanleiding zouden geven tot een aanpassing van de aanbeveling. Om de grenzen voor klinische besluitvorming te bepalen moeten alle relevante uitkomstmaten en overwegingen worden meegewogen. De grenzen voor klinische besluitvorming zijn daarmee niet één op één vergelijkbaar met het minimaal klinisch relevant verschil (Minimal Clinically Important Difference, MCID). Met name in situaties waarin een interventie geen belangrijke nadelen heeft en de kosten relatief laag zijn, kan de grens voor klinische besluitvorming met betrekking tot de effectiviteit van de interventie bij een lagere waarde (dichter bij het nuleffect) liggen dan de MCID (Hultcrantz, 2017).

Overwegingen (van bewijs naar aanbeveling)

Om te komen tot een aanbeveling zijn naast (de kwaliteit van) het wetenschappelijke bewijs ook andere aspecten belangrijk en worden meegewogen, zoals aanvullende argumenten uit bijvoorbeeld de biomechanica of fysiologie, waarden en voorkeuren van patiënten, kosten (middelenbeslag), aanvaardbaarheid, haalbaarheid en implementatie. Deze aspecten zijn systematisch vermeld en beoordeeld (gewogen) onder het kopje ‘Overwegingen’ en kunnen (mede) gebaseerd zijn op expert opinion. Hierbij is gebruik gemaakt van een gestructureerd format gebaseerd op het evidence-to-decision framework van de internationale GRADE Working Group (Alonso-Coello, 2016a; Alonso-Coello 2016b). Dit evidence-to-decision framework is een integraal onderdeel van de GRADE-methodiek.

Formuleren van aanbevelingen

De aanbevelingen geven antwoord op de uitgangsvraag en zijn gebaseerd op het beschikbare wetenschappelijke bewijs en de belangrijkste overwegingen, en een weging van de gunstige en ongunstige effecten van de relevante interventies. De kracht van het wetenschappelijk bewijs en het gewicht dat door de werkgroep wordt toegekend aan de overwegingen, bepalen samen de sterkte van de aanbeveling. Conform de GRADE-methodiek sluit een lage bewijskracht van conclusies in de systematische literatuuranalyse een sterke aanbeveling niet a priori uit, en zijn bij een hoge bewijskracht ook zwakke aanbevelingen mogelijk (Agoritsas, 2017; Neumann, 2016). De sterkte van de aanbeveling wordt altijd bepaald door weging van alle relevante argumenten tezamen. De werkgroep heeft bij elke aanbeveling opgenomen hoe zij tot de richting en sterkte van de aanbeveling zijn gekomen.

In de GRADE-methodiek wordt onderscheid gemaakt tussen sterke en zwakke (of conditionele) aanbevelingen. De sterkte van een aanbeveling verwijst naar de mate van zekerheid dat de voordelen van de interventie opwegen tegen de nadelen (of vice versa), gezien over het hele spectrum van patiënten waarvoor de aanbeveling is bedoeld. De sterkte van een aanbeveling heeft duidelijke implicaties voor patiënten, behandelaars en beleidsmakers (zie onderstaande tabel). Een aanbeveling is geen dictaat, zelfs een sterke aanbeveling gebaseerd op bewijs van hoge kwaliteit (GRADE gradering HOOG) zal niet altijd van toepassing zijn, onder alle mogelijke omstandigheden en voor elke individuele patiënt.

Implicaties van sterke en zwakke aanbevelingen voor verschillende richtlijngebruikers
	Sterke aanbeveling	Zwakke (conditionele) aanbeveling
Voor patiënten	De meeste patiënten zouden de aanbevolen interventie of aanpak kiezen en slechts een klein aantal niet.	Een aanzienlijk deel van de patiënten zouden de aanbevolen interventie of aanpak kiezen, maar veel patiënten ook niet.
Voor behandelaars	De meeste patiënten zouden de aanbevolen interventie of aanpak moeten ontvangen.	Er zijn meerdere geschikte interventies of aanpakken. De patiënt moet worden ondersteund bij de keuze voor de interventie of aanpak die het beste aansluit bij zijn of haar waarden en voorkeuren.
Voor beleidsmakers	De aanbevolen interventie of aanpak kan worden gezien als standaardbeleid.	Beleidsbepaling vereist uitvoerige discussie met betrokkenheid van veel stakeholders. Er is een grotere kans op lokale beleidsverschillen.

Organisatie van zorg

In de knelpuntenanalyse en bij de ontwikkeling van de richtlijnmodule is expliciet aandacht geweest voor de organisatie van zorg: alle aspecten die randvoorwaardelijk zijn voor het verlenen van zorg (zoals coördinatie, communicatie, (financiële) middelen, mankracht en infrastructuur). Randvoorwaarden die relevant zijn voor het beantwoorden van deze specifieke uitgangsvraag zijn genoemd bij de overwegingen. Meer algemene, overkoepelende, of bijkomende aspecten van de organisatie van zorg worden behandeld in de module Organisatie van zorg.

Commentaar- en autorisatiefase

De conceptrichtlijnmodule werd aan de betrokken (wetenschappelijke) verenigingen en (patiënt) organisaties voorgelegd ter commentaar. De commentaren werden verzameld en besproken met de werkgroep. Naar aanleiding van de commentaren werd de conceptrichtlijnmodule aangepast en definitief vastgesteld door de werkgroep. De definitieve richtlijnmodule werd aan de deelnemende (wetenschappelijke) verenigingen en (patiënt) organisaties voorgelegd voor autorisatie en door hen geautoriseerd dan wel geaccordeerd.

Literatuur

Agoritsas T, Merglen A, Heen AF, Kristiansen A, Neumann I, Brito JP, Brignardello-Petersen R, Alexander PE, Rind DM, Vandvik PO, Guyatt GH. UpToDate adherence to GRADE criteria for strong recommendations: an analytical survey. BMJ Open. 2017 Nov 16;7(11):e018593. doi: 10.1136/bmjopen-2017-018593. PubMed PMID: 29150475; PubMed Central PMCID: PMC5701989.

Alonso-Coello P, Schünemann HJ, Moberg J, Brignardello-Petersen R, Akl EA, Davoli M, Treweek S, Mustafa RA, Rada G, Rosenbaum S, Morelli A, Guyatt GH, Oxman AD; GRADE Working Group. GRADE Evidence to Decision (EtD) frameworks: a systematic and transparent approach to making well informed healthcare choices. 1: Introduction. BMJ. 2016 Jun 28;353:i2016. doi: 10.1136/bmj.i2016. PubMed PMID: 27353417.

Alonso-Coello P, Oxman AD, Moberg J, Brignardello-Petersen R, Akl EA, Davoli M, Treweek S, Mustafa RA, Vandvik PO, Meerpohl J, Guyatt GH, Schünemann HJ; GRADE Working Group. GRADE Evidence to Decision (EtD) frameworks: a systematic and transparent approach to making well informed healthcare choices. 2: Clinical practice guidelines. BMJ. 2016 Jun 30;353:i2089. doi: 10.1136/bmj.i2089. PubMed PMID: 27365494.

Brouwers MC, Kho ME, Browman GP, Burgers JS, Cluzeau F, Feder G, Fervers B, Graham ID, Grimshaw J, Hanna SE, Littlejohns P, Makarski J, Zitzelsberger L; AGREE Next Steps Consortium. AGREE II: advancing guideline development, reporting and evaluation in health care. CMAJ. 2010 Dec 14;182(18):E839-42. doi: 10.1503/cmaj.090449. Epub 2010 Jul 5. Review. PubMed PMID: 20603348; PubMed Central PMCID: PMC3001530.

Hultcrantz M, Rind D, Akl EA, Treweek S, Mustafa RA, Iorio A, Alper BS, Meerpohl JJ, Murad MH, Ansari MT, Katikireddi SV, Östlund P, Tranæus S, Christensen R, Gartlehner G, Brozek J, Izcovich A, Schünemann H, Guyatt G. The GRADE Working Group clarifies the construct of certainty of evidence. J Clin Epidemiol. 2017 Jul;87:4-13. doi: 10.1016/j.jclinepi.2017.05.006. Epub 2017 May 18. PubMed PMID: 28529184; PubMed Central PMCID: PMC6542664.

Medisch Specialistische Richtlijnen 2.0 (2012). Adviescommissie Richtlijnen van de Raad Kwalitieit. http://richtlijnendatabase.nl/over_deze_site/over_richtlijnontwikkeling.html

Neumann I, Santesso N, Akl EA, Rind DM, Vandvik PO, Alonso-Coello P, Agoritsas T, Mustafa RA, Alexander PE, Schünemann H, Guyatt GH. A guide for health professionals to interpret and use recommendations in guidelines developed with the GRADE approach. J Clin Epidemiol. 2016 Apr;72:45-55. doi: 10.1016/j.jclinepi.2015.11.017. Epub 2016 Jan 6. Review. PubMed PMID: 26772609.

Schünemann H, Brożek J, Guyatt G, et al. GRADE handbook for grading quality of evidence and strength of recommendations. Updated October 2013. The GRADE Working Group, 2013. Available from http://gdt.guidelinedevelopment.org/central_prod/_design/client/handbook/handbook.html.

Schünemann HJ, Oxman AD, Brozek J, Glasziou P, Jaeschke R, Vist GE, Williams JW Jr, Kunz R, Craig J, Montori VM, Bossuyt P, Guyatt GH; GRADE Working Group. Grading quality of evidence and strength of recommendations for diagnostic tests and strategies. BMJ. 2008 May 17;336(7653):1106-10. doi: 10.1136/bmj.39500.677199.AE. Erratum in: BMJ. 2008 May 24;336(7654). doi: 10.1136/bmj.a139.

Schünemann, A Holger J [corrected to Schünemann, Holger J]. PubMed PMID: 18483053; PubMed Central PMCID: PMC2386626.

Wessels M, Hielkema L, van der Weijden T. How to identify existing literature on patients' knowledge, views, and values: the development of a validated search filter. J Med Libr Assoc. 2016 Oct;104(4):320-324. PubMed PMID: 27822157; PubMed Central PMCID: PMC5079497.

Richtlijnendatabase

Diagnostiek en behandeling reuscelarteriitis

Diagnostiek en behandeling reuscelarteriitis

Diagnostische testen

Uitgangsvraag

Aanbeveling

Overwegingen

Onderbouwing

Achtergrond

Conclusies / Summary of Findings

Samenvatting literatuur

Zoeken en selecteren

Referenties

Evidence tabellen

Verantwoording

Beoordelingsdatum en geldigheid

Initiatief en autorisatie

Algemene gegevens

Samenstelling werkgroep

Samenstelling van de werkgroep

Belangenverklaringen

Inbreng patiëntenperspectief

Werkwijze

Bijlagen