Diagnostische strategie bij volwassenen met acute appendicitis

Beoordeeld: 01-07-2019

Uitgangsvraag

Wat is de optimale diagnostische strategie bij volwassenen met verdenking van acute appendicitis?

Aanbeveling

Aanbeveling 1

Verricht echografie bij iedere patiënt met verdenking op een appendicitis.

Aanbeveling 2

Verricht bij een negatieve of inconclusieve echografie en een hoge klinische verdenking op acute appendicitis zonder eenduidige alternatieve diagnose:

een CT met intraveneus contrast of;
een MRI bij jongvolwassenen, met name bij vrouwen in de vruchtbare leeftijd.

Overweeg bij een lage(re) klinische verdenking op acute appendicitis zonder eenduidige alternatieve diagnose bij een negatieve of inconclusieve echografie een herbeoordeling met eventueel herhalen van de echografie wanneer klachten persisteren of verergeren.

Verricht een diagnostische laparoscopie bij een hoge klinische verdenking op acute appendicitis en een inconclusieve MRI of CT scan.

Aanbeveling 3

Overweeg aanvullend een CT met intraveneus contrast bij patiënten met verdenking op een abces of groot ontstekingsinfiltraat op echografie.

Zie ook het Stroomschema ‘Diagnostiek bij volwassenen’ bij de aanverwante producten onder 'toepassen'.

Overwegingen

Voor- en nadelen van de interventie en de kwaliteit van het bewijs

De werkgroep is van mening dat de diagnostische accuratesse van anamnese, lichamelijk onderzoek en laboratorium onderzoek voldoende is om een goede inschatting te kunnen maken of een patiënt met acute buikpijn voor aanvullende diagnostiek, opname of herbeoordeling vanwege verdenking op appendicitis in aanmerking komt.

Om de definitieve diagnose acute appendicitis vast te stellen is aanvullend beeldvormend onderzoek geïndiceerd. De accuratesse van beeldvormend onderzoek gaat omhoog naarmate de vooraf kans hoger wordt. Een goede voorselectie op basis van de anamnese, lichamelijk onderzoek en laboratorium onderzoek verhoogd de accuratesse van beeldvorming en is daarom essentieel.

Beeldvorming

Het gebruik van beeldvorming bij iedere patiënt met verdenking op een appendicitis zorgt voor lagere kosten per patiënt, minder complicaties en minder negatieve appendectomieën (Lahaye, 2015). De studie van Leeuwenburgh (2013) toonde aan dat de sensitiviteit en negatief voorspellende waarde van echografie gevolgd door CT of MRI is hoger dan die voor echografie alleen. De studie van Reuvers (2016) beschreef de resultaten van klinische herbeoordeling in plaats van direct aanvullende beeldvorming, welke in geselecteerde patiënten met goed resultaat toegepast kan worden.

Meerdere studies tonen aan dat de positief voorspellende waarde en specificiteit van echografie voor het aantonen van appendicitis hoog is. Bij een positieve echografie kan worden volstaan met alleen echografie. Bij een negatieve of inconclusieve echo moet overleg plaatsvinden over de vervolgstappen. Echografie gevolgd door CT zorgt voor een hogere negatief voorspellende waarde. Er kan echter ook gekozen worden voor herbeoordeling.

Herbeoordeling

Onur (2008) vergeleek in een RCT onder patiënten met acute buikpijn, die na beoordeling op de spoedeisende hulp een niet-definitieve diagnose hebben gekregen, of er een verschil is in morbiditeit tussen opname ter observatie (n=50) en herbeoordeling de volgende drie dagen met een tijdsinterval van 8 tot 12 uur (n=55). De totale morbiditeit was 10% in de opname groep en 7,2% in de herbeoordeling groep.

Toorenvliet (2010) keek in een prospectieve studie onder 500 patiënten met acute buikpijn, die na beoordeling op de spoedeisende hulp niet werden opgenomen en binnen 24 uur werden herbeoordeeld. Zes patiënten (1,2%) kregen bij herbeoordeling een diagnose waarvan je zou willen dat deze bij initiële beoordeling zou zijn gesteld. Drie van deze patiënten hadden een geperforeerde acute appendicitis. De zes patiënten werden geopereerd en na herstel van de operatie uit het ziekenhuis ontslagen zonder complicaties.

De studie van Toorenvliet (2010) was opgezet als een prospectieve studie waarbij het dagelijkse management niet werd beïnvloed door de studie. Gezien slechts bij zes patiënten (1,2%) een snellere diagnose gewenst was, kan gesteld worden dat op basis van klinische evaluatie een goede inschatting gemaakt kan worden of een patiënt in aanmerking komt voor herbeoordeling.

Beeldvorming:

De bewijskracht van de literatuur is laag tot zeer laag. Voor sommige vergelijkingen waren er geen studies beschikbaar die in dezelfde studiepopulatie verschillende testen met elkaar vergeleken, bijvoorbeeld echografie versus MRI. De diagnostische accuratesse van echografie en MRI is daardoor geëxtraheerd uit verschillende studiepopulaties, waardoor er geen directe vergelijking kan worden gemaakt en er sprake is van indirectheid.

De studies en uitkomsten van de beschikbare literatuur zijn erg heterogeen, wat de bewijskracht van de uitkomstmaten verlaagd. De prevalentie binnen studies varieert en het is vaak onduidelijk hoe patiënten in een studie geselecteerd zijn en welke diagnostiek voorafgaand aan de beeldvorming heeft plaatsgevonden. Daarnaast is het in veel studies onduidelijk hoe lang de follow-up tijd van de referentie test was, met name bij de patiënten die geen operatie ondergingen en welke informatie beschikbaar was bij de beoordeling van een diagnostische test.

Herbeoordeling:

De bewijskracht van de literatuur is niet beoordeeld. Er is geen systematisch literatuuronderzoek verricht, omdat er verwacht werd geen vergelijkende studies te vinden die kunnen beantwoorden bij welke patiënten er na klinische evaluatie een indicatie is voor herbeoordeling, dan wel aanvullende diagnostiek, de volgende dag. Er worden dan ook geen conclusies vermeld. De aanbevelingen zijn daarom uitsluitend gebaseerd op overwegingen die zijn opgesteld door de werkgroepleden op basis van kennis uit de praktijk en waar mogelijk onderbouwd door niet systematisch literatuuronderzoek.

Waarden en voorkeuren van patiënten (en eventueel hun verzorgers)

Beeldvorming:

Het belangrijkste doel voor de patiënt is een correcte en snelle diagnose, waarbij onnodige diagnostiek voorkomen moet worden. En waarbij aandacht is voor de belasting en belastbaarheid van de patiëntengroep. Geen negatieve appendectomie, maar ook geen onterecht uitgestelde behandeling.

Echografie heeft geen stralingsbelasting of risico op contrastnefropathie en wordt goed verdragen door patiënten, het nadeel is echter de lagere sensitiviteit en negatief voorspellende waarde. Groot voordeel is dat echografie overal beschikbaar is, ook buiten kantoortijden. Wanneer CT of MRI wordt toegevoegd aan het diagnostisch traject bij een negatieve of inconclusieve echo nemen sensitiviteit en negatief voorspellende waarde toe.

Nadelen van het gebruik van CT zijn stralingsbelasting en het gebruik van intraveneus contrast, waarbij risico is op nierschade en allergische reacties.

Nadeel van het gebruik van MRI bij appendicitis is dat deze techniek nog niet wijdverspreid is over alle ziekenhuizen en voornamelijk in nachten en weekenden vooralsnog minder beschikbaar is. Daarnaast vergt het beoordelen van een MRI specifieke training en zijn er veel minder radiologen bekwaam in het beoordelen van een MRI (Leeuwenburgh, 2012). Een ander nadeel van MRI is de krappe ruimte, waardoor patiënten met claustrofobie en/of ernstig obese patiënten geen MRI kunnen ondergaan. Ook patiënten met een neurostimulator of pacemaker kunnen geen MRI ondergaan, of alleen met voorzorgsmaatregelen. Het grote voordeel van MRI ten opzichte van CT is het ontbreken van stralingsbelasting.

Herbeoordeling:

Door herbeoordeling te gebruiken in het diagnostisch proces worden meer correcte diagnoses gesteld. 30% van de patiënten krijgt na herbeoordeling binnen 24 uur een andere diagnose kreeg dan de initiële diagnose die gesteld was na eerste beoordeling op de spoedeisende hulp. De herbeoordeling leidt in 17% van de gevallen tot verandering van beleid en bij 4% tot operatieve behandeling (Toorenvliet, 2010).

Bij 1,2% van de patiënten die voor herbeoordeling binnen 24 uur komen wordt een diagnose gevonden, waarbij snelle diagnostiek gewenst was geweest. Deze patiënten werden geopereerd en na herstel van de operatie uit het ziekenhuis ontslagen zonder complicaties.

Door herbeoordeling te gebruiken in het diagnostisch proces wordt het onnodig gebruik van aanvullende diagnostiek tegen gegaan. Door de factor tijd te benutten kan het natuurlijk beloop van de ziekte worden gevolgd. Milde, zelflimiterende, ziekte zal geen aanvullende diagnostiek behoeven.

Kosten (middelenbeslag)

Beeldvorming:

Het gebruik van beeldvorming bij iedere patiënt met de verdenking op appendicitis zorgt voor lagere kosten per patiënt (Lahaye, 2015). Echografie is goedkoper en beter beschikbaar dan CT en MRI. Omdat bij een aantal patiënten volstaan kan worden met echografie alleen wordt hiermee een kostenreductie bereikt vergeleken met een diagnostisch traject waarbij echografie overgeslagen wordt.

Herbeoordeling:

Het gebruik van herbeoordeling in het diagnostisch proces zorgt voor minder aanvullende diagnostiek en beperkt het aantal opnames. Herbeoordeling is goedkoper dan opname. Bij herbeoordeling is bij een aantal patiënten geen aanvullende diagnostiek nodig, omdat het natuurlijk beloop van de ziekte uitwijst dat het om milde ziekte gaat.

Aanvaardbaarheid voor de overige relevante stakeholders

Het gebruik van beeldvorming en herbeoordeling in het diagnostisch proces van een patiënt met een verdenking acute appendicitis is standaard zorg en breed geaccepteerd.

Bijkomende beslissende factor voor herbeoordeling kan zijn dat het niet in ieder ziekenhuis mogelijk is om buiten kantoortijden en in het weekend te beschikken over directe beschikbaarheid van echografie, CT of MRI.

Haalbaarheid en implementatie

Beeldvorming:

Het gebruik van echografie bij verdenking op acute appendicitis is standaard zorg en breed geaccepteerd. Hetzelfde geldt voor CT. De expertise met MRI voor appendicitis is wisselend. Ook zal het niet in ieder ziekenhuis mogelijk zijn om buiten kantoortijden en in het weekend te beschikken over personeel dat bekwaam is in het maken en beoordelen van een MRI.

Herbeoordeling:

De diagnostische accuratesse van anamnese, lichamelijk onderzoek en laboratorium onderzoek is voldoende om een goede inschatting te kunnen maken of een patiënt voor aanvullende diagnostiek, opname of herbeoordeling in aanmerking komt.

De lagere kosten, brede beschikbaarheid en de afwezigheid van stralenbelasting maken echografie het onderzoek van eerste keus bij patiënten met een verdenking op appendicitis.

Rationale/ balans tussen de argumenten voor en tegen de interventie

Wanneer de echografie negatief of inconclusief is en er een hoge klinische verdenking op appendicitis is, is er vervolgonderzoek nodig. De accuratesse van echografie gevolgd door MRI is gelijk aan die van echografie gevolgd door een CT. De MRI geeft geen stralingsbelasting, maar is minder goed beschikbaar (expertise, beschikbare middelen).

Onderbouwing

Achtergrond

De diagnostiek bij appendicitis bestaat uit klinische evaluatie, laboratorium onderzoek en beeldvorming (echografie, CT-scan, MRI-scan). Bij lage verdenking op acute appendicitis kan na klinische evaluatie gekozen worden om de patiënt de volgende dag te herbeoordelen in plaats van aanvullende diagnostiek te doen. Patiënten met acute appendicitis zijn over het algemeen goed te identificeren, maar patiënten die zich in het beginstadium van de ziekte presenteren zijn moeilijker te onderscheiden van patiënten met andere (self-limiting) oorzaken van de buikklachten. De betrouwbaarheid van beeldvorming is in het beginstadium van ziekte ook lager. Of een patiënt in aanmerking komt voor herbeoordeling, dan wel directe aanvullende diagnostiek hangt af van klinische evaluatie.

Conclusies / Summary of Findings

Laag

GRADE

De sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde van contrast CT lijken enigszins hoger vergeleken met echografie voor het diagnosticeren van acute appendicitis bij volwassenen met verdenking op acute appendicitis.

Bronnen: (Van Randen, 2008)

Zeer laag GRADE

Het is niet duidelijk of er verschil is tussen de sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde van echografie vergeleken met MRI of CT voor het diagnosticeren van acute appendicitis bij volwassenen met verdenking op acute appendicitis.

Bronnen: (Giliaca, 2017; Lourenco, 2016; Pare, 2015; Petkovska, 2016; Repplinger, 2016; Ziedses, 2016

Laag

GRADE

De sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde van MRI versus contrast CT voor de diagnose van acute appendicitis bij volwassenen lijken vergelijkbaar.

Bronnen: (Repplinger, 2018)

GRADE

Er zijn geen studies gevonden die de uitkomstmaat inconclusieve resultaten rapporteerden voor contrast CT.

Bronnen: (-)

Laag

GRADE

De sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde van de work-up echografie plus contrast CT lijken vergelijkbaar met de work-up echografie plus MRI voor het diagnosticeren van acute appendicitis bij volwassenen met verdenking op acute appendicitis.

Bronnen: (Leeuwenburgh, 2013)

Laag

GRADE

De sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde van de work-up echografie plus contrast CT lijken vergelijkbaar met een contrast CT zonder echografie voor het diagnosticeren van acute appendicitis bij volwassenen met verdenking op acute appendicitis.

Bronnen: (Atema, 2015)

Laag

GRADE

De sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde van de work-up echografie plus contrast CT lijken vergelijkbaar met een MRI zonder echografie voor het diagnosticeren van acute appendicitis bij volwassenen met verdenking op acute appendicitis.

Bronnen: (Leeuwenburgh, 2013)

Laag

GRADE

De sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde van de work-up echografie plus MRI lijken vergelijkbaar met een MRI zonder echografie voor het diagnosticeren van acute appendicitis bij volwassenen met verdenking op acute appendicitis.

Bronnen: (Leeuwenburgh, 2013)

GRADE

Er zijn geen studies gevonden die de uitkomstmaat inconclusieve resultaten rapporteerden voor de work-up echografie plus MRI of echografie versus contrast CT

Bronnen: (-)

Samenvatting literatuur

Beschrijving studies

Er werden in totaal drie systematische reviews (Giliaca, 2017, Van Randen, 2008, Repplinger, 2016) en 10 aanvullende studies (Atema, 2015; Leeuwenburgh, 2013; Lourenco, 2016; Pare, 2015; Parida, 2017; Petkovska, 2016; Poortman, 2009; Repplinger, 2018; Toorenvliet, 2010; Ziedses, 2016) geïncludeerd voor de subgroep volwassenen. De meeste studies waren prospectieve cohort studies. Het verloop van de klachten of de (pathologische) uitkomsten na een operatie werden bij de meeste studies als referentietest beschouwd.

Echografie versus contrast CT of MRI

De review van Giliaca (2017) includeerde 17 cohort studies met 2,778 patiënten die een echografie kregen. De mediane prevalentie van acute appendicitis in de studies was 76%. Uitkomsten na een operatie of follow-up werden bij de meeste studies als referentietest beschouwd.

Aanvullend beschreef Pare (2015) de accuratesse van echografie bij 451 jonge mannen tussen de 18 en 39 jaar die een operatie hadden ondergaan vanwege de diagnose acute appendicitis. De prevalentie van acute appendicitis was 94%. Ook CT en CT na inconclusieve echografie werden toegepast bij een deel van de patiënten, maar alleen de resultaten van de echografie werden gerapporteerd. Pathologische uitkomsten werden als referentietest gebruikt. Lourenco (2016) beschreef resultaten van echografie bij 354 volwassenen, waarvan 19,8% acute appendicitis had. Als referentietest werd echografie vergeleken met CT, MRI en chirurgische uitkomsten. Patiënten die niet geopereerd waren werden geclassificeerd als negatief voor appendicitis, behalve als ze werden heropgenomen of de eerste hulp bezochten.

De review Van Randen (2008) includeerde 6 vergelijkende studies, waarbij echografie met CT werd vergeleken bij 671 patiënten. De totale prevalentie van acute appendicitis was 50%. Na de zoekdatum van dit review werden er geen studies gevonden die echografie met CT vergeleken. Het verloop van de klachten of de (pathologische) uitkomsten na een operatie werden bij de meeste studies als referentietest beschouwd. Bij enkele studies werden patiënten nog nagebeld. Eén studie gaf geen informatie over de referentietest.

Repplinger (2016) includeerde 10 studies waarbij de accuratesse van MRI bij 838 patiënten werd gerapporteerd. De gemiddelde prevalentie van acute appendicitis in deze studies was 58%. Het verloop van de klachten of de (pathologische) uitkomsten na een operatie werden bij de geïncludeerde studies als referentietest beschouwd. De resultaten werden aangevuld met de studie van Ziedses (2016) die de accuratesse van MRI zonder contrast onderzocht bij 112 patiënten met een prevalentie van 26%. De follow-up na 4 maanden of die histologische uitkomsten werden gebruikt als referentiestandaard. Petkovska (2016) rapporteerde de resultaten van MRI bij 403 patiënten, waarvan 253 volwassenen met een prevalentie van 17%. De klachten of de (pathologische) uitkomsten na een operatie werden als referentietest beschouwd. Als een patiënt niet geopereerd werd, werd de patiënt na minimaal 8 weken telefonisch geïnterviewd of werd na minimaal 6 maanden follow-up het medisch dossier bekeken.

MRI versus contrast CT

De studie van Repplinger (2018) vergeleek CT met MRI, waarbij 198 patiënten beide modaliteiten kregen. De prevalentie van acute appendicitis was 32%. Klinische follow-up of de pathologische uitkomsten na een operatie werden als referentietest beschouwd.

Work-up (echografie plus contrast CT of echografie plus MRI) versus MRI, CT of

Work-up echografie plus contrast CT versus echografie plus MRI

Een aantal studies onderzochten de accuratesse van een work-up. Leeuwenburgh (2013) onderzocht een work-up, waarbij patiënten een CT scan kregen bij een inconclusieve echo. Daarnaast kregen ook alle patiënten (n=230) een MRI scan. De studieprevalentie was 51%. Als referentietest werden de resultaten gecombineerd van pathologisch onderzoek, chirurgie. klinische informatie, beeldvorming van echografie en CT en minstens 3 maanden follow-up.

De studie van Atema (2015) vergeleek een work-up van een CT na een inconclusieve of negatieve echografie met het direct krijgen een CT scan. Alle patiënten (n=422) kregen zowel een echografie als een CT. De studieprevalentie was 59% en als referentietest werden chirurgische en pathologische uitkomsten of follow-up data gebruikt.

Toorenvliet (2010) includeerde 250 patiënten met verdenking op acute appendicitis, waarvan 117 patiënten een echografie kregen, 2 patiënten een CT en 20 patiënten een CT na een inconclusieve echo. De studieprevalentie was 31%. De definitieve diagnose werd gebaseerd op pathologische en chirurgische bevindingen of klinische en radiologische diagnose in combinatie met klinische follow-up.

Poortman (2009) onderzocht een work-up waarbij patiënten een CT scan kregen bij een inconclusieve echografie of negatieve echo. 151 patiënten kregen een echografie en 60 een echografie met daarna een CT. De studieprevalentie was 61%. De referentietest was chirurgische of conservatieve behandeling.

Resultaten

Echografie versus contrast CT of MRI

In een meta-analyse door Van Randen (2008) zijn 6 studies geïncludeerd die CT en echografie direct met elkaar hebben vergeleken. Een directe vergelijking binnen dezelfde populatie is methodologisch de beste vergelijking voor test accuratesse. Door afwezigheid van studies bij kinderen die echografie direct vergeleken met MRI zijn accuratesse studies geïncludeerd en worden de testeigenschappen van echografie en MRI als indirect bewijs beschreven.

Echografie versus CT

In een meta-analyse door Van Randen (2008) zijn 6 studies geïncludeerd die CT en echografie direct met elkaar hebben vergeleken bij 671. Hiervan werden twee studies die unenhanced CT scans maakten. Een directe vergelijking binnen dezelfde populatie is methodologisch de beste vergelijking voor test accuratesse. Hieruit blijkt dat CT superieur is aan echografie in patiënten met verdenking appendicitis. De gepoolde sensitiviteit was 78% (95% BI 67% tot 86%) voor echografie versus 91% (95% BI 84% tot 95%) voor CT en de gepoolde specificiteit was 83% (95% BI 76% tot 88%) voor echografie versus 90% (95% BI 85 tot 95%) voor CT. De positief voorspellende waarde was 82% voor echografie versus 90% voor CT en de negatief voorspellende waarde was 79% voor echografie versus 91% voor CT. Daarnaast concludeerden de auteurs dat de accuratesse van de test sterk samenhangt met de prevalentie van appendicitis, namelijk hoe lager de prevalentie in de populatie, hoe lager de accuratesse van beeldvorming.

Echografie

De meta-analyse van Giliaca (2017) includeerde 17 cohort studies met 2,778 patiënten die een echografie kregen. De gepoolde sensitiviteit was 69% (95% BI 59 tot 78%) en de gepoolde specificiteit was 81% (95% BI 73 tot 88%). De gepoolde positief en negatief voorspellende waarde werd niet gerapporteerd. Uitgaande van de mediane prevalentie in de studies van 76% zijn de gepoolde positief en negatief voorspellende waarde respectievelijk 82% en 45%. Met een vooraf kans van 76% worden er per 1000 patiënten 7 onterecht geclassificeerd met acute appendicitis (fout positief) en 84 patiënten onterecht geclassificeerd als geen acute appendicitis (fout negatief).

Lourenco (2016) rapporteerde de accuratesse van echografie bij 354 patiënten met een studieprevalentie van 20%. De sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde was respectievelijk 48,4%, 97,9%, 83,6% en 89,6%. 288 (81,4%) patiënten hadden een inconclusieve echo.

Pare (2015) de accuratesse van echografie bij 451 jonge mannen tussen de 18 en 39 jaar met een studieprevalentie van 94% en vond een sensitiviteit van 57% (95% BI 46% tot 67,6%) en een positief voorspellende waarde van 98% (95% BI 93,8% tot 100%). De specificiteit en negatief voorspellende waarde werden niet gerapporteerd.

MRI

De review van Repplinger (2016) poolde 10 studies waarbij de accuratesse van MRI bij 838 patiënten werd gebruikt gerapporteerd. De sensitiviteit was 97% (95% BI 92% tot 99%) en de specificiteit was 96% (95% BI 89% tot 98%). De positief en negatief voorspellende waarde was respectievelijk 96% (95% BI 92% tot 99%) en 96% (95% BI 91% tot 98%). De gemiddelde prevalentie in de studiepopulatie was 57,7%. Met deze vooraf kans werden er per 1000 patiënten 17 onterecht geclassificeerd met acute appendicitis (fout positief) en 17 patiënten onterecht geclassificeerd als geen acute appendicitis (fout negatief).

Ziedses (2016) onderzocht de accuratesse van MRI zonder contrast onderzocht bij 112 patiënten met een prevalentie van 26%. De sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde was respectievelijk 89%, 100%, 100% en 96%. 9 (8%) patiënten hadden een inconclusieve echo.

Petkovska (2016) rapporteerde de resultaten van MRI bij 403 patiënten, waarvan 253 volwassenen met een prevalentie van 17%. De sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde was respectievelijk 97,1 (95% BI 85,1 tot 99,9), 99,5 (95% BI 97,5 tot 100), 97,1 (95% BI 85,1 tot 99,9) en 99,5 (95% BI 97,5 tot 100).

Bewijskracht van de literatuur

De bewijskracht voor de uitkomstmaten sensitiviteit, specificiteit, positief voorspellende waarde, negatief voorspellende waarde en inconclusieve resultaten voor echografie versus CT is met twee niveaus verlaagd naar laag vanwege risico op bias (onduidelijke selectie van patiënten, onduidelijke follow-up tijd van de referentie test, onduidelijk welke informatie beschikbaar was bij de beoordeling van de echografie of CT) en inconsistentie (heterogeniteit tussen studies, zoals het gebruik van verschillende CT technieken). De bewijskracht voor echografie versus MRI is met drie niveaus verlaagd naar zeer laag vanwege risico op bias (onduidelijke selectie van patiënten, onduidelijke follow-up tijd van de referentie test), indirectheid (diagnostische accuratesse als tussenstap voor patiënt relevante consequenties) en inconsistentie (heterogeniteit tussen studies).

MRI versus contrast CT

Eén studie (Repplinger, 2018) vergeleek de accuratesse van MRI versus contrast CT, waarbij 198 patiënten beide modaliteiten kregen. Hieruit bleek dat de diagnostische accuratesse van MRI en CT voor de diagnose van acute appendicitis vergelijkbaar is. De gepoolde sensitiviteit was 96,9% (95% BI 88,2% tot 99,5%) voor MRI versus 98,4% (95% BI 90,5% tot 99,9%) voor CT en de gepoolde specificiteit was 81,3% (95% BI 73,5% tot 87,3%) voor MRI versus 89,6% (95% BI 82,8% tot 94,0%) voor CT. De positief voorspellende waarde was 71,3 (95% BI 60,4 tot 80,2) voor MRI versus 81,8 (95% BI 71,0 tot 89,4) voor CT en de negatief voorspellende waarde was 98,2 (95% BI 93,0 tot 99,7) voor MRI versus 99,2 (95% BI 94,8 tot 100) voor CT. Inconclusieve resultaten werden niet gerapporteerd.

Bewijskracht van de literatuur

De bewijskracht voor de uitkomstmaten sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde voor MRI versus CT is met twee niveaus verlaagd naar laag vanwege risico op bias (onduidelijke selectie van patiënten, onduidelijke follow-up tijd van de referentie test) en imprecisie (gering aantal patiënten). Vanwege de afwezigheid van studies die inconclusieve resultaten beschrijven is de bewijskracht voor inconclusieve resultaten niet beoordeeld.

Work-up (echografie plus contrast CT of echografie plus MRI) versus MRI, CT of

Work-up echografie plus contrast CT versus echografie plus MRI

Work-up: echografie met CT versus echografie met MRI versus direct MRI

Leeuwenburgh (2013) onderzocht een work-up, waarbij patiënten een contrast CT scan kregen bij een inconclusieve of negatieve echo. Daarnaast kregen ook alle patiënten (n=230) een MRI scan. De diagnostische accuratesse van de work-up met CT was vergelijkbaar met de diagnostische accuratesse van de work-up met MRI of het maken van een MRI zonder echografie (tabel 2).

Tabel 2 sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde van echografie, MRI, work-up echografie met CT en work-up echografie met MRI

Beeldvorming	Sensitiviteit, %	Specificiteit, %	Positief voorspellende waarde, %	Negatief voorspellende waarde, %
Echo	70	94	93	80
MRI	97	93	93	96
Work-up echografie met contrast CT	97	91	91	97
Work-up echografie met MRI	98	88	88	98

Work-up: echografie met CT versus CT

De studie van Atema (2015) vergeleek een work-up van een contrast CT na een inconclusieve of negatieve echografie met het direct krijgen een CT scan. Alle patiënten (n=422) kregen zowel een echografie als een CT. De studieprevalentie was 59%. Inconclusieve resultaten werden niet gerapporteerd. De diagnostische accuratesse van de work-up met CT was vergelijkbaar met de diagnostische accuratesse van een CT zonder echografie (tabel 3).

Tabel 3 sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde van CT en work-up echografie met CT

Beeldvorming	Sensitiviteit, % (95% BI)	Specificiteit, % (95% BI)	Positief voorspellende waarde, % (95% BI)	Negatief voorspellende waarde, % (95% BI)
CT	95 (91 tot 97)	87 (81 tot 92)	92 (87 tot 95)	92 (86 tot 95)
Work-up echografie met contrast CT	96 (93 tot 98)	77 (70 tot 83)	86 (81 tot 90)	93 (87 tot 96)

Work-up: echografie met CT na inconclusieve echo

Toorenvliet (2010) en Poortman (2009) rapporteerde de diagnostische accuratesse van een CT na een echografie.

Toorenvliet (2010) includeerde 250 patiënten met verdenking op acute appendicitis, waarvan 117 patiënten een echografie kregen, 2 patiënten een CT en 20 patiënten een CT na een inconclusieve echo. De studieprevalentie was 31%. De sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde van deze work-up was respectievelijk 91%, 98%, 94% en 97% voor echografie zonder CT versus 93%, 99%, 98% en 98% voor een CT na een inconclusieve echo.

Poortman (2009) onderzocht een work-up waarbij patiënten een CT scan kregen bij een inconclusieve of negatieve echo. 151 patiënten kregen een echografie en 60 CT na een inconclusieve of negatieve echo. De studieprevalentie was 61%. De sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde van deze work-up was respectievelijk 91%, 86%, 90% en 71% voor echografie zonder CT versus 93%, 86%, 92% en 100% voor een CT na een inconclusieve of negatieve echo. 29 (19,2%) echo’s waren inconclusief.

Bewijskracht van de literatuur

De bewijskracht voor de uitkomstmaten sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde voor de work-up (echografie plus CT/ echografie plus MRI) versus MRI, CT of een andere work-up (echografie plus CT/ echografie plus MRI) is met twee niveaus verlaagd naar laag vanwege risico op bias (onduidelijke selectie van patiënten, onduidelijke follow-up tijd van de referentie test) en vanwege imprecisie (brede betrouwbaarheidsintervallen). Vanwege afwezigheid van studies die inconclusieve resultaten rapporteerden voor de work-up echografie plus MRI of echografie versus contrast CT kon de bewijskracht voor deze uitkomstmaat niet beoordeeld worden.

Zoeken en selecteren

Om de uitgangsvraag te kunnen beantwoorden is er een systematische literatuuranalyse verricht naar de volgende zoekvragen:

PICO 1

Wat is de diagnostische accuratesse van de echografie vergeleken met MRI of een contrast CT voor het diagnosticeren van acute appendicitis bij volwassenen met verdenking van acute appendicitis?

P (Patiënten): patiënten (volwassenen) met een acute appendicitis;

I (Interventie): echografie;

C (Comparison): MRI of contrast CT;

Referentie test: verloop van de klachten of de (pathologische) uitkomsten na een operatie;

O (Outcomes): sensitiviteit, specificiteit, positief voorspellende waarde, negatief voorspellende waarde, inconclusieve uitkomsten.

PICO 2

Wat is de diagnostische accuratesse van MRI vergeleken met CT voor het diagnosticeren van acute appendicitis bij volwassenen met verdenking van acute appendicitis?

P (Patiënten): patiënten (volwassenen) met een acute appendicitis;

I (Interventie): MRI;

C (Comparison) : contrast CT;

Referentie test: verloop van de klachten of de (pathologische) uitkomsten na een operatie;

O (Outcomes): sensitiviteit, specificiteit, positief voorspellende waarde, negatief voorspellende waarde, inconclusieve uitkomsten.

PICO 3

Wat is de diagnostische accuratesse van een work-up (echografie plus CT of echografie plus MRI) vergeleken met MRI of contrast CT of een work-up echografie plus CT vergeleken met work-up echografie plus MRI voor het diagnosticeren van acute appendicitis bij volwassenen met verdenking van acute appendicitis?

P (Patiënten): patiënten (volwassenen) met een acute appendicitis;

I (Interventie): work-up (echografie plus contrast CT of echografie plus MRI);

C (Comparison): MRI, contrast CT of work-up (anders dan bij I);

Referentie test: verloop van de klachten of de (pathologische) uitkomsten na een operatie;

O (Outcomes): sensitiviteit, specificiteit, positief voorspellende waarde, negatief voorspellende waarde, inconclusieve uitkomsten.

Relevante uitkomstmaten

De werkgroep achtte sensitiviteit en negatief voorspellende waarde (een) voor de besluitvorming cruciale uitkomstmaten; en specificiteit, positief voorspellende waarde, inconclusieve uitkomsten (een) voor de besluitvorming belangrijke uitkomstmaten.

Tabel 1 Gevolgen en consequenties van diagnostische testeigenschappen

Uitkomsten	Gevolgen	Relevantie
Terecht positieven (TP)	Patiënt wordt terecht gediagnostiseerd met acute appendicitis en krijgt behandeling.	Cruciaal
Terecht negatieven (TN)	Patiënt wordt terecht niet gediagnostiseerd met acute appendicitis en krijgt terecht geen behandeling.	Belangrijk
Fout positieven (FP)	Patiënt wordt onterecht gediagnostiseerd met acute appendicitis en krijgt een onnodige behandeling. De klachten persisteren.	Belangrijk
Fout negatieven	Patiënt wordt onterecht niet gediagnostiseerd met acute appendicitis en krijgt onterecht geen behandeling. Er vindt vervolgonderzoek plaats naar de oorzaak van de symptomen.	Cruciaal
Inconclusieve uitkomsten	Vervolgonderzoek (MRI of CT) met vertraging van de uiteindelijke diagnose.	Belangrijk

De werkgroep definieerde 10% verschil in sensitiviteit, specificiteit, positief voorspellende waarde, negatief voorspellende waarde of inconclusieve uitkomsten als een klinisch (patiënt) relevant verschil.

Zoeken en selecteren (Methode)

In de databases Medline (via OVID) en Embase (via Embase.com) is op 6 juni 2018 met relevante zoektermen gezocht naar systematische reviews (SR), randomized controlled trials (RCTs), observationele vergelijkende studies en cohort studies die rapporteerden over diagnostische accuratesse van echografie, MRI, CT of een step-up approach, waarbij een MRI of CT scan werd gedaan na een inconclusieve echografie voor de diagnostiek van acute appendicitis gepubliceerd vanaf 2008. De zoekverantwoording is weergegeven onder het tabblad Verantwoording. De literatuurzoekactie leverde 479 treffers op. Studies werden geselecteerd op grond van de volgende selectiecriteria: systematische reviews (SR), randomized controlled trials (RCTs), observationele vergelijkende studies die ten minste één van de volgende uitkomstmaten rapporteerden: de sensitiviteit, specificiteit, positief voorspellende waarde, negatief voorspellende waarde of inconclusieve uitkomsten. Wanneer deze studies niet beschikbaar waren werden ook niet-vergelijkende studies geïncludeerd.

Op basis van titel en abstract werden in eerste instantie 107 studies voorgeselecteerd. Na raadpleging van de volledige tekst, werden vervolgens 72 studies geëxcludeerd (zie exclusietabel onder het tabblad Verantwoording), en 35 studies definitief geselecteerd. 35 onderzoeken zijn opgenomen in de literatuuranalyse, waarvan 3 systematische reviews en 10 aanvullende studies resultaten voor de subgroep volwassenen rapporteerden. De belangrijkste studiekarakteristieken en resultaten zijn opgenomen in de evidencetabellen. De beoordeling van de individuele studieopzet (risk-of-bias) is opgenomen in de risk-of-bias tabellen.

Referenties

Atema JJ, Gans SL, Van Randen A, Laméris W, van Es HW, van Heesewijk JP, van Ramshorst B, Bouma WH, Ten Hove W, van Keulen EM, Dijkgraaf MG, Bossuyt PM, Stoker J, Boermeester MA. Comparison of Imaging Strategies with Conditional versus Immediate Contrast-Enhanced Computed Tomography in Patients with Clinical Suspicion of Acute Appendicitis. Eur Radiol. 2015 Aug;25(8):2445-52. doi: 10.1007/s00330-015-3648-9. Epub 2015 Apr 24. PubMed PMID: 25903701; PubMed Central PMCID: PMC4495262.
Giljaca V, Nadarevic T, Poropat G, Nadarevic VS, Stimac D. Diagnostic Accuracy of Abdominal Ultrasound for Diagnosis of Acute Appendicitis: Systematic Review and Meta-analysis. World J Surg. 2017 Mar;41(3):693-700. doi: 10.1007/s00268-016-3792-7. Review. PubMed PMID: 27864617.
Lahaye MJ, Lambregts DM, Mutsaers E, Essers BA, Breukink S, Cappendijk VC, Beets GL, Beets-Tan RG. Mandatory imaging cuts costs and reduces the rate of unnecessary surgeries in the diagnostic work-up of patients suspected of having appendicitis. Eur Radiol. 2015 May;25(5):1464-70. doi: 10.1007/s00330-014-3531-0. Epub 2015 Jan 16. PubMed PMID: 25591748.
Leeuwenburgh MM, Wiarda BM, Wiezer MJ, Vrouenraets BC, Gratama JW, Spilt A, Richir MC, Bossuyt PM, Stoker J, Boermeester MA; OPTIMAP Study Group. Comparison of imaging strategies with conditional contrast-enhanced CT and unenhanced MR imaging in patients suspected of having appendicitis: a multicenter diagnostic performance study. Radiology. 2013 Jul;268(1):135-43. doi: 10.1148/radiol.13121753. Epub 2013 Mar 12. PubMed PMID: 23481162.
Lourenco P, Brown J, Leipsic J, Hague C. The current utility of ultrasound in the diagnosis of acute appendicitis. Clin Imaging. 2016 Sep-Oct;40(5):944-8. doi: 10.1016/j.clinimag.2016.03.012. Epub 2016 Apr 2. PubMed PMID: 27203288.
Onur OE, Guneysel O, Unluer EE et al (2008) Outpatient follow-up or active clinical observation in patients with nonspecific abdominal pain in the emergency department. A randomized clinical trial. Minerva Chir 63:915
Pare JR, Langlois BK, Scalera SA, Husain LF, Douriez C, Chiu H, Carmody K. Revival of the use of ultrasound in screening for appendicitis in young adult men. J Clin Ultrasound. 2016 Jan;44(1):3-11. doi: 10.1002/jcu.22282. Epub 2015 Jul 14. PubMed PMID: 26178008.
Parida S, Nayak B, Mohanty J. Determination of sensitivity and specificity of ultrasonography in acute appendicitis: comparison with per-operative findings and histopathological report. Asian journal of pharmaceutical and clinical research. 2017;10(9):151-5. doi: http://dx.doi.org/10.22159/ajpcr.2017.v10i9.18873
Petkovska I, Martin DR, Covington MF, Urbina S, Duke E, Daye ZJ, Stolz LA, Keim SM, Costello JR, Chundru S, Arif-Tiwari H, Gilbertson-Dahdal D, Gries L, Kalb B. Accuracy of Unenhanced MR Imaging in the Detection of Acute Appendicitis: Single-Institution Clinical Performance Review. Radiology. 2016 May;279(2):451-60. doi: 10.1148/radiol.2015150468. Epub 2016 Jan 25. PubMed PMID: 26807893.
Poortman P, Oostvogel HJ, Bosma E, Lohle PN, Cuesta MA, de Lange-de Klerk ES, Hamming JF. Improving diagnosis of acute appendicitis: results of a diagnostic pathway with standard use of ultrasonography followed by selective use of CT. J Am Coll Surg. 2009 Mar;208(3):434-41. doi: 10.1016/j.jamcollsurg.2008.12.003. PubMed PMID: 19318006.
Repplinger MD, Levy JF, Peethumnongsin E, Gussick ME, Svenson JE, Golden SK, Ehlenbach WJ, Westergaard RP, Reeder SB, Vanness DJ. Systematic review and meta-analysis of the accuracy of MRI to diagnose appendicitis in the general population. J Magn Reson Imaging. 2016 Jun;43(6):1346-54. doi: 10.1002/jmri.25115. Epub 2015 Dec 22. Review. PubMed PMID: 26691590; PubMed Central PMCID: PMC4865442.
Repplinger MD, Pickhardt PJ, Robbins JB, Kitchin DR, Ziemlewicz TJ, Hetzel SJ, Golden SK, Harringa JB, Reeder SB. Prospective Comparison of the Diagnostic Accuracy of MR Imaging versus CT for Acute Appendicitis. Radiology. 2018 Aug;288(2):467-475. doi: 10.1148/radiol.2018171838. Epub 2018 Apr 24. PubMed PMID: 29688158; PubMed Central PMCID: PMC6067821.
Reuvers JR, Rijbroek A. Acute appendicitis: liever tweede echo dan CT of MRI. Ned Tijdschr Geneeskd. 2016;160:A9603 en D64.
Toorenvliet BR, Bakker RFR, Flu HC et al (2010) Standard Outpatient Re-Evaluation for Patients Not Admitted to the Hospital After Emergency Department Evaluation for Acute Abdominal Pain. World J Surg 34:480486. DOI 10.1007/s00268-009-0334-6
Toorenvliet BR, Wiersma F, Bakker RF, Merkus JW, Breslau PJ, Hamming JF. Routine ultrasound and limited computed tomography for the diagnosis of acute appendicitis. World J Surg. 2010 Oct;34(10):2278-85. doi: 10.1007/s00268-010-0694-y. PubMed PMID: 20582544; PubMed Central PMCID: PMC2936677.
Van Randen A, Bipat S, Zwinderman AH, Ubbink DT, Stoker J, Boermeester MA. Acute appendicitis: meta-analysis of diagnostic performance of CT and graded compression US related to prevalence of disease. Radiology. 2008 Oct;249(1):97-106. doi: 10.1148/radiol.2483071652. Epub 2008 Aug 5. PubMed PMID: 18682583.
Ziedses des Plantes CMP, van Veen MJF, van der Palen J, Klaase JM, Gielkens HAJ, Geelkerken RH. The Effect of Unenhanced MRI on the Surgeons' Decision-Making Process in Females with Suspected Appendicitis. World J Surg. 2016 Dec;40(12):2881-2887. doi: 10.1007/s00268-016-3626-7. PubMed PMID: 27495315; PubMed Central PMCID: PMC5104813.

Evidence tabellen

Table of quality assessment for systematic reviews of diagnostic studies

Study First author, year	Appropriate and clearly focused question? Yes/no/unclear	Comprehensive and systematic literature search? Yes/no/unclear	Description of included and excluded studies? Yes/no/unclear	Description of relevant characteristics of included studies? Yes/no/unclear	Assessment of scientific quality of included studies? Yes/no/unclear	Enough similarities between studies to make combining them reasonable? Yes/no/unclear	Potential risk of publication bias taken into account? Yes/no/unclear	Potential conflicts of interest reported? Yes/no/unclear
Repplinger, 2016	Yes	No, search date not reported	Included: yes, excluded: no	Yes	Yes	Yes	No	No
Randen, 2008	Yes	Yes	Yes	Yes	Yes, but not reported	Yes	No	No
Giljaca, 2016	Yes	Yes	Included: yes, excluded: no	Yes	Yes	Yes	No	No

Evidence table for systematic reviews of diagnostic test accuracy studies

Study reference

Study characteristics

Patient characteristics

Index test

(test of interest)

Reference test

Follow-up

Outcome measures and effect size

Comments

Repplinger, 2016

PS., study characteristics and results are extracted from the SR (unless stated otherwise)

SR and meta-analysis

Literature search up to ? not reported

A: Nitta 2005

B: Cobben 2009

C: Singh 2009

D: Inci 2011

E: Chabanova 2011

F: Inci 2011

G: Heverhagen 2012

H: Zhu 2012

I: Leeuwenburgh, 2013

J: Avcu 2013

Study design: SR of prospective cohort studies

Setting and Country:

SR: USA

A: Japan

B: The Netherlands

C: USA

D: Turkey

E: Denmark

F: Turkey

G: Germany

H: China

I: The Netherlands

J: Turkey

Source of funding and conflicts of interest:

National Center for Advancing

Translational Sciences (NCATS); contract grant numbers:

UL1TR000427; KL2TR000428; Contract grant sponsor:

National Institute on Aging; contract grant number:

K23AG038352; Contract grant sponsor: National Institute

on Drug Abuse; contract grant number: K23DA032306;

Contract grant sponsor: National Institute of Mental

Health; contract grant number: T21MH18029; Contract

grant sponsor: National Institute for Diabetes and Digestive

and Kidney Diseases; contract grant number:

K24DK102595

Conflict of interest are not reported

Inclusion criteria SR: published in

the past decade to best represent current imaging protocols including

diffusion-weighted imaging and images obtained with free breathing.

Studies were only included if they specifically dealt with the diagnosis of acute appendicitis using MRI, although specific

imaging sequences were not required for inclusion. Additionally,

articles were only included if they had well-defined and

acceptable reference standards such as a single imaging comparator

or clinical follow-up (surgical findings, histopathological findings, clinic visits, phone interviews, et cetera). Finally, articles were required

to provide absolute numbers of true positives, false positives, true

negatives, false negatives, and equivocal cases so that pooled statistics

could be calculated.

Exclusion criteria SR: Case reports, case series, and review articles were excluded.

Studies restricted to specific subpopulations, such as children

or pregnant women, were excluded because of potentially significant

clinical heterogeneity due to the substantial anatomical differences

compared to the general population and the potential for

spectrum bias.

10 studies included with 838 patients

Important patient characteristics:

Number of patients

A: 37

B: 138

C: 40

D: 85

E: 48

F: 119

G: 52

H: 41

I: 223

J: 55

Mean age (range

A: 37.1 (16–69)

B: 29 (6–80)

C: 34 (11–69)

D: 26.5 (14–72)

E: 37.1 (18–70)

F: 27 (17–72)

G: 44.7 (18–88)

H: 41.5 SD 11.3

I: 35 (IQR 25–50)

J: 35.6 (17–83)

Sex % female

A: 52

B: 56

C: Unknown

D: 47

E: 60

F: 36

G: 40

H: 56

I: 59

J: 43

Describe index test and

cut-off point(s):

MRI

A: 0.5T Gyroscan

(Philips)

T1 SE, T2 FSE, T2

with fat saturation

B: 1.0T (Siemens) T1 SGRE, T2 FSE,

T2 FSE with fat saturation

C: 1.5T Excite Twinspeed

(GE Medical Systems)

T2 SSFSE with fat saturation,

T2 FSE with

fat saturation, STIR,

pre-gadolinium T1,

post-gadolinium T1

D: 1.5 T Avanto

(Siemens)

T1 FSE, T2 FSE with

and without fat

saturation

E: 0.23T and 0.6T Panorama (Philips), 1.5T

Infinion (Philips),

1.5T Achieva (Philips)

T1 SE, T2 FSE, STIR

F: 1.5 T Avanto

(Siemens)

T1 FSE, T2 FSE with

and without fat saturation, DWI

G: 1.5 T Magnetom

Sonata (Siemens)

STIR, T2 FSE, bSSFP,

T1 fat-saturated SGRE

(before and after IV

contrast). IV butylscopolamin

used to prevent perstalsis.

H: 1.5T Achieva Nova

Dual (Philips)

T2 FSE, bSSFP with

fat saturation

I: 1.5T Magnetom

Avanto (Siemens)

T2 FSE with and

without fat saturation,

DWI

J: 1.5T Magnetom Symphony

(Siemens)

DWI, bSSFP, STIR

Describe reference test and cut-off point(s):

A: Surgical pathology or

clinical follow up

B: Surgical pathology or

clinical follow up

C: Final diagnosis at hospital discharge

D: Surgical pathology or

clinical follow up

E: Surgical pathology or

clinical follow up

F: Surgical pathology or

clinical follow up

G: Surgical pathology or

clinical follow up

H: Surgical pathology

I: Expert panel reviewing

surgery and clinical

follow up

J: Surgical pathology or

clinical follow up

Prevalence (%)

(based on refence test at specified cut-off point)

A: 78

B: 45

C: 30

D: 67

E: 63

F: 66

G: 25

H: 80

I: 52

J: 71

Mean prevalence of 57.7%

(95% CI 44.7–70.7%)

For how many participants were no complete outcome data available?

Not described

Endpoint of follow-up:

Not described

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

0.97 (95% CI 0.92–0.99)

Outcome measure-2

Specificity

0.96 (95% CI 0.89–0.98)

Outcome measure-3

PPV

0.96 (95% CI 0.92–0.99)

Outcome measure-4

NPV

0.96 (95% CI 0.91–0.98)

Outcome measure-5

Inconclusive results

Not described

Study quality (ROB): Using the

QUADAS-2 assessment tool, the included studies were generally

deemed to be at low or uncertain risk-of-bias.

Place of the index test in the clinical pathway: patients underwent MRI if they had clinical findings suggestive of equivocal acute appendicitis.

Choice of cut-off point: Not described

Author’s conclusion

MRI has a high sensitivity and specificity for the diagnosis of appendicitis, similar to that reported previously

for computed tomography.

Heterogeneity:

For each observable study characteristic, a subgroup was created split at the median value of the included studies.

These categories (and values) were: proportion of females

enrolled (51%), average age of study participants (35.6

years), and prevalence of appendicitis (64.5%). In the case

of scanner field strength, values were dichotomized as either

using only a 1.5T scanner (n = 7) or not (n = 3). We present

results partitioned by these study characteristics in Table

3. Overall, these differences appear small, with the largest

discrepancy occurring when comparing specificity for studies

using only 1.5T scanners versus those that used other field

strengths, having a pooled specificity of 93.9% (95 CI:

89.8–96.4) and those not 87.8% (95 CI: 48.9–98.2).

Randen, 2008

PS., study characteristics and results are extracted from the SR (unless stated otherwise)

SR and meta-analysis

Literature search up to February 2006

A: Balthazar, 1994

B: Pickuth, 2000

C: Wise, 2001

D: Kan, 2001

E: Poortman, 2003

F: Keyzer, 2005

Study design: cohort, prospective

Setting and Country:

A: USA

B: Germany

C: USA

D: USA

E: The Netherlands

F: Belgium

Source of funding and conflicts of interest:

Inclusion criteria SR: (a) prospective cohort design

compared graded compression US

and CT in the same patient population,

(b) more than 10 patients were included,

and/or clinical follow-up was used

as reference standard, and (d) data were reported to calculate a 2x2 contingency

table for graded compression

US and CT.

Exclusion criteria SR: Studies

evaluating only pediatric patients

were not eligible.

6 studies included

Important patient characteristics:

Number of patients

A: 100

B: 120

C: 100

D: 31

E: 226

F: 94

Total: 671

Age, mean (range)

A: (38 (15-82)

B: (8-81)

C: 38 (18-86)

D: 34 (18-57)

E: 26 (3-89)

F: 38 (16-81)

Sex (% females)

A: 65

B: 53

C: 74

D: 84

E: 55

F: 63

Describe index test and

cut-off point(s):

US (1) vs CT (2)

A-1: Linear, 5 and/or 7.5

A-2: Nonhelical, Oral (800 mL), intravenous

(200 mL)

B-1: Curved, 3.5; linear, 5–7.5

B-2: Helical, multidetector, Rectal (1000 mL)

C-1: NR

C-2: Helical, single detector, Oral (400–500 mL),intravenous (125 mL), rectal (800–1200 mL)

D-1: Curvilinear, 3.5; linear, 5

D-2: Helical, multidetector, Rectal (750 mL), oral (23

patients), intravenous

(7 patients)

E-1: Linear, 5–12; curved, 5–2

E-2: Helical, single detector, oral contrast

F-1: Convex, 3.75; linear, 8

F-2: Helical, multidetector, oral contrast

Describe reference test and cut-off point(s):

A: Surgery, discharge diagnosis, clinical

follow-up

B: Surgery, 6-mo clinical follow-up

C: Surgery, discharge diagnosis, phone

follow-up after 1 and 3 mo

D: Not specified

E: Surgery, mean 13-mo follow-up

F: Surgery, 1-mo clinical follow-up

Prevalence (%)

(based on refence test at specified cut-off point)

A: 54

B: 77

C: 24

D:13

E: 66

F: 32

Overall: 50%

For how many participants were no complete outcome data available?

Endpoint of follow-up:

A: NR

B: 6 months

C: 3 months

D: NR

E: 13 months

F: 1 month

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

US: 0.78 (0.67, 0.86)

CT: 0.91 (0.84, 0.95)

P=0.017

Outcome measure-2

Specificity

US: 0.83 (0.76, 0.88)

CT: 0.90 (0.85, 0.94)

P=0.037

Outcome measure-3

PPV*

US: 0.82

CT: 0.90

Outcome measure-4

NPV*

US: 0.79

CT: 0.91

Outcome measure-5

Inconclusive results

US: NR

CT: NR

*Not reported, calculated using the overall prevalence and overall sensitivity and specificity

Study quality (ROB): The Quality

Assessment of Diagnostic Accuracy Studies included in Systematic Reviews tool was used as a guideline for evaluation of study quality. When applying Quality

of Reporting of Meta-analyses criteria for selection, no studies were eligible.

Place of the index test in the clinical pathway:

Diagnostic tool

Choice of cut-off point: not reported

Author’s conclusion

In head-to-head comparison studies of diagnostic imaging,

CT had a better test performance than did graded compression

US in diagnosing appendicitis. Ignoring the relationship

between prevalence (pretest probability) and diagnostic

value may lead to an inaccurate estimation of diagnostic performance.

Personal remarks

The two studies with unenhanced CT (oral contrast) both had a lower sensitivity compared to enhanced CT.

Heterogeneity: not reported

Giljaca, 2017

PS., study characteristics and results are extracted from the SR (unless stated otherwise)

SR and meta-analysis

Literature search up to October, 2014

A: Bendeck, 2002

B: Fergusson, 2002

C: Flum, 2005

D: Gökçe, 2011

E: Grodzinski, 2004

F: John, 2011

G: Khanzada, 2009

H: Köksal, 2009

I: Kurane, 2008

J: Memisoglu, 2010

K: Peixoto, 2011

L: Saeed, 2009

M: Sezer, 2012

N: Sharma, 2007

O: Stunell, 2008

P: Uebel, 1996

Q: West, 2006

Study design: cohort

(prospective and retrospective)

Setting and Country:

University Hospital Centre Rijeka, Kresimirova, Croatia

Source of funding and conflicts of interest:

Inclusion criteria SR:

We included studies which evaluated diagnostic accuracy

of US for the diagnosis of AA which contained enough

data for 2 9 2 table and where the histopathology report of

the operative specimen was defined as the reference standard.

Exclusion criteria SR: insufficient data for 2x2

table, reference standard other than histopathology report, studies which included <10 patients, studies in which the period between performing the index test and reference

standard was longer than one week, and studies older than the year 1994.

17 studies included

Important patient characteristics:

Number of patients

A: 105

B: 176

C: 144

D: 235

E: 112

F: 238

G: 195

H: 184

I: 60

J: 196

K: 156

L: 170

M: 91

N: 118

O: 30

P: 538

Q: 30

Age, mean (range)

A: NR

B:NR

C: NR

D: 28

E: NR

F: 28

G: 28

H: 24

I: NR

J: 27

K: NR

L: NR

M: 31

N: NR

O: NR

P: 28

Q: 27

Sex (% females)

A: 67

B: NR

C: NR

D: 45

E: NR

F: 45

G: 38

H: 48

I: 48

J: 38

K: 47

L: 38

M: 52

N: 43

O: 100

P: 58

Q:83

Describe index test and

cut-off point(s):

Describe reference test and cut-off point(s):

Surgery or follow-up

Prevalence (%)

(based on refence test at specified cut-off point)

A: 91

B: 73

C: 50

D: 82

E: 76

F: 81

G: 88

H: 93

I:38

J: 83

K: 84

L: 69

M: 85

N: 76

O: 67

P: 68

Q: 57

Median prevalence: 76%

For how many participants were no complete outcome data available?

Endpoint of follow-up:

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

69% (95% CI 59–78%)

Outcome measure-2

Specificity

81% (95% CI 73–88%)

Outcome measure-3

PPV*

92%

Outcome measure-4

NPV*

45%

Outcome measure-5

Inconclusive results

*Not reported, calculated using the median prevalence and overall sensitivity and specificity

Study quality (ROB): quality assessment of diagnostic accuracy studies assessment tool (QUADAS-2). High risk-of-bias in all the included studies.

Place of the index test in the clinical pathway: first choice

Choice of cut-off point:

Author’s conclusion

Abdominal ultrasound does not seem to have a role in the diagnostic pathway for diagnosis of AA in suspected patients. The summary sensitivity and specificity of US do not exceed that of physical examination.

Patients that require additional diagnostic workup should be referred to more sensitive and specific diagnostic procedures, such as computed tomography.

Sensitivity analysis of indeterminate results did not show a significant effect on summary results of sensitivity and specificity, i.e., the

confidence regions of ROC curves of main results and

sensitivity analysis results overlap.

CI: confidence interval; CT: computed tomography; MRI: magnetic resonance imaging; NPV: negative predictive value; NR: not reported; PPV: positive predictive value; SD: standard deviation; SR: systematic review; US: ultrasound

Risk-of-bias assessment diagnostic accuracy studies (QUADAS II, 2011)

Study reference

Patient selection

Index test

Reference standard

Flow and timing

Comments with respect to applicability

Pare, 2016

Was a consecutive or random sample of patients enrolled?

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Yes

Were the index test results interpreted without knowledge of the results of the reference standard?

Unclear

If a threshold was used, was it pre-specified?

Unclear

Is the reference standard likely to correctly classify the target condition?

Unclear

Were the reference standard results interpreted without knowledge of the results of the index test?

Unclear

Was there an appropriate interval between index test(s) and reference standard?

Yes

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Yes

Were all patients included in the analysis?

Are there concerns that the included patients do not match the review question?

Yes

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Yes

Are there concerns that the target condition as defined by the reference standard does not match the review question?

Yes

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: HIGH

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: UNCLEAR

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: UNCLEAR

CONCLUSION

Could the patient flow have introduced bias?

RISK: HIGH

Repplinger, 2018

Was a consecutive or random sample of patients enrolled?

Yes

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Yes

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes

If a threshold was used, was it pre-specified?

Yes

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

Unclear

Was there an appropriate interval between index test(s) and reference standard?

Yes

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Were all patients included in the analysis?

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: LOW

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: LOW

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: LOW

CONCLUSION

Could the patient flow have introduced bias?

RISK: LOW

Leeuwenburgh, 2013

Was a consecutive or random sample of patients enrolled?

Yes

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes

If a threshold was used, was it pre-specified?

Not used

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

Was there an appropriate interval between index test(s) and reference standard?

Yes

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Were all patients included in the analysis?

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: LOW

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: UNCLEAR

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: LOW

CONCLUSION

Could the patient flow have introduced bias?

RISK: UNCLEAR

Atema, 2015

Was a consecutive or random sample of patients enrolled?

Yes

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes

If a threshold was used, was it pre-specified?

Unclear

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

Was there an appropriate interval between index test(s) and reference standard?

Yes

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Were all patients included in the analysis?

Yes

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: LOW

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: UNCLEAR

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: LOW

CONCLUSION

Could the patient flow have introduced bias?

RISK: UNCLEAR

Toorenvliet, 2010

Was a consecutive or random sample of patients enrolled?

Yes

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes

If a threshold was used, was it pre-specified?

Unclear

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

Was there an appropriate interval between index test(s) and reference standard?

Yes

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Were all patients included in the analysis?

Yes

Are there concerns that the included patients do not match the review question?

Yes

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: LOW

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: UNCLEAR

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: LOW

CONCLUSION

Could the patient flow have introduced bias?

RISK: UNCLEAR

Poortman, 2009

Was a consecutive or random sample of patients enrolled?

Yes

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes

If a threshold was used, was it pre-specified?

Yes

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

Was there an appropriate interval between index test(s) and reference standard?

Yes

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Were all patients included in the analysis?

Yes

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: LOW

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: UNCLEAR

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: LOW

CONCLUSION

Could the patient flow have introduced bias?

RISK: UNCLEAR

Lourenco, 2016

Was a consecutive or random sample of patients enrolled?

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes

If a threshold was used, was it pre-specified?

Yes

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

Was there an appropriate interval between index test(s) and reference standard?

Yes

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Were all patients included in the analysis?

Yes

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: HIGH

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: LOW

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: UNCLEAR

CONCLUSION

Could the patient flow have introduced bias?

RISK: UNCLEAR

Parida, 2017

Was a consecutive or random sample of patients enrolled?

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Were the index test results interpreted without knowledge of the results of the reference standard?

Unclear

If a threshold was used, was it pre-specified?

Unclear

Is the reference standard likely to correctly classify the target condition?

Unclear

Were the reference standard results interpreted without knowledge of the results of the index test?

Unclear

Was there an appropriate interval between index test(s) and reference standard?

Unclear

Did all patients receive a reference standard?

Unclear

Did patients receive the same reference standard?

Were all patients included in the analysis?

Unclear

Are there concerns that the included patients do not match the review question?

Yes

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: HIGH

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: UNCLEAR

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: UNCLEAR

CONCLUSION

Could the patient flow have introduced bias?

RISK: UNCLEAR

Ziedses, 2016

Was a consecutive or random sample of patients enrolled?

Yes

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Yes

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes

If a threshold was used, was it pre-specified?

Unclear

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

Unclear

Was there an appropriate interval between index test(s) and reference standard?

Yes

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Were all patients included in the analysis?

Yes

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: LOW

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: UNCLEAR

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: UNCLEAR

CONCLUSION

Could the patient flow have introduced bias?

RISK: LOW

Ziedses, 2016

Was a consecutive or random sample of patients enrolled?

Unclear

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Unclear

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes

If a threshold was used, was it pre-specified?

Yes

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

Unclear

Was there an appropriate interval between index test(s) and reference standard?

Yes

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Were all patients included in the analysis?

Yes

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: UNCLEAR

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: LOW

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: UNCLEAR

CONCLUSION

Could the patient flow have introduced bias?

RISK: LOW

Evidence table for diagnostic test accuracy studies

Study reference

Study characteristics

Patient characteristics

Index test

(test of interest)

Reference test

Follow-up

Outcome measures and effect size

Comments

Pare, 2015

Type of study: retrospective cohort analysis

Setting: The study was conducted at an urban academic

medical center that has more than 130,000

annual visits.

Country: USA

Conflicts of interest: not reported

Inclusion criteria: Consecutive young male patients, 18–

39 years old, who had been hospitalized from the ED between June 2006 and September

2011 with an admitting diagnosis of appendicitis

and who had undergone a surgical procedure.

Exclusion criteria: Women were excluded from the study because of the possibility that pelvic pathology could have complicated the diagnosis. Patients were

also excluded from the study if they had undergone

imaging prior to their ED evaluation or

did not undergo surgery.

N=451

Of the 451 patients with appendicitis, 86 had undergone US examination

(39 underwent only initial US, and 47 underwent both initial US and subsequent

CT). Of the remaining patients, 306 had undergone

CT alone, and 59 had had no imaging

performed.

Prevalence:

US: 94%

CT: 98,4%

US+CT: 98%

Mean age ± SD:

US: 22.2 (5.3)

CT: 28.7 (5.5)

US+CT: 20.3 (3.5)

BMI, mean ± SD:

US: 25.4 (5.4)

CT: 27.1 (5.4)

US+CT: 26.1 (5.3)

US, CT or US+CT

Describe index test:

US scanning was performed by using Philips iU22 machines (Philips Ultrasound, Bothell, WA) equipped with L12–5 and C5–1 transducers.

Cut-off point(s):

If the appendix was visualized and there was any suggestion of appendicitis, the results

were considered positive, even if the dictated radiology report was equivocal. If the appendix

was not visualized, the results were considered negative.

Comparator test:

CT, no information provided.

Of the 39 patients whose results were negative

for appendicitis on initial US, 7 underwent no additional imaging but proceeded directly to surgery; of those 7 patients, 4 were found to have a pathologically normal appendix (ie, false-positive US results). The remaining 32

patients with negative results for appendicitis on initial US subsequently underwent diagnostic CT scanning; their CT results were classified

as true positives (ie, false-negative US results).

Describe reference test:

All results were compared with those in pathology reports.

Time between the index test and reference test:

For how many participants were no complete outcome data available?

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

US: 57% (95% CI 46–67.6)

Outcome measure-2

Specificity

Outcome measure-3

PPV

US: 98% (95% CI 93.8–100)

Outcome measure-4

NPV

Outcome measure-5

Inconclusive results

Not reported

Author’s conclusion

Screening US should be considered first for diagnosing appendicitis because of its high positive predictive value, but even if US results are negative for appendicitis, one should not exclude the possible existence of pathology because US has poor

sensitivity in this situation. We speculate that the use

of screening US can decrease radiation exposure,

imaging costs, and LOS.

Personal remarks

No outcomes reported for CT or US+CT.

Repplinger, 2018

Type of study: prospective cohort study

Setting: academic medical center

Country: USA

Conflicts of interest: Activities related to the present article: disclosed no relevant relationships.

Inclusion criteria: Patients were eligible for enrollment if they were at least 12 years

old and had been ordered to undergo CT for evaluation for appendicitis during study hours (in general, weekdays 7 am to 11 pm and weekends 7 am to 3 pm until May 2014, at which point an in-house MR technologist was available 24 hours per

day).

Exclusion criteria: contraindications to either gadolinium-based contrast material administration or

MR imaging (eg, metallic implant) or the inability to provide informed consent or assent.

N=198

Prevalence: 32.3% (64 of 198)

Mean age ± SD: 31.6 (14.2)

Sex: 58% F

Describe index test:

CT vs MRI

CT examinations of the abdomen and pelvis were performed with a 64 3 0.625-mm detector configuration scanner (GE Healthcare, Waukesha, Wis) after administration of oral contrast material (Gastrografin; Bracco Diagnostics, Princeton, NJ)

and intravenous iohexol (Omnipaque 300; GE Healthcare); imaging was performed in the portal venous phase (SmartPrep

with automated scan initiation).

MRI

MR imaging was performed with clinical 1.5-T units

(Signa HDxt with a CRM or TwinSpeed gradients, Discovery MR450w; GE Healthcare) by using an eight-channel or 12-channel phased-array body coil.

Cut-off point(s):

Study to include three radiologists’ interpretations, the consensus interpretation (meaning that at least two radiologists agreed on

the presence or absence of appendicitis) was used as the primary outcome measure. We had a priori set a score of 3 or higher as reflecting a positive test result for image interpretation.

Describe reference test:

A composite reference standard of surgical and histopathologic results and clinical follow-up was used, arbitrated by an expert panel

of three investigators.

Time between the index test and reference test:

For how many participants were no complete outcome data available?

32 of the 230 (13.9%)

Reasons for incomplete outcome data described?

Incomplete MRI n=6

Lost to follow up n=6

Training scans n=20

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

CT: 98.4% (95% CI: 90.5%, 99.9%)

MRI: 96.9% (95% CI: 88.2%, 99.5%)

Outcome measure-2

Specificity

CT: 89.6% (95% CI: 82.8%, 94.0%)

MRI: 81.3% (95% CI: 73.5%,

87.3%)

Outcome measure-3

PPV

CT: 81.8 (95% CI 71.0, 89.4)

MRI: 71.3 (95% CI 60.4, 80.2)

Outcome measure-4

NPV

CT: 99.2 (95% CI 94.8, 100)

MRI: 98.2 (95% CI 93.0, 99.7)

Outcome measure-5

Inconclusive results

Not reported

Author’s conclusion

The diagnostic accuracy of MR imaging was similar to that of CT for the diagnosis of acute appendicitis.

Leeuwenburgh, 2013

Type of study: prospective cohort study

Setting: The study was performed in an academic

hospital and five large teaching

hospitals in the Netherlands.

Country: The Netherlands

Conflicts of interest: Financial activities related to the present article:

none to disclose.

Inclusion criteria: adult patients (age ≥18 years) who, prior to imaging, were clinically suspected of

having acute appendicitis on the basis of medical history and physical and laboratory

examination findings.

Exclusion criteria: Patients

with a contraindication for MR imaging

and critically ill patients who needed

intensive vital organ function monitoring were not invited to participate in this study. Pregnant women were also

excluded because the imaging protocol

included CT.

N=230

Prevalence: 51.3%

Mean age :

Men: 36 (IQR 25-50

Women: 34 (IQR 24-49)

Sex: 40% M / 60% F

Describe index test:

Conditional CT/MRI

A staff radiologist or radiologic resident performed a US examination by using the graded compression technique. A curved 3.5–5.0 MHz array and a linear 10-MHz array were used.

In case of negative or inconclusive US imaging results, CT was performed.

All CT scans were performed by using a 4, 16, or 64 multi–detector row CT scanner. CT protocols were based

on the following: effective level of 165 mAs, 120 kV, maximum 2.5-mm collimation, maximum 3-mm section width, 0.5-second rotation time, and a 125-mL

intravenous contrast agent injection after a 60-second delay at 3 mL/sec. No oral or rectal contrast agents were used

MRI

For study purposes, all patients underwent MR imaging (breath-hold

rapid acquisition with relaxation enhancement

and spectral selection attenuated inversion recovery rapid acquisition with relaxation enhancement,

free breathing diffusion-weighted imaging,

without contrast agent) within 2 hours, in addition to the conventional imaging with US and/or CT. The examinations were performed with a 1.5-T MR imager (Magnetom Avanto 1.5 T,

Siemens Medical Systems, Forchheim, Germany; or Intera 1.5 T, Philips Medical

Systems, Best, the Netherlands).

Cut-off point(s):

We did not specify strict radiologic criteria for the diagnosis of appendicitis;

readers evaluated the appendix diameter, presence of an appendicolith,

periappendiceal fat infiltration, periappendiceal

fluid, absence of gas in the

appendix, destruction of the appendiceal wall structure, restricted diffusion of the appendiceal wall, restricted

diffusion of the appendiceal lumen, and restricted diffusion of focal fluid

collections. The final judgment on the

presence of appendicitis was left to the radiologist or supervised radiologic resident

who interpreted the images.

Describe reference test:

Expert panels were composed of two

surgeons and one radiologist who assigned a final diagnosis based on histopathologic findings or clinical information, imaging findings from US and CT, surgery, and at least 3 months follow-up.

Time between the index test and reference test:

At least 3 months.

For how many participants were no complete outcome data available? The imaging

protocol was violated in one patient who had direct CT performed without initial US imaging, and in seven patients

in whom no CT was performed after a negative or inconclusive US examination. In seven patients, an MR examination

could not be performed because of claustrophobia or unexpected technical failure. The imaging results of these patients were included in data analysis.

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

US only: 70%

Conditional CT: 97%

Conditional MRI: 98%

Immediate MRI: 97%

Outcome measure-2

Specificity

US: 94%

Conditional CT: 91%

Conditional MRI: 88%

MRI: 93%

Outcome measure-3

PPV*

US: 93%

Conditional CT: 91%

Conditional MRI: 88%

MRI: 93%

Outcome measure-4

NPV*

US: 80%

Conditional CT: 97%

Conditional MRI: 98%

MRI: 96%

*Not reported, calculated.

Outcome measure-5

Inconclusive results

US: 106

Conditional CT: 5

(conditional) MRI: 0

Author’s conclusion

The accuracy of conditional or immediate MR imaging was similar to that of conditional CT in patients suspected of having

appendicitis, which implied that strategies with MR imaging may replace conditional CT for appendicitis detection.

Atema, 2015

Type of study: prospective cohort study

Setting: 5 hospitals (2 academic)

Country: The Netherlands

Conflicts of interest: The authors of this manuscript declare no relationships with any companies, whose products or services may be related to the subject

matter of the article. This study has received funding by The Dutch Organization

for Health Research and Development (ZonMW).

Inclusion criteria: adult patients presenting with acute abdominal pain at the

emergency department between March 2005 and November

2006. Only patients with a clinical suspicion of acute appendicitis,

based on medical history, physical examination, and laboratory

tests, were included in the analyses.

Exclusion criteria:-

N=422

US+CT: 199 (47%)

Prevalence: 59%

Mean age (range): 40 (19-89)

Sex: 54% F

Describe index test:

Conditional CT strategy (CT only after inconclusive

or negative ultrasound findings)

All standardized ultrasound examinations were performed

using a curved 3.5 – 5.0MHz array and a linear 10MHz array.

Cut-off point(s):

Comparator test:

Immediate CT strategy (CT in all without prior ultrasound)

The CT parameters for the different CTsystems in the original multicenter study were effective mAs level 165, 120 kV, (4×) 2.5-mm collimation, (4×) 3-mm slice width and 0.5-s rotation

time, and 125ml iodinated contrast was given intravenously at 3 ml/s after a 60-s delay. No orally or rectally administered

contrast agents were used

Cut-off point(s):

Describe reference test:

A final diagnosis of acute appendicitis was predominantly based on surgical findings, obtained

histopathology, and follow-up data.

Cut-off point(s):

Time between the index test and reference test:

at least 6 months

For how many participants were no complete outcome data available?

1,101 consecutive patients presented with acute abdominal pain, 80 of whom had to be excluded because of incomplete case record forms. Of the remaining 1,021 patients, 422 (41 %) had a clinical suspicion of acute appendicitis

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

Conditional CT: 96% (95% CI 93-98)

Immediate CT: 95% (95% CI 91-97)

Outcome measure-2

Specificity

Conditional CT: 77% (95% CI 70-83)

Immediate CT: 87% (95% CI 81-92)

Outcome measure-3

PPV

Conditional CT: 86% (95% CI 81-90)

Immediate CT: 92% 95% CI (87-95)

Outcome measure-4

NPV

Conditional CT: 93% (95% CI 87-96)

Immediate CT: 92% (95% CI 86-95)

Outcome measure-5

Inconclusive results

Author’s conclusion

A conditional CT strategy correctly identifies as

many patients with appendicitis as an immediate CT strategy, and can halve the number of CTs needed. However, conditional

CT imaging results in more false positives.

Toorenvliet, 2010

Type of study: prospective cohort study

Setting: middle-sized

teaching hospital

Country: The Netherlands

Conflicts of interest: not reported

Inclusion criteria: All consecutive

patients with acute abdominal pain evaluated at

the ED by a resident of the surgical department between

June 2005 and July 2006 were included in the study.

Exclusion criteria: Patients who were evaluated at another hospital for the same complaint, patients with abdominal pain due to trauma, and those who had undergone additional radiological

examination (US or CT) prior to surgical consultation

were excluded.

N=802

Appendicitis: after primary evaluation at the ED, 164 patients were suspected

to have appendicitis. The proposed strategy was an open appendectomy

99 times, an admission to the hospital for re-evaluation 32

times, and an outpatient re-evaluation the next day 33 times. A total of 139 patients underwent additional radiological imaging after the

primary evaluation. Of these, 117 patients had US only, 2 patients had CT only, and 20 patients had US as well as CT.

Twenty-five patients did not undergo additional radiological

imaging on the day of the primary evaluation.

Prevalence: Of the 250 patients who had appendicitis as the referring diagnosis,

78 (31.2%) were ultimately determined to have

appendicitis

Only characteristics are provided for patients with a final diagnosis appendicitis

Mean age ± SD: 22.5 (18.1)

Sex: 41.2% F

Other important characteristics:

Describe index test:

Routine US, limited CT and clinical re-evaluation

An initial management proposal was then made based on the clinical diagnosis. All clinical parameters, the clinical

diagnosis and strategy were registered on a study form. After conferring with the consulting surgeon about

each case (mostly over the phone), a decision was made

whether or not to perform additional radiological examination. US was always the primary examination of choice. It was, however, at the radiologist’s discretion to decide if CT would be a more

suitable primary examination when taking into account the

patient characteristics (i.e., a high BMI) and the nature of

the suspected condition (e.g., acute mesenteric ischemia).

When an US was inconclusive, a CT of the abdomen was

subsequently made.

For US, the abdomen was examined with an ATL HDI 5000 US system (Philips Medical Systems). All abdominal organs were examined, with special attention to the appendix, using the graded compression technique. For CT a GE LightSpeed QX/i 4-slice CT (Milwaukee, WI) was used.

Cut-off point(s):

Describe reference test:

The final diagnosis (FD)

was based on intraoperative findings or pathological

examination of the resected organs. If patients did not

undergo an operation, the final diagnosis was made by the clinical and/or radiological diagnosis in combination with the clinical response to medical therapy at standard re-evaluation and follow-up.

All patients that were not

admitted to the surgical ward after surgical consultation at the ED were given appointments for re-evaluation at the outpatient clinic within 24 h. There, the diagnosis and management strategies were reassessed by the consultant surgeon or a surgical resident under the supervision of a consultant surgeon. Additional radiological examinations were carried out if deemed necessary.

Cut-off point(s):

Time between the index test and reference test: 24h

For how many participants were no complete outcome data available?

During the study period 972 patients were evaluated. Of

these, 49 patients (5.0%) were excluded when they did not

show up for their re-evaluation appointment, and another121 (12.4%) patients were excluded as their study forms were missing or incomplete. For 23 of these patients no follow-up details were acquired (2.4%). Of the 147 patients excluded from analysis for whom follow-up was successful, seven had acute appendicitis and were treated at our own hospital. They were excluded because the study forms were missing or incomplete and therefore the effect of diagnostics on management could not be assessed. In total, 802 patients were eligible for analysis.

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

US only: 0.91

Conditional CT: 0.93

Outcome measure-2

Specificity

US only: 0.98

Conditional CT: 0.99

Outcome measure-3

PPV

US only: 0.94

Conditional CT: 0.98

Outcome measure-4

NPV

US only: 0.97

Conditional CT: 0.98

Outcome measure-5

Inconclusive results

Author’s conclusion

A diagnostic pathway using routine US, limited CT, and clinical re-evaluation for patients with acute abdominal pain can provide excellent results for the diagnosis and treatment of appendicitis.

Personal remarks

Of the patients diagnosed with appendicitis 69 were <17 year and 50 were adults.

Poortman, 2009

Type of study: prospective cohort study

Setting: All patients between the ages of 18

and 80 years who had presented to the emergency department

with symptoms of acute appendicitis were eligible for

this study.

Country: The Netherlands

Conflicts of interest: not reported

Inclusion criteria:

Patients with typical signs of acute appendicitis

(ie, history, physical examinations findings, and

laboratory test results) who needed acute operation (within 24 hours) and who had been admitted between 8 AM and 10 PM were included in the study.

Exclusion criteria: Pregnant patients, patients with claustrophobia, and patients

with a previous appendectomy were not included.

N=151

US: 151

CT: 60

Prevalence: 60.9%

Mean age (range):

US: 29 (18-80)

CT: 30 (18-74)

Sex:

US: 44% Male

CT: 39% Male

Other important characteristics:

BMI, mean (range)

US: 23.6 (15.6-40.7)

CT: 25.9 (17.1-40.7)

Describe index test:

US and conditional CT if US was negative or inconclusive

US (HDI 3000; ATL-Philips Medical Systems) was performed using the graded-compression technique, with

3.5-MHz and 5-MHz convex- and 7.5-MHz linear-array

transducers, according to body size.

All multidetector CT examinations were performed by using an eight-detector-row CT machine (Philips Medical Systems). Scanning was performed with the following parameters: 0.5 seconds per rotation time, 2-mm collimation, and 40 mm/sec table increment (pitch 1.25).

Cut-off point(s):

Both US and CT assessments

were based on criteria derived from reports in the

literature.1,15 Direct visualization of an incompressible appendix

with an outer diameter 6 mm and echogenic incompressible

periappendicular inflamed tissue with or without an appendicolith was the primary criterion to establish

a diagnosis of acute appendicitis. A fluid-filled appendix, hyperemia within the appendiceal wall at color

Doppler sonography, pericecal fluid, and abscess, were considered as possible positive criteria for acute appendicitis.US was considered negative for appendicitis only if a normal

appendix could be entirely identified. If the appendix

could not be visualized, the result of US was considered

inconclusive and an additional CT was performed.

Comparator test:

Cut-off point(s):

Describe reference test:

The reference standard was operation or conservative treatment. Imaging tests and therapy—hospitalization for operation, observation before discharge from hospital—were performed within 6 to 12 hours of patient arrival at the

emergency department. Diagnostic performances of US and CT were compared with the reference standard for each patient.

Time between the index test and reference test:

6-12h

For how many participants were no complete outcome data available?

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

US only: 0.91

Conditional CT: 0.93

Outcome measure-2

Specificity

US only: 86% (95% CI 76-93)

Pathway: 86% (95% CI 76-93)

Outcome measure-3

PPV

US only: 90% (95% CI 81-95)

Pathway: 92% (95% CI 85-96)

Outcome measure-4

NPV

US only: 71% (95% CI 60-80)

Pathway: 100% (95% CI 93-97)

Outcome measure-5

Inconclusive results

US only: 29 (19.2%)

Author’s conclusion

A diagnostic pathway using primary graded compression US and complementary multidetector CT in a general community teaching hospital yields a high diagnostic accuracy for acute

appendicitis without adverse events from delay in treatment. Although US is less accurate than CT, it can be used as a primary imaging modality, avoiding the disadvantages of CT. For those patients with negative US and CT findings, observation is safe.

Lourenco, 2016

Type of study: retrospective cohort study

Setting: large

quaternary hospital within the Providence Health Care authority (St.

Paul's or Mount St. Joseph Hospitals) in Vancouver

Country: Canada

Conflicts of interest: None declared

Inclusion criteria: Only adult patients who received sonographic investigation as the initial imaging modality were included

Exclusion criteria:

Patients imaged with other modalities or who did not receive imaging were excluded

N=354

Prevalence: 19.8%

Mean age: 30.5 years

Sex: 76.3% Female

Describe index test:

Ultrasound

Cut-off point(s):

Positive: 6-mm or larger diameter aperistaltic, non-compressible hyperemic

blind-ending structure with origin adjacent to the cecal pole

Negative: complete visualization of the compressible blind-ending structure with diameter less than 6 mm adjacent to the cecal pole

Equivocal: appendix not identified

Describe reference test:

US was compared to CT or MR imaging, and US to surgical results. Patients who did not undergo surgery were considered to not have had appendicitis, with discharge notes considered gold standard. Patient re-admission to hospital or subsequent emergency visits within the subsequent 3 months were recorded.

Time between the index test and reference test:

3 months

For how many participants were no complete outcome data available?

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

48.4%

Outcome measure-2

Specificity

97.9%

Outcome measure-3

PPV

83.8%

Outcome measure-4

NPV

89.6%

Outcome measure-5

Inconclusive results

288 (81.4%)

Author’s conclusion

Ultrasound is a test commonly performed in theworkup of acute appendicitis.

Unfortunately, despite its benefits, the utility of ultrasound in this setting is often compromised due to commonly yielding an equivocal result. In this retrospectively designed study, we attempted to explore the role ultrasound plays in the diagnosis of appendicitis. We

demonstrate that even equivocal ultrasound results, when combined with simple clinical factors, can be useful in the exclusion of appendicitis, with an excellent NPV (96.2%) when pretest risk factors indicate a

low pretest likelihood of disease.

Parida, 2017

Type of study: prospective cohort study

Setting: Department of Radiodiagnosis in SCB Medical College Hospital (MCH), Cuttack

Country: India

Conflicts of interest: Not reported

Inclusion criteria: All the patients with clinical and laboratory diagnosis of acute appendicitis were included in the study.

Exclusion criteria:

Patients who were unfit for the surgery, cases with appendicular lump, cases with peritonitis, and recurrent appendicitis were excluded from the study. Patients more than 75 years of age and uncooperative patients were also excluded from the study.

N=100

Prevalence: 77%

Age: range 2-67

Sex: M:F ratio

1:5

Describe index test:

Ultrasound

First, general survey of the patient’s abdomen was performed with 2-5 MHz curvilinear probe; then, the examination of the right lower quadrant by graded compression technique with 3-12 MHz linear probe was done.

Cut-off point(s):

Describe reference test:

Pre-operative findings and histopathology reports

Cut-off point(s):

Time between the index test and reference test: NR

For how many participants were no complete outcome data available?

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

96.1%

Outcome measure-2

Specificity

95.7%

Outcome measure-3

PPV

89.7%

Outcome measure-4

NPV

88%

Outcome measure-5

Inconclusive results

Author’s conclusion

High resolution sonography with graded compression is a very useful diagnostic tool for diagnosis of appendicitis in problematic cases and in women in their reproductive period. It is also helpful in detecting complications of appendicitis and other abdominal diseases that mimic acute appendicitis.

Ziedses, 2016

Type of study: prospective cohort study

Setting: large

regional teaching hospital

Country: The Netherlands

Conflicts of interest: Not reported

Inclusion criteria:

A clinical suspicion of appendicitis and female sex in the age of 12 through to 55 years that were presented at the emergency

department.

Exclusion criteria: if informed consent was

not obtained, if the patients were pregnant or in case of a

known contraindication for MRI.

N=112

Prevalence: 26%

Median age (range): 22 (12-54)

Describe index test:

Unenhanced MRI

At inclusion the patients underwent a complete routine surgical examination including patients’ history, physical examination and blood tests. After this workup all patients underwent MRI.

All patients underwent MRI operating at a field strength

of a 1.5-Tesla superconductive magnet (GyroscanIntera,

Philips Medical Systems, The Netherlands). T2-weighted

Turbo Spin Echo images in coronal and sagittal direction

and transverse T1-weighted Gradient Echo images were

obtained.

Cut-off point(s):

Describe reference test:

the definitive histological

diagnosis or outcome at four month’s follow-up.

Time between the index test and reference test:

4 months

For how many participants were no complete outcome data available?

Sixteen out of 128 patients were excluded from this study; nine patients underwent emergency surgery, six of whom had appendicitis and for seven other patients the MRI system was not available.

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

89%

Outcome measure-2

Specificity

100%

Outcome measure-3

PPV

100%

Outcome measure-4

NPV

96%

Outcome measure-5

Inconclusive results

9 (8%)

Author’s conclusion

We believe that MRI should perhaps be standard in all female patients during their reproductive years

with suspected appendicitis. It avoids an operation in 32 % of cases and allows earlier planning for patients with an

equivocal clinical picture.

Petkovska, 2016

Type of study: retrospective cohort study

Setting: University of Arizona College

Country: USA

Conflicts of interest: None declared

Inclusion criteria:

(a) came to the

ED with a clinical presentation of appendicitis

and (b) underwent an accelerated

MR imaging protocol.

Exclusion criteria: (a) were younger than

3 years or 50 years or older and (b) did not meet the described inclusion criteria

N=403 of whom 253 adults, 150 children and 48 pregnant woman.

Prevalence: 16.6%

Mean age (range): 21 (3-49)

Sex: 25% male

Other important characteristics:

Describe index test:

MRI

All MR examinations were performed with either a 1.5- or 3.0-T system (Magnetom Aera or Magnetom Skyra;

Siemens Medical Solutions, Erlangen, Germany) equipped with 45 mT/m gradients operating at a slew rate of 200 T/m/sec. Two 18-channel

torso phased-array anterior coils and a 32-channel table-integrated posterior coil were used for signal reception.

Cut-off point(s):

Findings were classified as

(a) definitely acute appendicitis, (b) probably acute appendicitis, (c) indeterminate, (d) probably not acute appendicitis,

and (e) definitely not acute

appendicitis. To facilitate statistical analysis, all patients with an MR imaging report that described the findings

as “definitely acute appendicitis” or “probably acute appendicitis” were

classified as having positive findings for acute appendicitis. All patients with an MR imaging report that described the findings as “definitely not acute appendicitis” or “probably not acute appendicitis” were classified as having negative

findings for acute appendicitis.

Describe reference test:

In all patients who underwent surgical

intervention, intraoperative surgical and histopathologic assessment served as the reference standard for the presence or absence of acute appendicitis. In patients who did not undergo surgical intervention, clinical follow-up served as the reference standard. Clinical follow-up was obtained via (a) phone

interview of patients, with a minimum of 8 weeks follow-up after MR imaging to determine any interval appendectomy or diagnosis of acute appendicitis,

or (b) medical record review, with a minimum of 6 months follow-up after MR imaging.

Time between the index test and reference test:

6 months

For how many participants were no complete outcome data available?

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

Children: 96.9 (95% CI 83.8 to 99.9)

Adults: 97.1 (95% CI 85.1 to 99.9)

Pregnant: 85.7 (95% CI 42.1 to 99.6)

Outcome measure-2

Specificity

Children: 99.2 (95% CI 95.4 to 100)

Adults: 99.5 (95% CI 97.5 to 100)

Pregnant: 100 (95% CI 91.4 to 100)

Outcome measure-3

PPV

Children: 96.9 (95% CI 83.3 to 99.9)

Adults: 97.1 (95% CI 85.1 to 99.9)

Pregnant: 100 (95% CI 54.1 to 100)

Outcome measure-4

NPV

Children: 99.2 (95% CI 95.4 to 100)

Adults: 99.5 (95% CI 97.5 to 100)

Pregnant: 97.6 (95% CI 87.4 to 99.9)

Outcome measure-5

Inconclusive results

Author’s conclusion

MR imaging is a highly sensitive and specific test in the evaluation of patients younger than 50 years with acute RLQ pain that uses a rapid imaging protocol performed without intravenous or oral contrast material.

Personal remarks

There were 150 patients

aged 18 years or younger (mean age, 12 years; age range, 3–18 years) and

48 pregnant patients (mean age, 25 years; age range, 15–37 years).

Verantwoording

Beoordelingsdatum en geldigheid

Laatst beoordeeld : 01-07-2019

Module	Regiehouder(s)	Jaar van autorisatie	Eerstvolgende beoordeling actualiteit richtlijn	Frequentie van beoordeling op actualiteit	Wie houdt er toezicht op actualiteit	Relevante factoren voor wijzigingen in aanbeveling
Diagnostiek bij volwassenen	NVvH en NVvR	2019	2024	Eens in de 5 jaar	NVvH en NVvR	-

Voor het beoordelen van de actualiteit van deze richtlijn is de werkgroep niet in stand gehouden. Uiterlijk in 2024 bepaalt het bestuur van de Nederlandse Vereniging voor Heelkunde of de modules van deze richtlijn nog actueel zijn. Op modulair niveau is een onderhoudsplan beschreven. Bij het opstellen van de richtlijn heeft de werkgroep per module een inschatting gemaakt over de maximale termijn waarop herbeoordeling moet plaatsvinden en eventuele aandachtspunten geformuleerd die van belang zijn bij een toekomstige herziening (update). De geldigheid van de richtlijn komt eerder te vervallen indien nieuwe ontwikkelingen aanleiding zijn een herzieningstraject te starten.

De Nederlandse Vereniging voor Heelkunde is regiehouder van deze richtlijn en eerstverantwoordelijke op het gebied van de actualiteitsbeoordeling van de richtlijn. De andere aan deze richtlijn deelnemende wetenschappelijke verenigingen of gebruikers van de richtlijn delen de verantwoordelijkheid en informeren de regiehouder over relevante ontwikkelingen binnen hun vakgebied.

Initiatief en autorisatie

Initiatief:

Nederlandse Vereniging voor Heelkunde

Geautoriseerd door:

Nederlandse Vereniging van Spoedeisende Hulp Artsen
Nederlandse Vereniging voor Heelkunde
Nederlandse Vereniging voor Kindergeneeskunde
Nederlandse Vereniging voor Medische Microbiologie
Nederlandse Vereniging voor Obstetrie en Gynaecologie
Nederlandse Vereniging voor Pathologie
Nederlandse Vereniging voor Radiologie
Patiëntenfederatie Nederland

Algemene gegevens

De richtlijnontwikkeling werd ondersteund door het Kennisinstituut van de Federatie Medisch Specialisten (https://www.demedischspecialist.nl/kennisinstituut) en werd gefinancierd uit de Kwaliteitsgelden Medisch Specialisten (SKMS). De financier heeft geen enkele invloed gehad op de inhoud van de richtlijn.

Doel en doelgroep

Doel

Deze richtlijn is bedoeld om een evidence-based beleid voor de zorg voor patiënten met acute appendicitis in de tweede op te stellen.

Doelgroep

Deze richtlijn is geschreven voor alle leden van de beroepsgroepen die betrokken zijn bij de zorg voor patiënten met acute appendicitis, zowel bij kinderen als bij volwassenen. Dit zijn onder andere chirurgen, kinderchirurgen, radiologen, kinderartsen, gynaecologen en SEH-artsen. Een secundaire doelgroep zijn zorgverleners uit de eerste lijn die betrokken zijn bij de zorg rondom patiënten met acute appendicitis, waaronder de huisarts, verpleegkundig specialist en physician assistants.

Samenstelling werkgroep

Voor de herziening van de richtlijn is in 2017 een multidisciplinaire werkgroep ingesteld, bestaande uit vertegenwoordigers van alle relevante specialismen die betrokken zijn bij de zorg voor patiënten met acute appendicitis te maken hebben.

Werkgroep:

Dr. C.C. van Rossem, gastro-intestinaal chirurg, werkzaam in Maasstad Ziekenhuis, namens NVvH (voorzitter)
Drs. A.L. van den Boom, fellow gastro-intestinale chirurgie, werkzaam in het UMCG, namens NVvH
Drs. W.J. Bom, arts-onderzoeker chirurgie, werkzaam in Amsterdam UMC locatie AMC, namens NVvH
Drs. M.E. Bos, arts in opleiding, werkzaam in Spoedeisende Geneeskunde regio VUmc, locatie Westfriesgasthuis, namens NVSHA
Dr. A.A.W. van Geloven, gastro-intestinaal chirurg, werkzaam in Tergooi, namens NVvH
Dr. R.R. Gorter, fellow kinderchirurgie, werkzaam in Amsterdam UMC, namens NVvH
Dr. B.C. Jacod, gynaecoloog-perinatoloog, werkzaam in OLVG, namens NVOG
Drs. M. Knaapen, arts-onderzoeker kinderchirurgie, werkzaam in Amsterdam UMC, namens NVvH
R. Lammers, MSc, beleidsadviseur, werkzaam voor de Patiëntenfederatie Nederland
Drs. A.H.J. van Meurs, algemeen kinderarts, werkzaam in HagaZiekenhuis, namens NVK
Dr. J. Nederend, radioloog, werkzaam in Catharina Ziekenhuis Eindhoven, namens NVvR
Dr. J.B.C.M. Puylaert, radioloog, werkzaam in Haaglanden Medisch Centrum, namens NVvR

Samenstelling klankbordgroep:

Dr. A.K. van der Bij, arts-microbioloog, werkzaam in Diakonessenhuis, NVMM
Dr. R. Bakx, kinderchirurg, werkzaam in Amsterdam UMC, namens NVvH

Met ondersteuning van:

Dr. S.N. Hofstede, adviseur, Kennisinstituut van de Federatie Medisch Specialisten
L. Boerboom MSc, literatuurspecialist, Kennisinstituut van de Federatie Medisch Specialisten
D.M.J. Tennekes, directiesecretaresse, Kennisinstituut van de Federatie Medisch Specialisten

Belangenverklaringen

De KNMG-code ter voorkoming van oneigenlijke beïnvloeding door belangenverstrengeling is gevolgd. Alle werkgroepleden hebben schriftelijk verklaard of zij in de laatste drie jaar directe financiële belangen (betrekking bij een commercieel bedrijf, persoonlijke financiële belangen, onderzoek financiering) of indirecte belangen (persoonlijke relaties, reputatiemanagement, kennisvalorisatie) hebben gehad. Een overzicht van de belangen van werkgroepleden en het oordeel over het omgaan met eventuele belangen vindt u in onderstaande tabel. De ondertekende belangenverklaringen zijn op te vragen bij het secretariaat van het Kennisinstituut van de Federatie Medisch Specialisten.

Werkgroeplid	Functie	Nevenfuncties	Gemelde belangen	Ondernomen actie
Van Rossem	Gastro-intestinaal chirurg, Maasstad Ziekenhuis	Geen	Geen	Geen actie
Van Geloven	Gastro-intestinaal chirurg, Tergooi	Geen	Geen	Geen actie
Gorter	Fellow Kinderchirurgie Amsterdam UMC	Onderzoeker kinderchirurgie Vumc & AMC	Projectleider APAC-studie Non-operatieve behandeling van appendicitis bij kinderen. ZonMw Dossiernummer: 843002708	Geen actie
Van Meurs	Algemeen kinderarts, Juliana Kinderziekenhuis (HagaZiekenhuis)	Onderwijs aan studenten geneeskunde LUMC	Geen	Geen actie
Jacod	Gynaecoloog, Radboud UMC	Secretaris werkgroep Samenwerking Obstetrie-Anesthesiologie, NVOG-NVA, onbetaald	Geen	Geen actie
Puylaert	Radioloog HMC	Geen	Geen	Geen actie
Nederend	Radioloog Catharina Ziekenhuis Eindhoven	Screeningsradioloog Bevolkingsonderzoek, betaald Secretaris Sectie Abdominale Radiologie, NVvR, onbetaald	Onderzoek naar de waarden van MRI bij PIPAC-behandeling, deels gefinancieerd (unresticted grant) door Bracco Imaging Europe B.V.	Geen actie
Bos	AIOS Spoedeisende Geneeskunde regio Vumc, locatie Westfriesgasthuis	Algemeen lid congrescommissie NVSHA - onbetaald	Geen	Geen actie
Van den Boom	fellow gastro-intestinale chirurgie	Geen	Principal investigator van APPIC trial (short versus long antibiotic treatment after appendectomy for complex appendicitis), gefinancierd door ZonMw ontvangen (Goed Gebruik Geneesmiddelen)	Geen actie
Bom	Arts-onderzoeker chirurgie, AMC	Geen	Ik word betaald vanuit de EPOCH studie, gefinancieerd door ZonMw. Dit is een RCT naar het voorkomen van wondinfecties. Dit is op geen enkele wijze gelieerd aan de richtlijn appendicitis. Derhalve heb ik geen belangen bij extern gefinancierd onderzoek.	Geen actie
Knaapen	Arts-onderzoeker kinderchirurgie Amsterdam UMC	Geen	Coördinerend onderzoeker APAC-studie Non-operatieve behandeling van appendicitis bij kinderen. ZonMw Dossiernummer: 843002708	Geen actie
Lammers	Beleidsadviseur, Patiëntenfederatie	Geen	Geen	Geen actie
Hofstede	Adviseur, Kennisinstituut van de Federatie Medisch Specialisten	Geen	Geen	Geen actie
Van Enst	Senior adviseur, Kennisinstituut van de Federatie Medisch Specialisten	Lid van de GRADE working group/ Dutch GRADE Network	Geen	Geen actie

Klankbordgroeplid	Functie	Nevenfuncties	Gemelde belangen	Ondernomen actie
Van der Bij	Arts-microbioloog Diakonessenhuis Utrecht/MSBD	Voorzitter commissie kwaliteitsbeheersing NVMM, onbetaald	Geen	Geen actie
Bakx	Kinderchirurg, Kinderchirurgisch centrum Amsterdam	Voorzitter richtlijnencommissie NVvH, onbetaald, bestuurslid Stichting spoedeisende hulp bij kinderen, onbetaald, APLS instructeur, onbetaald	Principal investigator APAC-studie Non-operatieve behandeling van appendicitis bij kinderen. ZonMw Dossiernummer: 843002708	Geen actie

Inbreng patiëntenperspectief

Er werd aandacht besteed aan het patiëntenperspectief door een afgevaardigde patiëntenvereniging in de werkgroep op te nemen. De conceptrichtlijn is tevens voor commentaar voorgelegd aan de Patiëntenfederatie.

Methode ontwikkeling

Evidence based

Implementatie

In de verschillende fasen van de richtlijnontwikkeling is rekening gehouden met de implementatie van de richtlijn (module) en de praktische uitvoerbaarheid van de aanbevelingen. Daarbij is uitdrukkelijk gelet op factoren die de invoering van de richtlijn in de praktijk kunnen bevorderen of belemmeren. Het implementatieplan is te vinden bij de aanverwante producten. De werkgroep heeft tevens interne kwaliteitsindicatoren ontwikkeld om het toepassen van de richtlijn in de praktijk te volgen en te versterken (zie Indicatorontwikkeling).

Werkwijze

AGREE

Deze richtlijn is opgesteld conform de eisen vermeld in het rapport Medisch Specialistische Richtlijnen 2.0 van de adviescommissie Richtlijnen van de Raad Kwaliteit. Dit rapport is gebaseerd op het AGREE II instrument (Appraisal of Guidelines for Research & Evaluation II; Brouwers, 2010), dat een internationaal breed geaccepteerd instrument is. Voor een stap-voor-stap beschrijving hoe een evidence-based richtlijn tot stand komt wordt verwezen naar het stappenplan Ontwikkeling van Medisch Specialistische Richtlijnen van het Kennisinstituut van de Federatie Medisch Specialisten.

Knelpuntenanalyse

Tijdens de voorbereidende fase inventariseerden de voorzitter van de werkgroep en de adviseur de knelpunten. De werkgroep beoordeelde de aanbevelingen uit de eerdere richtlijn (NVvH, 2010) op noodzaak tot revisie. Tevens werden stakeholders uitgenodigd voor een knelpuntenbijeenkomst (Invitational conference). Vanwege het lage aantal aanmeldingen (drie, IGZ, NVA en de Patiëntenfederatie) is de bijeenkomst geannuleerd. Gevraagd is schriftelijk op het raamwerk te reageren. Er zijn schriftelijk knelpunten aangedragen door NVKC, NVSHA, NVvH, NVZ en V&VN. Een verslag hiervan is opgenomen onder aanverwante producten. De werkgroep stelde vervolgens een long list met knelpunten op en prioriteerde de knelpunten op basis van: (1) klinische relevantie, (2) de beschikbaarheid van (nieuwe) evidence van hoge kwaliteit, (3) en de te verwachten impact op de kwaliteit van zorg, patiëntveiligheid en (macro)kosten.

Uitgangsvragen en uitkomstmaten

Op basis van de uitkomsten van de knelpuntenanalyse zijn door de voorzitter en de adviseur concept-uitgangsvragen opgesteld. Deze zijn met de werkgroep besproken waarna de werkgroep de definitieve uitgangsvragen heeft vastgesteld. Vervolgens inventariseerde de werkgroep per uitgangsvraag welke uitkomstmaten voor de patiënt relevant zijn, waarbij zowel naar gewenste als ongewenste effecten werd gekeken. De werkgroep waardeerde deze uitkomstmaten volgens hun relatieve belang bij de besluitvorming rondom aanbevelingen, als cruciaal (kritiek voor de besluitvorming), belangrijk (maar niet cruciaal) en onbelangrijk. Tevens definieerde de werkgroep tenminste voor de cruciale uitkomstmaten welke verschillen zij klinisch (patiënt) relevant vonden.

Strategie voor zoeken en selecteren van literatuur

Er werd eerst oriënterend gezocht naar bestaande buitenlandse richtlijnen, systematische reviews (Medline (OVID)), en literatuur over patiëntvoorkeuren (patiëntenperspectief; Medline (OVID)). Vervolgens werd voor de afzonderlijke uitgangsvragen werd aan de hand van specifieke zoektermen gezocht naar gepubliceerde wetenschappelijke studies in (verschillende) elektronische databases. Tevens werd aanvullend gezocht naar studies aan de hand van de literatuurlijsten van de geselecteerde artikelen. In eerste instantie werd gezocht naar studies met de hoogste mate van bewijs. De werkgroepleden selecteerden de via de zoekactie gevonden artikelen op basis van vooraf opgestelde selectiecriteria. De geselecteerde artikelen werden gebruikt om de uitgangsvraag te beantwoorden. De databases waarin is gezocht, de zoekstrategie en de gehanteerde selectiecriteria zijn te vinden in de module met desbetreffende uitgangsvraag. De zoekstrategie voor de oriënterende zoekactie en patiëntenperspectief zijn opgenomen onder aanverwante producten.

Kwaliteitsbeoordeling individuele studies

Individuele studies werden systematisch beoordeeld, op basis van op voorhand opgestelde methodologische kwaliteitscriteria, om zo het risico op vertekende studieresultaten (risk-of-bias) te kunnen inschatten. Deze beoordelingen kunt u vinden in de Risk-of-Bias (RoB) tabellen. De gebruikte RoB instrumenten zijn gevalideerde instrumenten die worden aanbevolen door de Cochrane Collaboration: AMSTAR – voor systematische reviews; Cochrane – voor gerandomiseerd gecontroleerd onderzoek; ACROBAT-NRS – voor observationeel onderzoek; QUADAS II – voor diagnostisch onderzoek.

Samenvatten van de literatuur

De relevante onderzoeksgegevens van alle geselecteerde artikelen werden overzichtelijk weergegeven in evidencetabellen. De belangrijkste bevindingen uit de literatuur werden beschreven in de samenvatting van de literatuur. Bij een voldoende aantal studies en overeenkomstigheid (homogeniteit) tussen de studies werden de gegevens ook kwantitatief samengevat (meta-analyse) met behulp van Review Manager 5.

Beoordelen van de kracht van het wetenschappelijke bewijs

A) Voor interventievragen (vragen over therapie of screening)

De kracht van het wetenschappelijke bewijs werd bepaald volgens de GRADE-methode. GRADE staat voor ‘Grading Recommendations Assessment, Development and Evaluation’ (zie http://www.gradeworkinggroup.org/).

GRADE onderscheidt vier gradaties voor de kwaliteit van het wetenschappelijk bewijs: hoog, redelijk, laag en zeer laag. Deze gradaties verwijzen naar de mate van zekerheid die er bestaat over de literatuurconclusie (Schünemann, 2013).

GRADE	Definitie
Hoog	Er is hoge zekerheid dat het ware effect van behandeling dichtbij het geschatte effect van behandeling ligt zoals vermeld in de literatuurconclusie; het is zeer onwaarschijnlijk dat de literatuurconclusie verandert wanneer er resultaten van nieuw grootschalig onderzoek aan de literatuuranalyse worden toegevoegd.
Redelijk*	Er is redelijke zekerheid dat het ware effect van behandeling dichtbij het geschatte effect van behandeling ligt zoals vermeld in de literatuurconclusie; het is mogelijk dat de conclusie verandert wanneer er resultaten van nieuw grootschalig onderzoek aan de literatuuranalyse worden toegevoegd.
Laag	Er is lage zekerheid dat het ware effect van behandeling dichtbij het geschatte effect van behandeling ligt zoals vermeld in de literatuurconclusie; er is een reële kans dat de conclusie verandert wanneer er resultaten van nieuw grootschalig onderzoek aan de literatuuranalyse worden toegevoegd.
Zeer laag	Er is zeer lage zekerheid dat het ware effect van behandeling dichtbij het geschatte effect van behandeling ligt zoals vermeld in de literatuurconclusie; De literatuurconclusie is zeer onzeker.

*in 2017 heeft het Dutch GRADE Network bepaalt dat de voorkeursformulering voor de op een na hoogste gradering ‘redelijk’ is in plaats van ‘matig’

B) Voor vragen over diagnostische tests, schade of bijwerkingen, etiologie en prognose

De kracht van het wetenschappelijke bewijs werd eveneens bepaald volgens de GRADE-methode: GRADE-diagnostiek voor diagnostische vragen (Schünemann, 2008), en een generieke GRADE-methode voor vragen over schade of bijwerkingen, etiologie en prognose. In de gehanteerde generieke GRADE-methode werden de basisprincipes van de GRADE-methodiek toegepast: het benoemen en prioriteren van de klinisch (patiënt) relevante uitkomstmaten, een systematische review per uitkomstmaat, en een beoordeling van bewijskracht op basis van de vijf GRADE-criteria (startpunt hoog; downgraden voor risk-of-bias, inconsistentie, indirectheid, imprecisie, en publicatiebias).

Formuleren van de conclusies

Voor elke relevante uitkomstmaat werd het wetenschappelijk bewijs samengevat in een of meerdere literatuurconclusies waarbij het niveau van bewijs werd bepaald volgens de GRADE-methodiek. De werkgroepleden maakten de balans op van elke interventie (overall conclusie). Bij het opmaken van de balans werden de gunstige en ongunstige effecten voor de patiënt afgewogen. De overall bewijskracht wordt bepaald door de laagste bewijskracht gevonden bij een van de cruciale uitkomstmaten. Bij complexe besluitvorming waarin naast de conclusies uit de systematische literatuuranalyse vele aanvullende argumenten (overwegingen) een rol spelen, werd afgezien van een overall conclusie. In dat geval werden de gunstige en ongunstige effecten van de interventies samen met alle aanvullende argumenten gewogen onder het kopje 'Overwegingen'.

Overwegingen (van bewijs naar aanbeveling)

Om te komen tot een aanbeveling zijn naast (de kwaliteit van) het wetenschappelijke bewijs ook andere aspecten belangrijk en worden meegewogen, zoals de expertise van de werkgroepleden, de waarden en voorkeuren van de patiënt (patient values and preferences), kosten, beschikbaarheid van voorzieningen en organisatorische zaken. Deze aspecten worden, voor zover geen onderdeel van de literatuursamenvatting, vermeld en beoordeeld (gewogen) onder het kopje ‘Overwegingen’.

Formuleren van aanbevelingen

De aanbevelingen geven antwoord op de uitgangsvraag en zijn gebaseerd op het beschikbare wetenschappelijke bewijs en de belangrijkste overwegingen, en een weging van de gunstige en ongunstige effecten van de relevante interventies. De kracht van het wetenschappelijk bewijs en het gewicht dat door de werkgroep wordt toegekend aan de overwegingen, bepalen samen de sterkte van de aanbeveling. Conform de GRADE-methodiek sluit een lage bewijskracht van conclusies in de systematische literatuuranalyse een sterke aanbeveling niet a priori uit, en zijn bij een hoge bewijskracht ook zwakke aanbevelingen mogelijk. De sterkte van de aanbeveling wordt altijd bepaald door weging van alle relevante argumenten tezamen.

Randvoorwaarden (Organisatie van zorg)

In de knelpuntenanalyse en bij de ontwikkeling van de richtlijn is expliciet rekening gehouden met de organisatie van zorg: alle aspecten die randvoorwaardelijk zijn voor het verlenen van zorg (zoals coördinatie, communicatie, (financiële) middelen, menskracht en infrastructuur). Randvoorwaarden die relevant zijn voor het beantwoorden van een specifieke uitgangsvraag maken onderdeel uit van de overwegingen bij de bewuste uitgangsvraag.

Indicatorontwikkeling

Gelijktijdig met het ontwikkelen van de conceptrichtlijn werden er interne kwaliteitsindicatoren ontwikkeld om het toepassen van de richtlijn in de praktijk te volgen en te versterken. Meer informatie over de methode van indicatorontwikkeling is op te vragen bij het Kennisinstituut van de Federatie Medisch Specialisten (secretariaat@kennisinstituut.nl).

Kennislacunes

Tijdens de ontwikkeling van deze richtlijn is systematisch gezocht naar onderzoek waarvan de resultaten bijdragen aan een antwoord op de uitgangsvragen. Bij elke uitgangsvraag is door de werkgroep nagegaan of er (aanvullend) wetenschappelijk onderzoek gewenst is om de uitgangsvraag te kunnen beantwoorden. Een overzicht van de onderwerpen waarvoor (aanvullend) wetenschappelijk van belang wordt geacht, is als aanbeveling in de Kennislacunes beschreven (onder aanverwante producten).

Commentaar- en autorisatiefase

De conceptrichtlijn werd aan de betrokken (wetenschappelijke) verenigingen en (patiënt) organisaties voorgelegd ter commentaar. De commentaren werden verzameld en besproken met de werkgroep. Naar aanleiding van de commentaren werd de conceptrichtlijn aangepast en definitief vastgesteld door de werkgroep. De definitieve richtlijn werd aan de deelnemende (wetenschappelijke) verenigingen en (patiënt) organisaties voorgelegd voor autorisatie en door hen geautoriseerd dan wel geaccordeerd.

Literatuur

Brouwers MC, Kho ME, Browman GP, et al. AGREE Next Steps Consortium. AGREE II: advancing guideline development, reporting and evaluation in health care. CMAJ. 2010;182(18):E839-42. doi: 10.1503/cmaj.090449. Epub 2010 Jul 5. Review. PubMed PMID: 20603348.

Medisch Specialistische Richtlijnen 2.0 (2012). Adviescommissie Richtlijnen van de Raad Kwalitieit. Link: https://richtlijnendatabase.nl/over_deze_site/richtlijnontwikkeling.html

Schünemann H, Brożek J, Guyatt G, et al. GRADE handbook for grading quality of evidence and strength of recommendations. Updated October 2013. The GRADE Working Group, 2013. Available from http://gdt.guidelinedevelopment.org/central_prod/_design/client/handbook/handbook.html.

Schünemann HJ, Oxman AD, Brozek J, et al. Grading quality of evidence and strength of recommendations for diagnostic tests and strategies. BMJ. 2008;336(7653):1106-10. doi: 10.1136/bmj.39500.677199.AE. Erratum in: BMJ. 2008;336(7654). doi: 10.1136/bmj.a139. PubMed PMID: 18483053.

Wessels M, Hielkema L, van der Weijden T. How to identify existing literature on patients' knowledge, views, and values: the development of a validated search filter. J Med Libr Assoc. 2016 Oct;104(4):320-324. PubMed PMID: 27822157; PubMed Central PMCID: PMC5079497.

Zoekverantwoording

Zoekacties zijn opvraagbaar. Neem hiervoor contact op met de Richtlijnendatabase.

Acute appendicitis

Acute appendicitis

Uitgangsvraag

Aanbeveling

Overwegingen

Onderbouwing

Achtergrond

Conclusies / Summary of Findings

Samenvatting literatuur

Zoeken en selecteren

Referenties

Evidence tabellen

Verantwoording

Beoordelingsdatum en geldigheid

Initiatief en autorisatie

Algemene gegevens

Doel en doelgroep

Samenstelling werkgroep

Belangenverklaringen

Inbreng patiëntenperspectief

Methode ontwikkeling

Implementatie

Werkwijze

Zoekverantwoording

Bijlagen

De Federatie Medisch Specialisten maakt gebruik van cookies