Diagnostische strategie bij zwangere vrouwen met acute appendicitis

Beoordeeld: 01-07-2019

Uitgangsvraag

Wat is de optimale diagnostische strategie bij zwangere vrouwen met verdenking op acute appendicitis?

Aanbeveling

Aanbeveling 1

Verricht echografie bij iedere zwangere patiënt met verdenking op een appendicitis.

Aanbeveling 2

Verricht bij een inconclusieve echografie een MRI.

Verwijs bij een negatieve echografie en/ of MRI de patiënt naar een gynaecoloog voor verder beleid.

Neem de patiënt op bij een inconclusieve MRI en maak beleid in samenspraak met een gynaecoloog.

Zie ook het Stroomschema ‘Diagnostiek bij zwangere vrouwen’ bij de aanverwante producten onder 'toepassen'.

Overwegingen

Voor- en nadelen van de interventie en de kwaliteit van het bewijs

Beeldvorming:

De werkgroep is van mening dat de diagnostische accuratesse van anamnese, lichamelijk en laboratorium onderzoek onvoldoende is om de definitieve diagnose acute appendicitis vast te stellen. Aanvullend beeldvormend onderzoek is daarom geïndiceerd voor het stellen van de definitieve diagnose. De accuratesse van beeldvormend onderzoek gaat omhoog naarmate de vooraf kans hoger wordt. Een goede voorselectie op basis van de anamnese, lichamelijk onderzoek en laboratorium onderzoek verhoogd de accuratesse van beeldvorming en is daarom essentieel.

Het gebruik van beeldvorming bij iedere patiënt met verdenking op een appendicitis zorgt voor lagere kosten per patiënt, minder complicaties en minder negatieve appendectomieën (Lahaye, 2015). De sensitiviteit en specificiteit van echografie zijn hoog. Er is bij zwangere vrouwen echter een aanzienlijk aantal inconclusieve resultaten, veel hoger dan bij kinderen en volwassenen – in studies van Konrad (2012) en Lehnert (2015) was dit zelfs meer dan 90%.

Herbeoordeling:

Onur (2008) vergeleek in een RCT onder patiënten met acute buikpijn, die na beoordeling op de spoedeisende hulp een niet-definitieve diagnose hebben gekregen, of er een verschil is in morbiditeit tussen opname ter observatie (n=50) en herbeoordeling de volgende drie dagen met een tijdsinterval van 8 tot 12 uur (n=55). De totale morbiditeit was 10% in de opname groep en 7,2% in de herbeoordeling groep.

Toorenvliet (2010) keek in een prospectieve studie onder 500 patiënten met acute buikpijn, die na beoordeling op de spoedeisende hulp niet werden opgenomen en binnen 24 uur werden herbeoordeeld. Zes patiënten (1,2%) kregen bij herbeoordeling een diagnose, waarvan je zou willen dat deze bij initiële beoordeling zou zijn gesteld. Drie van deze patiënten hadden een geperforeerde acute appendicitis. De zes patiënten werden geopereerd en na herstel van de operatie uit het ziekenhuis ontslagen zonder complicaties.

De studie van Toorenvliet (2010) was opgezet als een prospectieve studie waarbij het dagelijkse management niet werd beïnvloed door de studie. Gezien slechts bij zes patiënten (1,2%) een snellere diagnose gewenst was, kan gesteld worden dat op basis van klinische evaluatie een goede inschatting gemaakt kan worden of een patiënt in aanmerking komt voor herbeoordeling.

De werkgroep is daarom van mening dat de diagnostische accuratesse van anamnese, lichamelijk onderzoek en laboratorium onderzoek voldoende is om een goede inschatting te kunnen maken of een patiënt met acute buikpijn voor aanvullende diagnostiek, opname of herbeoordeling vanwege verdenking op appendicitis in aanmerking komt.

Bewijskracht

Beeldvorming:

De bewijskracht van de literatuur is laag tot zeer laag. Voor sommige vergelijkingen waren er geen studies beschikbaar die in dezelfde studiepopulatie verschillende testen met elkaar vergeleken. De diagnostische accuratesse deze testen is daardoor geëxtraheerd uit verschillende studiepopulaties, waardoor er geen directe vergelijking kan worden gemaakt en er sprake is van indirectheid.

De studies en uitkomsten van de beschikbare literatuur zijn erg heterogeen, wat de bewijskracht van de uitkomstmaten verlaagd. De prevalentie binnen studies varieert en het is vaak onduidelijk hoe patiënten in een studie geselecteerd zijn en welke diagnostiek voorafgaand aan de beeldvorming heeft plaatsgevonden. Daarnaast is het in veel studies onduidelijk hoe lang de follow-up tijd van de referentie test was, met name bij de patiënten die geen operatie ondergingen en welke informatie beschikbaar was bij de beoordeling van een diagnostische test.

Herbeoordeling:

De bewijskracht van de literatuur is niet beoordeeld. Er is geen systematisch literatuuronderzoek verricht, omdat er verwacht werd geen vergelijkende studies te vinden die kunnen beantwoorden bij welke patiënten er na klinische evaluatie een indicatie is voor herbeoordeling, dan wel aanvullende diagnostiek, de volgende dag. Er worden dan ook geen conclusies vermeld. De aanbevelingen zijn daarom uitsluitend gebaseerd op overwegingen die zijn opgesteld door de werkgroepleden op basis van kennis uit de praktijk en waar mogelijk onderbouwd door niet systematisch literatuuronderzoek.

Waarden en voorkeuren van patiënten (en eventueel hun verzorgers)

Beeldvorming:

Het belangrijkste doel voor de patiënt is een correcte en snelle diagnose. Geen negatieve appendectomie, maar ook geen onterecht uitgestelde behandeling.

Echografie heeft geen stralingsbelasting en wordt goed verdragen door patiënten. Een andere voordeel is dat ook andere oorzaken van de klachten (galstenen, cholecystitis, pyelonefritis, nierstenen) goed beoordeeld kunnen worden tijdens echografie. Nadeel is het hoge percentage inconclusieve onderzoeken.

Een nadeel van het gebruik van MRI bij appendicitis is dat deze techniek nog niet wijdverspreid is over alle ziekenhuizen en voornamelijk in nachten en weekenden vooralsnog minder beschikbaar is. Daarnaast vergt het beoordelen van een MRI specifieke training en zijn er veel minder radiologen bekwaam in het beoordelen van een MRI (Leeuwenburgh, 2012). Een ander nadeel van MRI is de krappe ruimte, waardoor patiënten met claustrofobie en/of ernstig obese patiënten geen MRI kunnen ondergaan. Ook patiënten met een neurostimulator of pacemaker kunnen geen MRI ondergaan, of alleen met voorzorgsmaatregelen. Voor een goede kwaliteit MRI is het noodzakelijk dat de patiënt stil kan liggen. Ook kan de beweging van het ongeboren kind zorgen voor artefacten die de beoordeling bemoeilijken.

CT wordt door de werkgroep afgeraden, gezien de hoge stralingsbelasting en het gebruik van intraveneus contrast en de mogelijke gevolgen hiervan voor het ongeboren kind.

Herbeoordeling:

Het belangrijkste doel voor de patiënt is een correcte en snelle diagnose en behandeling, waarbij onnodige diagnostiek beperkt dient te worden. Het gebruik van CT als diagnosticum zorgt voor stralenbelasting.

Door herbeoordeling te gebruiken in het diagnostisch proces worden meer correcte diagnoses gesteld. 30% van de patiënten krijgt na herbeoordeling binnen 24 uur een andere diagnose kreeg dan de initiële diagnose die gesteld was na eerste beoordeling op de spoedeisende hulp. De herbeoordeling leidt in 17% van de gevallen tot verandering van beleid en bij 4% tot operatieve behandeling (Toorenvliet, 2010).

Bij 1,2% van de patiënten die voor herbeoordeling binnen 24 uur komen wordt een diagnose gevonden, waarbij snelle diagnostiek gewenst was geweest. Deze patiënten werden geopereerd en na herstel van de operatie uit het ziekenhuis ontslagen zonder complicaties.

Door herbeoordeling te gebruiken in het diagnostisch proces wordt het onnodig gebruik van aanvullende diagnostiek tegen gegaan. Door de factor tijd te benutten kan het natuurlijk beloop van de ziekte worden gevolgd. Milde, zelflimiterende, ziekte zal geen aanvullende diagnostiek behoeven.

Kosten (middelenbeslag)

Beeldvorming:

Het gebruik van beeldvorming bij iedere patiënt met de verdenking op appendicitis zorgt voor lagere kosten per patiënt (Lahaye, 2015). Echografie is goedkoper en beter beschikbaar dan CT en MRI. Omdat bij een aantal patiënten volstaan kan worden met echografie alleen wordt hiermee een kostenreductie bereikt vergeleken met een diagnostisch traject waarbij echografie overgeslagen wordt.

Herbeoordeling:

Het gebruik van herbeoordeling in het diagnostisch proces zorgt voor minder aanvullende diagnostiek en beperkt het aantal opnames. Herbeoordeling is goedkoper dan opname. Bij herbeoordeling is bij een aantal patiënten geen aanvullende diagnostiek nodig, omdat het natuurlijk beloop van de ziekte uitwijst dat het om milde ziekte gaat.

Aanvaardbaarheid voor de overige relevante stakeholders

Herbeoordeling:

Het gebruik van beeldvorming en herbeoordeling in het diagnostisch proces van een patiënt met een verdenking acute appendicitis is standaard zorg en breed geaccepteerd. Bijkomende beslissende factor voor herbeoordeling kan zijn dat het niet in ieder ziekenhuis mogelijk is om buiten kantoortijden en in het weekend te beschikken over directe beschikbaarheid van echografie, CT of MRI.

Haalbaarheid en implementatie

Beeldvorming:

Het gebruik van echografie bij verdenking op acute appendicitis is standaard zorg en breed geaccepteerd. De expertise met MRI voor appendicitis is wisselend. Ook zal het niet in ieder ziekenhuis mogelijk zijn om buiten kantoortijden en in het weekend te beschikken over personeel dat bekwaam is in het maken en beoordelen van een MRI.

Herbeoordeling:

De diagnostische accuratesse van anamnese, lichamelijk onderzoek en eventueel laboratoriumonderzoek is voldoende om een goede inschatting te kunnen maken of een patiënt voor aanvullende diagnostiek, opname of herbeoordeling in aanmerking komt.

Rationale/ balans tussen de argumenten voor en tegen de interventie

De lagere kosten, brede beschikbaarheid en de afwezigheid van stralenbelasting maken echografie het onderzoek van eerste keus bij zwangere vrouwen met verdenking op een appendicitis.

De diagnostische accuratesse van MRI bij zwangere vrouwen is hoog. CT wordt door de werkgroep afgeraden, gezien de hoge stralingsbelasting en het gebruik van intraveneus contrast en de mogelijke gevolgen hiervan voor het ongeboren kind.

Onderbouwing

Achtergrond

De diagnostiek bij appendicitis bestaat uit klinische evaluatie, laboratorium onderzoek en beeldvorming (echografie, CT-scan, MRI-scan). Bij lage verdenking op acute appendicitis kan na klinische evaluatie gekozen worden om de patiënt de volgende dag te herbeoordelen in plaats van aanvullende diagnostiek te doen. Patiënten met acute appendicitis zijn over het algemeen goed te identificeren, maar patiënten die zich in het beginstadium van de ziekte presenteren zijn moeilijker te onderscheiden van patiënten met andere (self-limiting) oorzaken van de buikklachten. Bij zwangere vrouwen wordt de identificatie van acute appendicitis bemoeilijkt, doordat de appendix verplaatst in de loop van de zwangerschap en appendicitis kan daarmee pijnklachten op een andere locatie geven wat leidt tot een atypische presentatie. Of een patiënt in aanmerking komt voor herbeoordeling, dan wel directe aanvullende diagnostiek hangt af van klinische evaluatie.

Conclusies / Summary of Findings

Laag

GRADE

Het is niet duidelijk of er verschil is tussen de sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde van echografie vergeleken met MRI of contrast CT voor het diagnosticeren van acute appendicitis bij zwangere vrouwen met verdenking op acute appendicitis. Echografie lijkt te resulteren in een groot aantal inconclusieve resultaten bij zwangere vrouwen.

Bronnen: (Basaran, 2009; Burke, 2015; Burns, 2017; Duke, 2016; Israel, 2008; Kazemini, 2017; Kereshi, 2018; Konrad, 2015; Petkovska, 2016; Segev, 2016; Wi, 2018)

Zeer laag

GRADE

Het is niet duidelijk of er verschil is tussen de sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde van MRI vergeleken contrast CT voor het diagnosticeren van acute appendicitis bij zwangere vrouwen met verdenking op acute appendicitis.

Bronnen: (Basaran, 2009; Burke, 2015; Duke, 2016; Kereshi, 2018; Petkovska, 2016; Wi, 2018)

Zeer laag GRADE

Het is niet duidelijk of er verschil is tussen de sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde van een work-up (echografie plus contrast CT of echografie plus MRI) vergeleken met MRI of CT voor het diagnosticeren van acute appendicitis bij zwangere vrouwen met verdenking op acute appendicitis.

Bronnen: (Amitai, 2016; Burns, 2017; Konrad, 2015; Patel, 2017; Shetty; 2010)

Zeer laag GRADE

Het is niet duidelijk of er verschil is tussen de sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde van een work-up echografie plus contrast CT vergeleken work-up echografie plus MRI voor het diagnosticeren van acute appendicitis bij zwangere vrouwen met verdenking op acute appendicitis.

Bronnen: (Amitai, 2016; Burns, 2017; Konrad, 2015; Patel, 2017; Shetty; 2010)

Samenvatting literatuur

Beschrijving studies

Er werden in totaal twee systematische reviews (Duke, 2015; Basaran, 2009) en 13 aanvullende studies (Amitai, 2016; Burke, 2015; Burns, 2017; Israel, 2008; Kazemini, 2017; Kereshi, 2018; Konrad, 2015; Lehnert, 2012; Patel, 2017; Petkovska, 2016; Segev, 2016; Shetty, 2010; Wi, 2018) geïncludeerd voor de subgroep zwangere vrouwen. De meeste studies waren retrospectieve cohort studies, waarbij door middel van dossieronderzoek gekeken werd naar de accuratesse van beeldvormend onderzoek. Het verloop van de klachten of de (pathologische) uitkomsten na een operatie werden bij bijna alle studies als referentietest beschouwd.

Echografie versus contrast CT of MRI

De review van Duke (2016) includeerde 30 cohort studies, waarvan 12 studies met 933 zwangere vrouwen die een MRI kregen. Een deel van deze studies rapporteerde het gebruik van de MRI na een inconclusieve echo. De gemiddelde prevalentie van acute appendicitis in de studies onder zwangere vrouwen was 10,5% en klinische follow-up of pathologische bevindingen werden als referentietest gebruikt. Aanvullend beschreef Wi (2018) de accuratesse van MRI bij 125 patiënten, de prevalentie van acute appendicitis was 19,2% en patiëntendossiers of pathologische resultaten werden als referentietest gebruikt. Kereshi (2018) beschreef resultaten van MRI’s bij 204 zwangere vrouwen waarvan 7,4% acute appendicitis had. Echter wanneer de patiënt voorafgaand in een andere instelling een echografie had gehad, werd de echografie meegenomen bij de beoordeling. Informatie uit patiëntendossiers over chirurgische en pathologische bevindingen, bezoek aan de eerste hulp gedurende de work-up of een bezoek aan de gynaecoloog werd gebruikt als referentietest.

Burke (2015) rapporteerde resultaten van 709 zwangere vrouwen die een MRI kregen, 9,3% had acute appendicitis. Petkovska (2016) rapporteerde de resultaten van MRI bij 403 patiënten, waarvan 48 zwangere vrouwen met een prevalentie van 17%. Voor de referentietest werd informatie uit patiëntendossiers gehaald, zoals de pathologische uitkomsten.

De review van Basaran (2009) includeerde 3 studies met totaal 173 patiënten die de accuratesse van de CT rapporteerden. Twee studies includeerden zwangere vrouwen met verdenking op acute appendicitis, één studie includeerde zwangere vrouwen met verschillende indicaties. De prevalentie van acute appendicitis was 37% en de referentietest was chirurgische (pathologische) uitkomsten of klinische follow-up. De review beoordeelde niet de methodologische kwaliteit van de studies. Deze resultaten werden aangevuld met Shetty (2010) die vrouwen met een CT (n=4), echografie (n=12) of beide (n=23) includeerde, waarvan 12,8% acute appendicitis had. Voor de referentietest werd informatie uit patiëntendossiers gehaald, zoals de pathologische uitkomsten of klinische verloop.

Israel (2008) vergeleek echografie met MRI bij 33 zwangere vrouwen, waarvan 5 (15%) acute appendicitis hadden. De radiologen die de MRI beoordeelden waren op de hoogte van de klinische bevindingen en de echografie. Voor de referentietest werd informatie uit patiëntendossiers gehaald, zoals de pathologische uitkomsten of alternatieve diagnoses.

De studie van Kazemini (2017) rapporteerde de accuratesse van echografie bij 58 zwangere vrouwen, die een appendectomie hadden ondergaan. De prevalentie was 65,5% acute appendicitis en de pathologische uitkomsten werden gebruikt als referentietest. Segev (2016) vergeleek de accuratesse van echografie bij zwangere vrouwen met niet-zwangere vrouwen, die allen een appendectomie hadden gehad. Pathologische uitkomsten werden gebruikt als referentietest. Alleen de resultaten van 92 zwangere vrouwen werden geëxtraheerd voor het beantwoorden van deze onderzoeksvraag. De prevalentie was 29%.

Lehnert (2012) rapporteerde het aantal inconclusieve echo’s bij 99 vrouwen, de prevalentie was 7,1%. De (pathologische) uitkomsten na een operatie werden als referentietest beschouwd. Als een patiënt niet werd geopereerd, werd het medische dossier bekeken.

MRI versus contrast CT

Door afwezigheid van studies bij zwangere vrouwen die MRI direct vergeleken met CT zijn de testeigenschappen van CT en MRI als indirect bewijs beschreven (zie CT of MRI onder het kopje echografie versus MRI of contrast CT).

Work-up (echografie plus contrast CT of echografie plus MRI) versus MRI, CT of een work-up

De studie van Burns (2017) rapporteerde de accuratesse van MRI bij 63 vrouwen, waarvan er 52 ook een echografie voorafgaand aan de MRI hadden gehad. De prevalentie was 20,6%. De (pathologische) uitkomsten na een operatie werden als referentietest beschouwd. Als een patiënt geen appendectomie had gehad, werd appendicitis uitgesloten als de vrouw zich gedurende de zwangerschap niet meer gemeld had met symptomen van appendicitis.

Konrad (2015) onderzocht de accuratesse van MRI en echografie, van de 140 vrouwen kregen 117 vrouwen een echografie en 114 vrouwen een MRI; 8 vrouwen ter bevestiging van de resultaten van de echografie, 83 vrouwen omdat de appendix niet zichtbaar was op de echografie en 23 vrouwen kregen een MRI als eerste beeldvorming. 11% werd gediagnosticeerd met acute appendicitis. De (pathologische) uitkomsten na een operatie werden als referentietest beschouwd. Als een patiënt niet werd geopereerd, werd het medische dossier bekeken.

Patel (2017) rapporteerde de accuratesse van echografie met daarna een MRI bij 42 vrouwen. Beiden werden gebruikt voor het diagnosticeren van acute appendicitis (prevalentie 11,9%). De (pathologische) uitkomsten na een operatie werden als referentietest beschouwd. Als een patiënt niet werd geopereerd, werd het medische dossier bekeken voor relevante opnames met een follow-up van 6 maanden.

In de studie van Amitai (2016) hadden alle vrouwen (n=49) zowel echografie als MRI gehad, waarvan 10% werd gediagnosticeerd met acute appendicitis. Shetty (2010) rapporteerde resultaten van 39 vrouwen, van wie 23 een echografie plus CT hadden gehad. De uitkomsten na een operatie werden als referentietest beschouwd. Als een patiënt niet werd geopereerd, werd de klinische follow-up gebruikt.

Resultaten

Echografie versus contrast CT of MRI

Israel (2008) vergeleek echografie met MRI bij 33 zwangere vrouwen, waarvan 5 (15%) acute appendicitis hadden. Echter, de radiologen die de MRI beoordeelden waren op de hoogte van de klinische bevindingen en de echografie, waardoor er geen directe vergelijking werd gemaakt tussen echografie en MRI. De sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde van echografie waren respectievelijk 20%, 100%, 100% en 99%. Voor MRI was dit respectievelijk 80%, 100%, 100% en 97%. Bij 29 (88%) patiënten met een echografie werd de appendix niet gezien, in de MRI groep was bij 16 vrouwen (48%) de appendix niet te zien.

Door afwezigheid van studies bij zwangere vrouwen die echografie direct vergeleken met CT of MRI zijn accuratesse studies geïncludeerd en worden de testeigenschappen van echografie, CT en MRI als indirect bewijs beschreven.

Echografie

De studie van Kazemini (2017) rapporteerde de accuratesse van echografie bij 58 zwangere vrouwen, van wie 65,5% acute appendicitis. Konrad (2015) onderzocht de accuratesse van MRI en echografie, van de 140 vrouwen kregen 117 vrouwen een echografie. De studie van Burns (2017) rapporteerde de accuratesse van MRI bij 63 vrouwen, waarvan er 52 ook een echografie voorafgaand aan de MRI hadden gehad. Segev (2016) rapporteerde over 92 zwangere vrouwen van wie er 29% acute appendicitis had. Lehnert (2012) rapporteerde het aantal inconclusieve echo’s bij 99 vrouwen. De sensitiviteit, specificiteit, positief voorspellende waarde, negatief voorspellende en inconclusieve of niet zichtbare echo’s van deze studies staan in tabel 2.

Tabel 2 sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde van studies met een echografie

Studie	Sensitiviteit	Specificiteit	Positief voorspellende waarde	Negatief voorspellende waarde	Inconclusief/ niet zichtbaar
Kazemini, 2017 (n=58)	94% (95% BI 87 tot 98)	97% (95% BI 96 tot 98)	81%	99%	4 (6,9%, inconclusief)
Konrad, 2015 (n=117)	100%	95%	96%	100%	109 (93,2%, niet zichtbaar)
Burns, 2017 (n=52)	100%	99,5%	93,3%	100%	Niet gerapporteerd
Segev, 2016 (n=92)	86,8%	99,2%	94,4%	99,7%	Niet gerapporteerd
Lehnert, 2012	Niet gerapporteerd	Niet gerapporteerd	Niet gerapporteerd	Niet gerapporteerd	96 (97%, inconclusief)

MRI

De review van Duke (2016) includeerde 30 cohort studies, waarvan 12 studies met 933 zwangere vrouwen die een MRI kregen. Een deel van deze studies rapporteerde het gebruik van de MRI na een inconclusieve echo. De gepoolde sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde waren respectievelijk 94% (95% BI 87 tot 98), 97% (95% BI 96 tot 98), 81% en 99%. De prevalentie van acute appendicitis in de studiepopulatie was 10,5%. Met deze vooraf kans werden er per 1000 zwangere 27 onterecht geclassificeerd met acute appendicitis (fout positief) en 6 zwangere vrouwen onterecht geclassificeerd als geen acute appendicitis (fout negatief). Resultaten van de studies waren homogeen (I²0,2% voor sensitiviteit en 19,1% voor specificiteit).

Aanvullend rapporteerden Wi (2018), Kereshi (2018) Burke (2015) en Petkovska (2016) de sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde van de MRI (tabel 3). Echter bij Kereshi (2018) werd een echografie meegenomen bij de beoordeling van de MRI wanneer de patiënt voorafgaand aan de MRI een echografie in een andere instelling had gehad. Kereshi (2018) vond 6 MRI’s waarbij de resultaten inconclusief waren. Bij Burke (2015) was op 207 (29,2%) van de MRI’s de appendix niet te zien.

Tabel 3 sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde van studies met een MRI

Studie, n	Sensitiviteit, % (95% BI)	Specificiteit, % (95% BI)	Positief voorspellende waarde, % (95% BI)	Negatief voorspellende waarde, % (95% BI)	Inconclusieve resultaten, n (%)
Duke, 2016 (review, 12 studies, n=933)	94 (87 tot 98)	97 (96 tot 98)	81	99	Niet gerapporteerd
Wi, 2018 (n=125)	100	95	96	100	Niet gerapporteerd
Kereshi, 2018 (n=204)	100	99,5	93,3	100	6 (2,9%)
Burke, 2015 (n=709)	86,8	99,2	94,4	99,7	207 (29,2%)
Petkovska, 2016 (n=48)	85,7 (42,1 tot 99,6)	100 (91.4 tot 100)	100 (54,1 tot 100)	97,6 (87,4 tot 99,9)	Niet gerapporteerd

Contrast CT

De review van Basaran (2009) includeerde 3 studies met totaal 173 patiënten die de accuratesse van de CT rapporteerden en vond een gepoolde sensitiviteit was 86% (95% BI 64% tot 97%) en een gepoolde specificiteit van 97% (95% BI 86% tot 100%). De positief voorspellende waarde was en de negatief voorspellende waarde werd niet gerapporteerd. Op basis van de gemiddelde prevalentie van 37% werden deze waardes berekend, de positief voorspellende waarde was 94% en de negatief voorspellende waarde 93%.

Bewijskracht van de literatuur

De bewijskracht voor de uitkomstmaten sensitiviteit, specificiteit, positief voorspellende waarde, negatief voorspellende waarde en inconclusieve resultaten voor echografie versus MRI of CT is met twee niveaus verlaagd naar laag vanwege risico op bias (onduidelijke selectie van patiënten, onduidelijke follow-up tijd van de referentie test en indextesten die in aanwezigheid van andere testen beoordeeld zijn) en indirectheid (diagnostische accuratesse als tussenstap voor patiënt relevante consequenties).

MRI versus contrast CT

Door afwezigheid van studies bij zwangere vrouwen die MRI direct vergeleken met CT zijn accuratesse studies geïncludeerd en worden de testeigenschappen van CT en MRI als indirect bewijs beschreven (zie echografie versus MRI of contrast CT).

Bewijskracht van de literatuur

De bewijskracht voor de uitkomstmaten sensitiviteit, specificiteit, positief voorspellende waarde, negatief voorspellende waarde en inconclusieve resultaten voor MRI versus CT is met twee niveaus verlaagd naar laag vanwege risico op bias (onduidelijke selectie van patiënten, onduidelijke follow-up tijd van de referentie test en indextesten die in aanwezigheid van andere testen beoordeeld zijn), indirectheid (diagnostische accuratesse als tussenstap voor patiënt relevante consequenties) en imprecisie (weinig patiënten met een CT scan).

Work-up (echografie plus contrast CT of echografie plus MRI) versus MRI, CT of

Work-up echografie plus contrast CT versus echografie plus MRI

Door afwezigheid van studies bij zwangere vrouwen die een MRI of een CT direct vergeleken met een work-up (echografie plus CT of echografie plus MRI) zijn accuratesse studies geïncludeerd en worden de testeigenschappen van de work-up echografie plus MRI en echografie plus CT als indirect bewijs beschreven. In de meeste studies werd zowel een echografie als een MRI of een CT gedaan.

Work-up: echografie met MRI

De studie van Burns (2017) rapporteerde de accuratesse van MRI bij 63 vrouwen, waarvan er 52 ook een echografie voorafgaand aan de MRI hadden gehad. De prevalentie was 20,6%. De sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde waren respectievelijk 75%, 100%, 100% en 93,2%. Konrad, 2015 onderzocht de accuratesse van MRI en echografie, van de 140 vrouwen kregen 117 vrouwen een echografie en 114 vrouwen een MRI; 8 vrouwen ter bevestiging van de resultaten van de echografie, 83 vrouwen omdat de appendix niet zichtbaar was op de echografie en 23 vrouwen kregen een MRI als eerste beeldvorming. De sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde van MRI met of zonder echografie waren respectievelijk 100%, 98%, 89% en 100%. 23 (20%) appendices werden niet gezien op de MRI. Patel (2017) rapporteerde de accuratesse van echografie met daarna een MRI bij 42 vrouwen. De sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde waren respectievelijk 60%, 91,9%, 50% en 94,4%. In de studie van Amitai (2016) hadden alle vrouwen (n=49) zowel echografie als MRI gehad, waarvan 10% werd gediagnosticeerd met acute appendicitis. De positief voorspellende waarde was 83,3% en de negatief voorspellende waarde 100%.

Work-up: echografie met CT

Shetty (2010) rapporteerde resultaten van 39 vrouwen, waarvan 4 een CT en 23 een echografie plus CT hadden gehad. De sensitiviteit van de CT (met of zonder echo) was 100%, de andere uitkomstmaten werden niet gerapporteerd.

Bewijskracht van de literatuur

De bewijskracht voor de uitkomstmaten sensitiviteit, specificiteit, positief voorspellende waarde en negatief voorspellende waarde voor de work-up (echografie plus CT of echografie plus MRI) versus echografie, MRI of CT is met drie niveaus verlaagd naar zeer laag vanwege risico op bias (onduidelijke selectie van patiënten, onduidelijke follow-up tijd van de referentie test en indextesten die in aanwezigheid van andere testen beoordeeld zijn), indirectheid (diagnostische accuratesse als tussenstap voor patiënt relevante consequenties) en imprecisie (weinig patiënten waarbij een work-up gedaan is). Vanwege de afwezigheid van studies die inconclusieve resultaten beschrijven is de bewijskracht voor inconclusieve resultaten is niet beoordeeld.

Zoeken en selecteren

Om de uitgangsvraag te kunnen beantwoorden is er een systematische literatuuranalyse verricht naar de volgende zoekvragen:

PICO 1

Wat is de diagnostische accuratesse van de echografie vergeleken met MRI of een contrast CT voor het diagnosticeren van acute appendicitis bij zwangere vrouwen met verdenking van acute appendicitis?

P (Patiënten): patiënten (zwangere vrouwen) met een acute appendicitis;

I (Interventie): echografie;

C (Comparison): MRI of contrast CT;

Referentie test: verloop van de klachten of de (pathologische) uitkomsten na een operatie;

O (Outcomes): sensitiviteit, specificiteit, positief voorspellende waarde, negatief voorspellende waarde, inconclusieve uitkomsten.

PICO 2

Wat is de diagnostische accuratesse van MRI vergeleken met CT voor het diagnosticeren van acute appendicitis bij zwangere vrouwen met verdenking van acute appendicitis?

P (Patiënten): patiënten (zwangere vrouwen) met een acute appendicitis;

I (Interventie): MRI;

C (Comparison): contrast CT;

Referentie test: verloop van de klachten of de (pathologische) uitkomsten na een operatie;

O (Outcomes): sensitiviteit, specificiteit, positief voorspellende waarde, negatief voorspellende waarde, inconclusieve uitkomsten.

PICO 3

Wat is de diagnostische accuratesse van een work-up (echografie plus CT of echografie plus MRI) vergeleken met MRI of contrast CT of een work-up echografie plus CT vergeleken met work-up echografie plus MRI voor het diagnosticeren van acute appendicitis bij zwangere vrouwen met verdenking van acute appendicitis?

P (Patiënten): patiënten (zwangere vrouwen) met een acute appendicitis;

I (Interventie): work-up (echografie plus contrast CT of echografie plus MRI);

C (Comparison): MRI, contrast CT of work-up (anders dan bij I);

Referentie test: verloop van de klachten of de (pathologische) uitkomsten na een operatie;

O (Outcomes): sensitiviteit, specificiteit, positief voorspellende waarde, negatief voorspellende waarde, inconclusieve uitkomsten.

Relevante uitkomstmaten

De werkgroep achtte sensitiviteit en negatief voorspellende waarde (een) voor de besluitvorming cruciale uitkomstmaten; en specificiteit, positief voorspellende waarde, inconclusieve uitkomsten (een) voor de besluitvorming belangrijke uitkomstmaten.

Tabel 1 Gevolgen en consequenties van diagnostische testeigenschappen

Uitkomsten	Gevolgen	Relevantie
Terecht positieven (TP)	Patiënt wordt terecht gediagnostiseerd met acute appendicitis en krijgt behandeling.	Cruciaal
Terecht negatieven (TN)	Patiënt wordt terecht niet gediagnostiseerd met acute appendicitis en krijgt terecht geen behandeling.	Belangrijk
Fout positieven (FP)	Patiënt wordt onterecht gediagnostiseerd met acute appendicitis en krijgt een onnodige behandeling. De klachten persisteren.	Belangrijk
Fout negatieven	Patiënt wordt onterecht niet gediagnostiseerd met acute appendicitis en krijgt onterecht geen behandeling. Er vindt vervolgonderzoek plaats naar de oorzaak van de symptomen.	Cruciaal
Inconclusieve uitkomsten	Vervolgonderzoek (MRI of CT) met vertraging van de uiteindelijke diagnose.	Belangrijk

De werkgroep definieerde 10% verschil in sensitiviteit, specificiteit, positief voorspellende waarde, negatief voorspellende waarde of inconclusieve uitkomsten als een klinisch (patiënt) relevant verschil.

Zoeken en selecteren (Methode)

In de databases Medline (via OVID) en Embase (via Embase.com) is op 6 juni 2018 met relevante zoektermen gezocht naar systematische reviews (SR), randomized controlled trials (RCTs), observationele vergelijkende studies en cohort studies die rapporteerden over diagnostische accuratesse van echografie, MRI, CT of een step-up approach, waarbij een MRI of CT scan werd gedaan na een inconclusieve echografie voor de diagnostiek van acute appendicitis gepubliceerd vanaf 2008. De zoekverantwoording is weergegeven onder het tabblad Verantwoording. De literatuurzoekactie leverde 479 treffers op. Studies werden geselecteerd op grond van de volgende selectiecriteria: systematische reviews (SR), randomized controlled trials (RCTs), observationele vergelijkende studies die ten minste één van de volgende uitkomstmaten rapporteerden: de sensitiviteit, specificiteit, positief voorspellende waarde, negatief voorspellende waarde of inconclusieve uitkomsten. Wanneer deze studies niet beschikbaar waren werden ook niet-vergelijkende studies geïncludeerd.

Op basis van titel en abstract werden in eerste instantie 107 studies voorgeselecteerd. Na raadpleging van de volledige tekst, werden vervolgens 72 studies geëxcludeerd (zie exclusietabel onder het tabblad Verantwoording), en 35 studies definitief geselecteerd. 35 onderzoeken zijn opgenomen in de literatuuranalyse, waarvan 2 systematische reviews en 12 aanvullende studies resultaten voor de subgroep zwangere vrouwen rapporteerden. De belangrijkste studiekarakteristieken en resultaten zijn opgenomen in de evidencetabellen. De beoordeling van de individuele studieopzet (risk-of-bias) is opgenomen in de risk-of-bias tabellen.

Referenties

Amitai MM, Katorza E, Guranda L, Apter S, Portnoy O, Inbar Y, Konen E, Klang E, Eshet Y. Role of Emergency Magnetic Resonance Imaging for the Workup of Suspected Appendicitis in Pregnant Women. Isr Med Assoc J. 2016 Oct;18(10):600-604. PubMed PMID: 28471619.
Basaran A, Basaran M. Diagnosis of acute appendicitis during pregnancy: a systematic review. Obstet Gynecol Surv. 2009 Jul;64(7):481-8; quiz 499. doi: 10.1097/OGX.0b013e3181a714bf. Review. PubMed PMID: 19545456.
Burke LM, Bashir MR, Miller FH, Siegelman ES, Brown M, Alobaidy M, Jaffe TA, Hussain SM, Palmer SL, Garon BL, Oto A, Reinhold C, Ascher SM, Demulder DK, Thomas S, Best S, Borer J, Zhao K, Pinel-Giroux F, De Oliveira I, Resende D, Semelka RC. Magnetic resonance imaging of acute appendicitis in pregnancy: a 5-year multiinstitutional study. Am J Obstet Gynecol. 2015 Nov;213(5):693.e1-6. doi: 10.1016/j.ajog.2015.07.026. Epub 2015 Jul 26. PubMed PMID: 26215327.
Burns M, Hague CJ, Vos P, Tiwari P, Wiseman SM. Utility of Magnetic Resonance Imaging for the Diagnosis of Appendicitis During Pregnancy: A Canadian Experience. Can Assoc Radiol J. 2017 Nov;68(4):392-400. doi: 10.1016/j.carj.2017.02.004. Epub 2017 Jul 18. PubMed PMID: 28728903.
Duke E, Kalb B, Arif-Tiwari H, Daye ZJ, Gilbertson-Dahdal D, Keim SM, Martin DR. A Systematic Review and Meta-Analysis of Diagnostic Performance of MRI for Evaluation of Acute Appendicitis. AJR Am J Roentgenol. 2016 Mar;206(3):508-17. doi: 10.2214/AJR.15.14544. Review. PubMed PMID: 26901006.
Israel GM, Malguria N, McCarthy S, Copel J, Weinreb J. MRI versus ultrasound for suspected appendicitis during pregnancy. J Magn Reson Imaging. 2008 Aug;28(2):428-33. doi: 10.1002/jmri.21456. PubMed PMID: 18666160.
Kazemini A, Reza Keramati M, Fazeli MS, Keshvari A, Khaki S, Rahnemai-Azar A. Accuracy of ultrasonography in diagnosing acute appendicitis during pregnancy based on surgical findings. Med J Islam Repub Iran. 2017 Aug 29;31:48. doi: 10.14196/mjiri.31.48. eCollection 2017. PubMed PMID: 29445677; PubMed Central PMCID: PMC5804447.
Kereshi B, Lee KS, Siewert B, Mortele KJ. Clinical utility of magnetic resonance imaging in the evaluation of pregnant females with suspected acute appendicitis. Abdom Radiol (NY). 2018 Jun;43(6):1446-1455. doi: 10.1007/s00261-017-1300-7. PubMed PMID: 28849364.
Konrad J, Grand D, Lourenco A. MRI: first-line imaging modality for pregnant patients with suspected appendicitis. Abdom Imaging. 2015 Oct;40(8):3359-64. doi: 10.1007/s00261-015-0540-7. PubMed PMID: 26338256.
Lehnert BE, Gross JA, Linnau KF, Moshiri M. Utility of ultrasound for evaluating the appendix during the second and third trimester of pregnancy. Emerg Radiol. 2012 Aug;19(4):293-9. doi: 10.1007/s10140-012-1029-0. Epub 2012 Feb 28. PubMed PMID: 22370694.
Patel D, Fingard J, Winters S, Low G. Clinical use of MRI for the evaluation of acute appendicitis during pregnancy. Abdom Radiol (NY). 2017 Jul;42(7):1857-1863. doi: 10.1007/s00261-017-1078-7. PubMed PMID: 28194513.
Petkovska I, Martin DR, Covington MF, Urbina S, Duke E, Daye ZJ, Stolz LA, Keim SM, Costello JR, Chundru S, Arif-Tiwari H, Gilbertson-Dahdal D, Gries L, Kalb B. Accuracy of Unenhanced MR Imaging in the Detection of Acute Appendicitis: Single-Institution Clinical Performance Review. Radiology. 2016 May;279(2):451-60. doi: 10.1148/radiol.2015150468. Epub 2016 Jan 25. PubMed PMID: 26807893.
Onur OE, Guneysel O, Unluer EE et al (2008) Outpatient follow-up or active clinical observation in patients with nonspecific abdominal pain in the emergency department. A randomized clinical trial. Minerva Chir 63:915
Shetty MK, Garrett NM, Carpenter WS, Shah YP, Roberts C. Abdominal computed tomography during pregnancy for suspected appendicitis: a 5-year experience at a maternity hospital. Semin Ultrasound CT MR. 2010 Feb;31(1):8-13. doi: 10.1053/j.sult.2009.09.002. Epub 2010 Jan 14. PubMed PMID: 20102691.
Segev L, Segev Y, Rayman S, Nissan A, Sadot E. The diagnostic performance of ultrasound for acute appendicitis in pregnant and young nonpregnant women: A case-control study. Int J Surg. 2016 Oct;34:81-85. doi: 10.1016/j.ijsu.2016.08.021. Epub 2016 Aug 20. PubMed PMID: 27554180.
Wi SA, Kim DJ, Cho ES, Kim KA. Diagnostic performance of MRI for pregnant patients with clinically suspected appendicitis. Abdom Radiol (NY). 2018 Jun 4. doi: 10.1007/s00261-018-1654-5. (Epub ahead of print) PubMed PMID: 29869102.

Evidence tabellen

Table of quality assessment for systematic reviews of diagnostic studies

Study

First author, year

Appropriate and clearly focused question?

Yes/no/unclear

Comprehensive and systematic literature search?

Yes/no/unclear

Description of included and excluded studies?

Yes/no/unclear

Description of relevant characteristics of included studies?

Yes/no/unclear

Assessment of scientific quality of included studies?

Yes/no/unclear

Enough similarities between studies to make combining them reasonable?

Yes/no/unclear

Potential risk of publication bias taken into account?

Yes/no/unclear

Potential conflicts of interest reported?

Yes/no/unclear

Duke, 2015

Yes

Included: yes, excluded: no

Yes

Basaran, 2009

Yes

Included: yes, excluded: no

Yes

Evidence table for systematic reviews of diagnostic test accuracy studies

Study reference

Study characteristics

Patient characteristics

Index test

(test of interest)

Reference test

Follow-up

Outcome measures and effect size

Comments

Duke, 2015

PS., study characteristics and results are extracted from the SR (unless stated otherwise)

SR and meta-analysis

Literature search up to October 2014

A: Theilen, 2015

B: Ramalingam, 2015

C: fonseca, 2014

D: Rapp, 2013

E: Jang, 2011

F: Massalli, 2011

G: Vu, 2009

H: Pedrosa, 2009

I: Oto, 2009

J: Israel, 2008

K: Birchard, 2005

L: Incesu, 1997

Study design: cohort studies 3 prospective / 9 retrospective

Setting and Country: Department of Medical Imaging, University of Arizona, USA

Source of funding and conflicts of interest:

Not reported

Inclusion criteria SR: Studies in which MRI was used for the diagnosis

of appendicitis were included in our analysis if

the pathologic findings, clinical follow-up, or both were used as the reference standard; if true-positive,

true-negative, false-positive, and false-negative results, or sufficient data to calculate these values, were included; and if the reviewers of the

MRI studies were blinded to the results of previous

ultrasound or CT examinations. Articles that

also categorized cases of appendicitis as inconclusive

or equivocal were included if explicit criteria

were provided to define why a result was inconclusive.

Exclusion criteria SR: case reports; non–peer reviewed

meeting abstracts or posters; studies in which there were no original cases of appendicitis,

such as review articles or studies in which healthy volunteers underwent imaging to determine the rate of visualization of a healthy appendix; and studies in which diagnoses were made on the basis of ultrasound

or CT findings that were subsequently evaluated

by unblinded MRI reviewers.

30 studies included of which 12 studies included with pregnant patients

Important patient characteristics:

Number of patients

A: 171

B: 102

C: 31

D: 212

E: 18

F: 40

G: 19

H: 148

I: 118

J: 33

K: 29

L: 12

Mean age:

A: NR

B: 26.2

C: NR

D: 26

E: 31.7

F: 28

G: 31

H: 29

I: 24.7

J: 25.6

K: 25

L: 28

Describe index test and

cut-off point(s):

MRI

A: 1.5 T (Espree and Symphony, Siemens Healthcare)

B: 1.5 T (Achieva and Intera, Philips Healthcare)

C: NR

D: 1.5 T (Espree and Avanto,Siemens Healthcare) or 1.5

T (Signa and Excite HD, GE Healthcare)

E: 1.5 T (Achieva, Philips

Healthcare)

F: 1.5 T (Avanto, Siemens Healthcare)

G: 1.5 T (Signa, GE Healthcare)

H: 1.5 T (Vision, Siemens

Healthcare) or 1.5 Signa,

GE Healthcare)

I: 1.5 T (Signa, GE Healthcare)

J: 1.5 T (Signa, GE Healthcare)

K: 1.5 T (Symphony, Sonata, or Vision, Siemens Healthcare)

L: 1.0 (Harmony, Siemens)

The equivocal cases were counted as positive cases if the results of MRI reported that inflammatory changes were present in the right lower quadrant despite nonvisualization of the appendix, and they were counted as negative cases if the appendix was not visualized but

no inflammatory changes were seen.

Describe reference test and cut-off point(s):

Pathologic findings, clinical follow-up, or both were.

Prevalence (%)

A: 7.6

B: 7.8

C: 35.5

D: 4.5

E: 27.8

F: 12.5

G: 10.5

H: 9.5

I: 8.5

J: 15.2

K:10.3

L: 25

Overall: 98/ 933 = 10.5%

For how many participants were no complete outcome data available?

Not reported

Endpoint of follow-up:

Not reported

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

0.94 (0.87–0.98)

I²: 0.2%

Outcome measure-1/2

Specificity

0.97 (0.96–0.98)

I²: 19.1 %

Outcome measure-3

PPV

Not reported

TP: 92

FP: 22

80.7%

Outcome measure-4

NPV

Not reported

TN: 813

FN: 6

99.3%

Study quality (ROB): With the use of the Quality Assessment

Tool for Diagnostic Accuracy questionnaire, the mean quality score was 10 of 14, with a

maximum score of 12 and a minimum score of 7.*

*Based on 30 included studies of which 12 among pregnant patients.

Place of the index test in the clinical pathway: MRI was used for the diagnosis of appendicitis.

Author’s conclusion

MRI has a high accuracy for the diagnosis of acute appendicitis, for a wide range of patients, and may be acceptable for use as a first-line diagnostic test.

Heterogeneity: There

was variability in the MRI technique used, although

nearly all studies included a combination of multiplanar T2-weighted imaging, most studies using imaging both with and

without fat suppression but some using only non-fat saturated or only fat saturated T2-weighted imaging. Six studies included DWI, whereas another six studies involved imaging

performed after IV administration of contrast

medium.

Basaran, 2009

PS., study characteristics and results are extracted from the SR (unless stated otherwise)

SR and meta-analysis

Literature search up to August 2008

A: Wallace, 2008

B: Lazarus, 2007

C: Ames Castro, 2001

Study design: 1 cohort (C), 2 case-control (A, B), retrospective

Setting and Country: Department of Obstetrics and Gynecology, Kulu State Hospital, Ankara, Turkey

Source of funding and conflicts of interest:

The authors have disclosed that they have no financial relationships

with or interests in any commercial companies pertaining to

this educational activity, funding not reported

Inclusion criteria SR: Surgical pathology and/or clinical

follow-up were used as the diagnostic reference standard

for appendicitis.

Exclusion criteria SR: Gravid patients

with diagnoses other than suspected appendicitis

were excluded, either because not all the imaging

studies were ordered with a suspicion of appendicitis,

or the indication of imaging was not specified. The intent of these exclusions was to obtain a more uniform

set of patients with suspected appendicitis.

3 studies included

Important patient characteristics:

Number of patients

A: 86

B: 80

C: 7

Gestational age

A:NR

B: 5-40 weeks (mean 23)

C: 20-38 weeks (mean 28)

Describe index test and

cut-off point(s):

A: Helical CT

B: 3 different (1, 4, 16 detector row)

C: Helical CT

Describe reference test and cut-off point(s):

A: Surgical pathology and/or clinical follow-up

B: Surgical pathology and/or clinical follow-up

C: Surgical pathology and/or clinical follow-up

Prevalence (%)

A: 49/ 86 = 57%

B: 13/ 80 = 16.3%

C: 2/ 7 = 28.6%

Overall: 64/ 173 = 37%

For how many participants were no complete outcome data available?

Not reported

Endpoint of follow-up:

Not reported

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

0.86 (0.64-0.97)

I²: 46.4%

Outcome measure-2

Specificity

0.97 (0.86-1.00)

I²: 0%

Outcome measure-3

PPV

Calculated using prevalence

0.94

Outcome measure-4

NPV

Calculated using prevalence

0.93

Study quality (ROB): not reported.

The summary assessment of the diagnostic accuracy for CT was limited overall by the small

number and retrospective nature of the studies, heterogeneity

among study samples, and poor methodologic quality in the original studies.

Place of the index test in the clinical pathway: all patients have had an US before CT scanning.

Author’s conclusion

This review is limited by the small number and retrospective nature of the available studies.

With these limitations in mind, CT seems to be highly sensitive and specific for the diagnosis of appendicitis in pregnancy and their use should be considered when the results of ultrasonography are

normal or inconclusive and appendicitis is suspected.

Heterogeneity: Indication for imaging was suspected appendicitis in studies A and C and multiple indications for study B.

Personal remarks: SR is of poor quality

CI: confidence interval; CT: computed tomography; GA: gestational age; MRI: magnetic resonance imaging; NPV: negative predictive value; NR: not reported; PPV: positive predictive value; SD: standard deviation; SR: systematic review; US: ultrasound

Risk-of-bias assessment diagnostic accuracy studies (QUADAS II, 2011)

Study reference

Patient selection

Index test

Reference standard

Flow and timing

Comments with respect to applicability

Israel, 2008

Was a consecutive or random sample of patients enrolled?

Yes

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Unclear

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes, but MR images were interpreted prospectively by one of

five attending radiologists certified by the American Board of Radiology, with full knowledge of the clinical

and US findings.

If a threshold was used, was it pre-specified?

Yes

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

Unclear

Was there an appropriate interval between index test(s) and reference standard?

Unclear

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Were all patients included in the analysis?

Yes

Are there concerns that the included patients do not match the review question?

Yes

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Yes

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: LOW

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: HIGH

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: UNCLEAR

CONCLUSION

Could the patient flow have introduced bias?

RISK: LOW

Kazemini, 2017

Was a consecutive or random sample of patients enrolled?

Yes

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Unclear

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes

If a threshold was used, was it pre-specified?

Yes

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

Yes

Was there an appropriate interval between index test(s) and reference standard?

Unclear

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Yes

Were all patients included in the analysis?

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: UNCLEAR

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: LOW

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: LOW

CONCLUSION

Could the patient flow have introduced bias?

RISK: UNCLEAR

Segev, 2016

Was a consecutive or random sample of patients enrolled?

Yes

Was a case-control design avoided?

Did the study avoid inappropriate exclusions?

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes

If a threshold was used, was it pre-specified?

Yes

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

Unclear

Was there an appropriate interval between index test(s) and reference standard?

Yes

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Yes

Were all patients included in the analysis?

Are there concerns that the included patients do not match the review question?

Yes

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Yes

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: LOW

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: LOW

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: UNCLEAR

CONCLUSION

Could the patient flow have introduced bias?

RISK: HIGH

Lehnert, 2012

Was a consecutive or random sample of patients enrolled?

Unclear

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Unclear

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes

If a threshold was used, was it pre-specified?

Yes

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

Unclear

Was there an appropriate interval between index test(s) and reference standard?

Yes

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Were all patients included in the analysis?

Yes

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: UNCLEAR

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: LOW

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: UNCLEAR

CONCLUSION

Could the patient flow have introduced bias?

RISK: UNCLEAR

Wi, 2018

Was a consecutive or random sample of patients enrolled?

Unclear

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Unclear

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes

If a threshold was used, was it pre-specified?

Unclear

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

Unclear

Was there an appropriate interval between index test(s) and reference standard?

Yes

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Were all patients included in the analysis?

Unclear

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: UNCLEAR

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: UNCLEAR

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: UNCLEAR

CONCLUSION

Could the patient flow have introduced bias?

RISK: UNCLEAR

Kereshi, 2018

Was a consecutive or random sample of patients enrolled?

Unclear

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Unclear

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes

If a threshold was used, was it pre-specified?

Yes

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

Was there an appropriate interval between index test(s) and reference standard?

Unclear

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Were all patients included in the analysis?

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: UNCLEAR

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: LOW

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: LOW

CONCLUSION

Could the patient flow have introduced bias?

RISK: UNCLEAR

Burns, 2017

Was a consecutive or random sample of patients enrolled?

Yes

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Unclear

Were the index test results interpreted without knowledge of the results of the reference standard?

Unclear

If a threshold was used, was it pre-specified?

Yes

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

Was there an appropriate interval between index test(s) and reference standard?

Unclear

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Were all patients included in the analysis?

Yes

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Yes, 52 patients received both US and MRI.

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: UNCLEAR

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: UNCLEAR

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: LOW

CONCLUSION

Could the patient flow have introduced bias?

RISK: UNCLEAR

Patel, 2017

Was a consecutive or random sample of patients enrolled?

Unclear

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Unclear

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes

If a threshold was used, was it pre-specified?

Yes

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

Was there an appropriate interval between index test(s) and reference standard?

Yes

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Were all patients included in the analysis?

Yes

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: UNCLEAR

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: LOW

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: LOW

CONCLUSION

Could the patient flow have introduced bias?

RISK: LOW

Konrad, 2015

Was a consecutive or random sample of patients enrolled?

Unclear

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Unclear

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes

If a threshold was used, was it pre-specified?

Yes

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

Was there an appropriate interval between index test(s) and reference standard?

Yes

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Were all patients included in the analysis?

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: UNCLEAR

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: LOW

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: LOW

CONCLUSION

Could the patient flow have introduced bias?

RISK: LOW

Burke, 2015

Was a consecutive or random sample of patients enrolled?

Yes

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Unclear

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes

If a threshold was used, was it pre-specified?

Yes

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

Was there an appropriate interval between index test(s) and reference standard?

Unclear

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Were all patients included in the analysis?

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: UNCLEAR

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: LOW

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: LOW

CONCLUSION

Could the patient flow have introduced bias?

RISK: UNCLEAR

Shetty, 2010

Was a consecutive or random sample of patients enrolled?

Unclear

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Unclear

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes

If a threshold was used, was it pre-specified?

Unclear

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

Was there an appropriate interval between index test(s) and reference standard?

Unclear

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Were all patients included in the analysis?

Unclear

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Yes, 23 patients received CT+US

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: UNCLEAR

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: UNCLEAR

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: LOW

CONCLUSION

Could the patient flow have introduced bias?

RISK: UNCLEAR

Amitai, 2016

Was a consecutive or random sample of patients enrolled?

Unclear

Was a case-control design avoided?

Yes

Did the study avoid inappropriate exclusions?

Unclear

Were the index test results interpreted without knowledge of the results of the reference standard?

Yes

If a threshold was used, was it pre-specified?

Yes

Is the reference standard likely to correctly classify the target condition?

Yes

Were the reference standard results interpreted without knowledge of the results of the index test?

Was there an appropriate interval between index test(s) and reference standard?

Unclear

Did all patients receive a reference standard?

Yes

Did patients receive the same reference standard?

Were all patients included in the analysis?

Unclear

Are there concerns that the included patients do not match the review question?

Are there concerns that the index test, its conduct, or interpretation differ from the review question?

Are there concerns that the target condition as defined by the reference standard does not match the review question?

CONCLUSION:

Could the selection of patients have introduced bias?

RISK: UNCLEAR

CONCLUSION:

Could the conduct or interpretation of the index test have introduced bias?

RISK: LOW

CONCLUSION:

Could the reference standard, its conduct, or its interpretation have introduced bias?

RISK: LOW

CONCLUSION

Could the patient flow have introduced bias?

RISK: UNCLEAR

Evidence table for diagnostic test accuracy studies

Study reference

Study characteristics

Patient characteristics

Index test

(test of interest)

Reference test

Follow-up

Outcome measures and effect size

Comments

Israel, 2008

Type of study: cohort study

Setting: Department

of radiology at Yale New Haven Hospital

Country: USA

Conflicts of interest:

Not reported

Inclusion criteria: Between July 2004 and June 2006, 45

pregnant patients were referred for imaging to the department

of radiology at Yale New Haven Hospital for a

clinical suspicion of appendicitis. The decision to obtain

imaging was made by the patient’s primary care

physician or the emergency room physician.

Exclusion criteria:

Of the 45 patients nine underwent US without MRI and three underwent MRI without US.

N=33

Prevalence: 15%

Age: range: 18–36 years, mean: 25.6 years

Other important characteristics: A total of 12 patients were in the first trimester, 16 in the second trimester,

and five in the third trimester.

Describe index test:

US vs MRI

US was performed using a 5-MH to 10-MHz linear transducer, on either a Philips iu-22 or HDI 5000 unit using the graded compression technique described by Puylaert.

Cut-off point(s): The criteria

for establishing the diagnosis of acute appendicitis was

direct visualization of a non-compressible appendix with a diameter of 6 mm or more at the point of maximal

tenderness with or without the presence of an appendicolith,

surrounding inflammation, or abscess formation.

The diagnostic criteria for negative findings on

sonography were visualization of a normal compressible

appendix less than 6 mm in diameter, with no evidence of inflammation, phlegmon, or abscess. US exams in which the appendix was not visualized were considered indeterminate.

Comparator test: MRI

patients were imaged in the supine position without sedation, anesthesia, or intravenous or oral

contrast. In all patients, a single-shot fast spin-echo

(SSFSE) sequence (TR = 1800–3200 msec, TE = ∞, matrix size = 256 x 256, slice thickness 5 mm, interslice gap = 0.5 mm, field of view = 32–40 cm) in

the axial, sagittal, and coronal planes through the abdomen

and pelvis (localized to the right lower quadrant)

was performed on a 1.5T MRI scanner (Signa; General

Electric, Milwaukee, WI, USA). At the discretion of the

interpreting attending radiologist, a fat-suppressed T2-weighted fast spin-echo (FSE) sequence (TR = 5000

msec, TE = 100 msec, number of excitations (NEX) = 2,

echo train length = 16, matrix size = 256 x 192, slice

thickness = 5 mm, interslice gap = 0.5–10 mm, field of

view = 24–36 cm) was performed in the axial plane in

16 patients. In addition, five patients also underwent

T1-weighted imaging in the axial plane.

Cut-off point(s): diagnosed

if the appendix measured 6 mm or less and was

without wall thickening or periappendiceal fluid.

Appendicitis was diagnosed when the appendiceal diameter was ≥7 mm and the lumen of the appendix was

distended and fluid-filled and/or there was evidence of

surrounding inflammation manifested by increased

signal intensity of periappendiceal tissues on T2-

weighted sequences. The appendix may or may not

have contained appendicoliths (defined as focal low signal

intensity filling defects on T2-weighted images). A

nonvisualized appendix without inflammation or fluid

in the expected location of the appendix was considered

indeterminate.

Describe reference test:

Each patient’s medical record was reviewed. Final pathology reports were used for disease confirmation in patients who underwent appendectomy. For patients who did not undergo surgery, the medical records were evaluated for alternative diagnoses.

Time between the index test and reference test:

For how many participants were no complete outcome data available?

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

US 20%

MRI 80%

Outcome measure-2

Specificity

US 100%

MRI 100%

Outcome measure-3

PPV

US 100%

MRI 100%

Outcome measure-4

NPV

US 99%

MRI 97%

Outcome measure-5

Inconclusive results

29 patients (88%; 29/33), the US was interpreted as

indeterminate (an appendix was not identified).

MRI

In 16 patients (48%; 16/33), the MRI was interpreted as

indeterminate (an appendix was not identified).

Author’s conclusion

Based on a relatively small number of true positives,

our data suggests that MRI is very useful for the

diagnosis and exclusion of appendicitis in pregnant women.

Personal remarks

MR images were interpreted prospectively by one of

five attending radiologists certified by the American

Board of Radiology, with full knowledge of the clinical

and US findings.

Kazemini, 2017

Type of study: cohort study

Setting: Imam Khomeini hospital, Tehran

Country: Iran

Conflicts of interest:

Inclusion criteria: pregnant women admitted to the emergency department,

who were highly suspected of having acute appendicitis

and underwent surgical exploration from January 2014 to

January 2016

Exclusion criteria: Records that did

not include pathological assessment or were incomplete

were excluded. Those patients treated conservatively were

also excluded.

N=58

Prevalence: 65.5%

Mean age ± SD: 29.1±4.94 years

Other important characteristics: There were 37 (63.8%), 15

(25.9%), and 6 (10.3%) patients in the first, second, and

third trimesters of pregnancy, respectively. The mean gestational

age of patients was 13±8.96 weeks.

Describe index test:

Cut-off point(s): at least one of Puylaert criteria had to be presented, which are as follow: (1) non-compressible, swollen appendix with a diameter greater than 7mm, and wall thickness greater than 3mm; (2) appendicolith; (3) an increase and hyperechogenicity of periappendiceal fat; (4) lack of normal wall layer; (5) appendiceal abscess; and (6)

periappendiceal fluid collection.

Describe reference test:

Surgically resected samples were evaluated and confirmed through histological evaluation.

Time between the index test and reference test:

For how many participants were no complete outcome data available?

63 pregnant patients underwent surgery, while 5 were excluded due to their

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

US 80%

Outcome measure-2

Specificity

US 75%

Outcome measure-3

PPV

US 91.4%

Outcome measure-4

NPV

US 52.9%

Outcome measure-5

Inconclusive results

4 (6.9%)

Author’s conclusion

Ultrasonography is the initially preferred imaging modality in pregnant women suspected of having acute appendicitis with an acceptable sensitivity; however, application of other imaging modalities such as CT scan or MRI is recommended after inconclusive ultrasonography results.

Segev, 2016

Type of study: case-control

Setting: tertiary medical center

Country: Israel

Conflicts of interest: None

Inclusion criteria: Prospective data recorded in the database of a tertiary medical

center were retrospectively reviewed to identify all pregnant

women who underwent appendectomy for presumed acute appendicitis in 2000-2014. The control group consisted of

nonpregnant women of reproductive age who underwent appendectomy

at the same medical center in 2004-2007. Only patients

who underwent ultrasound scanning as part of their preoperative

evaluation were included in the analysis.

Exclusion criteria: -

N Pregnant: 92*

*Only results for the pregnant women were extracted

Prevalence: 29%

Mean age: 28 years

Other important characteristics: median gestational

age, 19 weeks (IQR: 14-26 weeks).

Describe index test:

Cut-off point(s):

Ultrasound reports were categorized as positive if they contained a radiologic diagnosis of “appendicitis” or “perforated appendicitis” ,based on visualization of an abnormal appendix, with or without secondary signs (free fluid, pericecal inflammatory fat stranding,phlegmon et cetera). Reports lacking a definitive diagnosis of appendicitis (“appendix not visualized”, “equivocal”, “further evaluation needed” and “other”) were considered negative.

Describe reference test:

Final pathologic diagnosis

Cut-off point(s):

Pathology reports were defined as positive if they reported acute appendicitis (simple, gangrenous, or perforated appendicitis

or periappendicular abscess formation), and negative, if no acute inflammation was found (i.e., negative appendectomy).

Time between the index test and reference test:

For how many participants were no complete outcome data available?

Preoperative diagnostic ultrasound scans were performed in 67 patients in the pregnant group (73%) only these patients were included in further analysis

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

65%

Outcome measure-2

Specificity

86%

Outcome measure-3

PPV

0.94

Outcome measure-4

NPV

0.40

Outcome measure-5

Inconclusive results

Author’s conclusion

There appears to be no difference in the ability of ultrasound to predict the diagnosis of acute appendicitis between pregnant women and nonpregnant women of reproductive age. Therefore,

similar preoperative imaging algorithms may be used in both patient populations.

Lehnert, 2012

Type of study: retrospective cohort study

Setting: Harborview Medical Center,

University of Washington,

Country: USA

Conflicts of interest: not reported

Inclusion criteria: imaging

records from 99 consecutive pregnant women over the age of

16 from 2001 to 2011 who presented during the second

(≥14 weeks gestation) or third trimester for right lower quadrant (RLQ) ultrasound to evaluate the appendix.

Exclusion criteria: Nonpregnant

patients, pregnant patients at less than 14 weeks gestational

ages, and patients with a history of appendectomy were excluded.

N=99

Prevalence: 7.1

Mean age ± SD: 28 years ±6.6 years

Other important characteristics: Gestational age ranged from 14 to 38 weeks. The mean gestational age at presentation was

23 weeks (±7 weeks)

Describe index test:

Ultrasounds were performed using a 5–12MHz linear transducer when possible or a 1–5MHz curved

array transducer if necessary in larger patients. The graded compression technique described by Puylaert was employed. Images were recorded in real time by the sonographers and presented for radiologist review on an ultrasound Picture Archiving and Communication System.

Cut-off point(s):

The criteria for establishing the diagnosis of acute appendicitis

at ultrasound was visualization of a non-compressible

appendix measuring ≥7 mm in diameter. A negative examination was confirmed by visualization of a compressible

appendix <7 mm in diameter. US exams in which the

appendix was not visualized were considered nondiagnostic for appendicitis.

Describe reference test: Operative notes and pathology results were reviewed to determine

if the appendix was inflamed when resected. Pathologic

confirmation of appendix inflammation served as the gold standard for appendicitis. The medical records were reviewed

for patients who were not managed surgically to confirm that the diagnosis of appendicitis was not subsequently made.

Time between the index test and reference test:

For how many participants were no complete outcome data available?

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

Outcome measure-2

Specificity

Outcome measure-3

PPV

Outcome measure-4

NPV

Outcome measure-5

Inconclusive results

96 (97%)

Author’s conclusion

In summary, ultrasound of the RLQ in the second and

third trimester of pregnancy is of limited utility due to

infrequent visualization of the appendix. This modality as first-line imaging may result in unnecessary cost and delay

in diagnosis in this patient population.

Wi, 2018

Type of study: retrospective cohort study

Setting: Bundang Medical Center

Country: Korea

Conflicts of interest: None

Inclusion criteria:

125

pregnant patients

with clinically suspected appendicitis

underwent 1.5 T MRI to diagnose or exclude acute

appendicitis.

Exclusion criteria: -

N=125

Prevalence: 19.2%

Age: 20 to 44 years of

age; mean 32 years

Other important characteristics: first trimester, n = 6; second trimester,

n = 89; third trimester, n = 30;

Describe index test:

MRI

MRI was performed using a 1.5 Tesla system (Magnetom

Sonata; Siemens, Erlangen, Germany) with a sixelement

phased-array surface coil.

Describe reference test:

Patients were classified as pathologically acute appendicitis positive or

pathologically acute appendicitis negative according to the clinical records.

Time between the index test and reference test:

For how many participants were no complete outcome data available?

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

100%

Outcome measure-2

Specificity

95%

Outcome measure-3

PPV

Calculated: 96%

Outcome measure-4

NPV

Calculated: 100%

Outcome measure-5

Inconclusive results

Author’s conclusion

MRI has high accuracy for the diagnosis of acute appendicitis in pregnant patients. Therefore, MRI

is recommended for use as a first-line diagnostic test for

pregnant patients with clinically suspected appendicitis.

Kereshi, 2018

Type of study: retrospective cohort study

Setting: Beth Israel Deaconess Medical Center, tertiary

care academic center

Country: USA

Conflicts of interest: None

Inclusion criteria: retrospective cohort study

was conducted of pregnant women who underwent

unenhanced MRI for suspected appendicitis at our tertiary

care academic center between January 1st, 2010 and

September 1st, 2015. A query was initiated on PACS for all

MRI abdomen and pelvis exams performed without

intravenous contrast during this time period. Based on the

exam indication and information in the electronic medical record, the exams were selected to be part of the study if the patient was pregnant and was having the MRI performed to evaluate for acute appendicitis.

Exclusion criteria: -

N=204

Prevalence: 7.4%

Age: 17–47 years old, mean 29 years old

Other important characteristics: 65 (32%) patients

were in their first trimester (0–12 weeks), with a range from 3 to 12 weeks, 104 (51%) patients were in their

second trimester (13–27 weeks) with a range from 13 to

27 weeks, and 35 (17%) patients were in their third trimester

(28 weeks birth) with a range from 28 to

37 weeks

Describe index test:

MRI

Noncontrast exam on a 1.5T scanner (Espree or Symphony,

Siemens Medical Solutions, Iselin, NJ; or Signa Twinspeed

or Signa Excite; GE Healthcare)

Because of manufacturer discontinuation, the use of

oral contrast agent was stopped part way through the

study. 136 MRI scans were performed with the oral

administration of contrast, and appendix was visualized in

121 of these (89.0%). 68 MRI scans did not use oral contrast,

and appendix was visualized in 55 of these (80.9%).

Additionally, if the patient had a preceding US from our institution or an outside hospital, the US reports were reviewed to determine if the appendix was visualized, if appendicitis was present, and if there were any additional findings on US that could explain the patient’s pain.

Cut-off point(s):

On MRI, appendicitis is typically diagnosed after

visualization of a distended appendiceal lumen greater

than or equal to 7 mm containing T2 hyperintense fluid. Peri-appendiceal inflammatory fat stranding supports the diagnosis of acute appendicitis, and is most evident as T2 hyperintense signal around the appendix, particularly on fat suppression sequences. When visualized, appendiceal hyperintense signal on diffusion-weighted imaging also supports the diagnosis of an acute inflammatory process. A T2 hyperintense wall suggestive of edema with thickening is also indicative of appendicitis, and comparison can be made to the wall of other non-inflamed bowel in the abdomen for reference.

Describe reference test:

Information collected on each patient from the electronic

medical record included the final clinical diagnosis, surgical notes, surgical pathology, and if the patient was seen by obstetrics, surgery, or emergency physicians during their workup.

Time between the index test and reference test:

For how many participants were no complete outcome data available?

Of the 212 MRI scans 8 were excluded due to equivocal MRI or lack of follow-up

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

100%

Outcome measure-2

Specificity

99.5%

Outcome measure-3

PPV

93.3%

Outcome measure-4

NPV

100%

Outcome measure-5

Inconclusive results

6 MRIs: equivocal findings

28 MRIs: appendix was not visualized

Author’s conclusion

Our large study cohort of pregnant patients confirms MRI to be of high diagnostic value in the workup of acute appendicitis with 100% NPV and sensitivity and 99.5% specificity. Furthermore, an alternative diagnosis for abdominal pain in this patient

population can be made in nearly half of MRI exams

which are deemed negative for appendicitis.

Burns, 2017

Type of study:

retrospective cohort study

Setting: St Paul’s Hospital & University of British Columbia

Country: Canada

Conflicts of interest: not reported

Inclusion criteria: MRI scans performed at our institution, between 2006 and 2012, for the evaluation of

suspected appendicitis in pregnant women.

Exclusion criteria: -

N=63

52 patients underwent abdominal US prior to MRI.

Prevalence: 20.6%

Mean age: 31 (range 19-41)

Other important characteristics: GA 22 (range 5-41) weeks, Nine (12.9%) patients presented during the first trimester of pregnancy, 38 (54.2%) patients presented during the second trimester, and 24 (34.3%) patients presented during the third trimester.

Describe index test:

MRI and/ or US

All MRI scans were performed on a General Electric Signa HD 1.5T system (GE Healthcare, Milwaukee, WI). An 8-channel body coil was used. No oral or intravenous contrast was administered.

When US was performed in our department prior to MRI, a variety of different US machines were used.

Cut-off point(s):

The MRI was considered diagnostic of appendicitis if the appendix was dilated (>7 mm), or if the appendix was

normal in diameter (6-7 mm) with wall thickening or periappendiceal

inflammatory changes

The US was considered diagnostic for appendicitis if the appendix was conclusively identified as a blind-ending

tubular structure, and was non-compressible with a diameter exceeding 7 mm.

Describe reference test:

The final diagnosis of acute appendicitis was made based upon histopathological examination following appendectomy. For cases in which operative

management was not undertaken, appendicitis was excluded if the patient did not re-present with appendicitis during the same pregnancy.

Time between the index test and reference test:

For how many participants were no complete outcome data available?

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

MRI 75%

US: 54.5%

Outcome measure-2

Specificity

MRI: 100%

US: 97.7%

Outcome measure-3

PPV

MRI: 100%

US: 85.7%

Outcome measure-4

NPV

MRI: 93.2%

US: 89.4%

Outcome measure-5

Inconclusive results

appendix was not visualized

MRI: 28 (44.4%)

US: 48 (88.9%)

Author’s conclusion

MRI is sensitive and highly specific for the diagnosis of appendicitis during pregnancy and should be considered as a first line

imaging study for this clinical presentation.

Patel, 2017

Type of study: retrospective cohort study

Setting: Edmonton

Country: Canada

Conflicts of interest:

None

Inclusion criteria:

A database search was conducted on the citywide

radiology Picture Archiving and Communication System

(PACS) for all MRI examinations performed for suspected

appendicitis in pregnant patients between January 1, 2008

and August 30, 2015. The search yielded 47 MRI examinations

in 47 patients.

Exclusion criteria: -

N=42

Prevalence: 11.9%

Mean age ± SD: 25.5 ± 5.4 years

Other important characteristics:

13 (31%) patients were in the first trimester, 22 (52.4%) patients were in the second trimester, and 7 (16.6%) patients

were in the third trimester.

Describe index test:

US and MRI

All MRI examinations were performed on a 1.5-T clinical

system with a body matrix coil placed over the torso with the patient positioned supine on the MR table. The examination time was approximately 20–25 min long. The standard MRI protocol for the evaluation of suspected appendicitis in the pregnant patient involved a non-contrast study.

All patients had undergone an US examination prior to MRI, and the US images and reports were also reviewed on PACS.

Describe reference test:

In patients that underwent surgical intervention for

suspected appendicitis, the standard of reference was the pathological data or operative findings at the time of surgery. In patients that did not undergo surgery, the standard of reference included the clinical and laboratory data obtained from the medical charts at the time of hospital admission including any

relevant clinical follow-up in the subsequent 6 months.

Time between the index test and reference test:

6 months

For how many participants were no complete outcome data available?

5 patients were excluded

from the study as their MRI examination was performed at

an outside hospital, for which institutional approval was

not obtained.

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

88.1%

Outcome measure-2

Specificity

91.9%

Outcome measure-3

PPV

91.9%

Outcome measure-4

NPV

50%

Outcome measure-5

Inconclusive results

Author’s conclusion

MRI is an excellent modality for excluding

acute appendicitis in pregnant patients presenting with

right lower quadrant pain.

Konrad, 2015

Type of study: retrospective cohort study

Setting: We searched the radiology database

of the largest women’s hospital in the region to identify

all pregnant patients who underwent US and/or MRI

imaging for clinically suspected appendicitis between

January 2009 and January 2011

Country: USA

Conflicts of interest: None

Inclusion criteria: pregnant patients who underwent US and/or MRI

imaging for clinically suspected appendicitis between

January 2009 and January 2011

Exclusion criteria: lack of

recorded clinical data

N=140

N MRI: 114

N US: 117

Prevalence: 11%

Mean age ± SD:

Other important characteristics: Average GA of all patients was 19 weeks.

Describe index test:

US and/or MRI

8 patients in whom MRI was performed for confirmation

of US findings, 83 patients in whom US was attempted

but the appendix was not visualized, and 23 patients who underwent MRI as the primary imaging

modality. All data are based upon the original prospective US and MRI interpretation

Patients were scanned in the supine position from the kidneys through the pelvis using a 1.5 T SIEMENS Symphony magnet (Siemens; Munich, Germany) with a

surface array coil.

Cut-off point(s):

The appendix was considered normal if the anteroposterior

diameter measured <6 mm in diameter, a lack of surrounding edema, or identification of a blind ending, tubular structure filled with air or feces. Patients were considered to have appendicitis if the appendix was incompressible, the anteroposterior diameter was

>6 mm in diameter, if there was periappendicial edema

(hypoechogenicity on US or bright T2 signal on MRI) or

if an abscess was identified.

Describe reference test:

Surgical pathology in patients who underwent surgery. The electronic medical record was used to assess clinical outcomes in patients who did not undergo surgery.

Time between the index test and reference test:

For how many participants were no complete outcome data available?

3 patients were excluded from the study due to the lack of

recorded clinical data.

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

MRI: 100%

US: 18%

Outcome measure-2

Specificity

MRI: 98%

US: 99%

Outcome measure-3

PPV

MRI: 89%

US: 66%

Outcome measure-4

NPV

MRI: 100%

US: 92%

Outcome measure-5

Inconclusive results

MRI: non-visualization: 23 (20%)

US: non-visualization: 109 (93.2%)

Author’s conclusion

Given the low likelihood of visualization of the appendix at US, the excellent accuracy of MRI and the ability of MRI to identify alternate diagnoses, we suggest that at certain institutions MRI may be considered a first-line imaging modality for pregnant patients of any GA with suspected appendicitis

Burke, 2015

Type of study:

retrospective cohort study

Setting: University of

North Carolina at Chapel Hill, Duke

University Medical Center, Hospital of the University of Pennsylvania, University of Southern California, University of California at San Diego, Northwestern

University, University of Chicago,

University of Nebraska, McGill

University Health Center, and Georgetown

University

Country: USA

Conflicts of interest: One author was a consultant and one author. is a consultant for Siemens.

Inclusion criteria:

pregnant women who underwent MRI evaluation of abdominal or

pelvic pain and who had clinical suspicion of acute appendicitis between

June 1, 2009, and July 31, 2014

Exclusion criteria: Patients with MRI findings of

acute appendicitis without surgical

follow up were excluded from review.

N=709

Prevalence: 9.3%

Mean age ± SD: mean, 27.8 SD 6.2 years

Other important characteristics:

Gestational age

ranged from 1-39 weeks, with a mean of 17 SD 8.5 weeks.

Describe index test:

MRI

All MRIs were performed on 1.5T magnets without the use of intravenous gadolinium-based contrast agents. Imaging protocols varied by institution but generally consisted of fat-suppressed HASTE imaging (half-Fourier acquisition single-shot turbo spin echo) and fat-suppressed T1-weighted imaging in all 3 planes (coronal, sagittal, and

axial).

Cut-off point(s):

All MRI examinations were

reviewed individually by an abdominal radiologist for the visualization of the appendix and imaging findings of appendicitis (appendiceal dilation, appendicolith,

free fluid, and fat-stranding).

MRI criteria for the diagnosis of acute appendicitis included a dilated appendix that measured ≥7 mm with signs of

periappendiceal inflammation.

Patients with MRI findings that were equivocal for acute appendicitis were considered to be positive for acute

appendicitis.

Describe reference test:

Clinical records were reviewed, and pathologic confirmation of appendicitis, when applicable.

In centers where ultrasound

scanning was performed as the

first line of imaging, ultrasound results were also recorded (positive, negative, or nondiagnostic/nonvisualization of the appendix) (192/709; 24.5%). Most ultrasound scans

(174/192; 90.6%) were read as nondiagnostic or nonvisualization of the

appendix.

Time between the index test and reference test:

For how many participants were no complete outcome data available?

Five patients with MRI findings of acute appendicitis were excluded from data analysis because of the absence of pathologic confirmation.

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

86.8%

Outcome measure-2

Specificity

99.2%

Outcome measure-3

PPV

92.4%

Outcome measure-4

NPV

99.7%

Outcome measure-5

Inconclusive results

non-visualization: 207 (29.2%)

Author’s conclusion

MRI is useful and reproducible in the diagnosis of suspected acute appendicitis during pregnancy.

Shetty, 2010

Type of study: retrospective cohort study

Setting: Woman’s Hospital of Texas

Country: USA

Conflicts of interest: not reported

Inclusion criteria: all pregnant women at the

Woman’s Hospital of Texas who presented with acute right

lower quadrant pain in whom there was a strong clinical

suspicion of acute appendicitis on the basis of presence of

fever, elevated white cell count, and physical examination,

and who subsequently underwent abdominal CT scan evaluation.

The period of study was from August 2002 to August

2007.

Exclusion criteria: -

N=39

US: 12

CT: 4

US+CT: 23

Prevalence: 12.8%

Mean age: 31 (range 18-43)

Other important characteristics: Sixteen patients were in their

third trimester, 10 in their second, and 1 was in her first

trimester.

Describe index test:

CT and/ or US

The CT hardware used during April 2006 was a single-detector row scanner

(CT FXI; GE Healthcare, Waukesha, WI), beginning May 2006 through the end of the study period a 16-detector row scanner (GE LightSpeed, 16VFX, GE Healthcare, Waukesha, WI) was used.

All patients received 2 cc/kg of intravenously administered

iopamidol, Bracco diagnostics (300 mg of iodine per mL) at

the rate of 3 cc/sec. Gastrografin (diatrizoate meglumine and diatrizoate sodium solution USP) Bracco diagnostics, was used as an oral contrast agent in all patients.

Describe reference test:

Patients’ charts were reviewed to document the following: patient’s age, gestational age at the time of diagnosis, time from onset of symptoms to performance of the

CT scan, surgical consultation, operative notes, and pathologic diagnosis in those who underwent surgical intervention. In those patients who did not have surgery the clinical course was documented from the discharge summary.

Time between the index test and reference test:

For how many participants were no complete outcome data available?

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

US: 46.1%

CT: 100%

Outcome measure-2

Specificity

US: 95.4%

CT: NR

Outcome measure-3

PPV

US: NR

CT: NR

Outcome measure-4

NPV

US: NR

CT: NR

Outcome measure-5

Inconclusive results

US: NR

CT: NR

Author’s conclusion

CT provides an accurate diagnosis in patients suspected to have acute appendicitis and is of value in avoiding false negative

exploratory laparatomy with its consequent risk of maternal and fetal mortality and morbidity. Although sonography is the preferred initial imaging modality as its lack of ionizing radiation, CT is more accurate in providing a timely diagnosis and its use is justified to reduce maternal mortality and mortality in patients with appendicitis.

Amitai, 2016

Type of study: retrospective cohort study

Setting: emergency

department (ED) of a tertiary medical center.

Country: Israel

Conflicts of interest: not reported

Inclusion criteria:

Exclusion criteria:

N=49

Prevalence: 10%

Age range: 19–42 years

Other important characteristics: GA 6-37 weeks. Three (6%) had twin pregnancies

Describe index test:

US+MRI

When a pregnant woman with abdominal pain was admitted to the emergency department she was examined by a gynecologist or a surgeon. If appendicitis was suspected, the woman was referred for ultrasound examination. If the ultrasound did not trace a tubular structure consistent with acute appendicitis, the surgeon coordinated a MRI study with the oncall staff radiologist in order to avoid a futile operation. If the

ultrasound examination was suspicious for acute appendicitis the patient was referred for MRI to confirm the diagnosis.

All scans were performed using a 1.5T whole-body MR

scanner (GE Excite, General Electric, Milwaukee, WI, USA)

equipped with high performance gradients, using manufacturer-supplied 8-channel cardiac coils.

A modified MRI protocol adapted to pregnancy was formulated. The imaging protocol included the following non-contrast

scans: T2 SSFSE coronal, SSFP coronal, SSFP axial, T2 SSFSE sagittal, T2 FS axial, T2 axial, T1 SPGR axial, and T1 SPGR coronal.

A potential pitfall in the diagnosis of a dilated

appendix was the presence of a dilated ovarian vein, a

common finding in pregnancy, which is adjacent to the appendix in a pregnant woman. Therefore, a magnetic resonance venography

(MRV) (time-of-flight acquisition) sequence was added after 24 studies for better differentiation between dilated ovarian veins and the appendix. However, it did not prove beneficial in any of the 25 remaining scans. Oral injestion of 1.5 liters of mannitol 5% became part of

the modified protocol 1 hour before the MRI, as we realized

that it facilitated the detection of the cecum, the illeo-cecal

valve and the appendix.

Cut-off point(s):

Positive MRI findings for

appendicitis included: total diameter exceeding 7 mm, appendicular wall thickness > 2 mm, high mural T2 signal secondary to mural edema, and peri-appendicular fat stranding

Describe reference test:

The imaging diagnoses were compared with operative

findings as the primary reference standard or with clinical follow-up as the secondary reference standard.

Time between the index test and reference test:

For how many participants were no complete outcome data available?

Outcome measures and effect size (include 95%CI and p-value if available):

Outcome measure-1

Sensitivity

Outcome measure-2

Specificity

Outcome measure-3

PPV

83.3% (95% CI 35.9 to 99.6)

Outcome measure-4

NPV

100% (95% CI 91.9 to 100)

Outcome measure-5

Inconclusive results

1 (2%)

In 11 women (22%) the appendix was not identified on MRI and no other signs of inflammation were seen; these studies were

interpreted as negative.

Author’s conclusion

Creation of an around-the-clock imaging service using abdominal MRI with the establishment of a workflow

chart using a dedicated MR protocol is feasible. It provides a safe way to rule out appendicitis and to avoid futile surgery in pregnant women.

CI: confidence interval; CT: computed tomography; GA: gestational age; MRI: magnetic resonance imaging; MRV: magnetic resonance venography; NPV: negative predictive value; NR: not reported; PPV: positive predictive value; SD: standard deviation; SR: systematic review; US: ultrasound

Verantwoording

Beoordelingsdatum en geldigheid

Laatst beoordeeld : 01-07-2019

Module	Regiehouder(s)	Jaar van autorisatie	Eerstvolgende beoordeling actualiteit richtlijn	Frequentie van beoordeling op actualiteit	Wie houdt er toezicht op actualiteit	Relevante factoren voor wijzigingen in aanbeveling
Diagnostiek bij volwassenen	NVvH en NVvR	2019	2024	Eens in de 5 jaar	NVvH en NVvR	-

Voor het beoordelen van de actualiteit van deze richtlijn is de werkgroep niet in stand gehouden. Uiterlijk in 2024 bepaalt het bestuur van de Nederlandse Vereniging voor Heelkunde of de modules van deze richtlijn nog actueel zijn. Op modulair niveau is een onderhoudsplan beschreven. Bij het opstellen van de richtlijn heeft de werkgroep per module een inschatting gemaakt over de maximale termijn waarop herbeoordeling moet plaatsvinden en eventuele aandachtspunten geformuleerd die van belang zijn bij een toekomstige herziening (update). De geldigheid van de richtlijn komt eerder te vervallen indien nieuwe ontwikkelingen aanleiding zijn een herzieningstraject te starten.

De Nederlandse Vereniging voor Heelkunde is regiehouder van deze richtlijn en eerstverantwoordelijke op het gebied van de actualiteitsbeoordeling van de richtlijn. De andere aan deze richtlijn deelnemende wetenschappelijke verenigingen of gebruikers van de richtlijn delen de verantwoordelijkheid en informeren de regiehouder over relevante ontwikkelingen binnen hun vakgebied.

Initiatief en autorisatie

Initiatief:

Nederlandse Vereniging voor Heelkunde

Geautoriseerd door:

Nederlandse Vereniging van Spoedeisende Hulp Artsen
Nederlandse Vereniging voor Heelkunde
Nederlandse Vereniging voor Kindergeneeskunde
Nederlandse Vereniging voor Medische Microbiologie
Nederlandse Vereniging voor Obstetrie en Gynaecologie
Nederlandse Vereniging voor Pathologie
Nederlandse Vereniging voor Radiologie
Patiëntenfederatie Nederland

Algemene gegevens

De richtlijnontwikkeling werd ondersteund door het Kennisinstituut van de Federatie Medisch Specialisten (https://www.demedischspecialist.nl/kennisinstituut) en werd gefinancierd uit de Kwaliteitsgelden Medisch Specialisten (SKMS). De financier heeft geen enkele invloed gehad op de inhoud van de richtlijn.

Doel en doelgroep

Doel

Deze richtlijn is bedoeld om een evidence-based beleid voor de zorg voor patiënten met acute appendicitis in de tweede op te stellen.

Doelgroep

Deze richtlijn is geschreven voor alle leden van de beroepsgroepen die betrokken zijn bij de zorg voor patiënten met acute appendicitis, zowel bij kinderen als bij volwassenen. Dit zijn onder andere chirurgen, kinderchirurgen, radiologen, kinderartsen, gynaecologen en SEH-artsen. Een secundaire doelgroep zijn zorgverleners uit de eerste lijn die betrokken zijn bij de zorg rondom patiënten met acute appendicitis, waaronder de huisarts, verpleegkundig specialist en physician assistants.

Samenstelling werkgroep

Voor de herziening van de richtlijn is in 2017 een multidisciplinaire werkgroep ingesteld, bestaande uit vertegenwoordigers van alle relevante specialismen die betrokken zijn bij de zorg voor patiënten met acute appendicitis te maken hebben.

Werkgroep:

Dr. C.C. van Rossem, gastro-intestinaal chirurg, werkzaam in Maasstad Ziekenhuis, namens NVvH (voorzitter)
Drs. A.L. van den Boom, fellow gastro-intestinale chirurgie, werkzaam in het UMCG, namens NVvH
Drs. W.J. Bom, arts-onderzoeker chirurgie, werkzaam in Amsterdam UMC locatie AMC, namens NVvH
Drs. M.E. Bos, arts in opleiding, werkzaam in Spoedeisende Geneeskunde regio VUmc, locatie Westfriesgasthuis, namens NVSHA
Dr. A.A.W. van Geloven, gastro-intestinaal chirurg, werkzaam in Tergooi, namens NVvH
Dr. R.R. Gorter, fellow kinderchirurgie, werkzaam in Amsterdam UMC, namens NVvH
Dr. B.C. Jacod, gynaecoloog-perinatoloog, werkzaam in OLVG, namens NVOG
Drs. M. Knaapen, arts-onderzoeker kinderchirurgie, werkzaam in Amsterdam UMC, namens NVvH
R. Lammers, MSc, beleidsadviseur, werkzaam voor de Patiëntenfederatie Nederland
Drs. A.H.J. van Meurs, algemeen kinderarts, werkzaam in HagaZiekenhuis, namens NVK
Dr. J. Nederend, radioloog, werkzaam in Catharina Ziekenhuis Eindhoven, namens NVvR
Dr. J.B.C.M. Puylaert, radioloog, werkzaam in Haaglanden Medisch Centrum, namens NVvR

Samenstelling klankbordgroep:

Dr. A.K. van der Bij, arts-microbioloog, werkzaam in Diakonessenhuis, NVMM
Dr. R. Bakx, kinderchirurg, werkzaam in Amsterdam UMC, namens NVvH

Met ondersteuning van:

Dr. S.N. Hofstede, adviseur, Kennisinstituut van de Federatie Medisch Specialisten
L. Boerboom MSc, literatuurspecialist, Kennisinstituut van de Federatie Medisch Specialisten
D.M.J. Tennekes, directiesecretaresse, Kennisinstituut van de Federatie Medisch Specialisten

Belangenverklaringen

De KNMG-code ter voorkoming van oneigenlijke beïnvloeding door belangenverstrengeling is gevolgd. Alle werkgroepleden hebben schriftelijk verklaard of zij in de laatste drie jaar directe financiële belangen (betrekking bij een commercieel bedrijf, persoonlijke financiële belangen, onderzoek financiering) of indirecte belangen (persoonlijke relaties, reputatiemanagement, kennisvalorisatie) hebben gehad. Een overzicht van de belangen van werkgroepleden en het oordeel over het omgaan met eventuele belangen vindt u in onderstaande tabel. De ondertekende belangenverklaringen zijn op te vragen bij het secretariaat van het Kennisinstituut van de Federatie Medisch Specialisten.

Werkgroeplid	Functie	Nevenfuncties	Gemelde belangen	Ondernomen actie
Van Rossem	Gastro-intestinaal chirurg, Maasstad Ziekenhuis	Geen	Geen	Geen actie
Van Geloven	Gastro-intestinaal chirurg, Tergooi	Geen	Geen	Geen actie
Gorter	Fellow Kinderchirurgie Amsterdam UMC	Onderzoeker kinderchirurgie Vumc & AMC	Projectleider APAC-studie Non-operatieve behandeling van appendicitis bij kinderen. ZonMw Dossiernummer: 843002708	Geen actie
Van Meurs	Algemeen kinderarts, Juliana Kinderziekenhuis (HagaZiekenhuis)	Onderwijs aan studenten geneeskunde LUMC	Geen	Geen actie
Jacod	Gynaecoloog, Radboud UMC	Secretaris werkgroep Samenwerking Obstetrie-Anesthesiologie, NVOG-NVA, onbetaald	Geen	Geen actie
Puylaert	Radioloog HMC	Geen	Geen	Geen actie
Nederend	Radioloog Catharina Ziekenhuis Eindhoven	Screeningsradioloog Bevolkingsonderzoek, betaald Secretaris Sectie Abdominale Radiologie, NVvR, onbetaald	Onderzoek naar de waarden van MRI bij PIPAC-behandeling, deels gefinancieerd (unresticted grant) door Bracco Imaging Europe B.V.	Geen actie
Bos	AIOS Spoedeisende Geneeskunde regio Vumc, locatie Westfriesgasthuis	Algemeen lid congrescommissie NVSHA - onbetaald	Geen	Geen actie
Van den Boom	fellow gastro-intestinale chirurgie	Geen	Principal investigator van APPIC trial (short versus long antibiotic treatment after appendectomy for complex appendicitis), gefinancierd door ZonMw ontvangen (Goed Gebruik Geneesmiddelen)	Geen actie
Bom	Arts-onderzoeker chirurgie, AMC	Geen	Ik word betaald vanuit de EPOCH studie, gefinancieerd door ZonMw. Dit is een RCT naar het voorkomen van wondinfecties. Dit is op geen enkele wijze gelieerd aan de richtlijn appendicitis. Derhalve heb ik geen belangen bij extern gefinancierd onderzoek.	Geen actie
Knaapen	Arts-onderzoeker kinderchirurgie Amsterdam UMC	Geen	Coördinerend onderzoeker APAC-studie Non-operatieve behandeling van appendicitis bij kinderen. ZonMw Dossiernummer: 843002708	Geen actie
Lammers	Beleidsadviseur, Patiëntenfederatie	Geen	Geen	Geen actie
Hofstede	Adviseur, Kennisinstituut van de Federatie Medisch Specialisten	Geen	Geen	Geen actie
Van Enst	Senior adviseur, Kennisinstituut van de Federatie Medisch Specialisten	Lid van de GRADE working group/ Dutch GRADE Network	Geen	Geen actie

Klankbordgroeplid	Functie	Nevenfuncties	Gemelde belangen	Ondernomen actie
Van der Bij	Arts-microbioloog Diakonessenhuis Utrecht/MSBD	Voorzitter commissie kwaliteitsbeheersing NVMM, onbetaald	Geen	Geen actie
Bakx	Kinderchirurg, Kinderchirurgisch centrum Amsterdam	Voorzitter richtlijnencommissie NVvH, onbetaald, bestuurslid Stichting spoedeisende hulp bij kinderen, onbetaald, APLS instructeur, onbetaald	Principal investigator APAC-studie Non-operatieve behandeling van appendicitis bij kinderen. ZonMw Dossiernummer: 843002708	Geen actie

Inbreng patiëntenperspectief

Er werd aandacht besteed aan het patiëntenperspectief door een afgevaardigde patiëntenvereniging in de werkgroep op te nemen. De conceptrichtlijn is tevens voor commentaar voorgelegd aan de Patiëntenfederatie.

Methode ontwikkeling

Evidence based

Implementatie

In de verschillende fasen van de richtlijnontwikkeling is rekening gehouden met de implementatie van de richtlijn (module) en de praktische uitvoerbaarheid van de aanbevelingen. Daarbij is uitdrukkelijk gelet op factoren die de invoering van de richtlijn in de praktijk kunnen bevorderen of belemmeren. Het implementatieplan is te vinden bij de aanverwante producten. De werkgroep heeft tevens interne kwaliteitsindicatoren ontwikkeld om het toepassen van de richtlijn in de praktijk te volgen en te versterken (zie Indicatorontwikkeling).

Werkwijze

AGREE

Deze richtlijn is opgesteld conform de eisen vermeld in het rapport Medisch Specialistische Richtlijnen 2.0 van de adviescommissie Richtlijnen van de Raad Kwaliteit. Dit rapport is gebaseerd op het AGREE II instrument (Appraisal of Guidelines for Research & Evaluation II; Brouwers, 2010), dat een internationaal breed geaccepteerd instrument is. Voor een stap-voor-stap beschrijving hoe een evidence-based richtlijn tot stand komt wordt verwezen naar het stappenplan Ontwikkeling van Medisch Specialistische Richtlijnen van het Kennisinstituut van de Federatie Medisch Specialisten.

Knelpuntenanalyse

Tijdens de voorbereidende fase inventariseerden de voorzitter van de werkgroep en de adviseur de knelpunten. De werkgroep beoordeelde de aanbevelingen uit de eerdere richtlijn (NVvH, 2010) op noodzaak tot revisie. Tevens werden stakeholders uitgenodigd voor een knelpuntenbijeenkomst (Invitational conference). Vanwege het lage aantal aanmeldingen (drie, IGZ, NVA en de Patiëntenfederatie) is de bijeenkomst geannuleerd. Gevraagd is schriftelijk op het raamwerk te reageren. Er zijn schriftelijk knelpunten aangedragen door NVKC, NVSHA, NVvH, NVZ en V&VN. Een verslag hiervan is opgenomen onder aanverwante producten. De werkgroep stelde vervolgens een long list met knelpunten op en prioriteerde de knelpunten op basis van: (1) klinische relevantie, (2) de beschikbaarheid van (nieuwe) evidence van hoge kwaliteit, (3) en de te verwachten impact op de kwaliteit van zorg, patiëntveiligheid en (macro)kosten.

Uitgangsvragen en uitkomstmaten

Op basis van de uitkomsten van de knelpuntenanalyse zijn door de voorzitter en de adviseur concept-uitgangsvragen opgesteld. Deze zijn met de werkgroep besproken waarna de werkgroep de definitieve uitgangsvragen heeft vastgesteld. Vervolgens inventariseerde de werkgroep per uitgangsvraag welke uitkomstmaten voor de patiënt relevant zijn, waarbij zowel naar gewenste als ongewenste effecten werd gekeken. De werkgroep waardeerde deze uitkomstmaten volgens hun relatieve belang bij de besluitvorming rondom aanbevelingen, als cruciaal (kritiek voor de besluitvorming), belangrijk (maar niet cruciaal) en onbelangrijk. Tevens definieerde de werkgroep tenminste voor de cruciale uitkomstmaten welke verschillen zij klinisch (patiënt) relevant vonden.

Strategie voor zoeken en selecteren van literatuur

Er werd eerst oriënterend gezocht naar bestaande buitenlandse richtlijnen, systematische reviews (Medline (OVID)), en literatuur over patiëntvoorkeuren (patiëntenperspectief; Medline (OVID)). Vervolgens werd voor de afzonderlijke uitgangsvragen werd aan de hand van specifieke zoektermen gezocht naar gepubliceerde wetenschappelijke studies in (verschillende) elektronische databases. Tevens werd aanvullend gezocht naar studies aan de hand van de literatuurlijsten van de geselecteerde artikelen. In eerste instantie werd gezocht naar studies met de hoogste mate van bewijs. De werkgroepleden selecteerden de via de zoekactie gevonden artikelen op basis van vooraf opgestelde selectiecriteria. De geselecteerde artikelen werden gebruikt om de uitgangsvraag te beantwoorden. De databases waarin is gezocht, de zoekstrategie en de gehanteerde selectiecriteria zijn te vinden in de module met desbetreffende uitgangsvraag. De zoekstrategie voor de oriënterende zoekactie en patiëntenperspectief zijn opgenomen onder aanverwante producten.

Kwaliteitsbeoordeling individuele studies

Individuele studies werden systematisch beoordeeld, op basis van op voorhand opgestelde methodologische kwaliteitscriteria, om zo het risico op vertekende studieresultaten (risk-of-bias) te kunnen inschatten. Deze beoordelingen kunt u vinden in de Risk-of-Bias (RoB) tabellen. De gebruikte RoB instrumenten zijn gevalideerde instrumenten die worden aanbevolen door de Cochrane Collaboration: AMSTAR – voor systematische reviews; Cochrane – voor gerandomiseerd gecontroleerd onderzoek; ACROBAT-NRS – voor observationeel onderzoek; QUADAS II – voor diagnostisch onderzoek.

Samenvatten van de literatuur

De relevante onderzoeksgegevens van alle geselecteerde artikelen werden overzichtelijk weergegeven in evidencetabellen. De belangrijkste bevindingen uit de literatuur werden beschreven in de samenvatting van de literatuur. Bij een voldoende aantal studies en overeenkomstigheid (homogeniteit) tussen de studies werden de gegevens ook kwantitatief samengevat (meta-analyse) met behulp van Review Manager 5.

Beoordelen van de kracht van het wetenschappelijke bewijs

A) Voor interventievragen (vragen over therapie of screening)

De kracht van het wetenschappelijke bewijs werd bepaald volgens de GRADE-methode. GRADE staat voor ‘Grading Recommendations Assessment, Development and Evaluation’ (zie http://www.gradeworkinggroup.org/).

GRADE onderscheidt vier gradaties voor de kwaliteit van het wetenschappelijk bewijs: hoog, redelijk, laag en zeer laag. Deze gradaties verwijzen naar de mate van zekerheid die er bestaat over de literatuurconclusie (Schünemann, 2013).

GRADE	Definitie
Hoog	Er is hoge zekerheid dat het ware effect van behandeling dichtbij het geschatte effect van behandeling ligt zoals vermeld in de literatuurconclusie; het is zeer onwaarschijnlijk dat de literatuurconclusie verandert wanneer er resultaten van nieuw grootschalig onderzoek aan de literatuuranalyse worden toegevoegd.
Redelijk*	Er is redelijke zekerheid dat het ware effect van behandeling dichtbij het geschatte effect van behandeling ligt zoals vermeld in de literatuurconclusie; het is mogelijk dat de conclusie verandert wanneer er resultaten van nieuw grootschalig onderzoek aan de literatuuranalyse worden toegevoegd.
Laag	Er is lage zekerheid dat het ware effect van behandeling dichtbij het geschatte effect van behandeling ligt zoals vermeld in de literatuurconclusie; er is een reële kans dat de conclusie verandert wanneer er resultaten van nieuw grootschalig onderzoek aan de literatuuranalyse worden toegevoegd.
Zeer laag	Er is zeer lage zekerheid dat het ware effect van behandeling dichtbij het geschatte effect van behandeling ligt zoals vermeld in de literatuurconclusie; De literatuurconclusie is zeer onzeker.

*in 2017 heeft het Dutch GRADE Network bepaalt dat de voorkeursformulering voor de op een na hoogste gradering ‘redelijk’ is in plaats van ‘matig’

B) Voor vragen over diagnostische tests, schade of bijwerkingen, etiologie en prognose

De kracht van het wetenschappelijke bewijs werd eveneens bepaald volgens de GRADE-methode: GRADE-diagnostiek voor diagnostische vragen (Schünemann, 2008), en een generieke GRADE-methode voor vragen over schade of bijwerkingen, etiologie en prognose. In de gehanteerde generieke GRADE-methode werden de basisprincipes van de GRADE-methodiek toegepast: het benoemen en prioriteren van de klinisch (patiënt) relevante uitkomstmaten, een systematische review per uitkomstmaat, en een beoordeling van bewijskracht op basis van de vijf GRADE-criteria (startpunt hoog; downgraden voor risk-of-bias, inconsistentie, indirectheid, imprecisie, en publicatiebias).

Formuleren van de conclusies

Voor elke relevante uitkomstmaat werd het wetenschappelijk bewijs samengevat in een of meerdere literatuurconclusies waarbij het niveau van bewijs werd bepaald volgens de GRADE-methodiek. De werkgroepleden maakten de balans op van elke interventie (overall conclusie). Bij het opmaken van de balans werden de gunstige en ongunstige effecten voor de patiënt afgewogen. De overall bewijskracht wordt bepaald door de laagste bewijskracht gevonden bij een van de cruciale uitkomstmaten. Bij complexe besluitvorming waarin naast de conclusies uit de systematische literatuuranalyse vele aanvullende argumenten (overwegingen) een rol spelen, werd afgezien van een overall conclusie. In dat geval werden de gunstige en ongunstige effecten van de interventies samen met alle aanvullende argumenten gewogen onder het kopje 'Overwegingen'.

Overwegingen (van bewijs naar aanbeveling)

Om te komen tot een aanbeveling zijn naast (de kwaliteit van) het wetenschappelijke bewijs ook andere aspecten belangrijk en worden meegewogen, zoals de expertise van de werkgroepleden, de waarden en voorkeuren van de patiënt (patient values and preferences), kosten, beschikbaarheid van voorzieningen en organisatorische zaken. Deze aspecten worden, voor zover geen onderdeel van de literatuursamenvatting, vermeld en beoordeeld (gewogen) onder het kopje ‘Overwegingen’.

Formuleren van aanbevelingen

De aanbevelingen geven antwoord op de uitgangsvraag en zijn gebaseerd op het beschikbare wetenschappelijke bewijs en de belangrijkste overwegingen, en een weging van de gunstige en ongunstige effecten van de relevante interventies. De kracht van het wetenschappelijk bewijs en het gewicht dat door de werkgroep wordt toegekend aan de overwegingen, bepalen samen de sterkte van de aanbeveling. Conform de GRADE-methodiek sluit een lage bewijskracht van conclusies in de systematische literatuuranalyse een sterke aanbeveling niet a priori uit, en zijn bij een hoge bewijskracht ook zwakke aanbevelingen mogelijk. De sterkte van de aanbeveling wordt altijd bepaald door weging van alle relevante argumenten tezamen.

Randvoorwaarden (Organisatie van zorg)

In de knelpuntenanalyse en bij de ontwikkeling van de richtlijn is expliciet rekening gehouden met de organisatie van zorg: alle aspecten die randvoorwaardelijk zijn voor het verlenen van zorg (zoals coördinatie, communicatie, (financiële) middelen, menskracht en infrastructuur). Randvoorwaarden die relevant zijn voor het beantwoorden van een specifieke uitgangsvraag maken onderdeel uit van de overwegingen bij de bewuste uitgangsvraag.

Indicatorontwikkeling

Gelijktijdig met het ontwikkelen van de conceptrichtlijn werden er interne kwaliteitsindicatoren ontwikkeld om het toepassen van de richtlijn in de praktijk te volgen en te versterken. Meer informatie over de methode van indicatorontwikkeling is op te vragen bij het Kennisinstituut van de Federatie Medisch Specialisten (secretariaat@kennisinstituut.nl).

Kennislacunes

Tijdens de ontwikkeling van deze richtlijn is systematisch gezocht naar onderzoek waarvan de resultaten bijdragen aan een antwoord op de uitgangsvragen. Bij elke uitgangsvraag is door de werkgroep nagegaan of er (aanvullend) wetenschappelijk onderzoek gewenst is om de uitgangsvraag te kunnen beantwoorden. Een overzicht van de onderwerpen waarvoor (aanvullend) wetenschappelijk van belang wordt geacht, is als aanbeveling in de Kennislacunes beschreven (onder aanverwante producten).

Commentaar- en autorisatiefase

De conceptrichtlijn werd aan de betrokken (wetenschappelijke) verenigingen en (patiënt) organisaties voorgelegd ter commentaar. De commentaren werden verzameld en besproken met de werkgroep. Naar aanleiding van de commentaren werd de conceptrichtlijn aangepast en definitief vastgesteld door de werkgroep. De definitieve richtlijn werd aan de deelnemende (wetenschappelijke) verenigingen en (patiënt) organisaties voorgelegd voor autorisatie en door hen geautoriseerd dan wel geaccordeerd.

Literatuur

Brouwers MC, Kho ME, Browman GP, et al. AGREE Next Steps Consortium. AGREE II: advancing guideline development, reporting and evaluation in health care. CMAJ. 2010;182(18):E839-42. doi: 10.1503/cmaj.090449. Epub 2010 Jul 5. Review. PubMed PMID: 20603348.

Medisch Specialistische Richtlijnen 2.0 (2012). Adviescommissie Richtlijnen van de Raad Kwalitieit. Link: https://richtlijnendatabase.nl/over_deze_site/richtlijnontwikkeling.html

Schünemann H, Brożek J, Guyatt G, et al. GRADE handbook for grading quality of evidence and strength of recommendations. Updated October 2013. The GRADE Working Group, 2013. Available from http://gdt.guidelinedevelopment.org/central_prod/_design/client/handbook/handbook.html.

Schünemann HJ, Oxman AD, Brozek J, et al. Grading quality of evidence and strength of recommendations for diagnostic tests and strategies. BMJ. 2008;336(7653):1106-10. doi: 10.1136/bmj.39500.677199.AE. Erratum in: BMJ. 2008;336(7654). doi: 10.1136/bmj.a139. PubMed PMID: 18483053.

Wessels M, Hielkema L, van der Weijden T. How to identify existing literature on patients' knowledge, views, and values: the development of a validated search filter. J Med Libr Assoc. 2016 Oct;104(4):320-324. PubMed PMID: 27822157; PubMed Central PMCID: PMC5079497.

Zoekverantwoording

Zoekacties zijn opvraagbaar. Neem hiervoor contact op met de Richtlijnendatabase.

Richtlijnendatabase

Acute appendicitis

Acute appendicitis

Diagnostische strategie bij zwangere vrouwen met acute appendicitis

Uitgangsvraag

Aanbeveling

Overwegingen

Onderbouwing

Achtergrond

Conclusies / Summary of Findings

Samenvatting literatuur

Zoeken en selecteren

Referenties

Evidence tabellen

Verantwoording

Beoordelingsdatum en geldigheid

Initiatief en autorisatie

Algemene gegevens

Doel en doelgroep

Samenstelling werkgroep

Belangenverklaringen

Inbreng patiëntenperspectief

Methode ontwikkeling

Implementatie

Werkwijze

Zoekverantwoording

Bijlagen