Aanvullende diagnostiek
Uitgangsvraag
Welk diagnostisch stappenplan kan op basis van de literatuur of consensus geadviseerd worden bij neonatale cholestase, inclusief galgangatresie? Deze uitgangsvraag bevat de volgende deelvragen:
Wat is de diagnostische accuraatheid van
- laboratoriumgegevens
- beeldvormend onderzoek
- andere diagnostiek voor het stellen van de diagnose galgangatresie
KLINISCHE VRAAG
Wat is de diagnostic accuracy van 1/2/3 bij baby’s met een vermoeden op cholestase?
DIAGNOSTIC ACCURACY KENMERKEN EN DE INVLOED OP DE UITKOMST VOOR PATIËNTEN
Test resultaat |
Uitkomst voor patiënten |
Belangrijkheid* |
Terecht positief |
Snelle definitieve diagnose voor tijdige behandeling waardoor gunstigere prognose (mortaliteit/morbiditeit) |
9 |
Terecht negatief |
Vermijding van onnodige invasieve procedure zoals leverbiopsie. Geruststelling en mogelijk een alternatieve diagnose |
7 |
Fout positief |
Blootstelling aan meer aanvullende onderzoeken of onnodige invasieve procedure dan nodig is. Verrichten van een leverbiopsie kan echter een juiste diagnose opleveren |
7 |
Fout negatief |
Vertraagde diagnose. Een negatief testresultaat kan niet overtuigend worden geïnterpreteerd als de afwezigheid van galgangatresie |
9 |
* GRADE recommends classifying patient important outcomes on a 9-point scale: 7-9: critical for decision making;
4-6: important but not critical for decision making; and 1-3: of lower importance to patients.
Aanbeveling
- Meet het gamma glutamyl transferase (GGT) bij elke baby van 3 weken of ouder die naar het ziekenhuis verwezen wordt vanwege persisterende geelzucht of ontkleurde ontlasting.
- Neem in het geval van een verhoogd GGT (>150 IU/L) direct contact op met een kinderhepatologisch centrum om de timing en inhoud van het verdere diagnostische plan te bespreken.
Voor de zorgverlener in een tweedelijns ziekenhuis:
- Neem direct contact op met een kinderhepatologisch centrum als een zuigeling met persisterende geelzucht of ontkleurde ontlasting een directe hyperbilirubinemie heeft.
- Gebruik niet het echo-onderzoek om galgangatresie vast te stellen of uit te sluiten.
- Doe geen MRCP bij jonge zuigelingen met persisterende geelzucht of ontkleurde ontlasting om galgangatresie te diagnosticeren of uit te sluiten.
- Overweeg een HIDA-scan bij twijfel over de ontkleuring van de ontlasting en persisterende cholestase. Als afvloed naar de darm wordt gezien is biliaire atresie zeer onwaarschijnlijk.
-
Doe geen ERCP bij jonge zuigelingen met persisterende geelzucht of ontkleurde ontlasting om galgangatresie te diagnosticeren of uit te sluiten. Overweeg een ERCP bij een niet-conclusief leverbiopt.
-
Doe een percutaan leverbiopt bij cholestatische zuigelingen die zijn opgenomen in een universitair kinderhepatologisch centrum vanwege een sterke verdenking op galgangatresie en laat dit (mede) beoordelen door een histopatholoog met expertise in neonatale cholestase.
-
Verwijs een kind met directe hyperbilirubinemie naar een kinderarts Metabole Ziekten en klinisch geneticus indien een biliaire atresie is uitgesloten en de reeds verrichte diagnostiek niet heeft geleid tot een diagnose.
Overwegingen
Het bloedonderzoek vormt de eerste stap in het diagnostische proces naar de oorzaak van persisterende geelzucht of ontkleurde ontlasting. Er wordt aanbevolen zowel het totale als het directe serumbilirubine te bepalen. Wanneer het totale bilirubine tenminste 50 µmol/l is en de directe bilirubinefractie >0,20, dan spreken we van neonatale cholestase. Andere belangrijke bloedonderzoeken zijn het bloedbeeld, CRP, transaminases, gamma-GT, albumine, en INR. Hiermee is het mogelijk een inschatting te maken van
(1) de waarschijnlijkheid dat er sprake is van galgangatresie (verhoogd GGT), en
(2) de dreiging van een vitamine K-afhankelijk stollingsstoornis (verhoogde INR).
In de literatuur worden GGT-afkapwaardes van 184 tot 300 IU/L (oftewel 2,5x tot 4x de bovenste referentiewaarde, BRW) genoemd om galgangatresie te kunnen onderscheiden van andere cholestatische aandoeningen. Het aandeel kinderen met galgangatresie in deze retrospectieve studies was hoog, waardoor de afkapwaardes hoog uitvallen. De werkgroep adviseert om in geval van directe hyperbilirubinemie en een GGT-waarde van groter of gelik aan 150 IU/L (>2x BRW) direct contact op te nemen met een kinderhepatologisch centrum vanwege een hoge voorafkans op galgangatresie. Bij een INR ≥1,4 dient vitamine K intraveneus toegediend te worden
Afhankelijk van de mogelijkheden voor laboratoriumdiagnostiek in een tweedelijns ziekenhuis en de verwachte duur voordat een uitslag verkregen kan worden, wordt geadviseerd de logistiek van de aanvullende bloedonderzoeken in overleg met het kinderhepatologisch centrum te coördineren.
b. Beeldvorming
b.1. Echografisch onderzoek
Echografie wordt gezien als het beeldvormende screeningsonderzoek van voorkeur vanwege de lage kosten, de afwezigheid van straling en de baby heeft over het algemeen geen sedatie nodig. Er zijn meerdere echografische kenmerken die, in geval van aangetoonde neonatale cholestase op basis van hyperbilirubinemie met directe bilirubinefractie >20%, de waarschijnlijkheid van BA kunnen vergroten.
Oorspronkelijk waren galblaasafwijkingen de belangrijkste indicatoren voor BA: afwezigheid van een galblaas, kleine galblaasgrootte, abnormale vorm en wand van de galblaas en afwezigheid van galblaascontracties. Later zijn er andere echografische kenmerken bijgekomen met een betere voorspellende waarde. Het triangular cord sign, een driehoekige of buisvormige echodense structuur in de nabijheid van de poortader, heeft een hoge positieve likelihood ratio (LR) (18), hetgeen maakt dat de waarschijnlijkheid van BA verder vergroot wordt. Afwezigheid van hepatic subcapsular flow, aan de andere kant, maakt de kans op BA juist erg klein (negatieve LR 0.05). Hepatic artery enlargement is van alle echografische kenmerken het minst richtinggevend. Het is aannemelijk dat de aanwezigheid van meerdere echografische kenmerken de positieve LR verder doet stijgen, maar dergelijke studies hebben we niet geïdentificeerd.
Tabel |Echo-onderzoek van de lever |
||||||
Publicaties |
Prevalentie van galgangatresie |
Echografische kenmerken |
Aantal per 100 patiënten |
|||
TP |
FP |
FN |
TN |
|||
(MA of 23 studies) |
41% (pooled data) |
Not specified |
32 |
4 |
9 |
55 |
Zhou 2016 (MA of 20 studies) |
50% (assumed)
|
Triangular cord sign |
37 |
2 |
13 |
48 |
Zhou 2016 (MA of 19 studies) |
Gallbladder abnormalities |
42 |
4 |
8 |
46 |
|
Zhou 2016 (MA of 5 studies) |
Combination of triangular cord sign and gallbladder abnormalities |
48 |
6 |
2 |
44 |
|
Zhou 2016 (MA of 5 studies) |
Hepatic artery enlargement |
40 |
12 |
10 |
38 |
|
Sun 2020 (MA of 9 studies) |
44% (pooled data) |
Hepatic subcapsular flow |
42 |
4 |
2 |
52 |
Patiënten: zuigelingen met een verdenking op galgangatresie Rol van de test: triage (alleen patiënten met één of meer echografische kenmerken van galgangatresie vervolgen het testtraject voor galgangatresie; zie figuur 1 in module 4). Omgeving: universitair ziekenhuis Indextest: echografische beoordeling van triangular cord sign, hepatic subcapsular flow en galblaasafwijkingen Referentiestandaard: histologische kenmerken van obstructieve cholangiopathie, abnormaal operatief cholangiogram, macroscopische beoordeling tijdens chirurgie of een combinatie |
Low GRADE |
An experienced paediatric ultrasonographer may be able to diagnose or exclude biliary atresia using ultrasonography, but the evidence is very uncertain. Sources: Wang 2018, Zhou 2016, Sun 2020 |
OVERWEGINGEN – VAN BEWIJS NAAR AANBEVELING
Echografisch onderzoek van de lever is zinvol bij zuigelingen die zijn verwezen wegens persisterende geelzucht of ontkleurde ontlasting. De diagnostische opbrengst van het onderzoek is afhankelijk van het zorgniveau. In een algemeen ziekenhuis waar geen gespecialiseerde kinderechografist beschikbaar is, kan het echo-onderzoek gebruikt worden om galwegmisvormingen (zoals een choledochus cyste) of stenen te diagnosticeren. Het bloedonderzoek, zoals besproken in de voorgaande paragraaf, is echter leidend voor een verwijzing naar een kinderhepatologisch centrum. Voor detectie van de subtiele echografische kenmerken die zouden kunnen wijzen op galgangatresie, zoals het triangular cord sign en subcapsulaire doorbloeding, is speciale expertise nodig. Zelfs bij aanwezigheid van deze speciale expertise is echografisch onderzoek overigens niet voldoende onderscheidend om BA met voldoende zekerheid aan te tonen of uit te sluiten.
b.2. MRCP
MR cholangiopancreatografie (MRCP) is een algemeen aanvaarde methode om externe galwegen en de confluens bij kinderen in beeld te brengen. Bij BA kunnen de externe galwegen gedeeltelijk of volledig onzichtbaar zijn. Gezonde baby’s tot 3 maanden hebben echter een beperkte galwegdiameter waardoor het niet altijd mogelijk is om een normale fysiologische situatie te onderscheiden van galgangatresie. De aanzienlijke kans op een fout-positieve diagnose in de eerste levensmaanden maakt dat de MRCP minder geschikt is voor het aantonen van BA. Daarnaast hebben de kinderen sedatie of fixatie nodig om de kans op bewegingsartefacten te verminderen.
Tabel |MRCP als screeningstest |
||||||
Publicaties |
Prevalentie van galgangatresie |
MRI kenmerken |
Aantal per 100 patiënten |
|||
TP |
FP |
FN |
TN |
|||
Wang 2018 (MA of 5 studies) |
48% (pooled data) |
Not specified |
46 |
22 |
2 |
30 |
Patiënten: zuigelingen verwezen van de tweedelijns kindergeneeskunde zorg naar een kinderhepatologisch centrum wegens een verdenking op galgangatresie. Rol van de test: screening (alleen patiënten met gedeeltelijk of volledig onzichtbare galwegen vervolgen het testtraject voor galgangatresie; zie figuur 1 in module 4). Omgeving: universitair ziekenhuis Indextest: MRCP Referentiestandaard: histologische kenmerken van obstructieve cholangiopathie, abnormaal operatief cholangiogram, macroscopische beoordeling tijdens chirurgie of een combinatie. |
Very low GRADE |
The evidence of a meta-analysis of five pediatric studies suggests that MRCP in young infants with persistent jaundice or acholic stools is not a suitable screening test for biliary atresia. Sources: Wang 2018 |
OVERWEGINGEN – VAN BEWIJS NAAR AANBEVELING
Het uitvoeren van een MRCP bij baby’s jonger dan 3 maanden met een verdenking op galgangatresie leidt tot een relatief hoog percentage fout-positieve uitslagen. Daarnaast is de voorbereiding voor een MRCP bewerkelijk vanwege sedatie en/of fixatie van de zuigeling. De werkgroep is van mening dat deze bezwaren het gebruik van de MRCP als screeningstest niet rechtvaardigt.
b.4. ERCP
Methodologie
We selecteerden de studies over baby’s met cholestase die voldoende data bevatten om 2x2 tabellen te kunnen reconstrueren. Een ERCP-resultaat werd als positief beschouwd wanneer de intrahepatische galwegen niet gevisualiseerd konden worden. In sommige onderzoeken werden baby’s met een choledochus malformatie of galstenen ook bij de analyses betrokken. Wij hebben deze patiënten weggelaten uit de tabel, aangezien choledochusafwijkingen en galstenen betrouwbaar middels niet-invasieve beeldvorming (echo of MRCP) kunnen worden aangetoond, en omdat ze de diagnostic accuracy voor galgangatresie beïnvloeden. In Nederland zouden we deze categorie kinderen niet aan een ERCP blootstellen.
Bij alle geïncludeerde studies kwam het voor dat het cannuleren van de papil van Vater niet lukte en derhalve het vullen van de galwegen met contrast mislukte. Het is methodologisch onjuist om deze failures uit te sluiten van analyse. Wij creëerden daarom een tussenlaag in de 2x2 tabel (x+y, tabel ERCP 2x2 table meets the real world). In de analyses (weergegeven in tabel ERCP als screeningstest) werden de cellen x en y geclassificeerd als een positieve testuitslag, aangezien het kind bij een twijfelachtige uitslag het testtraject zal vervolgen naar een operatief cholangiogram.
Tabel | ERCP 2x2 table meets the real world.
|
Galgangatresie aanwezig |
Galgangatresie afwezig |
Intrahepatische galwegen niet gevisualiseerd |
TP |
FP |
ERCP met niet-conclusieve uitkomst |
x |
y |
Normale intrahepatische galwegen |
FN |
TN |
Tabel | ERCP als screeningstest |
|||||||||
Publicatie |
Prevalentie van galgangatresie |
Patiënten met choledochus malformatie of stenen zijn geëxcludeerd |
Aantal per 100 patiënten |
||||||
TP |
FP |
x |
y |
FN |
TN |
||||
Linuma (2000) |
63% |
4 |
63 |
0 |
13 |
0 |
0 |
24 |
|
76 |
0 |
|
|
0 |
28 |
||||
Keil (2009) |
63% |
14 |
54 |
2 |
11 |
1 |
0 |
33 |
|
65 |
3 |
|
|
0 |
33 |
||||
Ngem (2018) |
62% |
0 |
62 |
7 |
5 |
6 |
0 |
26 |
|
67 |
13 |
|
|
0 |
26 |
||||
Saito (2014) |
77% |
13 |
71 |
0 |
6 |
3 |
0 |
20 |
|
77 |
3 |
|
|
0 |
20 |
||||
Shanmugan (2009) |
52% |
0 |
46 |
6 |
6 |
0 |
0 |
42 |
|
52 |
6 |
|
|
0 |
42 |
||||
Shteyer (2012) |
65% |
4 |
57 |
4 |
9 |
4 |
0 |
35 |
|
66 |
8 |
|
|
0 |
35 |
||||
Patiënten: cholestatische zuigelingen opgenomen in een universitair kinderhepatologisch centrum wegens een verdenking op galgangatresie. Rol van de test: screening (alleen patiënten bij wie de intrahepatische galwegen niet gevisualiseerd kunnen worden zullen een operatief cholangiogram krijgen). Omgeving: universitair ziekenhuis Indextest: ERCP Referentiestandaard: abnormaal operatief cholangiogram, macroscopische beoordeling tijdens chirurgie of een combinatie |
OVERWEGINGEN – VAN BEWIJS NAAR AANBEVELING
Een ERCP-onderzoek vindt plaats onder algehele anesthesie. De endoscoop wordt hierbij ingebracht tot in het duodenum, waarna een kathetertje via de papil van Vater in de galwegen wordt geplaatst. Via de katheter wordt contrastvloeistof in de galwegen gebracht en de anatomie kan via röntgendoorlichting gevisualiseerd worden. Bij jonge kinderen is de failure rate, die gedefinieerd wordt als het niet kunnen opspuiten van de galwegen circa 11%, hetgeen hoger is dan bij volwassen.
Wanneer de intra- en extrahepatische galwegen bij een ERCP-onderzoek zichtbaar gemaakt kunnen worden, dan is galgangatresie met 100% zekerheid uitgesloten. De kans op een foutpositieve uitslag, oftewel de kans dat een zuigeling ten onrechte wordt blootgesteld aan een operatief cholangiogram, varieert van 0 tot 16%.
Kinder-ERCP is geen universeel geaccepteerde screeningstest voor galgangatresie vanwege de invasiviteit van het onderzoek, de kans op complicaties en de zeldzaamheid van de procedure. In Europa en Israël wordt het ERCP-onderzoek zelden tot nooit ingezet om galgangatresie te diagnosticeren of uit te sluiten (Koot, 2020).
ERCP kan nog wel een rol vervullen als 2e diagnostische test bij een niet-conclusief leverbiopt (Koot 2020 en bv Feldman AG, Sokol RJ. Semin Pediatr Surg. 2020), bijvoorbeeld in plaats van het herhalen van een leverbiopsie bij dubieuze uitslag (Hadzic, 2010). In ervaren handen bleek in 42% van de gevallen laparoscopie te voorkomen.
c.1. Leverbiopsie
Het leverbiopt wordt gezien als een essentieel onderdeel van het diagnostische traject, voordat een baby met cholestase aan een operatief cholangiogram kan worden blootgesteld. Het leverweefsel wordt idealiter verkregen via een percutane echogeleide levernaald onder algehele narcose.
Interpretatie van de histologische kenmerken is uitdagend. Er is overlap tussen BA kenmerken en andere cholestatische ziekten. Enkele histologische kenmerken die significant vaker voorkomen bij BA zijn ductulaire reactie, galpluggen, portale fibrose en canaliculaire cholestase. Deze tekenen van obstructieve cholangiopathie worden in de weken na de geboorte geleidelijk aan steeds duidelijker. Ductopenie en reusceltransformatie komen juist significant vaker voor bij de niet-BA groep.
Tabel | Leverbiopsie als screeningstest |
||||||
Publicaties |
Prevalentie van galgangatresie |
Histopathologische kenmerken |
Aantal per 100 patiënten |
|||
TP |
FP |
FN |
TN |
|||
(MA of 11 studies) |
50% (pooled data) |
|
49 |
4 |
1 |
46 |
Russo 2016 |
60% |
marked bile ductular proliferation, bile duct plugs, portal fibrosis and canalicular cholestasis |
53 |
3 |
7 |
37 |
Patiënten: cholestatische zuigelingen opgenomen in een universitair kinderhepatologisch centrum wegens een verdenking op galgangatresie. Rol van de test: screening (alleen patiënten met histologische kenmerken van galgangatresie vervolgen het testtraject voor galgangatresie; zie figuur 1 in module 4). Omgeving: kinderhepatologisch centrum Indextest: percutaan leverbiopt Referentiestandaard: abnormaal operatief cholangiogram, macroscopische beoordeling tijdens chirurgie of een combinatie |
Moderate GRADE |
The evidence of one meta-analysis and one observational study (which was not included in the meta-analysis) suggests that the presence of typical histopathologic features increases the likelihood of biliary atresia. Sources: Wang 2018, Russo 2016 |
Het uitvoeren van een leverbiopt bij zuigelingen met een sterke verdenking op galgangatresie leidt tot een klein percentage fout-negatieve of fout-positieve uitslagen. In het eerste geval kan het nodig zijn om het diagnostische traject opnieuw te doorlopen, inclusief een tweede leverbiopt. In het geval van een fout-positieve uitslag zal de patiënt ten onrechte een operatief cholangiogram ondergaan. Ofschoon het preoperatieve leverbiopt van alle diagnostische testen een hoge accuracy heeft, is ook dat niet de methode waarmee een definitieve diagnose BA gesteld wordt. Daarvoor is een intra-operatief cholangiogram nodig. Wanneer het leverbiopt een niet-conclusieve uitslag heeft, dient het cholestatische kind over de tijd gevolgd te blijven worden. Het kan nodig zijn om het diagnostische traject, zoals beschreven in module 4, opnieuw te doorlopen. Dit kan ook betekenen dat op termijn een tweede leverbiopt geïndiceerd is.
Genetische en metabole diagnostiek
Inleiding
Neonatale cholestase kan een complexe diagnostische uitdaging zijn, met als mogelijke oorzaak ook zeldzame metabole en erfelijke leverziekten die zelfs fataal kunnen verlopen. Vroeg diagnosticeren en gericht behandelen (farmacologisch, soms zelfs transplantatie) kan levensreddend zijn en irreversibele schade voorkomen. Voor sommige aandoeningen is de prognose infaust en staat een vroege diagnose toe om tijdig palliatieve zorg en beslissingen zo goed mogelijk te organiseren. De differentiaal diagnose voor leverfunctiestoornissen op jonge leeftijd is lang, wat een breed scala aan diagnostiek kan vergen. De klinische symptomen van metabole en genetische aandoeningen zijn regelmatig heel divers, waarvan een minderheid zich presenteert als neonatale cholestase. De werkgroep is van oordeel dat het te ver voert om in deze module alle mogelijke metabole en genetische oorzaken van neonatale cholestase uitvoerig te behandelen. Tegelijkertijd vindt de werkgroep het waardevol om de metabole en genetische vervolgdiagnostiek (meestal via de Metabole Ziekten in een tertiair centrum) aan te leveren in de vorm van een supplement bij deze Richtlijn.
Literatuuronderzoek werd verricht naar het meest efficiënte diagnostisch protocol wat betreft opbrengst, tijdsduur tot aan uitslag, behandelbaarheid van de aandoening, kosten en patiëntvriendelijkheid. De meeste nadruk heeft hierbij gelegen op de diagnostische accuratesse van de testen om een diagnostische work-up te formuleren gebaseerd op beschikbaar bewijs en expertise.
Voor- en nadelen van de interventie en de kwaliteit van het bewijs
Er waren geen studies die voldeden aan de inclusiecriteria voor deze zoekvraag. De aanbeveling wordt onderbouwd door expert opinion en patiëntvoorkeur/-ervaring.
De geïnteresseerde lezer wordt verwezen naar Supplement Metabole en Genetische Diagnostiek.
Overwegingen
Alhoewel de evidentie in de literatuur beperkt is (geen hits op onze PICO), is het onomstreden dat metabole ziekten neonatale cholestase kunnen veroorzaken. De frequentie van voorkomen in deze groep is onbekend, maar zeker is dat het gaat om zeldzame aandoeningen.
Gezien het feit dat de meeste metabole ziekten als onderliggende aandoening bij neonatale cholestase behandelbaar zijn en dat een tijdige diagnose en interventie onomkeerbare orgaanschade c.q. levenslange beperkingen (verstandelijk en lichamelijk) kan voorkomen, is tijdige metabole c.q. genetische diagnostiek aanbevolen, met name nadat eerst geëvalueerd is of de neonatale cholestase veroorzaakt is door galgangatresie. De werkgroep gaat ervanuit dat de metabole en genetische diagnostiek in een tertiair centrum wordt verricht.
Onderbouwing
Achtergrond
Neonatale cholestase is een symptoom van een aantal relatief zeldzame maar ernstige leverziekten bij jonge zuigelingen. Als eerste screening om te bepalen of er sprake is van neonatale cholestase, wordt geadviseerd om bij zuigelingen met ontkleurde ontlasting voor de leeftijd van 3 weken en alle zuigelingen met icterus op de leeftijd van 3 weken bloed onderzoek te doen om het totaal en directe serumbilirubine te bepalen. Bij directe hyperbilirubinemie spreken we van neonatale cholestase.
Een van de oorzaken van neonatale cholestase is galgangatresie (synoniem: biliaire atresie, BA). BA leidt tot chronisch leverfalen op jonge leeftijd. Vroege herkenning van BA is cruciaal omdat tijdige chirurgische behandeling de prognose sterk kan verbeteren. Hoe eerder de Kasai hepatoportoenterostomie verricht kan worden, des te beter de overlevingskansen van kinderen met BA met eigen lever. Ook bij andere vormen van neonatale cholestase is het zo vroeg mogelijk starten van behandeling in het belang voor de prognose, bijvoorbeeld ter voorkoming van hersenschade en ernstige vitamine-K afhankelijke bloedingen. De werkgroep ging na welk laboratoriumonderzoek de verdenking op BA waarschijnlijker kan maken.
Conclusies
Moderate GRADE |
The evidence of eight paediatric studies suggests that an elevated GGT in infants with persistent jaundice or acholic stool increases the likelihood of biliary atresia. Sources: Liu 2019, Shneider 2017, Tang 2007, Chen 2016, Wongsawasdi 2008, El-Guindi 2014, Lee 2015, Rafeey 2016 |
Low GRADE |
The evidence of 3 paediatric studies suggests that measuring direct bilirubin in isolation in infants with persistent jaundice or acholic stools is a good screening test to differentiate direct (pathological) from indirect hyperbilirubinemia, but it is not a good screening test for biliary atresia. Sources: Liu 2019, Poddar 2009, Lee 2015 |
Samenvatting literatuur
The databases Medline (via OVID) and Embase (via Embase.com) were searched with relevant search terms from January 2000 until December 9th, 2020. The detailed search strategy is depicted under the tab Methods. The systematic literature search resulted in 4076 hits for the whole module. Studies were selected based on the following criteria: relevance to PICO, systematic review (with meta-analysis), or observational study. Based on title, abstract 55 studies were initially selected based on title and abstract screening. After reading the full text, 49 studies were excluded for sub question a, two (2) studies were excluded for sub question b, and twelve (12) studies were excluded for sub question c (see the table with reasons for exclusion under the tab Methods). Six studies were included for sub question a, two studies for sub question b, and no studies were included for sub question c.
Results
Description of studies
a. Blood biochemistry
Wang et al. (2018) performed a systematic review with the aim to analyse the accuracy of different methods for diagnosing BA patients with infantile cholestasis. They searched PubMed, EMBASE and the Web of Science databases for articles published up to July 2017. The inclusion criteria for the identified articles were as follows:(1) diagnostic test accuracy (DTA) studies evaluating sensitivity and specificity of at least one of B-US, MRCP, acholic stool, serum liver function test, hepatobiliary scintigraphy and percutaneous liver biopsy, (2) articles were published in full texts in English and (3) studies with sufficient information for analysis. The exclusion criteria for the identified articles were as follows: (1) letters, reviews, case reports, conference abstracts, editorials, expert opinion reviews and abstracts, (2) data of sensitivity, specificity is incorrect or insufficient for analysis or evaluated by more than one researcher without a consensus,(3) screening studies with a large population without cholestasis and (4) studies with overlapping cases and data.
If the cases of two or more studies overlapped each other they gave priority to the study with more diagnostic methods evaluated and with more patients included if diagnostic methods were the same. Screening was performed in duplicates, independently, by two researchers at all stages. Disagreements in study selection between the two reviewers were resolved through discussion and consensus. To evaluate the quality of included studies Wang et al used the version 2 of the Quality Assessment of Diagnostic Test Accuracy Studies (QUADAS-2) tool which was assessed by two researchers. All disagreements were discussed, and consensus was reached. In total 38 articles were included, of them there were 21 articles in which final diagnosis methods of BA explicitly included intraoperative cholangiography (IC) solitarily or in combination with histology and/or surgery, one did not use IC and 8 articles did not mention how to definitely reach diagnosis of BA . Of the 38 articles, 25 articles performed the diagnostic test when the reference test results were unknown, 10 articles knew the reference test results in advance and 3 articles did not mention.
For the analysis of serum liver function tests seven studies could be included; it was not described how the diagnosis of BA in these specific studies was established. Remarkably, it was not known which specific laboratory results were included in the studies (AST, ALT, GGT, AF, otherwise). Five studies were prospective, 1 retrospective and of one study the design is unknown. Studies were performed in 5 different countries (2 Iran, 2 Egypt, 1 India, 1 Thailand, 1 Korea). A total of 493 patients were included, 231 were diagnosed with BA.
C. Dong et al (2018) evaluate retrospectively different methods to differentiate BA from non-BA related disorders. They performed their study during the period of May 2007 to June 2011 in China. Inclusion criteria of de study were (1) existence of jaundice in infancy (including the neonatal period) without remission; (2) pale or light yellow faeces; (3) hepatomegaly or hepatic texture change; (4) direct hyperbilirubinemia. No exclusion criteria were described. Follow-up was more than a year. The subjects were diagnosed clinically as BA if: (1) The jaundice appeared in infancy (including the neonatal period) without remission. (2) The colour of the faeces was pale for longer than 15 days. (3) The liver was enlarged more than 3 cm, or texture change was seen. The subjects were diagnosed as BA if the serum GGT ≥300 IU/L. The subjects were diagnosed as non-BA cholestasis if the serum GGT <300 IU/L. This cut-off level was predefined. A total of 396 patients were included in the study ( 178 BA, 218 non-BA).
Harpavat et al (2016) presented a case-control study on direct bilirubin (DB or CB) measurements in newborns with and without biliary atresia They aimed to investigated if DB or CB measurements in the
newborn period could be sensitive and specific for BA. Eligible subjects in the BA group were cared for at Texas Children’s Hospital (TCH) and born between January 1, 2011, and December 31, 2014. Eligible infants were either presented to TCH initially or were referred to TCH for further care. For those diagnosed in time for the Kasai operation, the BA diagnosis was made by intraoperative cholangiogram and pathological assessment of the bile duct remnant. For those identified too late for the Kasai operation, the BA diagnosis was inferred by the initial needle liver biopsy and subsequent analysis of liver and bile duct tissue removed at the time of liver transplantation/autopsy. A total of 9163 patients were included, prevalence of BA was 0.06%. Laboratory results were known of 35 out of 61 patients. The cut off level of direct bilirubin was the upper limit of what is normally used by the laboratory: 0.3 mg/dl (5.1 µmol/L).
Liu et al (2019) aimed to design and validate a non-invasive diagnostic prediction criterion combining graphical, biochemical, and clinical examination for the early discrimination of BA in infants with neonatal cholestasis. They examined records of 2 consecutive cohorts of infants with neonatal cholestasis recruited from the Pediatric Hepatology Department, Xian Children’s Hospital, China, between January 2011 and June 2016. Cholestasis was diagnosis by considering direct bilirubin (DBIL) of more than 20% of the total bilirubin (TBIL), an increased serum bile acid (SBA) concentration, and other biochemical indicators and clinical symptoms The diagnosis of BA was confirmed by laparotomy with or without intraoperative cholangiogram (IOC). Cases without BA at IOC and those with other causes of cholestasis were included in the non-BA group. Excluded were patients with 1) Gilbert syndrome, 2) born prematurely, 3) sepsis, 4) or those receiving total parenteral nutrition. In total, 482 patients were included, 166 patients with BA. Based on the ROC curve the cut-off values for optimal clinical of individual parameters were determined.
Shneider et al (2017) performed a longitudinal prospective study in the USA and Canada between April 2004 and February 2014. The objective of this study was to determine the predictive value for BA of typical testing performed in the evaluation of cholestatic infants prior to the decision for invasive testing (e.g., liver biopsy, cholangiography, exploratory laparotomy). A secondary goal was to develop a diagnostic algorithm to help guide the clinician’s decision-making for invasive testing.
Inclusion criteria were: 1) age under 180 days at presentation to a ChiLDReN centre; and 2) serum direct bilirubin >20% of total bilirubin (TB) and ≥ 2mg/dl. Exclusion criteria were: 1) acute liver failure; 2) previous hepatobiliary surgery; 3) bacterial or fungal sepsis; 4) hypoxia, shock, or ischemic hepatopathy; 5) malignancy; 6) primary haemolytic disease; 7) drug or total parenteral nutrition-associated cholestasis; 8) extracorporeal membrane oxygenation (ECMO)-associated cholestasis; or 9) birth weight. Participants were included only if they had laboratory studies indicating direct hyperbilirubinemia that were performed at the time of “presentation” to the ChiLDReN clinical site. Inclusion in the BA cohort (Group 1) for this analysis required either the performance of a biliary drainage procedure for BA or exploratory surgery with the finding of an atretic extrahepatic bile duct by either inspection or attempted cholangiography. Inclusion in the non-BA cohort (Group 2) required the identification of a specific alternative aetiology for their cholestasis or cholangiography that excluded BA. For an infant with the clinical diagnosis of idiopathic neonatal hepatitis (INH) or idiopathic cholestasis (IC) to be included in this analysis, resolution of cholestasis was required as defined by a subsequent TB 120 days of age (without hepatic portoenterostomy). INH was defined as neonatal cholestasis in which histologic evidence of giant cell hepatitis was present on liver biopsy and for whom no other aetiology was confirmed. IC was defined as neonatal cholestasis that resolved in an infant who did not undergo liver biopsy or did not have giant cell hepatitis on a liver biopsy, and for whom no other aetiology was confirmed. Diagnoses in the 259 Non-BA infants who met study criteria (Group 2) included IC (n = 72), INH (n = 61), alpha-1 antitrypsin deficiency (n = 31), Alagille syndrome (n = 28), panhypopituitarism (n = 12), cytomegalovirus infection (n = 10), bile duct paucity (n = 10), progressive familial intrahepatic cholestasis (n = 8), cystic fibrosis (n = 6), mitochondrial disease (n = 6), bile acid synthesis defect (n = 5), and other (n = 8; 1 each for hemophagocytic lymphohistiocytosis, hereditary spherocytosis, neonatal ascites, Caroli’s disease, perinatal sclerosing cholangitis, porphyria, hyperinsulinism, and duplicate gall bladder. A total of 875 children were included, 401 were diagnosed with BA. Using logistic regression analysis, Schneider et al. (2017) developed a predictive model for biliary atresia (BA) using data available at initial presentation in 401 cholestatic infants with BA and 259 cholestatic infants without BA. They admit that despite the relatively good accuracy of the prediction model, the high precision required for differentiating BA from Non-BA was not achieved.
Tang et al (2007) performed a retrospective study in which they analysed the diagnostic value of GGT level, GGT/AST ratio, and GGT/ALT ratio in 93 BA and 65 neonatal hepatitis (NH) patients. NH is another form of liver disease at neonatal age. Therefore, the BA patients in this study were not compared to healthy controls, but rather to disease controls, namely NH. Charts were reviewed with the diagnosis of BA and NH from 1986 to 2005 at Kaohsiung Chang Gung Memorial Hospital in Taiwan. BA and NH were confirmed with serum bilirubin level, complete liver function profile, liver histology, and intraoperative findings. Cases with aetiologies other than BA and NH, such as sclerosing cholangitis, progressive familial intrahepatic cholestasis, alpha-1-antitrypsin deficiency, and Alagille syndrome were excluded from the study. Ninety-three biliary atresia patients and sixty-five non-BA patients were included in this study. A cut off value of 300 U/L was used for GGT, possibly based on the study of Liu but this was not clearly described.
b. Imaging
b.1. Ultrasonography
A total of 8 (eight) studies were selected for full text analysis based on title and abstract. All studies included in one of the three systematic reviews were already excluded. Eventually only these three reviews were included, because the other five were older than the included systematic reviews of or did not report accuracy for BA (Fitzpatrick 2010).
Sun (2020) focused on the evaluation of the diagnostic accuracy of Hepatic Subcapsular Flow (HSF) with ultrasound. Literature search was conducted in PubMed, EMBASE, Chinese National Knowledge Infrastructure (CNKI), and Technology of Chongqing (VIP) up to January 1, 2019. Inclusion criteria were: (1) evaluation of the diagnostic potential of HSF for BA, (2) case-control design with a control group of patients with non-BA disease, and (3) sufficient data to calculate the diagnostic parameters. Exclusion criteria were as follows: (1) duplicate publications; (2) letters, editorials, and case reports or reviews; and (3) studies lacking complete data. Two reviewers independently extracted the necessary data. In addition, Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) was applied to assess the quality of the included studies. The initial literature search identified a total of 463 published records, 232 were excluded as duplicate publications. Then, 231 articles were left behind for the next assessment. Using prudent judgment, 180 articles were excluded as they were reviews and letters or were not related to the theme of BA or HSF. Thus, 51 articles were available for further examination. By reviewing the full text of the remaining articles, 42 articles had insufficient data or no relevance to the diagnosis and were excluded. Finally, nine studies related to HSF in the early detection of BA were included, four prospective and five retrospective.
A total of 772 patients were included, 363 patients were diagnosed with BA.
Wang (2018) performed a systematic review that was also used for the laboratory results. A total of 23 studies were included in the analysis, 17 prospective, 5 retrospective and one unknown design. The different studies used different parameters on ultrasound but are all pooled together. A total of 1801 patients were included; 759 patients were diagnosed with BA.
Zhou (2016) performed a meta-analytic review which aimed to systematically review and summarize published studies on the diagnostic performance of the triangular cord sign and gallbladder abnormalities on ultrasound as well as the absence of CBD, enlargement of the HA, appearance of hepatic subcapsular flow (HSF) to determine whether children with jaundice also have BA.
They searched MEDLine and the web of science databases. They included studies performed between January 1990 and May 2015 and only articles published in English. They included studies if two inclusion criteria were met. First, the data needed to include 2 × 2 contingency data on the diagnostic accuracy of ultrasound in identifying biliary atresia in at least 10 patients with and 10 patients without disease. Second, the study needed to use surgery or biopsy for biliary atresia and surgery, clinical follow-up, or some combination of the three as the reference standard for the exclusion of biliary atresia. They used QUADAS-2 tool to assess the quality of studies.
Editorials, letters to the editor, review articles, case reports, and animal experiment studies were excluded. The reference lists of the included studies were manually searched to identify other potentially eligible articles. Two reviewers independently evaluated all relevant studies for eligibility criteria. Disagreements in study selection between the two reviewers were resolved through discussion and consensus. If no consensus could be reached, a third reviewer was consulted. Based on title and abstract 59 out of 419 identified studies were included. Of them 24 did not meet inclusion criteria, 6 studies were case reports, four articles were not published in English and two articles were letter to the editor. Finally, 23 articles were selected for data extraction. Of these studies 19 studies analysed gallbladder abnormalities, 20 triangular cord sign and five the hepatic enlargement. Eleven studies were reported as prospective, 11 were reported as retrospective, and one did not specify type of study. A total of 2136 patients were included; 864 patients had been diagnosed with BA.
b.2. Magnetic resonance cholangiopancreatography (MRCP)
A total of 14 studies were selected for full text analysis based on title and abstract. Eventually 13 studies were excluded, because they were older than the included systematic review of Wang (2018), did not report the accuracy of the test for detecting BA, population did not match, or the design was case-controlled.
Wang (2018) performed a systematic review as described above. For the analysis of MRCP five studies could be included. IN one of them no cholangiography or operation was used as golden standard. Of these five had a prospective study and 1 retrospective design. A total of 381 patients were included, 183 were diagnosed with BA.
b.3. Hepatobiliary iminodiacetic acid (HIDA) scan
A total of 12 studies were selected for full text analysis based on title and abstract. Eventually 10 studies were excluded, because they were older than the included systematic review of Wang (2018), described only cases or they included another population than intended.
Wang (2018) performed a systematic review as described above. For the analysis of hepatobiliary scintigraphy eighteen studies could be included. Of these eighteen studies 9 had a prospective study design, 8 a retrospective design and of one study this was not clearly described. A total of 1533 patients were included, of whom 535 were diagnosed with BA.
Tsuda (2019) aimed to evaluate the usefulness of 99mTc-PMT scintigraphy for suspected BA. They retrospectively analyzed data of patients between August 2001 and October 2018. All patients were diagnosed BA or non-BA with preoperative cholangiogram or follow-up as the gold standard. The preparation included cessation of breastfeeding for at least 4 h before the test. Imaging was carried out on a Single Head Gamma Camera using a low-energy high-resolution (LEHR) collimator or low-medium energy general purpose (LMEGP) collimator with the patient in a supine position. The dose of 99mTc-PMT was intravenously injected based on age or body weight. Anterior view hepatic phase dynamic images were obtained for 60 min and followed by anterior static images at least once during 2–7 h. If no radiotracer was detected in the bowel until then, a delayed scan was taken at 24 h. Normally, most of the tracer is accumulated in the liver within 5 min after injection. The tracer
appears in the proximal small bowel within 30 min. Cases were regarded as positive for BA, if the scans showed good liver uptake with no intestinal excretion till 24 h, Two nuclear medicine physicians visually analyzed the images. If the two readers’ interpretations were conflicting, a third nuclear medicine physician reviewed the study. In total 52 patients were included, 35 were diagnosed with BA.
b.4. Endoscopic retrograde cholangiopancreatography (ERCP)
A total of 8 studies were selected for full text analysis based on title and abstract. Eventually 2 studies were excluded.
Iinuma (2000) assessed the utility of ERCP in diagnosis of cholestatic liver disease in infants. They included children who were admitted to the hospital with suspicion of obstructive jaundice between April 1982 and December 1996. All ERCPs were performed by one endoscopist in the same way. Reference test was exploratory laparotomy. In the period 65 patients met inclusion criteria, in 50 patients ERCP was performed. The remaining 15 patients underwent exploratory laparotomy directly. Of these infants no diagnosis is described in the article. In 46 of the remaining 50 patients diagnosis of cholestasis had not been conclusive based on other diagnostic modalities. The other four were diagnosed with congenital biliary dilatation. Prevalence of BA in the study population was 35/50 (70%).
Keil (2010) aimed to determine the safety and diagnostic efficiency of ERCP in the diagnosis of cholestatic liver disease in neonates and infants. They analyzed retrospectively the ERCP examination performed in neonates and infants aged 1 year or younger with cholestatic jaundice during the period from December 1998 and March 2008. In total 104 children were referred. Before ERCP all children underwent and diagnostic workflow according to hospital protocol. Children with infectious, metabolic, and endocrine disorders were excluded. ERCP was only performed in patients with suspicion of extrahepatic biliary obstruction based on absence of positive findings on prior diagnostic procedures. All procedures were performed by one or two endoscopist(s) with the same protocol. BA was strongly suspected when only the pancreatic duct could be filled. Contrast filling of the common bile duct, cystic duct, and gallbladder only without the common hepatic duct was defined as type II BA. If neither the CB or papilla could be filled, or papilla cannulation could not be performed diagnosed was defined as inconclusive. In the whole population BA prevalence was 51/104 (54%), in children after technically successful ERCP 49/95 (52%).
Negm (2018) included all infants with neonatal cholestasis between 2000 and 2014 who presented to the endoscopy unit in Hannover with suspicion of BA. The aim of the study was to evaluate the role of ERCP in the diagnosis of BA. All included infants had neonatal cholestasis with unknown origin that could not me clarified using ultrasound or hepatobiliary scintigraphy. Endoscopic procedure was performed with the same protocol, the only difference was use of air in the first 140 infants, CO2 thereafter. Contrasting the pancreatic duct only or the cystic duct with the gallbladder and/or rudimentary common bile duct was considered positive for BA. If BA was excluded by ERCP, a liver biopsy was performed. If ERCP was positive for suspected BA, explorative laparotomy was performed, the intraoperative diagnosis was the golden standard. Of the 251 patients BA was confirmed in 155 (61.7%))
Saito (2014) retrospectively analyzed data of pediatric patients undergoing ERCP to clarify the results of pediatric ERCP between 1980 and 2011. Protocol was the same during all the procedures. The visualization of the extrahepatic bile duct, intrahepatic bile duct, pancreatic duct and pancreaticobiliary maljunction was evaluated. They included 235 ERCP procedures in 220 patients, 8 procedures had a therapeutic aim. In this cohort all children till 18 years were included. A sub-analysis was made for the neonatal cholestasis cohort. Ninety-two diagnostic ERCPs in 90 infants were performed for this analysis, the diagnosis of BA was confirmed in 61 children (67%)
Shanmugam (2009) reviewed data of infants younger than 100 days undergoing ERCP between 1997 and 2007 at King’s College Hospital Londen. ERCP was indicated when histopathologic features consistent with large bile obstruction were accompanied by pigmented stools, acholic stools were present but such histologic features were not; or investigations suggested that BA and another condition causing prolonged neonatal jaundices may coexist. ERCP was performed by protocol.
Exploratory laparotomy and intraoperative cholangiography were performed when the intrahepatic biliary tree could not be demonstrated on ERCP or when the operator had been unable to cannulate the ampulla. Total population 224 BA were included, of 3300 patients who were analyzed for liver disease ((6%). In 48 patients ERCP was performed, 25 diagnosed with confirmed BA (52%).
Shteyer (2012) aimed to determine the use, complication rate and success rate of ERCP in a cohort of infants between January 2000 and 2010 who received ERCP as part of neonatal cholestasis in one institution of Israel. During this period only two endoscopist performed ERCP with use of the same protocol for diagnostic analysis prior to ERCP and during ERCP as well. Diagnosis of abnormal ERCP was made according to previous publications. Failed ERCP was defined as the inability to cannulate the ampulla. ERCP results were interpreted after ultrasound (group divided in no suspicion vs. suspected BA). In 27 patients, analyses were done, 15 patients with BA (56%). ERCP failed in 5 patients. Of them 2 were diagnosed with BA, 1 with paucity of bile ducts and two with stones. In one patient ducts did not fill normally, finally diagnosis was neonatal hepatitis.
c.1. Liver biopsy
Seven studies were selected for full text analysis based on title and abstract. Two of these articles were finally excluded. Mandelia (2017) had a study design not compatible with our aims, Chaudhry (2019) did not describe diagnostic accuracy of the biopsy, and Shreef (2016) did only describe the accuracy of biopsy in combination with laparoscopic-guided cholecystocholangiography. Finally, the report of Rastogi et al. was older than that of Wang et al. (2018).
Wang (2018) also performed in the same systematic review as described above an analysis for the diagnostic accuracy of liver biopsy. For this analysis they included eleven studies, 7 prospective, 3 retrospective and one with unknown design. A total of 315 patients, out of 646 patients, were diagnosed with BA.
Ahmed (2021) performed a retrospective cohort study in a tertiary hospital in Saudi Arabia between 2007 and 2019 in which they aimed to evaluate the performance of liver biopsy in infantile cholestasis. This was defined as presence of jaundice, acholic stools, with or without itching, and biochemically when direct bilirubin exceeded 20 μmol/l. All infants who underwent percutaneous LB as part of diagnostic work up for cholestasis during the study period were enrolled. All liver specimens were obtained through ultrasound guided liver needle biopsy under general anesthesia with a Menghini needle gauge 18. The biopsy was not performed by the same person in all patients. All liver specimens were fixed in 10% buffered formaldehyde paraffin-embedded. One pathologist reviewed all biopsies. The biopsy materials were screened for adequacy of size and number of portal tracts. The pathologist evaluated each liver tissue for the presence of the followings: lobular disarray, giant cell transformation, hepatocytes swelling bile duct proliferation (mild-moderate or severe), bile duct plugs, bile duct paucity), bile duct injury, periductular fibrosis (onion skinning), peri-sinusoidal fibrosis, portal fibrosis (grade I–I/grade III–IV), canalicular cholestasis and fatty infiltration (micro- or macro-steatosis). Immunohistochemistry for bile salt export protein (BSEP) to diagnose progressive familial intrahepatic cholestasis type 2 (PFIC2), and for multidrug resistance3 protein (MDR3) to diagnose PFIC3, was not possible. Diagnosis of BA was made when IOC failed to show a patent biliary tree or, when IOC was not possible, demonstration of an atretic extrahepatic biliary tree intra-operatively at Kasai surgery or liver transplant. Of the 522 patients with infantile cholestasis, 166 underwent a biopsy, 122 were included in analysis. In the analysis of the diagnostic accuracy, they used only the biopsies which were performed in high suspicion cases (n=46), 22 patients were finally diagnosed with BA.
Russo (2016) performed a prospective longitudinal study, in which he included a sub study aiming to assess the accuracy of biopsy diagnosis of BA. Eligible subjects were 180 days old or less at the time a liver biopsy was performed and presented at one of 15 participating clinical centers with cholestasis, defined as a serum direct bilirubin greater than or equal to 2 mg/dL and greater than 20% of total bilirubin. Infants who had received parenteral nutrition, who were very low birth weight (<1500 g), who had acute liver failure or cholestasis associated with shock or sepsis, or who had undergone previous HPE, or other hepatobiliary surgery were not eligible for enrollment. All enrollments occurred between June 1, 2004, and November 2, 2014. The clinical diagnosis of BA at the enrolling site was based on an intraoperative cholangiogram and/or examination of the excised biliary remnants. The diagnosis of all non-BA cases was established on clinical, laboratory, or genetic grounds with adequate follow-up, as defined below, to confirm the absence of BA. To assess the accuracy of the biopsy diagnosis of BA, needle liver biopsies obtained at the participating centers at the time of enrollment in PROBE were reviewed centrally by the ChiLDReN pathologists. Wedge biopsies were excluded from the correlation of histology with clinical diagnosis as it was believed the wedge biopsy would bias the pathologists towards a diagnosis of BA. The histopathologic features of 227 needle biopsies from subjects obtained at enrollment were evaluated and compared with the clinical diagnosis as determined at each of the participating institutions. There were 136 BA and 91 non-BA cases.
c.2. Metabolite diagnostics inborn errors of metabolism
c.3. Cholestasis gene panel
RESULTS OF STUDIES
The selected studies analyzed different laboratory results in BA patients, mostly versus non-BA patients, and showed significant differences, which were only present for ALT, direct bilirubin and GGT. We used these parameters for diagnostic accuracy as discussed below.
In Wang et al. pooled data could not be used for our outcome, because the different studies included in this systematic review all used other parameters, with different cut off points. Therefore, if available, the results of the underlying studies were used in this section.
Direct bilirubin
Diagnostic accuracy of conjugated bilirubin was reported in two studies (Harpavat et al, Liu et al).
Sensitivity was respectively 100 and 90.9%, with a negative predictive value of 100 and 93.2%. Both studies showed a lower specificity (98.2 and 65.9%) and a lower positive predictive value (17 and 58.4%), respectively. It should be noted that Harpavat et al. described a case-control study of BA patients compared to a population cohort, in contrast to Liu et al, whose study included a preselected group of children with cholestasis. This likely explains the difference in prevalence.
In the systematic review by Wang et al., only the studies by Poddar et al. and Lee et al. reported diagnostic accuracy for this laboratory result which ranged between 74-97%, 85-94%, 26-61% and 41-62% for sensitivity, NPV, specificity and PPV, respectively.
ALT
Only Liu et al reported accuracy of ALT. With a cut off level of 67.5 the sensitivity was 86.1% with a negative predictive value of 85.4%. Specificity was 42.5% with a positive predictive value of 44%.
In Wang et al. only Deghani et al reported these parameters; sensitivity of 68%, NPV of 77%, specificity of 43% and PPV of 33%.
GGT
This is the most reported significant marker in 4 studies (Dong et al, Liu et al., Tang et al. and Shneider et al.). Cutt off levels differed in these studies between 184 and 300 IU/L. Sensitivity ranged between 40 and 84.5% while PPV ranged between 56 and 90.7% in the first three studies. Shneider et al showed that GGT alone was not sensitive (10%) with also a low negative predictive (22.7%) value, but its accuracy was higher in combination with acholic stool (85.5% and 77.3% respectively)
Included studies in Wang et al. showed accuracy of GGT in Wongsawaadi et al (2008), El Guindi et al (2013), El Guindi et al (2014), Lee et al (2015) and Rafeey et al (2016).
Sensitivity ranged between 66 and 89%, the NPV between 74-86%. The specificity ranged between 67 and 97% and the PPV 50-80%. should be noted that each of these studies had a very small population with high relative prevalence of BA because of preselection.
Due to large heterogeneity of the study populations, no meta-analysis was performed.
b. Imaging
b.1. Ultrasonography
Three systematic reviews analyzed this parameter. Sun analyzed specifically the subcapsular flow on ultrasound. In the nine included studies sensitivity ranged between 73-100% and specificity between 71-100%. Overall sensitivity was 95% (95%, confidence interval (CI), 88-98%) and specificity: 92 (95% CI 85-96%).
In Wang, 23 studies described parameters of ultrasound, and their outcomes were analyzed in combined fashion. Reported sensitivity varied between 27-100% and specificity 71-100%. Overall sensitivity was 77% (95% 74-80%) and specificity 93% (95% CI 91-94%)
Zhou also included 23 studies and analyzed these studies per parameter. Pooled data for triangular cord sign (n=20 studies) showed sensitivity of 74% (95% CI, 61-84%) and specificity of 97% (95% CI, 95–99%). Sub analysis showed for location of bifurcation ( n=10 studies) sensitivity and specificity of 80% (58-92%) and 99% (96-100%), respectively. Measurements of triangular cord on echogenic anterior wall of the right portal vein (n=6 studies) showed 63% (55-71%) and 95% (89-98%) for sensitivity and specificity, respectively. The same analysis for Thickness ≥ 4 mm resulted in 60% (46–72%) and 97% (94–99%).
Hepatic enlargement was analyzed in 5 studies, overall sensitivity was reported 79% (95% CI, 71–86%) and specificity 75% (95% CI, 60–86%) sensitivity and specificity, respectively.
Gall bladder abnormalities were reported in 19 studies, overall, with 85% (95%CI, 76–91%) sensitivity and 92% ((95% CI, 81–97%) specificity. Sub analysis showed for absent gall bladder (n= 10 studies) a sensitivity of 28% (19-40%) and a specificity of 99% (93-100%), respectively. Other parameters were absence or length < 1.5 cm (n=6), or gallbladder wall abnormalities (n=5 studies) and no contraction (n=5) showed for sensitivity 79% (66-88%), 83% (70-91%) 89% (91-93%), respectively, and for specificity 87% (65-96%), 94% (91-96%) and 79% (55-92%), respectively. Five studies described the combination of triangular cord and gall bladder abnormalities, resulting in sensitivity of 79% (95% CI, 71–86%) and specificity of 75% (95% CI, 60–86%).
Due to large heterogeneity of the study populations, no meta-analysis was performed.
b.2. MRCP
Only Wang analyzed the accuracy of MRCP in cholestatic neonates. In this review 5 studies were included. Sensitivity and specificity ranged between 85 to 100% and 36 to 96% respectively. Overall
sensitivity was 96% and 58%, respectively. NPV and PPV were overall 94 and 68%.
b.3. HIDA
Studies used different scintigraphy, different premedication and various timepoints after contrast administration of obtaining the dynamic scan. In most studies no excretion in 24 hours was taken indicative as BA, therefore the studies were comparable. Reported sensitivity was in Wang 84-100% (overall 96%) and in Tsuda 100%. Unfortunately, the specificity ranged between 43 and 93% in Wang (overall 73%) and was 82% in Tsuda.
b.4. ERCP
In Iunama (2006), ERCP was completed in 43 of 50 patients (86%). Of these seven patients, 6 were diagnosed with BA. In 29 of the 43 patients, complete visualization of the biliary tree was not achieved, or only pancreatic duct was visualized. In all the patients BA was confirmed with exploratory laparotomy. In most of them on ERCP only the pancreatic duct was visible (n=24/29). In one patient the short distal end of common bile duct and pancreatic duct was visualized.
Keil (2009), 95/104 were fulfilled successfully (91.3%). Eight of the nine children with failed ERCP were later diagnosed with BA. In the remaining 95 children, 51 children were identified as BA patients. During laparotomy 49 of the 51 identified patients were confirmed to have BA.
Negm (2018) included 251 patients, 224 ERCPs were successful. ERCP suggested BA in 159 cases, 154 underwent laparotomy. Of one infant no data is presented, 4 children underwent no laparotomy. Intraoperatively 140 infants were confirmed with BA. In 52 cases ERCP was normal and in 13 ERCP showed another diagnosis. No BA was confirmed in one of these 65 children. For this analysis they excluded data of the failed ERCP in 27 infants. In this group 13/27 underwent a Kasai procedure.
Saito (2014) performed 92 ERCPs in 90 patients, 84 succeeded. Of these group 56 patients were confirmed with BA, 28 were diagnosed with choledochal cyst (n=12), hepatitis (n=7), paucity (n=3), unknown origin (n=4) and normal ERCP (n=2). Of the 8 failed ERCP 5 were finally diagnosed with BA.
Shanmugam (2009) performed 48 ERCPs, 45 were successful. In 20 patients the complete biliary tree was visualized, 25 were suspected of BA. Of these patients 22 were confirmed with intra operative cholangiography. In the three unsuccessful ERCPs BA was confirmed in all.
Shteyer (2012) performed 27 ERCPs, 15 were diagnosed with BA finally. ERCP was conclusive for BA in 15 patients, no false negatives were reported. In 12 patients without BA 5 ERCP procedures failed and in one infant the ductus filled not normally.
Due to large heterogeneity of the study populations, no meta-analysis was performed.
c.1. Liver biopsy
Wang included 11 studies in these analysis with comparable results. Sensitivity ranged between 94 and 100% while the specificity ranged between 84 and 100%. Overall sensitivity was 96% and specificity 73%. Ahmed was a small study which showed sensitivity of 86% and specificity of 67%. Russo showed in their prospective design a sensitivity of 89% and a specificity of 85%. The overall accuracy was 90% in their cohort.
SUMMARY OF DIAGNOSTIC ACCURACY CHARACTERISTICS PER STUDY
Direct bilirubin
Study |
Cut-off level in umol/l |
Prevalence of BA |
Sensitivity (95% CI) |
NPV (95% CI) |
Specificity (95% CI) |
PPV (95% CI) |
Harpavat et al. (2016) |
26.5 |
35/9163 |
35/35 (100%, 90-100%) |
8936/8936 (100%, 99.9-100%) |
8936/9102 (98.2%, 97.8-98.2%) |
35/201 (17%, 13-23%) |
Liu et al. (2019) |
90.3 |
166/482 |
151/166 (90.9%, 86-94%) |
208/223 (93%, 89-96%) |
208/316 (65.9%, 60.4-70.8%) |
151/259 (58.4%, 52.2-64.1%) |
Wang et al. (2018) |
||||||
Poddar et al (2009) |
68 |
35/101 |
34/35 (97%, 85-99%) |
17/18 (94%, 74-99%) |
17/66 (26%, 17-37%) |
34/83 (41%, 31-52%) |
Lee et al (2015) |
440 |
46/100 |
34/46 (74%, 60-84%) |
33/39 (85%, 70-93%) |
33/54 (61%, 48-73%) |
34/55 (62%, 49-73%) |
ALT
Study |
Cut-off level U/L |
Prevalence of BA |
Sensitivity (95% CI) |
NPV (95% CI) |
Specificity (95% CI) |
PPV (95% CI) |
Liu et al. (2019) |
67.5 |
166/482 |
143/166 (86.1%, 80.1-90.6%) |
134/157 (85.4%, 790.0-90.0%) |
134/316 (42.5%, 37.1-47.9%) |
143/224 (44%,57.3-69.9%) |
Wang et al (2018) |
||||||
Deghani et al (2006) |
5 times less than normal mean |
19/65 |
13/19 (68%, 46-85%) |
20/26 (77%, 58-89%) |
20/46 (43%, 30-58%) |
13/39 (33%, 21-49%) |
GGT
Study |
Cut-off level U/L |
Prevalence of BA |
Sensitivity (95% CI) |
NPV (95% CI) |
Specificity (95% CI) |
PPV (95% CI) |
Liu et al. (2019) |
184
180 |
166/482
(=34%) |
143/166 (86.1%, 80.1-90.6%) 87% |
216/238 (90.7%, 86.4-93.8%) 91% |
216/316 (68.3, 63.0-73.2%) 67% |
143/244 (58.5%, 52.3-64.6%) 58% |
Shneider
+ acholic stool |
|
|
|
|
|
|
204
204 |
401/875 343/660 (=52%) |
343/401 (85.5%, 81.8-88.6%) 88% |
198/256 (77.3%, 71.8-82.1%) 79% |
198/259 (76.4%, 70.9-81.2%) 48% |
343/401 (85.5%, 81.8-88.6%) 65% |
|
Tang |
300 |
93/158 68/122 (=56%) |
27/68 (40%, 29-52%) 40% |
53/94(56%, 46-66%) 56% |
53/54 (98%, 90-100%) 98% |
27/28 (96%, 92-99%) 96% |
Chen 2016 |
303 |
1338/1469 (=91%) |
83% |
32% |
82% |
98% |
Wang et al |
||||||
Wongsawadi (2008) |
500 |
31/61 |
19/29 (66%, 47-80%)
|
28/38 (74%, 58-85%)
|
28/29 (97%, 83-99%) |
(10/20 (50%, 30-70%)
|
El Guindi (2014) |
286 |
30/60 (=50%) |
23/30 (77%, 59-88%)
|
24/31 (77%, 60-89%) |
24/30 (80%, 63-91%) |
23/29 (79%, 62-90%) |
Lee (2015) |
200 |
46/96 |
40/46 (87%, 74-94%) |
38/44 (86%, 73-94%) |
38/50 (76%, 63-86%) |
40/52 (77%, 64-86%) |
Rafeey (2016) |
218.5 |
18/30 |
16/18 (89%, 67-97%) |
16/20 (80%, 58-92%) |
8/12 (67%, 39-86%) |
8/10 (80%, 49-94%) |
Ultrasound
Study |
Prevalence of BA |
Sensitivity (95% CI) |
Specificity |
NPV |
PPV (95% CI) |
AUC
|
Triangular cord sign |
||||||
Wang (2018) |
759/1901 (42%) |
77% (95% 74-80%) I2 88.6% |
93% (95% CI 91-94%) I2 76.9% |
|
|
94% |
Zhou (2016) |
Total population: 864/ 2136 (40%)
Total 20/23 looked at this parameter n=?
Location bifurcation 10/23 studies
echogenic anterior wall of the right portal vein 6/23 studies
Thickness ≥ 4 mm 7/23 studies
|
74% (95% CI, 61-84%) I2 = 93.31
80% (58-92%)
63% (55-71%)
60% (46–72%) |
97% (95% CI, 95– 99%) I2 = 77.21% 99% (96-100%)
95% (89-98%)
97 % .94–.99%) |
|
|
97% (95-98%)
98% (97-99%)
74% (70-77%)
94% (92–96%) |
Gall bladder abnormalities |
||||||
Zhou (2016) |
Total population:
19/23 looked at this parameter n=?
Absence gall bladder 10/23
Absence or length < 1.5 cm 6/23 studies
Wall abnormalities 5/23 studies
No contraction 5/23 studies |
85% (95% CI, 76–91%)
I2 = 94.89
28% (19-40%)
79% (66-88%)
83% (70-91%)
89% (91-93%) |
92 % ((95% CI, 81–97%)
I2 = 90.86%
99% (93-100%)
87% (65-96%)
94% (91-96%)
79% (55-92%) |
|
|
94 % (95% CI, 91–95%)
64% (60-68%)
86% (83-89%)
96% (94-97%)
90% (87-92%) |
Triangular cord and gall bladder abnormalities |
||||||
Zhou (2016) |
Total population: 864/ 2136 (40%)
5/23 looked at this parameter n=?
|
95% (70-99%) |
89% (79-94%) |
|
|
94% (92-96%) |
Hepatic artery enlargement |
||||||
Zhou (2016) |
Total population: 864/ 2136 (40%)
5/23 looked at this parameter n=? |
79% (95% CI, 71–86%) |
75% (95% CI, 60–86%) |
|
|
83% (95% CI, 80–86%)
|
Subcapsular flow |
||||||
Sun |
363/772 (47%) |
89% (95% CI 83-94%)
I2 = 80.16 |
93% (95% CI 90-95%) 0.30
I2 = 80.51 |
|
|
|
MRCP
HIDA-scan
Study |
Prevalence of BA |
Sensitivity |
Specificity |
NPV |
PPV |
AUC/Accuracy |
Wang (2018) |
535/1533 (35%) |
96% (94–97%)
I2 52.4%
|
73% (70–76%)
I2 87.4% |
97.2%
|
64.5%
|
AUC 0.9300
LR + 3.26 (95% CI 2.38–4.48) LR – 0.09 (95% CI 0.05–0.16).
|
Tsuda (2019) |
35/52 (67%) |
35/35 (100%, 95% CI 90-100%)
|
14/17 (82%, 95% CI 59-94%)
|
14/14 (100%, 95% CI 78-100%))
|
35/38 (92%, 95% CI 79-97%)
|
Accuracy 49/52 (94%) |
b.4. ERCP
Zie tabel 9.
Liver biopsy
Study |
Prevalence of BA |
Sensitivity |
Specificity |
NPV |
PPV |
AUC/Accuracy |
Wang (2018) |
315/646 (49%) |
96% (95% CI 94–97%)
I2 52.4%
|
73% (95% CI 70–76%)
I2 87.4% |
97.2%
|
64.5%
|
0.9300 |
Ahmed (2021) |
22/100 (18%) |
19/22(86%, 95% CI 67-95%)
|
16/24 (67%, 95% CI 47-82%),
|
16/19 (84%, 95% CI 62-94%),
|
19/27(70%, 95% CI 52-84%),
|
Overall accuracy 76% |
Russo 2016 |
136/227 (60%)
|
121/136 (89%, (95% CI: 83-93% |
77/91 (85%, 95% CI 76-91%)
|
77/92 (84%, 95% CI 75-90%) |
121/135 (89%, 95% CI 83-94%) |
Overall accuracy 90.1% (95% CI: 85.2%, 94.9%) |
Metabolic and genetic diagnostics
Description of studies
Not applicable.
Results
Not applicable.
Level of evidence of the literature
All laboratory test evaluated in the included studies are tests to screen for biliary atresia. Because it is important that biliary atresia is not missed, sensitivity (with low false negative outcome) is a critical outcome. Because of all studies showed diagnostic accuracy, GRADE was standard started as high. This level of evidence was downgraded for all outcome measurements for risk of bias (because of high preselected population, no randomized studies, cut off value not prespecified)
Liu et al was downgraded for imprecision and indirection. Downgrading was not necessary for the study by Harpavat et al., because this was a population-based study and with narrow confidence intervals.
In case of ALT and GGT the level of evidence was downgraded for inconsistency because of heterogenicity (all studies had variable cut off points), which makes pooling impossible. Furthermore, all studies implied a high prevalence, but as mentioned before, this is the result of inclusion of children with cholestasis and not based on screening of every life born neonate. Based on the GRADE, the level of evidence for the parameters is very low.
The evidence of eight pediatric studies shows that an elevated GGT in infants with persistent jaundice or acholic stool increases the likelihood of biliary atresia, but a GGT below the threshold cannot reliably exclude BA.
The evidence of 3 pediatric studies shows that serum direct bilirubin of infants with persistent jaundice or acholic stools cannot reliably differentiate between BA and non-BA.
Imaging
Ultrasound
Three systematic reviews were included. The study of Wang pooled all data together, but with high heterogeneity. The other two studies studied per specific parameter. This is why for imprecision is downgraded. All studies fulfilled the research question, no publication bias or indirectness was present.
MRCP
All five studies included in Wang were diagnostic studies, four with a prospective and one with a retrospective design. Quality of these studies was based on the reported QUADAS low, except for index test in Yang (golden standard of cholangiography or operation not used). Between the included studies, heterogeneity was present especially for specificity as outcome. Since confidence intervals were wide, the evidence was downgraded with one level for imprecision. Therefore, no downgrading for inconsistency was applied because of different population sizes and parameters used of MRCP. Studies aimed to answer the same clinical question as our report. No publication bias was present.
HIDA
In the assessment of quality with QUADAS in the report by Wang, most studies scored with a low risk of bias, but a few in index test. No indirectness, imprecision or publication bias was present. In Wang the lowest specificity was reported in small studies.
ERCP
De evidence tabel en de bespreking volgt hieronder (zie tabel 8 en 9).
Liver biopsy
Russo (2016) was a prospective longitudinal study, with good quality. Wang et al was a systematic review which included 11 studies. According to the QUADAS reported in this paper, the risk of bias was low except for the index test in 4/11 studies. Population selection was not clear in 6/11 studies. All in consideration a medium risk of bias was present. Overlap of confidence intervals was present. Neither inconsistency nor publication bias was present.
Metabolic and genetic diagnostics
Not applicable.
Zoeken en selecteren
A systematic review of the literature was performed to answer the following question: What is the diagnostic accuracy of laboratory tests, imaging, and metabolic and genetic testing in the evaluation of children with neonatal cholestasis.
P (Patiënten) = Neonate, infant
I1 (Indextest 1) = Algemeen biochemisch onderzoek (serum bilirubin,
direct bilirubin).
I2 (Indextest 2) = Imaging diagnostiek (medical ultrasound, diagnostic
sonography, ultrasonography, endoscopic retrograde cholangiopancreatography (ERCP), cholescintigraphy, hepatobiliary scintigraphy, HIDA scan (PIPIDA scan, DISIDA scan, BrIDA scan)), magnetic resonance cholangiopancreatography (MRCP).
I3 (Indextest 3) = Non-imaging diagnostiek (metabool, genetisch,
leverbiopt). Amino acids (P, U). Organic acids (U) Acylcarnitines (DBS, P) Porphyrins (U, P, RBC) VLCFA (P) Sialotransferrins (S) Purines (U) Oligosaccharide (U) Bile acids (S, U) Sterols (P). Copper (S, U) Ceruloplasmin (S) LS-Enzymes (DBS) Alpha-fetoprotein (S) SAH & SAM (P) Sulfatides (U). Galactose-1-P (RBC) Manganese (B). Liver biopsy: respiratory chain complexes / OXPHOS analysis, ATP production, mitochondrial morphology. (s=serum, p=plasma, u=urine, b=blood). Genetics: gene, genetic, familial, inherited, heritable, molecular diagnostics, next (-) generation sequencing (NGS), (whole) exome sequencing (WES), (whole) genome sequencing (WGS), (targeted) gene panel, mitochondrial dna (mutations), mitochondrial depletion, mtDNA, copy number variation (CNV), deletion, duplication, monogenic, mutation, chromosomal, (SNP)-array, DNA, MLPA.
Relevant outcome measures
The guideline development group considered sensitivity and negative predictive value as critical outcome measures for decision making; and specificity and positive predictive value as important outcome measures for decision making.
A priori, the guideline development group did not define the outcome measures listed above but used the definitions used in the studies.
Referenties
- Ahmed ABM, Fagih MA, Bashir MS, Al-Hussaini AA. Role of percutaneous liver biopsy in infantile cholestasis: cohort from Arabs. BMC Gastroenterol. 2021 Mar 12;21:118.
- Dong C, Zhu HY, Chen YC, Luo XP, Huang ZH. Clinical Assessment of Differential Diagnostic Methods in Infants with Cholestasis due to Biliary Atresia or Non-Biliary Atresia. Curr Med Sci. 2018 Feb;38(1):137–43.
- Ferreira CR, Cassiman D, Blau N. Clinical and biochemical footprints of inherited metabolic diseases. II. Metabolic liver diseases. Mol Genet Metab. 2019 Jun;127(2):117-121. doi: 10.1016/j.ymgme.2019.04.002. Epub 2019 Apr 12. PMID: 31005404.
- Hadzic N, Harrison PM. Selective rather than routine approach to endosopic retrograde cholangio-pancreatography in diagnosis of biliary atresia. J Hepatol. 2010 May;52(5):777. doi: 10.1016/j.jhep.2010.01.019. Epub 2010 Feb 18. PMID: 20347498.
- Harpavat S, Ramraj R, Finegold MJ, Brandt ML, Hertel PM, Fallon SC, Shepherd RW, Shneider BL. Newborn Direct or Conjugated Bilirubin Measurements As a Potential Screen for Biliary Atresia. J Pediatr Gastroenterol Nutr. 2016 Jun;62(6):799–803.
- Iinuma Y, Narisawa R, Iwafuchi M, Uchiyama M, Naito M, Yagi M, Kanada S, Otaki M, Yamazaki S, Honma T, Motoyama H, Baba Y. The role of endoscopic retrograde cholangiopancreatography in infants with cholestasis. J Pediatr Surg. 2000 Apr;35(4):545–9.
- Keil R, Snajdauf J, Rygl M, Pycha K, Kotalová R, Drábek J, Stovícek J, Procke M. Diagnostic efficacy of ERCP in cholestatic infants and neonates--a retrospective study on a large series. Endoscopy. 2010 Feb;42(2):121–6.
- Koot BGP, Kelly DA, Hadzic N, Gonzales E, Hierro L, Davenport M, Keil R, Fockens P, Baumann U. Endoscopic Retrograde Cholangiopancreatography in Infants: Availability Under Threat: A Survey on Availability, Need, and Clinical Practice in Europe and Israel. J Pediatr Gastroenterol Nutr. 2020 Aug;71(2):e54-e58. doi: 10.1097/MPG.0000000000002752. PMID: 32304552.
- Lee JJY, Wasserman WW, Hoffmann GF, van Karnebeek CDM, Blau N. Knowledge base and mini-expert platform for the diagnosis of inborn errors of metabolism. Genet Med. 2018 Jan;20(1):151-158. doi: 10.1038/gim.2017.108. Epub 2017 Jul 20. PMID: 28726811; PMCID: PMC5763153.
- Liu X, Peng X, Huang Y, Shu C, Liu P, Xie W, Dang S. Design and validation of a noninvasive diagnostic criteria for biliary atresia in infants based on the STROBE compliant. Medicine (Baltimore). 2019 Feb;98(6):e13837.
- Negm AA, Petersen C, Markowski A, Luettig B, Ringe KI, Lankisch TO, Manns MP, Ure B, Schneider AS. The Role of Endoscopic Retrograde Cholangiopancreatography in the Diagnosis of Biliary Atresia: 14 Years’ Experience. Eur J Pediatr Surg. 2018 Jun;28(3):261–7.
- Russo P, Magee JC, Anders RA, Bove KE, Chung C, Cummings OW, Finegold MJ, Finn LS, Kim GE, Lovell MA, Magid MS, Melin-Aldana H, Ranganathan S, Shehata BM, Wang LL, White FV, Chen Z, Spino C, Childhood Liver Disease Research Network (ChiLDReN). Key Histopathologic Features of Liver Biopsies That Distinguish Biliary Atresia From Other Causes of Infantile Cholestasis and Their Correlation With Outcome: A Multicenter Study. Am J Surg Pathol. 2016 Dec;40(12):1601–15.
- Saito T, Terui K, Mitsunaga T, Nakata M, Kuriyama Y, Higashimoto Y, Kouchi K, Onuma N, Takahashi H, Yoshida H. Role of pediatric endoscopic retrograde cholangiopancreatography in an era stressing less-invasive imaging modalities. J Pediatr Gastroenterol Nutr. 2014 Aug;59(2):204–9.
- Shanmugam NP, Harrison PM, Devlin J, Peddu P, Knisely AS, Davenport M, Hadzić N. Selective use of endoscopic retrograde cholangiopancreatography in the diagnosis of biliary atresia in infants younger than 100 days. J Pediatr Gastroenterol Nutr. 2009 Oct;49(4):435–41.
- Shneider BL, Moore J, Kerkar N, Magee JC, Ye W, Karpen SJ, Kamath BM, Molleston JP, Bezerra JA, Murray KF, Loomes KM, Whitington PF, Rosenthal P, Squires RH, Guthery SL, Arnon R, Schwarz KB, Turmelle YP, Sherker AH, Sokol RJ, Childhood Liver Disease Research Network. Initial assessment of the infant with neonatal cholestasis-Is this biliary atresia? PLoS One. 2017;12(5):e0176275.
- Shteyer E, Wengrower D, Benuri-Silbiger I, Gozal D, Wilschanski M, Goldin E. Endoscopic retrograde cholangiopancreatography in neonatal cholestasis. J Pediatr Gastroenterol Nutr. 2012 Aug;55(2):142–5.
- Sun C, Wu B, Pan J, Chen L, Zhi W, Tang R, Zhao D, Guo W, Wang J, Huang S. Hepatic Subcapsular Flow as a Significant Diagnostic Marker for Biliary Atresia: A Meta-Analysis. Dis Markers. 2020;2020:5262565.
- Tang KS, Huang LT, Huang YH, Lai CY, Wu CH, Wang SM, Hwang KP, Huang FC, Tiao MM. Gamma-glutamyl transferase in the diagnosis of biliary atresia. Acta Paediatr Taiwan. 2007;48(4):196–200.
- Tsuda N, Shiraishi S, Sakamoto F, Ogasawara K, Tomiguchi S, Yamashita Y. Tc-99m PMT scintigraphy in the diagnosis of pediatric biliary atresia. Jpn J Radiol. 2019 Dec;37(12):841–9. Wang L, Yang Y, Chen Y, Zhan J. Early differential diagnosis methods of biliary atresia: a meta-analysis. Pediatr Surg Int. 2018 Apr;34(4):363–80.
- Vademecum Metabolicum Zschocke & Hoffman http://www.vademetab.org/
- Zhou LY, Chen SL, Chen HD, Huang Y, Qiu YX, Zhong W, Xie XY. Percutaneous US-guided Cholecystocholangiography with Microbubbles for Assessment of Infants with US Findings Equivocal for Biliary Atresia and Gallbladder Longer than 1.5 cm: A Pilot Study. Radiology. 2018 Mar;286(3):1033–9.
Evidence tabellen
Study reference |
Study characteristics |
Patient characteristics 2 |
Intervention (I) |
Comparison / control (C) 3
|
Follow-up |
Outcome measures and effect size 4 |
Comments |
|||||||
Study reference |
Patient selection
|
Index test |
Reference standard |
Flow and timing |
Comments with respect to applicability |
|
||||||||
What is the diagnostic accuracy of laboratory results (general) in the evaluation of neonatal cholestasis (resulting in better outcome) in children with neonatal cholestasis. |
||||||||||||||
C. Dong et al (2018) |
Type of study Retrospective cohort
May 2007 to June 2011
Setting China
Funding and conflicts of interest: Funding not mentioned.
No potential conflicts of interest to disclose.
|
Inclusion criteria (1) existence of jaundice in infancy (including the neonatal period) without remission; (2) pale or light yellow feces; (3) hepatomegaly or hepatic texture change; (4) conjugated hyperbilirubinemia.
Exclusion criteria Not described
N total at baseline: 396
Important prognostic factors:
Gender Total Girls 152/396 (38%) Boys 244/396 (62%)
BA Girls 70/1778 (39%) Boys 108/178 (61%)
Age (mean in age ± SD) BA group 58±30 days Non-BA. 61±24 days
|
BA (n= 178)
Index test GGT > 300 IU/L
Reference test Cholangiography or histopathologic examination (main pathological changes are bile duct proliferation, bile plugs, portal or perilobular fibrosis and edema, with preservation of the basic hepatic lobular architecture)
|
Non-BA ( n=218) |
More than one year |
Serum GGT > 300 IU/l (n= 347)
Sensitivity 115/156 (74%, 95% CI 66-80%)
Specificity 128/191 (67%, 95% CI 60-73%)
PPV 115/178 (65%, 95% CI 57-71%)
NPV 128/169 (76%, 95% CI 69-82%)
|
|
|||||||
Harpavat et al (2016) |
Type of study Retrospective
January 2011 and December 2014
Setting Texas, USA
Funding and conflicts of interest: No conflict of interest
No funding. |
Inclusion Texas Children’s Hospital (TCH), diagnosed with BA, and born between January 1, 2011, and December 31, 2014
Eligible subjects in the non-BA group (n = 9102) were all infants born between June 1, 2009, and July 30, 2011, at Ben Taub General Hospital (BTGH).
Exclusion 269 subjects were excluded because they only had total bilirubin (TB) concentrations measured (n=193), had DB concentrations measured after newborns are typically discharged (60 hours of life (HoL)) (n=6), were transferred before DB concentrations were measured (n=33), died before DB concentrations were measured (n=34), or were discharged early before any bilirubin concentrations were measured (n=3). 1 BA patient!
N total at baseline: 9137
Important prognostic factors:
Gender BA Girls 26/35 Boys 9/35
Non-BA Girls 4490/9102 (49%) 4612/9102(51%)
Gestational age BA Preterm 6/61 A terme 55/61
Non-BA Preterm 681/9102 A terme
Other causes vs healthy controls in BA group not described.
|
BA group (n=61) 26/61 data not known.
Index test DB or CB (n=35)
Reference test BA diagnosis was made by intraoperative cholangiogram and pathological assessment of the bile duct remnant. For those identified too late for the Kasai operation, the BA diagnosis was inferred by the initial needle liver biopsy and subsequent analysis of liver and bile duct tissue removed at the time of liver transplantation/autopsy. |
Non-BA (n=9102)
|
? |
Direct/conjugated bilirubin (> U/L of reference test)
Sensitivity 35/35 100% (95% CI 87.7-100)
Specificity 8936/9102 98.2% (95% CI 97.9-98.4).
PPV 35/201 (17%, (95% CI 13-23%)
NPV 8936/8936 100% (95% CI 99.9-100%)
|
|
|||||||
Liu et al (2019) |
Type of study Retrospective cohort
January 2011 and June 2016
Setting
Xian Children’s Hospital, China
Funding and conflicts of interest: No financial assistance No conflict of interest |
Inclusion Infants’ neonatal cholestasis
Exclusion Gilbert syndrome, born prematurely, sepsis, or those receiving total parenteral nutrition.
N total at baseline: 482
Important prognostic factors:
Sex: Gender Female Non-BA 104/316 (32.9) BA 101/166 (60.8)
Male 212/316 (67.%1) Non-BA 65/166 (39.2%)
Age, days Non-BA 56.5 (46.0–75.0) BA 59.5 (44.0–84.5) P .830
Gestational age, weeks Total 39.1±2.2 Non-BA 38.9±2.4 Ba 39.5±1.3 P .003
|
BA N=166
|
Non-BA N= 316 |
|
Direct bilirubin, mmol/L
BA 121.0 (105.8–144.1) Non-BA 71.6 (41.6–104.2) p<.001
DBIL > 90.3 Sensitivity 151/166 (90.9%, 95% CI 86-94%)
Specificity 208/316 (65.9%, 95% CI 60.4-70.8%))
PPV 151/259 (58.4%, 52.2-64.1%)
NPV 208/223 (93.2%, 89-96%)
ALT, U/L Non-BA 82.0 (43.0–180.0)
BA 120.0 (85.0–210.0) p <.001
>67.5 Sensitivity 143/166 (86.1%, 95% C 80.1-90.6%)
Specificity 134/316 (42.5%, 95% CI 37.1-47.9%)
PPV 143/324 (44.0%, 95% CI 57.3-69.9%)
NPV 134/157 (85. 4%, 95% CI 79.0-90.0%)
GGT, U/L Mean (range) BA 442.0 (236.5–780.0) Non-BA 126.0 (85.0–227.5) P < 0.01
GGT IU/L > 184 Sensitivity 144/166 (86.5%, 95% CI 86.4-93.8%)
Specificity 216/316 (68.3%, 95% CI 63.0-73.2%)
PPV 143/244 (58.5%, 52.3-64.6%)
NPV 216/238 (90.7%, 86.4-93.8%)
|
ALP Non-Ba 570.2±280.8 BA 519.2±197.2 .038
ALP > 719.5 AUC 0.529 (0.582 - 0.476) Sensitivity 86.6 Specificity 22.9 PPV 36.9 NPV 76.6
Total bilirubin
BA 180.0 (155.6–223.9)
Non-BA 112.8 (72.9–155.4) P<.001
(>147.1) Sensitivity 138/166 (83.0, 95% Specificity 229/316 (72.4) PPV 138/225 (61.2%) NPV 229/257 (89.1) AUC (95% CI) 0.798 (0.759 - 0.838)
Non-Ba 119.0 (64.0–231.5) p <.001
AST, U/L BA 191.0 (139.0–281.0)
>121.5 Sensitivity 139/166 (83.6%) Specificity 162/316 (51.3%) PPV 139/293 (47.3%) NPV 162/189 (85.7%) AUC 0.679 (0.633 - 0.726)
|
|||||||
Shneider et al (2017) |
Type of study Multicenter prospective longitudinal study
Setting USA and Canada
April 2004 till February 2014
Funding and conflicts of interest: Funding provided by the following National Institute of Diabetes and Digestive and Kidney Disease The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
The authors have declared that no competing interests exist |
Inclusion : 1) age 180 days at presentation to a ChiLDReN center; and 2) serum direct or conjugated bilirubin >20% of total bilirubin (TB) and 2mg/dl.
Exclusion 1)acute liver failure; 2) previous hepatobiliary surgery; 3) bacterial or fungal sepsis; 4) hypoxia, shock, or ischemic hepatopathy; 5) malignancy; 6) primary hemolytic disease; 7) drug or total parenteral nutrition-associated cholestasis; 8) extracorporeal membrane oxygenation (ECMO)-associated cholestasis; or 9) birth weight
N total at baseline: 875 Group 1: BA patients (n=401) Group 2 non-BA (n=259) Group 3 BA with missing data (n=102,58 excluded for lack of laboratory data at presentation and 44 for lack of operative demonstration) Group 4: not classified (n=113)
Important prognostic factors:
Sex Group 1: M: 191 (47.6%) F: 210 (52.4%)
Group 3 M: 164 (63.3%) F: 95 (36.7%) P < 0.001
Age in days (mean) for onset disease BA: 12.8 (18.5) Non- BA: 18.7 (22.1) P < 0.01
Age at First Evaluation (Days) BA: 63.5 (30.9) Non-BA 60 (33.3) P 0.176
|
BA patients N=93
|
Neonatal hepatitis N = 65 |
|
GGT GGTP (u/L) BA N = 379 711.9 (537.5) Non-BA N = 238 299 (380.5) P < 0.001
GGT > 204 Sensitivity 40/401 (10%, 95% CI 7.4-13.3%)
Specificity 106/259 (40.9%, 95% CI 35.1-47%)
PPV 40/193 (20.7%, 95% CI 15.6-27.0%)
NPV 106/467 (22.7%, 95% CI 19.1-26.7%)
GGT + acholic stool
Sensitivity 343/401 (85.5%, 95% CI 81.8-88.6%)
Specificity 198/259 (76.4%, 95% CI 71.8-82.1%)
PPV 343/404 (84.4%, 95% CI 81.8-88.6%)
NPV 198/256 (77.3%, 95% CI 71.8-82.1%)
|
AST BA (397) 232.1 (206.4) Non-BA (254) 84.2 (347.7)
ALT B (n=400) 154.7 (124.3 Non-BA (n=255) 190.7 (232.5) P 0.230
Conjugated bili (mg/dl) BA (n=215)4.3 (1.6) Non-BA (n=121) 4.6 (2.6) P 0.871
Total Bile acid BA 160.9 (115.0–199.6) Non-BA 121.0 (73.0–165.0) p <.001
Total Bile Acid > ) 108.8 AUC 0.662 (0.606 - 0.719
Sensitivity140/165 (85.0%) Specificity142/316 (44.8%) PPV 140/317 (34.8%) NPV 142/164 (89.6%)
PT (sec) BA 114.9±22.9 P Non-BA 102.6±29.0 <.001
PT > 400 AUC0.562 (0.508 - 0.617) Sensitivity 62/165 (37.8%) Specificity 231/316 (73.2%) PPV 62/147 (42.8%) NPV 231/334 (69.0 %)
Albumine BA 40.4±4.7 P Non-Ba 38.7±4.9 <.001
Albumin >40.5 AUC 0.592 (0.539 - 0.646) Sensitivity 83/165 (50.3%) Specificity 201/316 (63.7%) PPV 83/198 (41.8%) NPV 201/283 (71.2%)
|
|||||||
Tang et al (2007) |
Type of study Retrospective cohort study
1986 to 2005
.Setting Taiwan
Funding and conflicts of interest: Not described |
Inclusion Infantile cholestasis: Charts were reviewed with the diagnosis of BA and NH
confirmed with serum bilirubin level, complete liver function profile, liver histology, and intraoperative findings
Exclusion Etiologies other than BA and NH.
N total at baseline: 158
Important prognostic factors:
Sex Girls 67 (42.4%) Boys 91 (57.6%)
Age BA 58.3 20.3 days (18-125 days) NH 57.4 26.4 Days (15-138 days) |
BA patients N=93
|
Neonatal hepatitis N = 65 |
|
GGT BA 353.3 ± 334.4 IU/L NH 114.8 ± 86 IU/L P < 0.001)
GGT level> 300 IU/L
Sensitivity 27/68 (40%, 95% CI 29-52%)
Specificity 53/54 98% (95% CI 90-100%)
PPV 27/28 (96%, 95% CI 92-99%)
NPV 53/94 (56%, 95% CI 46-66%)
|
GGT/AST > 2 Sensitivity 55/68 (81%)
Specificity 39/54 (72%)
PPV 55/70 (79%)
NPV 39/52(75%) GGT/AST > 2
Sensitivity 54/65(83%)
Specificity 31/50 (62%)
PPV 54/73 (74%)
NPV 31/42 (74%)
|
|||||||
C. Dong et al (2018) |
Was a consecutive or random sample of patients enrolled? No
Was a case-control design avoided? Yes
Did the study avoid inappropriate exclusions? Unclear, exclusion not described.
|
Were the index test results interpreted without knowledge of the results of the reference standard? No, retrospective design.
If a threshold was used, was it pre-specified? Yes (GGT > 300 IU/l)
|
Is the reference standard likely to correctly classify the target condition? Yes
Were the reference standard results interpreted without knowledge of the results of the index test? No
|
Was there an appropriate interval between index test(s) and reference standard? No
Did all patients receive a reference standard? Yes
Did patients receive the same reference standard? No liver biopsy or cholangiogram
Were all patients included in the analysis? Yes, but GGT analysis were not performed in all patients
|
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No
|
|
||||||||
|
CONCLUSION: Could the selection of patients have introduced bias?
RISK: High |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: Medium
|
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: High |
CONCLUSION Could the patient flow have introduced bias?
RISK: High |
|
|
||||||||
Harpavat et al (2016) |
Was a consecutive or random sample of patients enrolled? No
Was a case-control design avoided? No
Did the study avoid inappropriate exclusions? No
|
Were the index test results interpreted without knowledge of the results of the reference standard? Yes
If a threshold was used, was it pre-specified? No
|
Is the reference standard likely to correctly classify the target condition? Yes
Were the reference standard results interpreted without knowledge of the results of the index test? No
|
Was there an appropriate interval between index test(s) and reference standard? Unclear
Did all patients receive a reference standard? No; only Ba patients.
Did patients receive the same reference standard? Yes (if reference standard was used)
Were all patients included in the analysis? No; only if lab results available (35/61)
|
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No
|
|
||||||||
CONCLUSION: Could the selection of patients have introduced bias?
RISK: High |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: High
|
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: High |
CONCLUSION Could the patient flow have introduced bias?
RISK: High |
|
|
|||||||||
Liu et al (2019) |
Was a consecutive or random sample of patients enrolled? No
Was a case-control design avoided? No
Did the study avoid inappropriate exclusions? Yes
|
Were the index test results interpreted without knowledge of the results of the reference standard? No
If a threshold was used, was it pre-specified? No
|
Is the reference standard likely to correctly classify the target condition? Yes
Were the reference standard results interpreted without knowledge of the results of the index test? No
|
Was there an appropriate interval between index test(s) and reference standard? Unclear
Did all patients receive a reference standard? Unclear
Did patients receive the same reference standard? Yes (if reference standard was used)
Were all patients included in the analysis? Not clear for each item solitary
|
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No
|
|
||||||||
CONCLUSION: Could the selection of patients have introduced bias?
RISK: High |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: High
|
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: High |
CONCLUSION Could the patient flow have introduced bias?
RISK: High |
|
|
|||||||||
CONCLUSION: Could the selection of patients have introduced bias?
RISK: High |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: High
|
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: High |
CONCLUSION Could the patient flow have introduced bias?
RISK: High |
|
|
|||||||||
Shneider et al (2019) |
Was a consecutive or random sample of patients enrolled? No
Was a case-control design avoided? Yes
Did the study avoid inappropriate exclusions? Yes
|
Were the index test results interpreted without knowledge of the results of the reference standard? Yes
If a threshold was used, was it pre-specified? No
|
Is the reference standard likely to correctly classify the target condition? Yes
Were the reference standard results interpreted without knowledge of the results of the index test? Yes
|
Was there an appropriate interval between index test(s) and reference standard? Yes
Did all patients receive a reference standard? No
Did patients receive the same reference standard? Yes (if reference standard was used)
Were all patients included in the analysis? Only the true BA and non-Ba patients
|
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No
|
|
||||||||
|
CONCLUSION: Could the selection of patients have introduced bias?
RISK: Low |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: Low
|
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: Low |
CONCLUSION Could the patient flow have introduced bias?
RISK: Medium |
|
|
||||||||
Tang et al (2007) |
Was a consecutive or random sample of patients enrolled? No
Was a case-control design avoided? No
Did the study avoid inappropriate exclusions? Yes
|
Were the index test results interpreted without knowledge of the results of the reference standard? No (diagnoses already known)
If a threshold was used, was it pre-specified? Unclear, most probably established retrospectively. Method unclear.
|
Is the reference standard likely to correctly classify the target condition? Yes
Were the reference standard results interpreted without knowledge of the results of the index test? No
|
Was there an appropriate interval between index test(s) and reference standard? Unclear
Did all patients receive a reference standard? Yes
Did patients receive the same reference standard? Unclear
Were all patients included in the analysis? No; total patients 122/158 for GGT analysis.
|
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No
|
|
||||||||
CONCLUSION: Could the selection of patients have introduced bias?
RISK: High |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: High
|
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: High |
CONCLUSION Could the patient flow have introduced bias?
RISK: High |
|
|
|||||||||
Table of quality assessment for systematic reviews of diagnostic studies module Aanvullende diagnostiek
Based on AMSTAR checklist (Shea et al.; 2007, BMC Methodol 7: 10; doi:10.1186/1471-2288-7-10) and PRISMA checklist (Moher et al 2009, PLoS Med 6: e1000097; doi:10.1371/journal.pmed1000097)
Research question:
Study
First author, year |
Appropriate and clearly focused question?1
Yes/no/unclear |
Comprehensive and systematic literature search?2
Yes/no/unclear |
Description of included and excluded studies?3
Yes/no/unclear |
Description of relevant characteristics of included studies?4
Yes/no/unclear |
Assessment of scientific quality of included studies?5
Yes/no/unclear |
Enough similarities between studies to make combining them reasonable?6
Yes/no/unclear |
Potential risk of publication bias taken into account?7
Yes/no/unclear |
Potential conflicts of interest reported?8
Yes/no/unclear |
Wang 2018 |
Yes |
Yes: MEDLine not searched |
No: final excluded studies not referenced; reasons are described |
Yes |
Yes: QUADAS 2 present |
Clinical: all BA patients, age +
Significance: Heterogeneities + |
No |
No |
Sun |
NO; Medline not used |
YES (Table 2) |
|
|||||
Zhoe |
Yes |
Yes |
No: final excluded studies not referenced; reasons are described |
Yes |
YES (Table 2) |
YEs |
Yes |
No |
Evidence table HIDA
Study reference |
Study characteristics |
Patient characteristics 2 |
Intervention (I) |
Comparison / control (C) 3
|
Follow-up |
Outcome measures and effect size 4 |
Comments |
|
What is the diagnostic accuracy of HIDA in the evaluation of neonatal cholestasis (resulting in better outcome) in children with neonatal cholestasis. |
||||||||
Tsuda 2019 |
Type of study Retrospective cohort
Period August 2001 and October 2018
Setting Japan
Funding and conflicts of interest: study was not funded by any institution.
authors declare that they have no conflict of interest.
|
Inclusion criteria Patients suspected for BA.
Exclusion criteria Not described
N total at baseline: 52
Prevalence 35/52 (67%)
Important prognostic factors:
Sex Boys 24/52 (46%) Girls28/52 (54%)
Age (range) 17-145 days
|
BA (n= 35)
Index test 99mTc-PMT scintigraphy
Reference test Preoperative cholangiogram or follow-up |
Non-BA(n=17) |
|
Sensitivity 35/35 (100%, 95% CI 90-100%)
Specificity 14/17 (82%, 95% CI 59-94%)
PPV 35/38 (92%, 95% CI 79-97%)
NPV 14/14 (100%, 95% CI 78-100%))
Overall accuracy 49/52 (94%) |
|
Study reference |
Study characteristics |
Patient characteristics
|
Index test (test of interest) |
Reference test
|
Follow-up |
Outcome measures and effect size |
Comments |
What is the diagnostic accuracy of HIDA in the evaluation of neonatal cholestasis (resulting in better outcome) in children with neonatal cholestasis. |
|||||||
Wang 2018
|
SR [and meta-analysis]
Literature search up to July 2017
A: Spivak, 1985, Usa, Retrospective B: Tolia, 1986, USA, design unknown C: Cox, 1987, USA, Prospective D: Park, 1997, Korea, prospective E: Lee, 2000, China, prospective F: Tan, 2000, Singapore, prospective G: Reyom, 2005. Korea, prospective H Deghani, 2006, Iran, Prospective J : Poddar, 2009, India, prospective K : Rouzrokh, 2009, Iran, retrospective L : Yang, 2009, China, retrospective N Jensen, 2012, USA, retrospective O Kwatra, 2013, USA< retrospective P Guan, 2015, China, retrospective Q Jancelewicz, 2015, Canada, retrospective R Brittain, 2016, Denmark, retrospective
Source of funding and conflicts of interest: Funding This study was funded by the National Science Foundation of China (Grant Number 81570471) and the Tianjin Health Bureau special grant (Grant Number 14KG129).
Conflict of interest the authors declare that they have no conflict of interest.
|
Inclusion criteria SR: (1) diagnostic test accuracy (DTA) studies evaluating sensitivity and specificity of HIDA (2) articles were published in full texts in English and (3) studies with sufficient information for analysis.
Exclusion criteria SR: (1) letters, reviews, case reports, conference abstracts, editorials, expert opinion reviews and abstracts, (2) data of sensitivity, specificity is incorrect or insufficient for analysis or evaluated by more than one researcher without a consensus, (3) screening studies with a large population without cholestasis and (4) studies with overlapping cases and data.
Important patient characteristics:
Number patients 1533
Prevalence 535/1533
Gender A Boys 18/27 (67%) Girls 9/23 (33%) B Boys 15/28(54%) Girls 13/28 (46%) C Boys 19/33 (58%) Girls 14/33 (42%)
Boys 90/152 (59%) Girls 62/152 (41%)
Girls 11/23 (48%)
Girls 34/65 (52%)
Girls 34/65 (52%)
Boys 30/69(43%) Girls 39/69 (57%)
Original paper not available
Boys 75/128 (59%) Girls 53/128 (41%) Boys 121/197 (61%) Girls 76/197 (39%) Q Boys 126/212 (59%) Girls 86/212 (41%)
R Boys 27/47 (57%) Girls 20/47(43%)
N, mean age A: 59 B: Not decribed C: 6.7+/-3.2 (range 1-12 weeks) D: 12-120 E: age range, 14-97 d; mean age ±SD, 55 ±18 d). F:2-12 weeks G: 69 (24-139) H 62+/-17 I: 88.6 J: 2,8 +/- 1.7 mon K: 39 L: 62 +/-14 (31-121) M 29 N 63 O 48(median) P 63.7 Q Median age at presentation, days (IQR) 28 (2–56) R 48 (4-160) days
|
Describe index tests and cut-off point(s):
A. Fasting x 3 hrs; 1.0 mCi 99m Tc-DISIDA* IV; Gamma camera (Picker international); images obtained at various intervals up to 24 hrs or until bowel activity was visualized; time to earliest detection of excretion.
B. 5 microCi; 99mTc-PIDIDA** IV; Gamma camera (Picker international); images obtained at 0,5,10,15,20,30,60 min, and 2,4,6 and 24 hours; repeated after 1 wk. of phenobarbital.
C. Fasting 4-6 hrs; 0.1 mCi 99mTc-Dosofenin (Hepatiolite) ***IV; images obtained at several moments in the first hour, after 2-4 hours and after 24 hours; repeated after 1 week of phenobarbital.
D. Fasting > 4hrs; 1,0 mCi 99mTcDISIDA** after a 3- to 5-day course of phenobarbital pretreatment (5 mg/kg/d). Images were obtained at 3,5, 10, 15,30,45,60, 120, and 240 minutes. If necessary, delayed images were obtained up to 24 hrs. No excretion of tracer in 24 hours was indicative of having BA, whereas excretion of tracer in 24 hours was indicative of having either neonatal hepatitis or other causes of cholestasis.
E. Fasting > 4hrs; 1,0 mCi 99mTc-disofenin*** after a 5- to 7-day course of phenobarbital pretreatment (5 mg/kg/d). Images were obtained at 5, 15,30,45 min, and 1,2,3,4,5,6,7 and 8 hrs, or until intestinal radioactivity was identified. If no bowel radioactivity after 8 hrs, then an additional 1 mCi 99mTC-disofenin IV with imaging 24 hrs after the first injection. Interpretation: biliary atresia if bowel radioactivity remained absent 24 h after injection and if efficient liver uptake was relatively preserved.
F. 1.0 mCi 99mTc-DISIDA* IV after 5-day course of phenobarbital (5 mg/kg/d). Images were obtained at 5, 10, 15, 30, 45, 60 and 120 min. Delayed images at 6 h and 24 h were also obtained following an additional 1.0 mCi dose of 99mTc-DISIDA at 6±8 h in 11 nondraining scans. No excretion in 24 h was taken as indicative of BA.
G. 3-hrs fasting; 0.25 mCi 99mTc-DISIDA*; phenobarbital (5 mg/kg/d orally) x 2–5 days in advance; imaging obtained 1, 5, 15, and 30 min and 1, 2, 4, and 6 hours after the injection. Delayed scans were obtained 24 hours later if there was no intestinal radioactivity after 6 hours.
H. Not reported.
I. Original paper not available
J. 99mTc Mebrofenin*****. Serial static images for up to 6 hours and delayed at 24 h. When no intestinal tracer after 24 h, procedure was repeated UDCA 20 mg/kg x2-3 days.
K. Original paper not available
L. 6 h fasting; 5 mCi 99mTc-EHIDA**** IV; images were obtained at 5-, 10-, 15-, 20-, 30-, and 60-min intervals, and at 6 and 24 h.
M. Original paper not available
N. Not reported.
O. 1 mCi 99m-TC mebrofenin*****; pretreatment with phenobarbital (5 mg/kg/d x 5d); images obtained at 2 h, 4 h, 6 h and 8 h till biliary excretion or up to a maximum of 24 h
P. Fasting x 4-6 h. 99mTc-EHIDA****; images obtained for 50 min (1 frame/min) and then at 6 and 24 h.
Q. Insufficient information
R. Fasting x 2 hrs; 99mTc-mebrofenin***** Pre-treatment with phenobarbital 5 mg/kg/d and UDCA 100 mg/kg/d; scintigraphy (1 min per frame) x 1 hr. when no intestinal radioactivity, then additional images at 3, 6 and 24 hours |
Describe reference test and cut-off point(s):
A. laparotomy or liver biopsy
B. percutaneous liver biopsy
C. liver biopsy
D. liver biopsy or laparotomy
E. laparotomy
F. Laparotomy and open liver biopsy
G. combination of surgery, various imaging modalities, and clinical follow-up
H. Cases suspicious for BA underwent laparotomy and intraoperative cholangiography.
I. Original paper not available
J. Preoperative cholangiogram and liver biopsy
K. Original paper not available
L. histopathologic examination of the specimens obtained when the Kasai procedure was performed.
M. Original paper not available
N. Not reported.
O. clinical course, intraoperative cholangiogram or laparotomy or liver biopsy.
P. laparoscopic cholangiography, surgical pathology, or clinical follow-up
Q. preoperative cholangiography or liver biopsy
R. Cholangiography and liver biopsy
Prevalence (%) [based on refence test at specified cut-off point] A 7/28 (25%) B: 10/32 (31%) C: 9/33 (27%) D: 25/73 (34%) E: 49/152 (32%) F: 12/60 (20%) G: 4/23 (17%) H: 19/65 (29%) I 31/61 (51%) J:60/101 (59%) K: 18/42 (43%) L: 34/69 (49%) M: 29/84 (35%) N 19/68 (28%) O 43/186 (23%) P 107/197 (54%) Q 45/212 (21%) R 14/47 (30%)
For how many participants were no complete outcome data available?
Not described |
Not described. |
Outcome measures and effect size (include 95%CI and p-value if available):
A- Sensitivity 7/7(100%, 95% CI 65-100%) Specificity 9/21 (43%, 95% CI 25-63%) PPV 7/19 (37%, 95% CI 19-59%) NPV 9/9 (100%, 70-100%)
B: Sensitivity 10/10 (100%, 95% CI 72-100%) Specificity 12/22 (55%, 95% CI 35-73% PPV 10/20 (50%, 95% CI 30-70%) NPV 12/12 (100%, 95% CI 76-100%)
C: Sensitivity 9/9 (100%, 95% CI 71-100%) Specificity 16/24 (67%, 95% CI 47-82%) PPV 9/17 (53%, 95% CI 31-74%) NPV 16/16 (100%, 95% CI 81-100%)
D: Sensitivity 24/25 (96%, 95% CI 80-99%) Specificity 16/46 (35%, 95% CI 23-49%) PPV 24/54 (44%, 95% CI 32-58%) NPV 16/17 (94%, 95% CI 73-99%)
E: Sensitivity 49/49 (100%, 95% CI 93-100%) Specificity 89/103 (86% 95% CI 78-92%) PPV 49/63 (78%, 95% 66-86%) NPV 89/89 (100%, 95% 96-100%) F Sensitivity 11/12 (92%, 95% CI 65-99%) Specificity 20/26 (77%, 95% CI 58-89%_ PPV 11/17 (65%, 95% CI 41-83%) NPV 20/21(95%, 95% CI 77-99%)
G Sensitivity 4/4 (100%, 95% CI 51-100%) Specificity 11/17 (65%, 95% CI 41-83%) PPV 4/10 (40%, 95% CI 17-69%) NPV 11/11 (100%, 74-100%)
H Sensitivity 16/19 (84%, 95% CI 62-94%) Specificity 22/46 (48%, 95% CI 34-62%) PPV 16/40 (40%, 95% CI 26-55%) NPV 22/25 (88%, 95% CI 70-96%)
I Sensitivity 29/29 (100%, 95 % CI 88-100%) Specificity 23/25 (92%, 95 % CI 75-98%) PPV 29/31 (79%, 95 % CI 79-98%) NPV 23/23 (100% 95 % CI 86-100%)
J Sensitivity 13/13 (100%, 95 % CI 77-100%) Specificity 18/21 (86%, 95 % CI 65-95%) PPV 13/16 (81%, 95 % CI 57-93%) NPV 18/18 (100% 95 % CI 82-100%)
K Sensitivity 18/18 (100%, 95 % CI 82-100%) Specificity 21/24 (88%, 95 % CI 69-96%) PPV 18/21 (86%, 95 % CI 65-95%) NPV 21/21 (100%, 86-100%)
L Sensitivity 30/34 (88%, 95 % CI 73-95%) Specificity 16/35 (45%, 95 % CI 30-62%) PPV 30/49 (61%, 47-74%) NPV 16/20 (80%, 95 % CI 58-92%)
M Sensitivity 29/29 (100%, 95 % CI 88-100%) Specificity 41/55 (75%, 95 % CI 62-84%) PPV 29/43 (67%, 95 % CI 53-80%) NPV 41/41 (100%, 95 % CI 91-100%)
N Sensitivity 18/19 (95%, 95 % CI 75-99%) Specificity 28/49 (57%, 95 % CI 43-70%) PPV 18/39 (46%, 95 % CI 32-61%) NPV 28/29 (97%, 95 % CI 93-99%)
O Sensitivity 43/43 (100%, 95 % CI 92-100%) Specificity 133/143 (93%, 95 % CI 88- 96%) PPV 43/53 (81%, 95 % CI 69-89%) NPV 133/133 (100%, 95 % CI 97-100%)
P Sensitivity 97/107 (91%, 95 % CI 84-95%) Specificity 71/90 (79%, 95 % CI 69-86%) PPV 97/116 ( 84%, 95 % CI 76-89%) NPV 71/81(88%, 95 % CI 79-93%)
Q Sensitivity 41/41 (100%, 95 % CI 91-100%) Specificity 120/131(91% 95 % CI 86-95%) PPV 41/82 (50% 95 % CI 39-61%) NPV 120/120 (100% 95 % CI 97-100%)
R Sensitivity 14/14 ( 100% 95 % CI 78-100%) Specificity 21/33 (64% 95 % CI 47-78%) PPV 14/26 (54% 95 % CI 35-71%) NPV 21/21 (100% 95 % CI 85-100%)
Overall
sensitivity 96% (95% CI 94–97%), specificity 73% (95% CI 70–76%), LR + 3.26 (95% CI 2.38–4.48) LR − 0.09 (95% CI 0.05–0.16). AUC 0.9300, Q 0.8651. PPV 64.5% NPV 97.2%.
|
|
Evidence table ultrasonography
Study reference |
Study characteristics |
Patient characteristics
|
Index test (test of interest) |
Reference test
|
Follow-up |
Outcome measures and effect size |
Comments |
Sun 2020
|
Type of study Meta-analysis
Databases PubMed, EMBASE, Chinese National Knowledge Infrastructure (CNKI), and Technology of Chongqing (VIP)
Literature search till January 1, 2019, and May 2015
Quality assessment QUADAS tool, 2 independent reviewers
Funding and conflict of interest Not reported
SR [and meta-analysis]
Literature search up to January 2019
.....
Study design:
Setting and Country: China
Source of funding and conflicts of interest: [commercial / non-commercial funding/ industrial co-authorship / potential conflicts of interest]
|
Inclusion criteria (1) evaluation of the diagnostic potential of HSF for BA, (2) case-control design with a control group of patients with non-BA disease (3) sufficient data to calculate the diagnostic parameters.
Exclusion criteria (1) duplicate publications; (2) letters, editorials, and case reports or reviews; and (3) studies lacking complete data.
Setting A: Zhang et al 2013, China. Retrospective design B: Ju et al 2015, China. Retrospective design C: Lee et al, 2009, Korea. Retrospective design D: El-Guindi et al 2013, Egypt. Prospective design E: El -Guindi et al 2014, Egypt. Prospective design F: Lee et al 2009, Korea. Retrospective design G: Li 2017, China. Prospective design H: Kim et al 2017, Korea. Retrospective design I: Duan et al, China 2013. Retrospective design
Important patient characteristics:
N, mean age Not described in Sun A B C D E G H I
Sex: Not described in Sun
A B C D E G H I
|
Describe index and comparator tests* and cut-off point(s):
A HSF present B: HSF present C: HSF present D: HSF present E: HSF present F: HSF present G: HSF present H: HSF present I: HSF present
No cut off point: absent or present
|
Describe reference test and cut-off point(s):
A: ? B: ? C: BA surgical cholangiography, non- BA surgical cholangiography, follow up D: BA laparotomy, non-BA E: BA laparotomy, non-BA biopsy, IOC, follow-up F: BA surgical cholangiography, non- BA surgical cholangiography, follow up G: ? H:? I:?
Prevalence BA A: 20/50 (40%) B 32/62 (52% C 29/64 (45%) D 27/54 (50%) E 30/60 (50%) F 29/48 (60%) G 30/60 (50%) H 101/161 (63%) I 65/213 (31 %)
For how many participants were no complete outcome data available?
Not described for any included study
Reasons for incomplete outcome data described?
Not described for any included study
|
Endpoint of follow-up:
Not described for any included study
|
Outcome measures and effect size (include 95%CI and p-value if available):
A Sensitivity 19/20 (95%, 95% CI 76-99%) Specificity 28/30 (93%, 95% CI 79-98%) PPV 19/21 (90%, 95% CI 71-97%) NPV 28/29 (97%, 83-99%)
B Sensitivity 30/32 (94%, 95% CI 80-98%) Specificity 27/30 (90%, 95% 74-97%) PPV 30/33 (91% CI 76-97%) NPV 27/29 (93%, 95% CI 78-98%)
C: Sensitivity 29/29 100% (88-100%) Specificity 30/35 86% (95% CI 71-94%) PPV 29/34 (85%, 95% CI 70-94%) NPV 30/30 (100%, 95% CI 89—100%)
D Sensitivity 26/27 (96%, 95% CI 82-99%) Specificity 26/27 27 (96%, 95% CI 82-99%) PPV 26/27 27 (96%, 95% CI 82-99%) NPV 26/27 27 (96%, 95% CI 82-99%)
E: Sensitivity 29/30 (97%, 95% CI 83-99%) Specificity 29/30 (97%, 95% CI 83-99%) PPV 29/30 (97%, 95% CI 83-99%) NPV 29/30 (97%, 95% CI 83-99%)
F: Sensitivity 29/29 (100%, 95% CI 88-100%) Specificity 19/19 (100%, 95% CI 83-100%) PPV 29/29 (100%, 95% CI 88-100%) NPV 19/19 (100%, 95% CI 83-100%)
G: Sensitivity 28/30 (93%, 95% CI 79-98%) Specificity 27/30 (90%, 95% CI 74-97%) PPV 28/31 (90%, 95% CI 75-97%) NPV 27/29 (93%, 78-98%)
H: Sensitivity 89/ 101 (88%, 95% 80-93%) Specificity 42/60 71% (95% 57-81%) PPV 89/107 (83%, 95% CI 95% 75-89%) NPV 42/54 (77%, 95% 65-87%)
I: Sensitivity 47/65 (73%, 95% CI 60-82%) Specificity 136/148 (92%, 95% CI 86-95%) PPV 47/59 (80%, 95% 68-88%) NPV 136/154 (88%, 95% CI 82-92%)
Pooled characteristic (bivariate analysis) Sensitivity 95% (95% CI 88-98%) I2 = 80.16 [67.84 - 92.47] Specificity: 92 (95% CI 85-96%) I2 = 80.51 [68.48 - 92.55] Positive likelihood ratio 11.6(95% CI 6.3-21.5) Negative Likelihood ratio, 0.06 (95% CI 0.02-0.14) AUC of 0.98 (95% CI 0.96-0.99)
Pooled data Only study with reference test Sensitivity 89% (95% CI 83-94%) Specificity 93% (95% CI 90-95%) 0.30
|
Study quality (ROB): method used and results per individual study.
Place of the index test in the clinical pathway: replacement, triage, add-on
Choice of cut-off point: influences test characteristics (sens, spec); important in relation to the clinical question (e.g. if a disease is to be ruled out, sensitivity is the critical outcome measure and more important than specificity: high sensitivity comes at the expense of low specificity and high rates of false postives, and usually those testing positive are subjected to further diagnostic tests for final diagnosis)
Facultative:
Brief description of author’s conclusion
Personal remarks on study quality, conclusions, and other issues (potentially) relevant to the research question
Sensitivity analyses (excluding small studies; excluding low quality studies; excluding case-control type of studies; relevant subgroup-analyses); mention only analyses which are of potential importance to the research question.
Heterogeneity: clinical and statistical heterogeneity; clinical: enough similarities in patient characteristics, diagnostic tests (strategy) to allow pooling? For pooled data: assessment of statistical heterogeneity and, more importantly, assessment of the reasons for heterogeneity (if present)? Note: sensitivity and specificity depend on the situation in which the test is being used and the thresholds that have been set, and sensitivity and specificity are correlated; therefore, the use of heterogeneity statistics (p-values; I2) is problematic, and rather than testing whether heterogeneity is present, the reasons for heterogeneity should be examined.
|
Wang 2018
|
SR [and meta-analysis]
Literature search up to July 2017
A: Cox et al 1987, USA, Prospective design B: Park, 1997, Korea, Prospective design C: Tan, 200, Singapore, Prospective design D: Farrant, 2001, UK, Prospective design E: Sun, 2001, China, Prospective design F: Azuma, 2003, Japan, design? G: Lee, 2003, Korea, Prospective design H: Visrutaratnia, 2003, Thailand, Prospective design I: Reyeom, 2005, Korea, Prospective design J: Deghani, 2006, Iran, Prospective design K: Humprey, 2007, UK, Prospective design L: Kim, 2007, Korea, Prospective design M: Takamizawa, 2007, Japan, Prospective design N: Lee, 2009, Korea, Prospective design O: Poddar, 2009, India, Prospective design P: Rouzrokh, 2009, Iran, Retropspective design Q: Yang, 2009 China, Retrospective design R: Aziz, 2011 USA, Prospective design S: El-Guindi et al (2013) Egypt T: Jiang, 2013, China Prospective design U El-Guindi, 2014, Egypt, Prospective design V: Jancelewicz, 2015 Canada, Retrospective design W: Lee, 2015, Korea, Retrospective
.....
Study design:
Setting and Country: China
Source of funding and conflicts of interest: [commercial / non-commercial funding/ industrial co-authorship / potential conflicts of interest]
|
Inclusion criteria SR: (1) diagnostic test accuracy (DTA) studies evaluating sensitivity and specificity of at least one of B-US (2) articles were published in full texts in English and (3) studies with sufficient information for analysis.
Exclusion criteria SR: (1) letters, reviews, case reports, conference abstracts, editorials, expert opinion reviews and abstracts, (2) data of sensitivity, specificity is incorrect or insufficient for analysis or evaluated by more than one researcher without a consensus, (3) screening studies with a large population without cholestasis and (4) studies with overlapping cases and data.
Important patient characteristics: US n= 1135 patients
Number of patients; characteristics important to the research question; for example, age, sex, bmi, ...
N, mean age A: B: 12-120, C: 2-12 wk D: < 12 wk E: 16-360 d F 62 G: 5-210 d H: 19-139 d I: 69 (24-169) J: 62 ± 17 K: 7-143 d L: 13-138 d M: 4-144d N: BA mean 51+/- 24 d Non-BA 48+/- 32d O 2.8±1.7 m P 39 days Q 62±14 (31-121) R: < 5 m
S 68.52 T 2.9 mon U BA: 63.0 ± 12.7 d; NBA: 71.5 ± 20.0 d V ? W 55
Sex A B C D E F G Male: 53/86 (61.6%) Female 33/86 (38.4%) H I J K Male 48/90 (53%) Female 42/90 (47%) L Male 38/68 (56%) Female 30/68 (44%) M N Male 43/64 (67%) Female 21/64 (33%) O P Q R S T U Male 31/60 (52%) Female 29/60 (48%) V W
|
Describe index and comparator tests* and cut-off point(s):
A B: C: Triangular sign: bifurcation D: Gallbladder: Absent gallbladder, abnormal wall or shape E: Triangular sign Bifurcation Gallbladder: Absent gallbladder, abnormal wall, or shape F: G: Triangular sign Greater than or equal to 4 mm, EARPV H: Triangular sign Greater than or equal to 3 mm, bifurcation
I: J K Triangular sign bifurcation L Triangular sign Greater than or equal to 4 mm, EARPV M Triangular sign G Greater than or equal to 3 mm, porta hepatis N Triangular sign Greater than or equal to 4 mm, EARPV O P Q R R1 Triangular sign Greater than or equal to 4 mm, EARPV S Triangular sign: Greater than or equal to 4 mm, EARPV T U Triangular sign EARPV V W
|
Describe reference test and cut-off point(s):
A B: C: Ba surgery, non-BA follow up
D: ? E: IOC to all F: G: BA surgery, non-BA biopsy, cholescintigraphy, follow-up H: BA surgery, cholangiography, liver biopsy non-BA IOC, cholescintigraphy, liver biopsy I: J K BA surgery, non-BA follow up, biopsy L BA surgical cholangiography, non- BA surgical cholangiography, follow up M BA surgery, non-BA biopsy, IOC N BA surgery, non-BA follow-up, biopsy O P Q R BA biopsy, IOC, surgery non-BA surgery, IOC S BA IOC, Non-BA biopsy, IOC, follow-up T U BA laparotomy, non-BA biopsy, IOC, follow-up V W
Prevalence (%) [based on refence test at specified cut-off point] A 9/33 (27%) B: 25/73 (34%) C: 12/60 (20%) D: 38/158 (24%) E: 151/182 (83%) F: 23/30 (77%) G: 20/86 (23%) H: 23/46 (50%) I:4/23 (17%) J: 19/65 (29%) K: 30/90 (33%) L: 38/68 (56%) M: 48/85 (53%) N: 29/64 (45%) O: 60/101 (64%) P: 18/42(43%) Q: 34/69 (49%) R: 15/35(43%) S: 27/54 (50%) T: 23/51 (45%) U: 30/60 (50%) V: 45/212(21%) W: 46/100 (46%)
For how many participants were no complete outcome data available?
Not described |
Endpoint of follow-up: A B: C: D: E: F: G: H: I: .....
..... |
Outcome measures and effect size (include 95%CI and p-value if available):
A- Sensitivity 6/9 (67%, 95% CI 35-88%) Specificity 20/24 (83%, 95% CI 64-93) PPV 6/10 (60%, 95% CI 31-83%) NPV 20/23 (87%, 95% CI 68-95%)
B: Sensitivity 21/25 (84%, 65-94%) Specificity 48/48 (100%, 95% CI 93-100%) PPV 21/21 (100%, 95% CI 85-100%) NPV 48/52 (92%, 95% CI 82-97%)
C: Sensitivity 10/12 (83%, 55-95%) Specificity 48/48 (100%, 93-100%) PPV 10/10 (100%, 72-100%) NPV 48/50 (96%, 87-99%)
D: Sensitivity 33/36 (92%, 78-97) Specificity 118/122 (97%, 92-99%) PPV 33/37(89%, 75-96%) NPV 118/121 (98%, 93-99%)
E: Sensitivity 150/151 (36%, 29-44%) Specificity 26/31(84%, 67-93%) PPV 150/155 (90%, 80-95%) NPV 26/27 (21%, 15-29%)
F: Sensitivity 19/23 (82%, 95% CI 63-93%) Specificity 5/7 (71%, 95% CI 36-92%) PPV 19/21 (90%, 95% CI 71-97%) NPV 5/9 (55%, 95% CI 27-81%)
G: Sensitivity 16/20 (80%, 58-92%) Specificity 65/66 (98%, 92-100%) PPV 16/17 (94%, 73-99%) NPV 65/69 (94%, 86-98%)
H: Sensitivity 22/23 (96%, 79-99%) Specificity 17/23 (74%, 54-87%) PPV 22/ 28 (78%, 60-90%) NPV 17/18 (94%, 74-99%)
I: Sensitivity ¾ (75%, 95% CI 30-95%) Specificity17/19 (89%, 95% 69-97%) PPV 3/5 (60%, 95% 23-88%) NPV17/18 (94%, 95% 74-99%)
J Sensitivity 10/19 (53%, 95% CI 32—73%) Specificity 35/46 (76%, 95% CI 62-86%) PPV 10/21 (48%, 95% CI 28-68%) NPV 35/44 (80%, 95% CI 66-89%)
K Sensitivity 22/30 (72%, 56-86%) Specificity 60/60 (100%, 94-100%) PPV 22/22 (100%, 85-100%) NPV 60/68 (88%, 78-94%)
L Sensitivity 22/38(58%, 42-72%) Specificity 28/30 (93%, 79-98%) PPV 22/24 (92%, 74-98%) NPV 28/44 (63%, 49-76%)
M Sensitivity 40/48 (83%, 70-91%) Specificity 36/37 (97%, 86-100%) PPV 40/42 (95%, 84-99%) NPV 36/43 (84%, 70-92%)
N Sensitivity 18/29 (62%, 44-77%) Specificity 35/35 (100%, 90-100%) PPV 18/18 (100%, 82-100%) NPV 35/46 (76%, 62-86%)
O Sensitivity 46/65 (71%, 59-80%) Specificity 30/36 (83%, 95% CI 68-92%) PPV 46/52 (88%, 95% CI 77-95%) NPV 30/49 (61%, 95% CI 47-74%)
P Sensitivity 13/18 (72%, 95% CI 49-88%) Specificity 22/24 (91%, 95% CI 74-98%) PPV 13/15 (86%, 95% CI 62-96%) NPV 22/27 (81%, 95% CI 63-92%)
Q Sensitivity 17/34 (50%, 95% CI 34-66%) Specificity29/35 )83%, 95% CI 67-92%) PPV 17/23 (74%, 95% CI 54-87%) NPV 29/46 (63%, 95% CI 49-75%)
R Sensitivity 9/15 (60%, 35-80%) Specificity 19/20 (95%, 76-99%) PPV 9/10 (90%, 60-98%) NPV 19/25(76%, 57-89%)
S Sensitivity 16/27 (59%, 41-75%) Specificity 24/27 (89%, 71-96%) PPV 16/19 (84%, 62-94%) NPV 24/35 (69%, 52-81%)
T Sensitivity 21/23 (91%, 95% CI 73-98%) Specificity 26/28 (93%, 95% CI 77-98%) PPV 21/23 23 (91%, 95% CI 73-98%) NPV 26/28 (93%, 95% CI 77-98%)
U Sensitivity 19/30 (63%, 95% CI 46-78%) Specificity 26/30 (87%, 95%v CI 70-95%) PPV 19/23 (83% (70, 95% CI 83-93%) NPV 26/37 (70%, 95% CI 54-83%)
V Sensitivity 14/45 (31%, 95% CI 20-46%) Specificity 166/167 (99%, 95% CI 97-100%) PPV 14/15 (93%, 95% CI 70-99%) NPV 166/197 (84%, 95% CI 79-89%)
W Sensitivity 46/46 (100%, 95% CI 92-100%) Specificity 51/54 (94%, 85-98%) PPV 46/49 ((4%, 95% CI 83-98%) NPV 51/51 (100%, 95% CI 93-100%)
Pooled Sensitivity 77%, 95% CI 74-80%) I2 88.6% Specificity 93% (95% CI 91-94%) I2 76.9%
|
Study quality (ROB): method used and results per individual study.
Place of the index test in the clinical pathway: replacement, triage, add-on
Choice of cut-off point: influences test characteristics (sens, spec); important in relation to the clinical question (e.g. if a disease is to be ruled out, sensitivity is the critical outcome measure and more important than specificity: high sensitivity comes at the expense of low specificity and high rates of false positives, and usually those testing positive are subjected to further diagnostic tests for final diagnosis)
Facultative:
Brief description of author’s conclusion
Personal remarks on study quality, conclusions, and other issues (potentially) relevant to the research question
Sensitivity analyses (excluding small studies; excluding low quality studies; excluding case-control type of studies; relevant subgroup-analyses); mention only analyses which are of potential importance to the research question.
Heterogeneity: clinical and statistical heterogeneity; clinical: enough similarities in patient characteristics, diagnostic tests (strategy) to allow pooling? For pooled data: assessment of statistical heterogeneity and, more importantly, assessment of the reasons for heterogeneity (if present)? Note: sensitivity and specificity depend on the situation in which the test is being used and the thresholds that have been set, and sensitivity and specificity are correlated; therefore, the use of heterogeneity statistics (p-values; I2) is problematic, and rather than testing whether heterogeneity is present, the reasons for heterogeneity should be examined.
Studies tussen 1998 en 2014 Positief triangular sign verschilt per studie, verschillende inclusiecriteria |
Zhou et al (2016) |
Type of study Systematic review, meta-analysis
Databases PubMed, Web of Science databases
Literature search between January 1990 and May 2015
Quality assessment QUADAS tool, 2 independent reviewers
Funding and conflict of interest Not reported
|
Inclusion criteria defining a positive ultrasound imaging result; surgery, biopsy, or both used as a reference standard for biliary atresia; surgery, biopsy, clinical follow-up, or some combination thereof used as a reference standard for the exclusion of biliary atresia; sufficient data to extract the number of true-positive, true-negative, false-positive, and false-negative results; and sufficient data reported for at least 10 cases.
Exclusion criteria Editorials, letters to the editor, review articles, case reports, and animal experiment studies
Setting A Ikeda et al (1998), Japan B Lee et al (2000), Taiwan C Tan-Kendrick et al (2000) Singapore D Kots et al (2001), Egypt E Flagrant et al (2001), UK F Park et al. (2001), Korea G Kanegawa et al. (2003), Japan H Lee et al. (2003), Korea I Tan-Kendrick et al. (2003), Singapore J Visrutaratna et al. (2003) Thailand K Humphrey et al. (2007), UK L Kim et al. (2007) Korea M Takamizawa et al. (2007), Japan N Lee etl al. (2009) Korea O Donia et al. (2010), Egypt P Imanieh et al. (2010) Iran Q Mittal et al. (2011), India R Aziz et al. (2011), USA S Sun et al. (2011), China T El-Guindi et al (2013) Egypt U El-Guindi et al. (2014) Egypt V Hanquinet et al. (2015) Switzerland W Zhou et al. (2015), China
Important patient characteristics: US n= 1135 patients
Number of patients; characteristics important to the research question; for example, age, sex, bmi, ...
Number of patients A 72 B 152 C 60 D 65 E 158 F 87 G 55 H 86 I 217 J 46 K 90 L 68 M 85 N 64 O 50 P 58 Q 99 R 35 S 182 T 54 U 60 V 20 W 273
N, mean age A ? B 14-97 days C 2-12 wk D 32-161 d E < 12 wk F 16-150 d G 8-144 d H 5-210 d I 2-12 wk J 19-139 d K 7-143 d L 13-138 d M 4-144d N BA mean 51+/- 24 d non-BA 48+/- 32d O < 12 months P 30-120 d Q 13-89 d R < 5 m S 16-360 d T BA: 61.8 ± 15.1 d; NBA: 75.6 ± 25.1 d U BA: 63.0 ± 12.7 d; NBA: 71.5 ± 20.0 d V BA: 55.5 ± 33.9 d; NBA: 48.8 ± 25.9 d W BA: 68.4 ± 20.4 d; NBA: 64.5 ± 25.0 d
Sex: A ? B Male 90/152 (59,2%) Female 62/ 152 (40,8%) C ? D ? E ? F ? G ? H Male: 53/86 (61.6%) Female 33/86 (38.4%) I ? J ? K Male 48/90 (53%) Female 42/90 (47%) L Male 38/68 (56%) Female 30/68 (44%) M ? N Male 43/64 (67%) Female 21/64 (33%) O Male 23/50 (46%) Female 27/50 (54%) P Male 33/58 (57%) Female 25/58 (43%) Q Male 68/99 (69%) Female 31/99 (31%) R ? S ? T Male 38/54(72%) Female 38/54 U Male 31/60 (52%) Female 29/60 (48%) V Male 10/20 (50% Female 10/20 (50%) W Male 183/273 (67%) Female 90/273 (33%)
|
Index test 1: positive TC thickness and location 2: Criteria of abnormal gall bladder
A1 ?, ? A2 Absent or not contractile gallbladder B1 ?, ? B2 Absent gallbladder, length < 1.5 cm C1?, bifurcation C2 Absent gallbladder, length < 1.5 cm D1?, bifurcation D2 Absent gallbladder, length < 1.5 cm E1, ? E2 Absent gallbladder, abnormal wall or shape F1 Groter of gelijk aan 2.5 mm bifurcatie F2 ? G1 Groter of gelijk aan 3 mm bifurcation G2 Absent gallbladder, length < 1.5 cm, detected without lumen, not contractile H1 Groter of gelijk aan 4 mm, EARPV H2 ? I1? Bifurcation I2 Ghost triad J1 Groter of gelijk aan 3 mm, bifurcation J2 Absent, length < 1.5 cm, detected without lumen. K1 ? bifurcation K2 Absent, length < 1.9 cm, abnormal wall, abnormal shape. L1 Groter of gelijk aan 4 mm, EARPV L2? M1 Groter of gelijk aan 3 mm, porta hepatis M2 Absent gallbladder, length < 1.5 cm, not contractile N1 Groter of gelijk aan 4 mm, EARPV N2 Length < 1.5 cm O1 ?Bifurcation O2 Absent gallbladder P1 ? porta hepatis P2 ? Q1 Vicinity of the portal vein, Groter of gelijk aan 4 mm Q2 Absent gallbladder, length < 1.9 mm, abnormal wall or shape, not contractile R1 Groter of gelijk aan 4 mm, EARPV R2 Absent Gall bladder, Kleiner of gelijk aan 1.5 cm S1? Bifurcation S2 Absent gallbladder, abnormal wall, or shape T1 Groter of gelijk aan 4 mm, EARPV T2 Length < 2.05 cm, absent gallbladder, not contractile, rudimentary U1 ? EARPV U2 Length < 2.05 cm, absent gallbladder, not contractile, rudimentary V1 ? Bifurcation V2 Absent gallbladder W1 Groter of gelijk dan 2 mm, distal to the right portal vein W2 Absent, length < 1.5 cm, abnormal wall, and shape, classification
|
Reference test A BA surgery, non-BA? B IOC liver biopsy; follow up C Ba surgery, non-BA follow up D BA surgery, non-BA biopsy, laparotomy, IOC E ? F Ba surgery, biopsy non-BA follow up or cholangiography of biopsy G BA surgery, non-BA follow-up H BA surgery, non-BA biopsy, cholescintigraphy, follow-up I Ba surgery, non-BA laparotomy, follow up. J BA surgery, cholangiography, liver biopsy non-BA IOC, cholescintigraphy, liver biopsy K BA surgery, non-BA follow up, biopsy L BA surgical cholangiography, non- BA surgical cholangiography, follow up M BA surgery, non-BA biopsy, IOC N BA surgery, non-BA follow-up, biopsy O Liver biopsy for all P Ba biopsy, non-biopsy Q BA surgery, IOC Non-BA biopsy, follow-up R BA biopsy, IOC, surgery non-BA surgery, IOC S IOC to all T BA IOC, Non-BA biopsy, IOC, follow-up T BA laparotomy, non-BA U BA laparotomy, non-BA biopsy, IOC, follow-up V BA IOC, biopsy. Non- BA IOC, biopsy W BA surgery, IOC, biopsy non-BA IOC, follow up.
Prevalence of BA A 34/72 (47%) B 49/152 (32%) C 12/ 60 (20%) D 25/65 (38%) E 37/158 (23%) F 30/87 (35%) G 29/55(53%) H 20/86 (23%) I 31/217 (14%) J 23/46(50%) K 30/90 (33%) L 38/68 (56%) M 48/85 (56%) N 29/64 (45%) O 27/50 (54%) P 10/58 (17%) Q 30/99 (30%) R 15/35 (43%) S 151/182 (83%) T 27/54 (50%) U 30/60 (50%) V 10/20 (50%) W 129/273 (47%)
Incomplete data ?? |
|
A1 Not reported A2 Sensitivity 30/34 (88%, 73-95%) Specificity 38/38 (100,%, 91-100%) PPV 30/30 (100%, 89-100%) NPV 38/42 (91%, 78-96%)
B1 Not reported B2 Sensitivity 40/49 (82%, 68-91%) Specificity 65/103 (63%, 53-72%) PPV 40/78 (51%, 40-62%) NPV 65/74 (87%, 78-93%)
C1 Sensitivity 10/12 (83%, 55-95%) Specificity 48/48 (100%, 93-100%) PPV 10/10 (100%, 72-100%) NPV 48/50 (96%, 87-99%) C2 Sensitivity 12/12 (100%, 76-100%) Specificity 48/48 (100%, 93-100%) PPV 12/12 (100%, 76-100%) NPV 48/48 (100%, %, 93-100%)
D1 Sensitivity 25/25 (100%, 87-100%) Specificity 40/40 (100%, 91-100%) PPV 25/25 (100%, 87-100%)
NPV 40/40 100%, 91-100%) D2 Sensitivity 23/25 (92%, 75-98%) Specificity 30/40 (75%, 60-86%) PPV 23/33 (70%, 53-83%) NPV 30/32 (95%, 80-98%)
E1 Not reported E2 Sensitivity 33/36 (92%, 78-97) Specificity 118/122 (97%, 92-99%) PPV 33/37(89%, 75-96%) NPV 118/121 (98%, 93-99%)
F1 Sensitivity 25/ 30 (83%, 66-93%) Specificity 55/57 (96%, 88-99%) PPV 25/26 (96%, 81-99%) NPV 55/61 (90%, 80-95%) F2 Not reported
G1 Sensitivity 27/29 (93%, 78-98%) Specificity 25/26 (96%, 81-99%) PPV 27/ 28 (96%, 82-99%) NPV 25/27 (93%, 77-98%) G2 Sensitivity 25/29 (86%, 69-95%) Specificity 19/26 (73%, 54-86%) PPV 25/32 (78%, 61-89%) NPV 19/23 (83%, 63-93%)
H1 Sensitivity 16/20 (80%, 58-92%) Specificity 65/66 (98%, 92-100%) PPV 16/17 (94%, 73-99%) NPV 65/69 (94%, 86-98%) H2 Not reported
I1 Not reported I2 Sensitivity 29/30 (97%, 83-99%) Specificity 187/187 (100%, 98-100%) PPV 29/29 (100%, 88-100%) NPV 187/188 (99%, 97-100%)
J1 Sensitivity 22/23 (96%, 79-99%) Specificity 17/23 (74%, 54-87%) PPV 22/ 28 (78%, 60-90%) NPV 17/18 (94%, 74-99%) J2 Sensitivity 22/23 (96%, 79-99%) Specificity 16/23 (70%, 49-84%) PPV 22/29v(76%, 58-88%) NPV 16/17 (94%, 73-99%)
K1 Sensitivity 22/30 (72%, 56-86%) Specificity 60/60 (100%, 94-100%) PPV 22/22 (100%, 85-100%) NPV 60/68 (88%, 78-94%) K2 Sensitivity 21/30 (70%, 52-83%) Specificity 60/60 (100%, 94-100%) PPV 21/21 (100%, 85-100%) NPV 60/69 (87%, 77-93%)
L1 Sensitivity 22/38(58%, 42-72%) Specificity 28/30 (93%, 79-98%) PPV 22/24 (92%, 74-98%) NPV 28/44 (63%, 49-76%) L2 Not reported M1 Sensitivity 40/48 (83%, 70-91%) Specificity 36/37 (97%, 86-100%) PPV 40/42 (95%, 84-99%) NPV 36/43 (84%, 70-92%) M2 Sensitivity 42/48 (87%, 75-94%) Specificity 27/37 (73%, 57-85%) PPV 42/52 (81%, 68-89%) NPV 27/33 (81%, 66-91%)
N1 Sensitivity 18/29 (62%, 44-77%) Specificity 35/35 (100%, 90-100%) PPV 18/18 (100%, 82-100%) NPV 35/46 (76%, 62-86%) N2 Sensitivity 19/29 (66%, 47-80%) Specificity 26/35 (74%, 58-86%) PPV 19/28 (68%, 49-92%) NPV 26/36 (72%, 56-84%)
O1 Sensitivity 5/27 (19%, 8-37%) Specificity 23/23 (100%, 86-100%) PPV 5/5 (100%, 57-100%) NPV23/45 (51%, 37-65%) O2 Sensitivity 17/27 (64%, 44-78%) Specificity 19/23 (83%, 63-93%) PPV17/21 (81%, 60-92%) NPV 19/29 (66%, 47-80%)
P1 Sensitivity 7/10 (70%, 40-89%) Specificity 46/48 (96%, 86-99%) PPV 7/9 (78%, 45-94%) NPV 46/49 (94%, 83-98%) P2 Not reported
Q1 Sensitivity 7/30 (23%, 12-41%) Specificity 67/69 (97%, 90-99%) PPV 7/9 (78%, 45-94%) NPV 67/90 (74%, 65-82%) Q2 Sensitivity 25/30 (83%, 66-93%) Specificity 58/69 (84%, 74-91%) PPV 25/36 (69%, 53-82%) NPV 58/63 (92%, 83-97%)
R1 Sensitivity 9/15 (60%, 35-80%) Specificity 19/20 (95%, 76-99%) PPV 9/10 (90%, 60-98%) NPV 19/25(76%, 57-89%) R2 Sensitivity 15/15 (100%, 80-100%) Specificity 20/20 (100%, 84-100%) PPV 15/15 (100%, 80-100%)) NPV 20/20 (100%, 84-100%)
S1 Sensitivity 41/151 (27%, 21-35%) Specificity 30/31 (97%, 84-99%) PPV 41/42 (97%, 88-100%) NPV 30/140 (21%, 15-29%) S2 Sensitivity 54/151 (36%, 29-44%) Specificity26/31(84%, 67-93%) PPV 54/60 (90%, 80-95%) NPV 26/122 (21%, 15-29%)
T1 Sensitivity 16/27 (59%, 41-75%) Specificity 24/27 (89%, 71-96%) PPV 16/19 (84%, 62-94%) NPV 24/35 (69%, 52-81%) T2 Sensitivity 25/27 (93%, 77-98%) Specificity 13/27 (48%, 31-66%) PPV 25/39 (64%, 48-77%) NPV 13/15 (87%, 62-96%)
U1 Sensitivity 19/30 (63%, 95% CI 46-78%) Specificity 26/30 (87%, 95%v CI 70-95%) PPV 19/23 (83% (70, 95% CI 83-93%) NPV 26/37 (70%, 95% CI 54-83%) Sensitivity 16/25 (63%, 45-80%) Specificity 22/25 (88%, 70-96%) PPV 16/19 (84%, 62-94%) NPV 22/31 (71%, 53-84%) U2 Sensitivity 19/25 (76%, 57-89%) Specificity 19/25 (76%, 57-89%) PPV6/25 (24%, 12-43%) NPV 6/ 25 (24%, 12-43%)
V1 Sensitivity 7/10 (70%, 40-89%) Specificity 9/10 (90%, 60-98%) PPV 7/8 (87.5%, 53-98%) NPV 9/12 (75%, 47-91%) V2 Sensitivity 4/10 (40%, 17-69%) Specificity 10/10 (100%, 72-100%) PPV 4/4 (100%, 51-100%) NPV 10/16 (62.5%, 39-82%)
W1 Sensitivity 118/128 (92%, 86-96%) Specificity 136/145 (94%, 89-97%) PPV 118/127 (93%, 87-96%) NPV 136/146 (92%, 88-96%) W2 Sensitivity 112/128 (88%, 81-92%) Specificity 130/145 (90%, 84-94%) PPV 112/126 (89%, 82-93%) NPV130/147 (88% 82-93%)
Pooled data Triangular cord sign (n=20 studies) Sensitivity 74% (95% CI, 61-84%), Specificity 97% (95% CI, 95– 99%) AUC 97 % (95% CI, 95–98%) heterogeneity for both sensitivity (I2 = 93.31%, p < 0.001) heterogeneity for specificity (I2 = 77.21%, p < 0.001).
Hepatic artery enlargement (n=5 studies Sensitivity 79% (95% CI, 71–86%) Specificity 75% (95% CI, 60–86%) AUC 83% (95% CI, 80–86%)
Gallbladder abnormalities (n=19 studies) Sensitivity 85% (95% CI, 76–91%) Specificity 92 % ((95% CI, 81–97%) AUC 94 % (95% CI, 91–95%) heterogeneity sensitivity I2 = 94.89%, p < 0.001) heterogeneity specificity I2 = 90.86%, p < 0.001)
Triangular cord sign + gall bladder abnormalities (n=5 studies) Sensitivity 95% (95% CI 70-99%) Specificity 89% (95% 79-94%) AUC 94% (95% 92-96%)
|
|
Evidence table liver biopsy
Study reference |
Study characteristics |
Patient characteristics 2 |
Intervention (I) |
Comparison / control (C) 3
|
Follow-up |
Outcome measures and effect size 4 |
Comments |
|
What is the diagnostic accuracy of liver biopsy in the evaluation of neonatal cholestasis (resulting in better outcome) in children with neonatal cholestasis. |
||||||||
Ahmed 2021 |
Type of study Retrospective cohort
From 2007 until 2019
Setting Saudi Arabia
Funding and conflicts of interest: The authors received no funding for this study.
The authors declare that they have non to declare.
|
Inclusion criteria Children with cholestasis defined clinically as presence of jaundice, acholic stools, and / or itching, and biochemically when conjugated bilirubin exceeds 20 μmol/l. If there was no jaundice, we required simultaneous elevation of serum GGT and total bile acids (TBA) > 10 (Normal, 0–10 μmol/L).
Exclusion criteria (1) inadequate LB specimen (when the liver tissue contained less than 5 portal spaces); (2) missing clinical information; (3) wedge biopsies obtained during IOC and Kassai portoenterostomy.
N total at baseline: 522 infantile cholestasis 166 biopsies performed, 122 included 46 biopsies with high suspicion for BA
Prevalence 22/122 (18%)
Important prognostic factors:
Sex Boys 77/122 (63%) Girls 55/122 (37%)
Gestational age (preterm) BA 3 (10%) 10 Non-BA (10%) p0.831
Age BA 78 (SD ± 38 days). Non-BA 150.75 (± 184.86) 0.051
|
BA (n= 22)
Index test Liver biopsy
Reference test Intraoperatively cholangiogram or Kasai
|
Non-BA(n=100) |
6-12 months |
Sensitivity 19/22(86%, 95% CI 67-95%)
specificity 16/24 (67%, 95% CI 47-82%),
PPV 19/27(70%, 95% CI 52-84%),
NPV 16/19 (84%, 95% CI 62-94%),
NPV 16/20 (80%, 95% CI 58-92%)
overall accuracy 76% |
Zie onderaan histopathological features BA vs non-BA |
|
Russo 2016 |
Type of study Prospective longitudinal multicenter study
Period June 1, 2004, November 2, 2014
Setting Canada
Funding and conflicts of interest: C Spino is employed by the University of Michigan, an entity that has received grant funds from Childhood Liver Disease Research Network (ChiLDReN). J Magee has received grant funding from NIDDK and Novartis. Z Chen provided data analysis for Saber. H Melin-Aldana received travel expenses from the NIDDK. F White received travel expenses through her PI’s NIDDK grant for this study (Dr Yumi Turmelle, Washington University School of Medicine). P Russo received grant funds. L Finn received travel reimbursement from NIDDK. B Shehata has received travel expenses from NI. GE Kim has received travel expenses from the NIDDK. The other authors have no conflicts of interest to declare. |
Inclusion criteria - children with cholestasis180 days old or less at the time a liver biopsy at one of 15 participating clinical centers - cholestasis defined as a serum direct or conjugated bilirubin greater than or equal to 2 mg/dL and greater than 20% of total bilirubin. Exclusion criteria parenteral nutrition, children who were very low birth weight (<1500 g), who had acute liver failure or cholestasis associated with shock or sepsis, or who had undergone previous HPE or other hepatobiliary surgery.
N total at baseline: 227 biopsies
Prevalence 136/227 (60%)
Important prognostic factors:
Sex BA Boys 25/28 (89%) Girls 3/28 (11%)
Neonatal hepatitis Boys 14/15 (93%) Girls 1/15 (7%)
Age BA 74.5 Non-BA 150.75 (± 184.86) 0.051
|
BA (n= 28)
Index test Liver biopsy
Reference test preoperatively cholangiogram and/or examination of the excised biliary remnants.
|
|
|
Sensitivity 121/136 (89%, (95% CI: 83-93%),
Specificity 77/91 (85%, 95% CI 76-91%)
PPV 121/135 (89%, 95% CI 83-94%)
NPV 77/92 (84%, 95% CI 75-90%)
|
|
Risk of bias table ERCP
Study reference |
Patient selection
|
Index test |
Reference standard |
Flow and timing |
Comments with respect to applicability |
Iunama (2000) |
Was a consecutive or random sample of patients enrolled? No
Was a case-control design avoided? No
Did the study avoid inappropriate exclusions? No
|
Were the index test results interpreted without knowledge of the results of the reference standard? No (retrospective design)
If a threshold was used, was it pre-specified? Not used
|
Is the reference standard likely to correctly classify the target condition? Yes: exploratory laparotomy after ERCP
Were the reference standard results interpreted without knowledge of the results of the index test? No
|
Was there an appropriate interval between index test(s) and reference standard? No
Did all patients receive a reference standard? No, only if incomplete visualization.
Did patients receive the same reference standard? Yes
Were all patients included in the analysis? Yes |
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No
|
CONCLUSION: Could the selection of patients have introduced bias?
RISK: High |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: high
|
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: Medium |
CONCLUSION Could the patient flow have introduced bias?
RISK: Medium |
|
|
Keil (2009) |
Was a consecutive or random sample of patients enrolled? No
Was a case-control design avoided? No
Did the study avoid inappropriate exclusions? Yes
|
Were the index test results interpreted without knowledge of the results of the reference standard? No (retrospective design)
If a threshold was used, was it pre-specified? Not used
|
Is the reference standard likely to correctly classify the target condition? Yes
Were the reference standard results interpreted without knowledge of the results of the index test? No
|
Was there an appropriate interval between index test(s) and reference standard? Not described
Did all patients receive a reference standard? No
Did patients receive the same reference standard? Yes
Were all patients included in the analysis? Yes |
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No
|
CONCLUSION: Could the selection of patients have introduced bias?
RISK: Medium |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: High
|
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: Low |
CONCLUSION Could the patient flow have introduced bias?
RISK: High |
|
|
Keil (2009) |
Was a consecutive or random sample of patients enrolled? No
Was a case-control design avoided? No
Did the study avoid inappropriate exclusions? Yes
|
Were the index test results interpreted without knowledge of the results of the reference standard? No (retrospective design)
If a threshold was used, was it pre-specified? Not used
|
Is the reference standard likely to correctly classify the target condition? Yes
Were the reference standard results interpreted without knowledge of the results of the index test? No
|
Was there an appropriate interval between index test(s) and reference standard? Not described
Did all patients receive a reference standard? No
Did patients receive the same reference standard? Yes
Were all patients included in the analysis? Yes |
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No
|
CONCLUSION: Could the selection of patients have introduced bias?
RISK: Medium |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: High
|
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: Medium |
CONCLUSION Could the patient flow have introduced bias?
RISK: High |
|
|
Negm (2018) |
Was a consecutive or random sample of patients enrolled? No
Was a case-control design avoided? No
Did the study avoid inappropriate exclusions? Not described
|
Were the index test results interpreted without knowledge of the results of the reference standard? No (retrospective design)
If a threshold was used, was it pre-specified? Not used
|
Is the reference standard likely to correctly classify the target condition? Yes
Were the reference standard results interpreted without knowledge of the results of the index test? No
|
Was there an appropriate interval between index test(s) and reference standard? Not described
Did all patients receive a reference standard? No
Did patients receive the same reference standard? Yes
Were all patients included in the analysis? No; some data missed or no operation performed
|
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No
|
CONCLUSION: Could the selection of patients have introduced bias?
RISK: High |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: High
|
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: Medium |
CONCLUSION Could the patient flow have introduced bias?
RISK: High |
|
|
Saito (2014) |
Was a consecutive or random sample of patients enrolled? No
Was a case-control design avoided? No
Did the study avoid inappropriate exclusions? Not described
|
Were the index test results interpreted without knowledge of the results of the reference standard? No (retrospective design)
If a threshold was used, was it pre-specified? Not used
|
Is the reference standard likely to correctly classify the target condition? Yes
Were the reference standard results interpreted without knowledge of the results of the index test? No
|
Was there an appropriate interval between index test(s) and reference standard? Not described
Did all patients receive a reference standard? No
Did patients receive the same reference standard? Yes
Were all patients included in the analysis? Yes |
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No
|
CONCLUSION: Could the selection of patients have introduced bias?
RISK: High |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: High
|
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: Medium |
CONCLUSION Could the patient flow have introduced bias?
RISK: Medium |
|
|
Shanmugam (2009) |
Was a consecutive or random sample of patients enrolled? No
Was a case-control design avoided? No
Did the study avoid inappropriate exclusions? Not described
|
Were the index test results interpreted without knowledge of the results of the reference standard? No (retrospective design)
If a threshold was used, was it pre-specified? Not used
|
Is the reference standard likely to correctly classify the target condition? Yes
Were the reference standard results interpreted without knowledge of the results of the index test? No
|
Was there an appropriate interval between index test(s) and reference standard? Not described
Did all patients receive a reference standard? No
Did patients receive the same reference standard? Yes
Were all patients included in the analysis? No; some data missed, or no operation performed
|
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No
|
CONCLUSION: Could the selection of patients have introduced bias?
RISK: High |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: High
|
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: Medium |
CONCLUSION Could the patient flow have introduced bias?
RISK: High |
|
|
Sheteyer (2012) |
Was a consecutive or random sample of patients enrolled? No
Was a case-control design avoided? No
Did the study avoid inappropriate exclusions? Not described
|
Were the index test results interpreted without knowledge of the results of the reference standard? No (retrospective design)
If a threshold was used, was it pre-specified? Not used
|
Is the reference standard likely to correctly classify the target condition? Yes
Were the reference standard results interpreted without knowledge of the results of the index test? No
|
Was there an appropriate interval between index test(s) and reference standard? Not described
Did all patients receive a reference standard? No
Did patients receive the same reference standard? Yes
Were all patients included in the analysis? Yes |
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No
|
CONCLUSION: Could the selection of patients have introduced bias?
RISK: High |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: High
|
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: Medium |
CONCLUSION Could the patient flow have introduced bias?
RISK: Medium |
|
Risk of bias table liver biopsy
Study reference |
Patient selection
|
Index test |
Reference standard |
Flow and timing |
Comments with respect to applicability |
Ahmad 2021 |
Was a consecutive or random sample of patients enrolled? No
Was a case-control design avoided? No
Did the study avoid inappropriate exclusions? Yes
|
Were the index test results interpreted without knowledge of the results of the reference standard? No (retrospective design)
If a threshold was used, was it pre-specified? Not used
|
Is the reference standard likely to correctly classify the target condition? Yes
Were the reference standard results interpreted without knowledge of the results of the index test? No
|
Was there an appropriate interval between index test(s) and reference standard? No
Did all patients receive a reference standard? No, only high suspicion.
Did patients receive the same reference standard? Yes
Were all patients included in the analysis? Yes |
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No
|
CONCLUSION: Could the selection of patients have introduced bias?
RISK: High |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: high
|
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: Medium |
CONCLUSION Could the patient flow have introduced bias?
RISK: Medium |
|
|
Russo 2016 |
Was a consecutive or random sample of patients enrolled? No
Was a case-control design avoided? No
Did the study avoid inappropriate exclusions? Yes
|
Were the index test results interpreted without knowledge of the results of the reference standard? Yes (within 1 month before)
If a threshold was used, was it pre-specified? Not used
|
Is the reference standard likely to correctly classify the target condition? Yes
Were the reference standard results interpreted without knowledge of the results of the index test? Yes
|
Was there an appropriate interval between index test(s) and reference standard? Yes (within 1 month before)
Did all patients receive a reference standard? Yes
Did patients receive the same reference standard? Yes
Were all patients included in the analysis? Yes |
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No
|
CONCLUSION: Could the selection of patients have introduced bias?
RISK: High |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: Medium
|
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: Low |
CONCLUSION Could the patient flow have introduced bias?
RISK: Medium |
|
Risk of bias table HIDA
Patient selection
|
Index test |
Reference standard |
Flow and timing |
Comments with respect to applicability |
|
Tsuda (2019) |
Was a consecutive or random sample of patients enrolled? No
Was a case-control design avoided? No
Did the study avoid inappropriate exclusions? Not known.
|
Were the index test results interpreted without knowledge of the results of the reference standard? Yes; independently viewed in diagnostic process.
If a threshold was used, was it pre-specified? Not used
|
Is the reference standard likely to correctly classify the target condition? Yes
Were the reference standard results interpreted without knowledge of the results of the index test? Not known; retrospective design, but independently scored HIDA.
|
Was there an appropriate interval between index test(s) and reference standard? Not known.
Did all patients receive a reference standard? Yes
Did patients receive the same reference standard? No; biopsy or follow-up.
Were all patients included in the analysis? Yes |
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No
|
CONCLUSION: Could the selection of patients have introduced bias?
RISK: High |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: low
|
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: Medium |
CONCLUSION Could the patient flow have introduced bias?
RISK: Medium |
|
|
CONCLUSION: Could the selection of patients have introduced bias?
RISK: High |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: Medium
|
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: Low |
CONCLUSION Could the patient flow have introduced bias?
RISK: Medium |
|
Verantwoording
Autorisatiedatum en geldigheid
Laatst beoordeeld : 22-05-2024
Laatst geautoriseerd : 22-05-2024
Geplande herbeoordeling : 22-05-2029
Algemene gegevens
De ontwikkeling/herziening van deze richtlijnmodule werd ondersteund door het Kennisinstituut van de Federatie Medisch Specialisten (www.demedischspecialist.nl/kennisinstituut) en werd gefinancierd uit de Kwaliteitsgelden Medisch Specialisten (SKMS).
De financier heeft geen enkele invloed gehad op de inhoud van de richtlijnmodule.
Samenstelling werkgroep
Voor het ontwikkelen van de richtlijn is in 2020 een multidisciplinaire werkgroep ingesteld, bestaande uit vertegenwoordigers van alle relevante specialismen (zie hiervoor de Samenstelling van de werkgroep) die betrokken zijn bij de zorg voor patiënten bij wie diagnostiek naar galgangatresie plaatsvindt.
Samenstelling van de werkgroep
- Prof. dr. H.J. (Henkjan) Verkade, kinderarts- MDL, werkzaam in het Universitair Medisch Centrum Groningen (Beatrix Kinderziekenhuis) te Groningen, NVK (voorzitter)
- Dr. P.F. (Patrick) van Rheenen, kinderarts-MDL, werkzaam in het Universitair Medisch Centrum Groningen (Beatrix Kinderziekenhuis) te Groningen, NVK (vicevoorzitter)
- Drs. J.M. (Jessica) Pruisen, fellowship MDL, werkzaam in het Universitair Medisch Centrum Groningen (Beatrix Kinderziekenhuis) te Groningen, NVK
- Prof. dr. C.D.M. (Clara) van Karnebeek, kinderarts en geneticus metabole ziekten, werkzaam in het Amsterdam UMC, NVK
- Dr. T.W. (Tjalling) de Vries, algemeen kinderarts (niet praktiserend). Indertijd werkzaam in het Medisch Centrum Leeuwarden, NVK
- Prof. dr. J.B.F. (Jan) Hulscher, chirurg, werkzaam in het Universitair Medisch Centrum Groningen te Groningen, NVvH
- Drs. C. (Carlijn) Frantzen, klinisch geneticus, werkzaam in het Universitair Medisch Centrum Groningen (Beatrix Kinderziekenhuis) te Groningen, VKGN
- Drs. C.A. (Lineke) Dogger, arts Maatschappij en Gezondheid, jeugdarts werkzaam bij NSPOH als opleider/adviseur, AJN
Samenstelling klankbordgroep
- J. (Janine) Pingen, St. Kind en Ziekenhuis
- Drs. J.A. (José) Willemse, directeur Nederlandse Leverpatiënten Vereniging
- Dr. V.M. (Victorien) Wolters, kinderarts-MDL, werkzaam in het Universitair Medisch Centrum Utrecht (Wilhelmina Kinderziekenhuis) te Utrecht, NVK
Met ondersteuning van:
- Dr. T. (Tim) Christen, adviseur, Kennisinstituut van de Federatie Medisch Specialisten
- Dr. J. (Janneke) Hoogervorst – Schilp, senior adviseur, Kennisinstituut van de Federatie Medisch Specialisten
- Dr. M. (Mattias) Göthlin, adviseur, Kennisinstituut van de Federatie Medisch Specialisten
- Drs. S. (Sjoukje) van der Werf, medisch informatiespecialist, UMCG
Belangenverklaringen
De Code ter voorkoming van oneigenlijke beïnvloeding door belangenverstrengeling is gevolgd. Alle werkgroepleden hebben schriftelijk verklaard of zij in de laatste drie jaar directe financiële belangen (betrekking bij een commercieel bedrijf, persoonlijke financiële belangen, onderzoeksfinanciering) of indirecte belangen (persoonlijke relaties, reputatiemanagement) hebben gehad. Gedurende de ontwikkeling of herziening van een module worden wijzigingen in belangen aan de voorzitter doorgegeven. De belangenverklaring wordt opnieuw bevestigd tijdens de commentaarfase.
Een overzicht van de belangen van werkgroepleden en het oordeel over het omgaan met eventuele belangen vindt u in onderstaande tabel. De ondertekende belangenverklaringen zijn op te vragen bij het secretariaat van het Kennisinstituut van de Federatie Medisch Specialisten.
Werkgroeplid |
Functie |
Nevenfuncties |
Gemelde belangen |
Ondernomen actie |
Werkgroep |
||||
Prof. dr. H.J. (Henkjan) Verkade |
Hoogleraar kindergeneeskunde/ Kinderarts Maag-, Darm- en Leverziekten
|
Consultancy voor Ausnutria, Albireo AB, Mirum, Friesland Campina, Vivet, lntercept, GMP-Orphan en Shire (elk op ad interim basis) |
n.v.t. |
Geen actie, genoemde nevenwerkzaamheden betreffen een andere aandoening en oudere leeftijdsgroep
|
Dr. P.F. (Patrick) van Rheenen |
Kinderarts-MDL |
Geen |
n.v.t. |
Geen actie
|
Drs. J.M. (Jessica) Pruisen |
Kinderarts-MDL |
Geen |
Geen |
Geen actie |
Prof. dr. C.D.M. (Clara) van Karnebeek |
Hoofd afdeling kindermetabole ziekten |
Programma directeur United for Metabolic Diseases (0,05fte betaald door St Metakids) |
n.v.t. |
Geen actie |
Dr. T.W. (Tjalling) de Vries |
Algemeen kinderarts |
Redactielid Praktische Pediatrie, waarvoor vergoeding wordt ontvangen. Mede auteur boeken, waarvoor royalties worden ontvangen |
Geen |
Geen actie, genoemde nevenwerkzaamheden betreffen een ander onderwerp
|
Prof. dr. J.B.F. (Jan) Hulscher |
Kinderchirurg UMCG, Hoogleraar Kinderchirurgie |
Voorzitter Nederlandse Vereniging voor Kinderchirurgie Algemeen Bestuurslid Nederlandse Vereniging |
Ik ben nauw betrokken bij de ontwikkeling en het uitrollen van de ontlasting kleurenkaart voor het screenen van neonatale cholestase. Deze kaart is initieel mede gefinancieerd door het zeldzame
|
Geen actie. Ontving geen betaling van Proctor & Gamble (Pampers) bij de ontwikkeling en het uitrollen van de ontlastingskleurenkaart
|
Drs. C. (Carlijn) Frantzen |
Klinisch Geneticus |
Geen |
Geen |
Geen actie |
Drs. C.A. (Lineke) Dogger |
Arts Maatschappij en Gezondheid |
Geen |
Geen |
Geen actie |
Klankbordgroep |
||||
J. (Janine) Pingen |
Junior Projectmanager en Beleidsmedewerker (tot 1-12-2020) |
Geen |
Geen |
Geen actie |
Mevr. R. (Rowy) Uitzinger |
Junior Projectmanager en Beleidsmedewerker (vanaf 1-12-2020) |
Geen |
Geen |
Geen actie |
Drs. J.A. (José) Willemse |
Directeur Nederlandse Leverpatiënten Vereniging
|
Zitting diverse experts commissies NASH (onbezoldigd)
|
Deelname aan de richtlijn zal hooguit positieve publiciteit bij de achterban opleveren. |
Geen actie |
Dr. V.M. (Victorien) Wolters |
Kinderarts-MDL, werkzaam in het Universitair Medisch Centrum Utrecht (Wilhelmina Kinderziekenhuis) te Utrecht, NVK |
Geen |
Geen |
Geen actie |
Inbreng patiëntenperspectief
Er werd aandacht besteed aan het patiëntenperspectief door uitnodigen van patiëntvertegenwoordigers voor de invitational conference en het uitnodigen van afgevaardigden van patiëntenverenigingen in de klankbordgroep. Het verslag hiervan is besproken in de werkgroep. De verkregen input is meegenomen bij het opstellen van de uitgangsvragen, de keuze voor de uitkomstmaten en bij het opstellen van de overwegingen. De conceptrichtlijn is tevens voor commentaar voorgelegd aan patiëntenverenigingen en de eventueel aangeleverde commentaren zijn bekeken en verwerkt.
Wkkgz & Kwalitatieve raming van mogelijke substantiële financiële gevolgen
Kwalitatieve raming van mogelijke financiële gevolgen in het kader van de Wkkgz
Bij de richtlijn is conform de Wet kwaliteit, klachten en geschillen zorg (Wkkgz) een kwalitatieve raming uitgevoerd of de aanbevelingen mogelijk leiden tot substantiële financiële gevolgen. Bij het uitvoeren van deze beoordeling zijn richtlijnmodules op verschillende domeinen getoetst (zie het stroomschema op de Richtlijnendatabase).
Uit de kwalitatieve raming blijkt dat er waarschijnlijk geen substantiële financiële gevolgen zijn, zie onderstaande tabel.
Module |
Uitkomst raming |
Toelichting |
Module Aanvullende diagnostiek |
Geen financiële gevolgen |
Uit de toetsing volgt dat de aanbeveling(en) niet breed toepasbaar zijn (<5.000 patiënten) en zal daarom naar verwachting geen substantiële financiële gevolgen hebben voor de collectieve uitgaven. |
Werkwijze
AGREE
Deze richtlijnmodule is opgesteld conform de eisen vermeld in het rapport Medisch Specialistische Richtlijnen 2.0 van de adviescommissie Richtlijnen van de Raad Kwaliteit. Dit rapport is gebaseerd op het AGREE II instrument (Appraisal of Guidelines for Research & Evaluation II; Brouwers, 2010).
Knelpuntenanalyse en uitgangsvragen
Tijdens de voorbereidende fase inventariseerde de werkgroep de knelpunten in de zorg voor patiënten met een verdenking op galgangatresie. Ook zijn er knelpunten aangedragen door:
- Nederlandse Vereniging voor Kindergeneeskunde (NVK)
- Nederlandse Vereniging voor Heelkunde (NVVH)
- Vereniging Klinische Genetica Nederland (VKGN)
- NHG (Nederlands Huisartsen Genootschap)
- Jeugdartsen Nederland (AJN)
- Stichting Kind en Ziekenhuis
- Nederlandse Leverpatiënten Vereniging
- IGJ (Inspectie Gezondheidszorg en Jeugd)
- NFU (Nederlandse Federatie van Universitair Medische Centra)
- NVZ (Nederlandse Vereniging van Ziekenhuizen)
- Patiëntenfederatie Nederland
- STZ (Samenwerkende Topklinische opleidingsZiekenhuizen)
- V&VN (Verpleegkundigen & Verzorgenden Nederland)
- NAPA (Nederlandse Associatie Physician Assistants)
- ZiNL (Zorginstituut Nederland)
- ZKN (Zelfstandige Klinieken Nederland)
- ZN (Zorgverzekeraars Nederland) via een schriftelijke invitational conference. Een verslag hiervan is opgenomen onder aanverwante producten.
Op basis van de uitkomsten van de knelpuntenanalyse zijn door de werkgroep concept-uitgangsvragen opgesteld en definitief vastgesteld.
Uitkomstmaten
Na het opstellen van de zoekvraag behorende bij de uitgangsvraag inventariseerde de werkgroep welke uitkomstmaten voor de patiënt relevant zijn, waarbij zowel naar gewenste als ongewenste effecten werd gekeken. Hierbij werd een maximum van acht uitkomstmaten gehanteerd. De werkgroep waardeerde deze uitkomstmaten volgens hun relatieve belang bij de besluitvorming rondom aanbevelingen, als cruciaal (kritiek voor de besluitvorming), belangrijk (maar niet cruciaal) en onbelangrijk. Tevens definieerde de werkgroep tenminste voor de cruciale uitkomstmaten welke verschillen zij klinisch (patiënt) relevant vonden.
Methode literatuursamenvatting
Een uitgebreide beschrijving van de strategie voor zoeken en selecteren van literatuur is te vinden onder ‘Zoeken en selecteren’ onder Onderbouwing. Indien mogelijk werd de data uit verschillende studies gepoold in een random-effects model. De beoordeling van de kracht van het wetenschappelijke bewijs wordt hieronder toegelicht. De gebruikte risk-of-bias instrumenten zijn gevalideerde instrumenten die worden aanbevolen door de Cochrane Collaboration: AMSTAR – voor systematische reviews; Cochrane – voor gerandomiseerd gecontroleerd onderzoek; ACROBAT-NRS – voor observationeel onderzoek; QUADAS II – voor diagnostisch onderzoek.
Beoordelen van de kracht van het wetenschappelijke bewijs
Voor interventievragen (vragen over therapie of screening)
De kracht van het wetenschappelijke bewijs werd bepaald volgens de GRADE-methode. GRADE staat voor ‘Grading Recommendations Assessment, Development and Evaluation’ (zie http://www.gradeworkinggroup.org/). De basisprincipes van de GRADE-methodiek zijn: het benoemen en prioriteren van de klinisch (patiënt) relevante uitkomstmaten, een systematische review per uitkomstmaat, en een beoordeling van de bewijskracht per uitkomstmaat op basis van de acht GRADE-domeinen (domeinen voor downgraden: risk of bias, inconsistentie, indirectheid, imprecisie, en publicatiebias; domeinen voor upgraden: dosis-effect relatie, groot effect, en residuele plausibele confounding).
GRADE onderscheidt vier gradaties voor de kwaliteit van het wetenschappelijk bewijs: hoog, redelijk, laag en zeer laag. Deze gradaties verwijzen naar de mate van zekerheid die er bestaat over de literatuurconclusie, in het bijzonder de mate van zekerheid dat de literatuurconclusie de aanbeveling adequaat ondersteunt (Schünemann, 2013; Hultcrantz, 2017).
GRADE |
Definitie |
Hoog |
|
Redelijk |
|
Laag |
|
Zeer laag |
|
Bij het beoordelen (graderen) van de kracht van het wetenschappelijk bewijs in richtlijnen volgens de GRADE-methodiek spelen grenzen voor klinische besluitvorming een belangrijke rol (Hultcrantz, 2017). Dit zijn de grenzen die bij overschrijding aanleiding zouden geven tot een aanpassing van de aanbeveling. Om de grenzen voor klinische besluitvorming te bepalen moeten alle relevante uitkomstmaten en overwegingen worden meegewogen. De grenzen voor klinische besluitvorming zijn daarmee niet één op één vergelijkbaar met het minimaal klinisch relevant verschil (Minimal Clinically Important Difference, MCID). Met name in situaties waarin een interventie geen belangrijke nadelen heeft en de kosten relatief laag zijn, kan de grens voor klinische besluitvorming met betrekking tot de effectiviteit van de interventie bij een lagere waarde (dichter bij het nuleffect) liggen dan de MCID (Hultcrantz, 2017).
Voor vragen over diagnostische tests, schade of bijwerkingen, etiologie en prognose
De kracht van het wetenschappelijke bewijs werd eveneens bepaald volgens de GRADE-methode: GRADE-diagnostiek voor diagnostische vragen (Schünemann, 2008), en een generieke GRADE-methode voor vragen over schade of bijwerkingen, etiologie en prognose. In de gehanteerde generieke GRADE-methode werden de basisprincipes van de GRADE-methodiek toegepast: het benoemen en prioriteren van de klinisch (patiënt) relevante uitkomstmaten, een systematische review per uitkomstmaat, en een beoordeling van bewijskracht op basis van de vijf GRADE criteria (startpunt hoog; downgraden voor risk of bias, inconsistentie, indirectheid, imprecisie, en publicatiebias).
Overwegingen (van bewijs naar aanbeveling)
Om te komen tot een aanbeveling zijn naast (de kwaliteit van) het wetenschappelijke bewijs ook andere aspecten belangrijk en worden meegewogen, zoals aanvullende argumenten uit bijvoorbeeld de biomechanica of fysiologie, waarden en voorkeuren van patiënten, kosten (middelenbeslag), aanvaardbaarheid, haalbaarheid en implementatie. Deze aspecten zijn systematisch vermeld en beoordeeld (gewogen) onder het kopje ‘Overwegingen’ en kunnen (mede) gebaseerd zijn op expert opinion. Hierbij is gebruik gemaakt van een gestructureerd format gebaseerd op het evidence-to-decision framework van de internationale GRADE Working Group (Alonso-Coello, 2016a; Alonso-Coello 2016b). Dit evidence-to-decision framework is een integraal onderdeel van de GRADE methodiek.
Formuleren van aanbevelingen
De aanbevelingen geven antwoord op de uitgangsvraag en zijn gebaseerd op het beschikbare wetenschappelijke bewijs en de belangrijkste overwegingen, en een weging van de gunstige en ongunstige effecten van de relevante interventies. De kracht van het wetenschappelijk bewijs en het gewicht dat door de werkgroep wordt toegekend aan de overwegingen, bepalen samen de sterkte van de aanbeveling. Conform de GRADE-methodiek sluit een lage bewijskracht van conclusies in de systematische literatuuranalyse een sterke aanbeveling niet a priori uit, en zijn bij een hoge bewijskracht ook zwakke aanbevelingen mogelijk (Agoritsas, 2017; Neumann, 2016). De sterkte van de aanbeveling wordt altijd bepaald door weging van alle relevante argumenten tezamen. De werkgroep heeft bij elke aanbeveling opgenomen hoe zij tot de richting en sterkte van de aanbeveling zijn gekomen.
In de GRADE-methodiek wordt onderscheid gemaakt tussen sterke en zwakke (of conditionele) aanbevelingen. De sterkte van een aanbeveling verwijst naar de mate van zekerheid dat de voordelen van de interventie opwegen tegen de nadelen (of vice versa), gezien over het hele spectrum van patiënten waarvoor de aanbeveling is bedoeld. De sterkte van een aanbeveling heeft duidelijke implicaties voor patiënten, behandelaars en beleidsmakers (zie onderstaande tabel). Een aanbeveling is geen dictaat, zelfs een sterke aanbeveling gebaseerd op bewijs van hoge kwaliteit (GRADE gradering HOOG) zal niet altijd van toepassing zijn, onder alle mogelijke omstandigheden en voor elke individuele patiënt.
Implicaties van sterke en zwakke aanbevelingen voor verschillende richtlijngebruikers |
||
|
||
|
Sterke aanbeveling |
Zwakke (conditionele) aanbeveling |
Voor patiënten |
De meeste patiënten zouden de aanbevolen interventie of aanpak kiezen en slechts een klein aantal niet. |
Een aanzienlijk deel van de patiënten zouden de aanbevolen interventie of aanpak kiezen, maar veel patiënten ook niet. |
Voor behandelaars |
De meeste patiënten zouden de aanbevolen interventie of aanpak moeten ontvangen. |
Er zijn meerdere geschikte interventies of aanpakken. De patiënt moet worden ondersteund bij de keuze voor de interventie of aanpak die het beste aansluit bij zijn of haar waarden en voorkeuren. |
Voor beleidsmakers |
De aanbevolen interventie of aanpak kan worden gezien als standaardbeleid. |
Beleidsbepaling vereist uitvoerige discussie met betrokkenheid van veel stakeholders. Er is een grotere kans op lokale beleidsverschillen. |
Organisatie van zorg
In de knelpuntenanalyse en bij de ontwikkeling van de richtlijnmodule is expliciet aandacht geweest voor de organisatie van zorg: alle aspecten die randvoorwaardelijk zijn voor het verlenen van zorg (zoals coördinatie, communicatie, (financiële) middelen, mankracht en infrastructuur). Randvoorwaarden die relevant zijn voor het beantwoorden van deze specifieke uitgangsvraag zijn genoemd bij de overwegingen. Meer algemene, overkoepelende, of bijkomende aspecten van de organisatie van zorg worden behandeld in de module Organisatie van zorg.
Commentaar- en autorisatiefase
De conceptrichtlijnmodule werd aan de betrokken (wetenschappelijke) verenigingen en (patiënt) organisaties voorgelegd ter commentaar. De commentaren werden verzameld en besproken met de werkgroep. Naar aanleiding van de commentaren werd de conceptrichtlijnmodule aangepast en definitief vastgesteld door de werkgroep. De definitieve richtlijnmodule werd aan de deelnemende (wetenschappelijke) verenigingen en (patiënt) organisaties voorgelegd voor autorisatie en door hen geautoriseerd dan wel geaccordeerd.
Literatuur
Agoritsas T, Merglen A, Heen AF, Kristiansen A, Neumann I, Brito JP, Brignardello-Petersen R, Alexander PE, Rind DM, Vandvik PO, Guyatt GH. UpToDate adherence to GRADE criteria for strong recommendations: an analytical survey. BMJ Open. 2017 Nov 16;7(11):e018593. doi: 10.1136/bmjopen-2017-018593. PubMed PMID: 29150475; PubMed Central PMCID: PMC5701989.
Alonso-Coello P, Schünemann HJ, Moberg J, Brignardello-Petersen R, Akl EA, Davoli M, Treweek S, Mustafa RA, Rada G, Rosenbaum S, Morelli A, Guyatt GH, Oxman AD; GRADE Working Group. GRADE Evidence to Decision (EtD) frameworks: a systematic and transparent approach to making well informed healthcare choices. 1: Introduction. BMJ. 2016 Jun 28;353:i2016. doi: 10.1136/bmj.i2016. PubMed PMID: 27353417.
Alonso-Coello P, Oxman AD, Moberg J, Brignardello-Petersen R, Akl EA, Davoli M, Treweek S, Mustafa RA, Vandvik PO, Meerpohl J, Guyatt GH, Schünemann HJ; GRADE Working Group. GRADE Evidence to Decision (EtD) frameworks: a systematic and transparent approach to making well informed healthcare choices. 2: Clinical practice guidelines. BMJ. 2016 Jun 30;353:i2089. doi: 10.1136/bmj.i2089. PubMed PMID: 27365494.
Brouwers MC, Kho ME, Browman GP, Burgers JS, Cluzeau F, Feder G, Fervers B, Graham ID, Grimshaw J, Hanna SE, Littlejohns P, Makarski J, Zitzelsberger L; AGREE Next Steps Consortium. AGREE II: advancing guideline development, reporting and evaluation in health care. CMAJ. 2010 Dec 14;182(18):E839-42. doi: 10.1503/cmaj.090449. Epub 2010 Jul 5. Review. PubMed PMID: 20603348; PubMed Central PMCID: PMC3001530.
Hultcrantz M, Rind D, Akl EA, Treweek S, Mustafa RA, Iorio A, Alper BS, Meerpohl JJ, Murad MH, Ansari MT, Katikireddi SV, Östlund P, Tranæus S, Christensen R, Gartlehner G, Brozek J, Izcovich A, Schünemann H, Guyatt G. The GRADE Working Group clarifies the construct of certainty of evidence. J Clin Epidemiol. 2017 Jul;87:4-13. doi: 10.1016/j.jclinepi.2017.05.006. Epub 2017 May 18. PubMed PMID: 28529184; PubMed Central PMCID: PMC6542664.
Medisch Specialistische Richtlijnen 2.0 (2012). Adviescommissie Richtlijnen van de Raad Kwalitieit. http://richtlijnendatabase.nl/over_deze_site/over_richtlijnontwikkeling.html
Neumann I, Santesso N, Akl EA, Rind DM, Vandvik PO, Alonso-Coello P, Agoritsas T, Mustafa RA, Alexander PE, Schünemann H, Guyatt GH. A guide for health professionals to interpret and use recommendations in guidelines developed with the GRADE approach. J Clin Epidemiol. 2016 Apr;72:45-55. doi: 10.1016/j.jclinepi.2015.11.017. Epub 2016 Jan 6. Review. PubMed PMID: 26772609.
Schünemann H, Brożek J, Guyatt G, et al. GRADE handbook for grading quality of evidence and strength of recommendations. Updated October 2013. The GRADE Working Group, 2013. Available from http://gdt.guidelinedevelopment.org/central_prod/_design/client/handbook/handbook.html.