Diagnostiek DDH
Uitgangsvraag
Wat is de optimale beeldvormende techniek om DDH vast te stellen en te classificeren?
De uitgangsvraag omvat de volgende deelvragen:
- Wat is de optimale beeldvormende diagnostische methode voor DDH: echografie (volgens Graf-methode) of röntgenfoto?
- Wat is de optimale beeldvormende diagnostische methode voor de classificatie van DDH: Graf-methode of andere echografische methoden?
Aanbeveling
Aanbeveling 1
Verricht een echografie van de heup bij kinderen verdacht voor DDH jonger dan één jaar, totdat echografisch onderzoek volgens Graf-methode niet meer kan.
Daarna is röntgendiagnostiek een geschikt alternatief.
Aanbeveling 2
Gebruik Graf echografie voor de diagnostiek van de heup bij kinderen verdacht voor DDH jonger dan één jaar.
Overwegingen
Voor- en nadelen van de interventie en de kwaliteit van het bewijs
Er is geen overtuigend bewijs dat bij verdenking op DDH echografie volgens Graf beter of slechter is dan röntgenfoto’s bij kinderen jonger dan één jaar. In veel artikelen wordt gesteld dat röntgenfoto’s op de leeftijd onder de drie maanden niet betrouwbaar zijn, soms wordt zelfs echografie als gouden standaard gebruikt, maar deze stellingen zijn eerder intuïtief dan evidence-based. Echter als de ossificatiekern van de heupkop te groot is geworden kan de Y-voege niet meer afgebeeld worden waardoor de Graf-methode niet meer mogelijk is (alfahoek niet meer te bepalen). Dit omslagpunt ligt globaal op de leeftijd van negen tot twaalf maanden.
De methode van Graf is algemeen geaccepteerd in Nederland en er is ruime expertise. Er is weinig tot geen bewijs dat andere echografische methoden beter zijn dan de Graf-methode. In veel vergelijkende artikelen ontbreekt een gouden standaard of wordt zelfs de methode van Graf als gouden standaard gebruikt. Bovendien zijn sommige echografische methodes afhankelijk van specifieke hardware en software die niet algemeen beschikbaar is en derhalve zelden worden toegepast/gebruikt (onder andere 3D echografie). Daarom zijn dit soort studies bij de literatuursearch geëxcludeerd, zie exclusietabel)
Een aantal studies suggereren dat de Graf-methode de meest gebruikte methode is, maar dat een complementaire methode relevant kan zijn voor de Graf type IIa groep. Echter, het is niet duidelijk wat er dan zou moeten gebeuren met de uitkomsten van meerdere methodes.
Waarden en voorkeuren van patiënten (en eventueel hun verzorgers)
Het doel van de diagnostische methode (echografie) is om even betrouwbaar als (of betrouwbaarder dan) met röntgenfoto’s te diagnosticeren.
Het voordeel van de echografie is dat schadelijke röntgenstraling ontbreekt en dat de weke delen van het heupgewricht afgebeeld kunnen worden, in tegenstelling tot röntgenfoto’s.
Het argument van de stralenbelasting wordt belangrijker als inderdaad DDH wordt gediagnosticeerd omdat daarna nog meerdere controles volgen. Een echografie duurt maximaal vijf minuten langer dan een röntgenfoto, maar dit is voor ouders eigenlijk nooit een probleem.
Kosten (middelenbeslag)
Een echografie is duurder dan een röntgenfoto: €91 versus €45 (kosten en honorarium, NZa tarieven, 2014). Dit verschil is verdedigbaar met betrekking tot de stralingsreductie.
Aanvaardbaarheid, haalbaarheid en implementatie
Zoals eerder beschreven is de aanvaardbaarheid van echografie zowel voor de ouders groot, als ook onder radiologen en orthopeden in Nederland. Alle radiologen in opleiding (AIOS) leren echografie als eerste modaliteit bij verdenking op DDH bij kinderen jonger dan één jaar. Echo apparatuur om DDH te diagnosticeren is in elk ziekenhuis aanwezig.
Rationale van de aanbeveling: weging van argumenten voor en tegen de interventies
Aanbeveling-1
Bij de vergelijking van echografie volgens de Graf-methode (interventie) versus röntgenfoto’s weegt het nadeel van de stralenbelasting bij röntgenfoto’s zwaar in overeenstemming met het ALARA (As Low As Reasonably Achievable) principe binnen de radiologie. Immers hoe jonger het kind, hoe groter het potentiële schadelijke effect (inductie toekomstige maligniteit). Door de geringe ossificatie is echografie beter in staat het heupgewricht af te beelden naarmate het kind jonger is, in tegenstelling tot röntgenfoto’s.
Acceptatie van echografie (volgens Graf) in Nederland is groot, zowel bij ouders als bij kinderen. Ook de ervaring met echografie in Nederland is hoog.
Aanbeveling-2
De methode van Graf is algemeen geaccepteerd in Nederland en er is ruime expertise. Er is weinig tot geen bewijs dat andere echografische methoden beter zijn dan de Graf-methode.
Doel is praktijkvariatie te verminderen, daarom is de werkgroep van mening dat er geen complementaire methodes nodig zijn ten opzichte van de Graf-methode.
Onderbouwing
Achtergrond
De huidige situatie in Nederland is dat beeldvorming naar DDH gaat op indicatie, er is geen routinescreening bij elke pasgeborene. Het algoritme voor beeldvorming is echter sterk wisselend:
- Veelal nog oude situatie: eerst echografie, indien afwijkend, dan röntgenfoto.
- Alleen röntgenonderzoek, ook de controle onderzoeken.
- Alleen echografie, ook de controle onderzoeken. Pas na één jaar röntgenfoto.
Het knelpunt in de praktijk is dat er grote praktijkvariatie, versnipperde expertise en onnodige stralenbelasting is voor jonge kinderen.
Conclusies
Very low GRADE |
It is unclear whether the diagnostic accuracy of radiography compared to ultrasonography differs between the two methods for the diagnosis of DDH in children younger than one year.
It is unclear whether there is good correlation between radiography and ultrasonography for the diagnosis of DDH in children younger than one year, especially for the more severe cases.
Sources: (Atalar, 2013; Morin, 1999; van Moppes, 1988) |
- GRADE |
It was not possible to draw conclusions or grade the level of evidence for diagnostic accuracy of ultrasonography with the static Graf-method versus any dynamic method for the classification of DDH in children younger than one year, due the lack of a gold standard or reference method. Sources: (Kosar, 2009; Alamdaran, 2016; Finnbogason, 2008) |
- GRADE |
It was not possible to draw conclusions or grade the level of evidence for diagnostic accuracy of ultrasonography the Graf-method versus femoral head coverage for the classification of DDH in children younger than one year, due the lack of a gold standard or reference method.
Sources: (Falliner, 2006; Czubak, 1998; Gunay, 2009) |
Very low GRADE |
It is unclear whether the diagnostic accuracy of ultrasonography differs between the Graf-method versus pubo-femoral distance for the classification of DDH in children younger than one year.
Sources: (Teixeira, 2015) |
Samenvatting literatuur
Description of studies and results
1. Ultrasonography versus radiography
Atalar (2013) aimed to determine the sensitivity and specificity of plain radiography for the diagnosis of DDH, with ultrasonography as reference standard in a retrospective cohort study. All infants were referred for DDH evaluation with ultrasonography and had undergone hip radiography before referral. Four patients who were older than 5 months of age had undergone radiography in the referral centre. Ultrasonography was performed within 10 days after the radiograph. Ultrasonography images were evaluated by two experts according to the Graf criteria, and the radiographs were evaluated by two other experts according to the Tönnis criteria. In total, 44 infants and 86 hips were evaluated, of which 35 girls, mean age 22 weeks (range 4 weeks to 50 weeks).
With ultrasonography as the standard for the diagnosis of DDH, radiography had a sensitivity of 61% and a specificity of 87%. All Graf D and Graf IIIa/b classifications according to ultrasonography were also identified with the radiographs, for type IIb and IIc the sensitivity and specificity was lower.
Morin (1999) studied 150 hips from 75 infants using both radiography (Tönnis criteria) and ultrasonography (Graf). The average age was 5 months (range 1 to 12 months). Five infants were referred with instable dislocated hips, 14 infants were treated with the Pavlik abduction device and the remaining 55 hips were stable hips but were considered at-risk.
Sonographic measurements were compared with the radiographically determined acetabular index with the aim to report on intra- and inter observer reliability. Two paediatric orthopaedic surgeons and one radiologist participated.
The acetabular index correlated better with the d/D ratio than with the α-angle. A d/D ratio of > 56% all had normal acetabular indexes and below 40% all were abnormal, there was a ‘grey zone’ with a poor correlation in between these limits. This ‘grey’ zone was wide, between 35° and 78° for the α-angle.
In the study of Van Moppes (1988), 500 ultrasounds on 1000 hips were performed in infants between 1 and 14 months. It was possible to compare the ultrasound results for 210 examinations with radiography. For the purpose of comparison were the ultrasound examinations of Graf IIa, normal for age included in Graf type I to be within normal physiological limits. Type IIb, IIc/d, III and IV were considered abnormal. Three radiological categories were used according to the Tönnis criteria, ‘normal’, ‘abnormal’, and ‘doubtful’. Despite the lack of a ‘golden standard’, sensitivity and specificity was calculated with radiographs as reference standard (see table 1), these were all high values. Of the 24 doubtful hips in 22 patients on radiologic examination, 14 were normal on ultrasound and 10 were abnormal.
Table 1 Comparison of radiography versus ultrasonography per hip
|
R abnormal |
R normal |
total |
sensitivity |
specificity |
US abnormal |
95 |
7 |
102 |
|
|
US normal |
8 |
266 |
274 |
|
|
|
103 |
273 |
376 |
92.2% |
97.4% |
R=radiography; US=ultrasonography
Level of evidence of the literature
The level of evidence regarding all the accuracy and correlation outcomes is from diagnostic studies and therefore starts high. The level of evidence was downgraded by two levels because of study limitations (bias by patient selection and absence of gold standard) and one level because of low numbers of included patients (imprecision). Resulting in a level of evidence of very low.
2. Graf-method versus other ultrasonography method
2a. Graf-method (static) versus dynamic method
Harcke method
Kosar (2009) studied 6,800 hips in 3,400 infants after referral for clinical suspicion of DDH, for having risk factors for DDH, or for follow-up examination for infants with known DDH. Infants were examined with ultrasonography methods, the method of Graf and/or the dynamic method of Harcke to differentiate normal from abnormal findings in the diagnosis of DDH. 2,142 (63%) were female in the age range of 3 days to 5 months, the mean age was 7 weeks and 3 days. Dynamic examination was not performed if the hip was classified as Type IIb (acetabular dysplasia) or worse according to Graf’s classification. Hips classified as Graf type I and dynamic stable were not called back for a repeat ultrasonography. Type Graf IIb and stable in the dynamic study were followed up with one-month intervals until the infants were 3 to 4 months old or were classified as type 1 morphology. Hips classified as Graf I or IIa, but unstable during the dynamic study were also followed up with one-month intervals.
Among the 5,540 hips that were grouped as Type I according to Graf’s classification, 91.5% (n = 5,068 hips) were stable in dynamic examination while 8.5% (n = 472 hips, age range 15 to 75 days) were unstable.
In the group classified as Type IIa (n = 682 hips), the rates of stability and instability with dynamic examination were 92.37% (n= 630 hips) and 7.63% (n = 52 hips), respectively.
In total 524 unstable hips (initially classified as Graf I or Graf IIa) in a total of 300 infants were followed by ultrasonography at one-month intervals without treatment. Of these 524 hips, 516 (98,5%) became stable in one or two months. In eight hips (all with unilateral instability), stabilization did not occur after 3 to 5 months, these eight hips were all initially classified as Graf Type I. 682 hips that were diagnosed as Type IIa had follow-up examinations and in 18 of them (18/682, 2.64%) no stabilization was observed. They underwent treatment with the diagnosis of acetabular dysplasia (Type IIb).
Other dynamic methods
In the cross-sectional study of Alamdaran (2016), all hips were examined with three techniques: static (Graf), dynamic and single view static and dynamic technique. The dynamic techniques were not well described. 300 infants (600 hips) were included after referral for an ultrasound, indications were abnormal findings on physical or imaging examination, family history of DDH, breech presentation on delivery, oligohydramniosis, and neuromuscular conditions. 52% were girls, the age range was 9 days to 83 weeks. 11% of patients were less than four weeks old, 74% was between 1 to 3 months of age 15% of the patients was more than three months old.
Hips were stratified into four classes based on α degree and shape of acetabulum based on summarized Graf classification (normal: α > 60, immature: 50 ≤ α < 60, mild dysplasia: 43 ≤ α < 50, severe dysplasia: α < 43).
In total, 5% of the hips were immature with the static technique. Almost all of them were unstable with the dynamic technique. 0.3% of morphologically normal hips were unstable with the dynamic technique and 9% of unstable hips with dynamic technique was normal. Single view static and dynamic technique detected all cases with acetabular dysplasia, instability, and dislocation, except two dislocations, these were detected by dynamic transverse view. Using single view static and dynamic technique detected more than 99% of all cases.
Finnbogason (2008) compared anterior dynamic ultrasonography and acetabular morphology with the Graf’s method to assess hip instability. In this study there was also an orthopaedic clinical examination which is not relevant for our literature review. 536 newborn infants (64% girls) and 1072 hips, with clinical signs of hip instability, ambiguous findings at clinical hip examination, or positive risk factors for DDH were investigated at an average age of 12 (sd 5) days.
821 (77%) hips were normal (Graf A, type Ia and Ib), 28 (3.4%) of them were unstable or dislocatable with dynamic ultrasonography. 220 (20%) hips were borderline/immature dysplastic (Graf B, type IIb), of which 73 (33%) were unstable or dislocatable with dynamic ultrasonography. 31 (3%) hips were pathologic and needed treatment (Graf C, type IIc and worse), of which 30 (97%) were unstable or dislocatable with dynamic ultrasonography.
Of all the hips considered unstable or dislocatable with dynamic ultrasonography, 21% were normal (Graf type I). Of the hips that were stable on dynamic ultrasound, one (0.1%) was dysplastic according to the Graf method.
Level of evidence of the literature
No level of evidence is estimated because no diagnostic outcome measures can be defined due to the lack of a gold standard or reference method. This is a description of a comparison between the methods. This means no GRADE conclusion can be drafted. There are differences observed for the Type IIa/b/c Graf method and the different dynamic methods.
2b. Graf-method versus Femoral Head Coverage (FHC)
In a prospective study Falliner (2006) investigated 232 consecutive neonates and 464 hips with ultrasonography by the methods of Graf and Terjesen. The neonates were less than four days old and 47% were female. The correlation between the two methods for the femoral head coverage (FHC) and the α angles was investigated.
Two groups were made for the Graf method and was considered either normal (types I and IIa) or pathological (IIc, D, IIIa). For Terjesen, hips with FHC < 44% (female) and < 47% (male) were classified as pathological.
According to Graf’s method, 1.3% of hips were pathological, compared with 4.1% according to Terjesen. The proportion of overall agreement was 96.8%. Positive and negative agreement is 98.3% and 40%, respectively, к-statistic = 0.39, according to the authors a fair agreement. This statistic is based on only six pathological cases according to the Graf method and 19 for the Terjesen method. Spearman’s correlation coefficient between FHC and α angles was 0.552 (95% CI 0.48 to 0.62).
Czubak (1998) investigated 657 infants with a means age of 23 days (range 5 to 42 days), 50% were girls. 657 out of consecutive 826 infants participated in this study. Subsequently, the infants were examined at least twice in every 6-week period. The α angle in the Graf method was compared to the FHC with the Terjesen method.
With the Graf method there were 3.9% dislocated or subluxated hips and 2.9% for the Terjesen method. In total, 29% of the hips were type IIa with the Graf method and 14% had possible dysplasia according to the Terjesen method FHC (49 to 40%).
A correlation coefficient of r = 0.57 was found between the alpha angle and FHC.
The results from the follow-up measurements were not considered for this literature review.
Gunay (2009) conducted a retrospective analysis and analysed ultrasonography data from 1,037 infants and 2,074 hips. Infants were included as part of a DDH screenings program. The mean age was 2.3 months (range, 1 to 10 months) and 59.4% were female. This study investigated the extent to which ultrasonographic measurements of FHC correspond to the categories of hip maturity defined by Graf’s angle α.
The percentages of FHC were found to be positively correlated with the α angles (r=0.668).
For further analysis two groups were defined, 1) α angle of 60 degrees or greater (mature hips) or 2) less than 60 degrees (immature or pathological).The authors calculated corresponding threshold values for FHC, hips having femoral head coverage of ≥ 51% were all mature. When this threshold value for mature development of 51% femoral head coverage was evaluated as an indicator of hip maturity, the sensitivity was 82.6%, and the 100% specificity.
The lower threshold value was 39%, indicating pathological development, having α angles of less than 50 degrees. For hips having pathological development (α angle ≤ 49 degrees, Graf types IIc, D, IIIa, IIIb, or IV), specificity was 100% and sensitivity was 79.2%.
The FHC of 22% of the hips in this study fell between these two threshold values (40 to 50%), Of these hips, 73% Graf type Ia and 26% Graf type IIa or IIb.
Level of evidence of the literature
No level of evidence is estimated because no diagnostic outcome measures can be defined due to the lack of a gold standard or reference method. This is a description of a comparison between two methods. This means no GRADE conclusion can be drafted.
2c. Graf method and pubo-femoral distance
Teixeira (2015) conducted a retrospective analysis to compare pubo-femoral distance (PFD) with the Graf method to screen for DDH in an at-risk population with 116 neonates, 232 hips. All neonates underwent ultrasonography in the fourth week after birth.
The study defined two groups according to recommendation for treatment: non-dysplastic (ND; Graf I/IIa; 211 hips; 69 females/37 males) and dysplastic hip (DH; Graf IIb/IIc/III/D/IV; 21 hips; 8 females/3males).
Sensitivity, specificity, and accuracy (receiver operating characteristics (ROC)curve) of PFD were 94.4% (95% CI 77 to 99), 93.4% (95% CI 89 to 96), and 97.2% (cut-off = 4.6 mm) at neutral position and 94.4% (95% CI 77 to 99), 89.0% (95% CI 84 to 93) and 95.5% (cut-off = 4.9 mm) with hip flexed with the Graf method as reference method.
Level of evidence of the literature
The level of evidence regarding the accuracy outcomes is from a diagnostic study and therefore starts high. The level of evidence was downgraded by two levels because of study limitations (bias by patient selection and absence of gold standard) and two level because of low numbers of included patients in one study (imprecision). Resulting in a level of evidence of very low.
Zoeken en selecteren
A systematic review of the literature was performed to answer the following questions:
PICO 1. What is the diagnostic accuracy of ultrasonography (according to the Graf method) in comparison with radiography for the diagnosis of DDH in children younger than one year?
P (patients): patients with probable DDH, younger than one year;
I (intervention): ultrasonography (according to Graf method);
C (control): conventional radiography;
O (outcome): diagnostic accuracy measures and correlation.
PICO 2. What is the diagnostic accuracy of ultrasonography according to the Graf method in comparison to other ultrasonography methods for the diagnosis of DDH in children younger than one year?
P (patients): patients with probable DDH, younger than one year;
I (intervention): ultrasonography (according to Graf method);
C (control): other ultrasonography methods (amongst others Harcke, dynamic ultrasound, anterior dynamic ultrasound, pubo-femoral distance);
O (outcome): diagnostic accuracy measures and correlation.
Relevant outcome measures
A priori, the guideline development group did not define the outcome measures listed above but used the definitions used in the studies.
Search and select (Methods)
The databases Medline (via OVID) and Embase (via Elsevier) were searched on 1 April 2019 with relevant search terms from 1980. The detailed search strategy is depicted under the tab Methods. The systematic literature search resulted in 468 hits. Studies were selected based on the following criteria: original studies and diagnostic tests were used according to the PICO, when a comparison was made no more than one month between two tests, useful diagnostic accuracy measures or DDH outcomes and studies including children under one year old. For PICO 1, 13 studies were initially selected based on title and abstract screening. After reading the full text, ten studies were excluded (see the table with reasons for exclusion under the tab Methods), and three studies were included. For PICO 2, 21 studies were initially selected based on title and abstract screening. After reading the full text, 14 studies were excluded (see the table with reasons for exclusion under the tab Methods), and 7 were included.
Results
Three studies were included in the analysis of the literature for the first PICO and seven studies for the second PICO. Important study characteristics and results are summarized in the evidence tables. The assessment of the risk of bias is summarized in the risk of bias tables.
Referenties
- Alamdaran, S. A., Kazemi, S., Parsa, A., Moghadam, M. H., Feyzi, A., & Mardani, R. (2016). Assessment of diagnostic value of single view dynamic technique in diagnosis of developmental dysplasia of hip: a comparison with static and dynamic ultrasond techniques. Archives of Bone and Joint Surgery, 4(4), 371.
- Atalar, H., Dogruel, H., Selek, H., Tasbas, B. A., Bicimoglu, A., & Gunay, C. (2013). A comparison of ultrasonography and radiography in the management of infants with suspected developmental dysplasia of the hip. Acta Orthop Belg, 79(5), 524-9.
- Czubak, J., Kotwicki, T., Piontek, T., & Skrzypek, H. (1998). Ultrasound measurements of the newborn hip comparison of two methods in 657 newborns. Acta orthopaedica Scandinavica, 69(1), 21-24.
- Falliner, A., Schwinzer, D., Hahne, H. J., Hedderich, J., & Hassenpflug, J. (2006). Comparing ultrasound measurements of neonatal hips using the methods of Graf and Terjesen. The Journal of bone and joint surgery. British volume, 88(1), 104-106.
- Finnbogason, T., Jorulf, H., Söderman, E., & Rehnberg, L. (2008). Anterior dynamic ultrasound and Graf's examination in neonatal hip instability. Acta Radiologica, 49(2), 204-211.
- Gunay, C., Atalar, H., Dogruel, H., Yavuz, O. Y., Uras, I., & Saylı, U. (2009). Correlation of femoral head coverage and Graf α angle in infants being screened for developmental dysplasia of the hip. International orthopaedics, 33(3), 761-764.
- Kosar, P., Ergun, E., Ünlübay, D., & Kosar, U. (2009). Comparison of morphologic and dynamic US methods in examination of the newborn hip. Diagnostic and Interventional Radiology, 15(4), 284.
- Morin, C., Zouaoui, S., Delvalle-Fayada, A., Delforge, P. M., & Leclet, H. (1999). Ultrasound assessment of the acetabulum in the infant hip. Acta orthopaedica Belgica, 65(3), 261-265.
- Teixeira, S. R., Dalto, V. F., Maranho, D. A., Zoghbi-Neto, O. S., Volpon, J. B., & Nogueira-Barbosa, M. H. (2015). Comparison between Graf method and pubo-femoral distance in neutral and flexion positions to diagnose developmental dysplasia of the hip. European journal of radiology, 84(2), 301-306.
- Van Moppes, F. I., & De Jong, R. O. (1988). Ultrasound diagnosis of congenital hip dislocation and dysplasia. Journal of medical imaging, 2(1), 1-9.
Evidence tabellen
Evidence table for diagnostic test accuracy studies
Research question: What is the diagnostic accuracy of ultrasonography (according to the Graf method) in comparison with radiography for the diagnosis of DDH in children younger than one year?
Study reference |
Study characteristics |
Patient characteristics
|
Index test (test of interest) |
Reference test
|
Follow-up |
Outcome measures and effect size |
Comments |
Atalar, 2013 |
Type of study: cross sectional study
Setting and country: Orthopaedic center, Turkey
Funding and conflicts of interest: No benefits or funds were received in support of this study. The authors report no conflict of interests. |
Inclusion criteria: -infants with referral for DDH evaluation and had undergone hip radiography before referral
Exclusion criteria: NA
N=44 infants, 86 hips
Prevalence: DDH radiography: n=24 (55%) Ultrasound radiography: n=26 (59%)
Age: 4 weeks to 50 weeks (mean age 21.7 weeks)
Sex: 35 (80%) female |
Describe index test: radiography
Normal or not normal according to Tönnis criteria. |
Describe reference test: Ultrasonography
Graf type 1 hips were categorized as normal, and types 2b, 2c, D, 3 and 4 were categorized as having DDH. Because type 2a hips can later show either normal or abnormal development, type 2a hips were not included in the study
|
Time between the index test and reference test: maximum of 10 days
For how many participants were no complete outcome data available? In 2 infants, Graf type 2a hips were present on one side and these two hips were not included in the study. |
Outcome measures and effect size (include 95%CI and p-value if available):
Sensitivity radiography: 61% Specificity radiography: 87%
Radiography and ultrasonography were found to be significantly correlated in terms of classification of developmental dysplasia of the hip presence or absence (p < 0.0001, Fisher’s exact test).
|
|
Morin, 1999 |
Type of study: cross sectional study
Setting and country: Hospital, France
Funding and conflicts of interest: not reported |
Inclusion criteria: -referral clinically instable or already Pavlik treatment or considered at-risk due to decreased abduction, clicks around the hip or positive family history
Exclusion criteria: NA
N=75 patients, 150 hips
Prevalence: NA
Mean age (range): 5 months (1-12 months)
Sex: 83% female |
Describe index test: Ultrasonography (Graf)
Comparator test: Radiography (Tönnis) |
Describe reference test: NA
|
Time between the index test and reference test: none
For how many participants were no complete outcome data available? N=0
Reasons for incomplete outcome data described? NA |
Outcome measures and effect size (include 95%CI and p-value if available):
Correlation (between d/D ratio and acetabular indexes): > 56% all had normal acetabular indexes <40% all were abnormal Correlation (between α-angle and acetabular indexes): grey zone with a poor correlation, between 35° and 78° |
Very little information on type of correlation measures |
Van Moppes, 1988 |
Type of study: cross sectional study
Setting and country: Hospital, The Netherlands
Funding and conflicts of interest: Not reported |
Inclusion criteria: consecutive newborns
Exclusion criteria: NA
N=500 ultrasounds, 1000 hips in 290 patients, of which 210 were examined with radiography
Prevalence: Graf classificatie based on sonography Graf I: 78% Graf IIa: 8.4% Graf IIb: 8.0% Graf IIc/d: 3.5% Graf IV: 1.8%
Age: 1 day – 14 months
Sex: 67% female |
Describe index test: Radiography (Tönnis)
Cut-off point(s): Tönnis and Brunken criteria |
Describe reference test: Ultrasonography (Graf)
Cut-off point(s): Graf type I and IIa was considered normal
Type =>IIb was considered abnormal
|
Time between the index test and reference test:
For how many participants were no complete outcome data available? N=0 (0%)
Reasons for incomplete outcome data described? NA? |
Outcome measures and effect size (include 95%CI and p-value if available):
Sensitivity radiography: 92.2% Specificity radiography: 97.4%
|
|
DDH=Developmental Dysplasia of the Hip; NA=Not applicable
Research question: What is the diagnostic accuracy of ultrasonography according to the Graf method in comparison to other ultrasonography methods for the diagnosis of DDH in children younger than one year?
Study reference |
Study characteristics |
Patient characteristics
|
Index test (test of interest) |
Reference test
|
Follow-up |
Outcome measures and effect size |
Comments |
Kosar, 2009 |
Type of study: cross sectional study
Setting and country: Hospital, Turkey
Funding and conflicts of interest: not reported |
Inclusion criteria: Patients that were referred for ultrasonography (US)for clinical suspicion of DDH, for having risk factors for DDH, or for follow-up US examination for infants with known DDH
Exclusion criteria: Follow-up cases and those who were referred without clinical information
N=3400 patients, 6800 hips
Prevalence: according to Graf’s classification Type 1 (normal hip): 5,540 (81.47%) Type 2a (physiologic immaturity): 682 (10%) Type 2b (acetabular dysplasia): 166 (2.44%) Type 2c (critical zone): 72 (1.05%) Type 3 (mildly dislocated): 197 (2.89%) Type 4 (dislocated): 143 (2.10%)
Mean age: 7 weeks and 3 days (range; 3 days-5 months)
Sex: 2,142 (63%) female |
Describe index test: Ultrasonography: dynamic method of Harcke
Cut-off point(s): Abnormal or normal based on tranverse flexion and coronal flexion views
Comparator test: Ultrasonography: Graf method
Cut-off point(s): Graf Type I and type IIa is considered normal
|
Describe reference test: NA |
Time between the index test and reference test: none
For how many participants were no complete outcome data available? Unclear
|
Outcome measures and effect size (include 95%CI and p-value if available):
Agreement outcome measure between Graf and dynamic examination: Graf Type I:(n = 5,068 hips) 91.5% were stable 8.5% (n = 472 hips) were unstable.
Graf Type IIa; (n = 682 hips), 92.37% were stable (n= 630 hips) 7.63% (n = 52 hips) were unstable
|
|
Alamdaran, 2016 |
Type of study: cross sectional study
Setting and country: Hospital Iran
Funding and conflicts of interest: not reported |
Inclusion criteria: -hip ultrasound indications*, referred to the radiology department -consent from parents * abnormal or equivocal findings on physical or imaging examination of the hip, any family history of DDH, breech presentation on delivery, oligohydramniosis, and neuromuscular conditions especially in foot.
Exclusion criteria: age over 2 years and recognized DDH
N=300 patients (600 hips)
Prevalence: based on static (Graf) technique: Normal: 90% Immature: 5% Mild dysplastic: 3% Dysplastic: 2%
Age: Range 9 days to 83 weeks. 35 infants (11%) <4 weeks old, 223 (74%) were 1-3 months age and 32 (15%) were > 3 months old.
Sex: 155 (52%) female |
Describe index test: Graf ultrasonography
Cut-off point(s): Graf classification normal: α > 60, immature: 50 ≤ α < 60, mild dysplasia: 43 ≤ α < 50, severe dysplasia: α < 43).
Comparator test: Dynamic methods: dynamic technique in other views (coronal flexion view in posterior lip plan, transverse/ flexion view and transverse/neutral view).
-dynamic -single view static and dynamic technique
|
Describe reference test: NA
|
Time between the index test and reference test: none
For how many participants were no complete outcome data available? N=0 (0%)
Reasons for incomplete outcome data described? NA |
Outcome measures and effect size (include 95%CI and p-value if available):
Agreement outcome measure between Graf and dynamic examination:
Static technique: 5% of the hips were immature Almost all of them were unstable with the dynamic technique.
0.3% of morphologically normal hips were unstable with the dynamic technique and 9% of unstable hips with dynamic technique was normal.
Single view static and dynamic technique detected all cases with acetabular dysplasia, instability, and dislocation, except two dislocations, these were detected by dynamic transverse view.
|
Little information on exact numbers |
Finnbogason, 2008 |
Type of study: cross sectional study
Setting and country: Maternity unit hospital, Sweden
Funding and conflicts of interest: not reported |
Inclusion criteria: -Infants with risk factors for DDH such as breech delivery, foot or neck deformity, or a family history of DDH, and those with a clinically unstable hip or ambiguous findings on clinical examination -referral for orthopaedic consultation at an age of 10 to 14 days.
Exclusion criteria: NA
N=536 infants or 1072 hips
Prevalence: Graf type Ia and Ib: 77% Graf type IIB: 20% Graf type IIc: 3%
Mean age ± SD: 12.2 days (SD 4.8 days).
Sex: 64% (342) female |
Describe index test: Static (Graf) method, acetabular morphology
Comparator test: Anterior dynamic ultrasonography
|
Describe reference test: NA
|
Time between the index test and reference test: on same day, but in 14 cases there was an interval exceeding 7 days.
For how many participants were no complete outcome data available? N=2 (<1%) à excluded from analyses
Reasons for incomplete outcome data described? No |
Outcome measures and effect size (include 95%CI and p-value if available):
Agreement outcome measure between Graf and dynamic examination:
Normal: 821 (77%) hips, 28 (3.4%) of them were unstable or dislocatable. Borderline/immature dysplastic 220 (20%) hips, of which 73 (33%) were unstable or dislocatable Pathological and needed treatment: 31 (3%) hips, of which 30 (97%) were unstable or dislocatable
Of all the hips considered unstable or dislocatable with dynamic ultrasonography, 21% were normal (Graf type I).
Of the hips that were stable on dynamic ultrasound, one (0.1%) was dysplastic according to the Graf method.
|
|
Falliner, 2006 |
Type of study: prospective cross sectional study
Setting and country: Hospital, Germany
Funding and conflicts of interest: No conflicts of interest |
Inclusion criteria: Consecutive neonates
Exclusion criteria: NA
N=232, 464 hips
Prevalence: 6 (1,3%) pathological hips according to Graf and 4.1% with Terjesen
Mean age ± SD: < 4 days
Sex: 47% female
|
Describe index test: Graf method
Cut-off point(s): Type I and IIa=normal Type IIc, D, IIIa=pathological
Comparator test: Terjesen method
Cut-off point(s):
|
Describe reference test: NA |
Time between the index test and reference test: none (same image)
For how many participants were no complete outcome data available? N=0 (0%)
Reasons for incomplete outcome data described? NA |
Outcome measures and effect size (include 95%CI and p-value if available):
Correlation: femoral head coverage (FHC) and the α angles 0.552 (95% CI 0.48 to 0.62).
Proportion of overall agreement: 96.8%. Positive agreement: 98.3% Negative agreement: 40% к-statistic = 0.39 |
|
Czubak, 1998 |
Type of study: cross sectional study
Setting and country: Hospital, Poland
Funding and conflicts of interest: not reported |
Inclusion criteria: -parents were advised to come for examination of the hips
Exclusion criteria: NA
N=657
Prevalence: DDH according to Graf: 3.9% DDH according to Terjesen: 2.9%
Mean age: 23 (range 5-42) days
Sex: 331 (50%) female |
Describe index test: Graf (alpha-angle)
Cut-off point(s): IIc and more=DDH
Comparator test: Terjesen (FHC) Normal=>50% Possible dysplasia:40-49% Subluxation:10-39% Dislocation:<10%
Cut-off point(s):
|
Describe reference test: NA
|
Time between the index test and reference test: none (same image)
For how many participants were no complete outcome data available? N=2 (0.3%)
Reasons for incomplete outcome data described? no |
Outcome measures and effect size (include 95%CI and p-value if available):
Correlation coefficient alpha-angle and FHC: r = 0.57
|
Follow-up data was not relevant |
Gunay, 2009 |
Type of study: cross sectional study, analysed retrospectively
Setting and country: Hospital, Turkey
Funding and conflicts of interest: not reported |
Inclusion criteria: all infants screened for DDH
Exclusion criteria: NA
N=1,037 infants, 2,074 hips
Prevalence:
Mean age: 2.3 months (range, 1–10 months)
Sex: 59.4% female |
Describe index test: Graf method (alpha-angle)
Cut-off point(s): Pathological hip: α angle ≤ 49 degrees
Comparator test: FHC %
Cut-off point(s): Two groups: =>60% is mature hip
Calculation threshold: ≥ 51% = mature hip Lower threshold: 39%=pathological hip |
Describe reference test: NA
|
Time between the index test and reference test: none (same image)
For how many participants were no complete outcome data available? N (%)
Reasons for incomplete outcome data described?
|
Outcome measures and effect size (include 95%CI and p-value if available):
Correlation coefficient alpha-angle and FHC: r = 0.668, p=0.001
Sensitivity and specificity: FHC threshold value for mature hip: 51% sensitivity with alpha-agle as reference was 82.6%, and 100% specificity.
FHC threshold value for pathological hip: 39%, alpha-angle <50 degrees, specificity was 100% and sensitivity was 79.2%.
|
|
Teixeira, 2015 |
Type of study: cross sectional study, retrospectively analyses
Setting and country: Hospital, Brazil
Funding and conflicts of interest: there were no conflicts of interest |
Inclusion criteria: neonates at risk for DDH, ultrasonography in the fourth week after birth.
Exclusion criteria: -known chromosomal abnormalities, neuro muscular disorders, or both. -19 participants were excluded because their exams were not available in the picture and archiving communication system (PACS).
N= 116 patients, 232 hips
Prevalence: 21 hips (9%) with Graf IIC, III, D, and IV.
Mean age ± SD:
Sex: 66% female |
Describe index test: Graf
Cut-off point(s): non-dysplastic = Graf I/IIa dysplastic=IIb/IIc/III/D/IV
Comparator test: PFD
Cut-off point(s):
|
Describe reference test: NA
|
Time between the index test and reference test: none (same image)
For how many participants were no complete outcome data available? N (%)
Reasons for incomplete outcome data described? |
Outcome measures and effect size (include 95%CI and p-value if available):
PFD (Graf) method as reference method. : Sensitivity: 94.4% (95% CI 77 to 99) Specificity: 93.4% (95% CI 89 to 96) (receiver operating characteristics (ROC)curve) 97.2% (cut-off = 4.6 mm) at neutral position and 94.4% (95% CI 77 to 99), 89.0% (95% CI 84 to 93) and 95.5% (cut-off = 4.9 mm) with hip flexed
|
|
DDH=Developmental Dysplasia of the Hip; FHC=Femoral Head Coverage; NA=Not applicable; PFD=pubo-femoral distance
Risk of bias assessment diagnostic accuracy studies (QUADAS II, 2011)
Research question: PICO 1. What is the diagnostic accuracy of ultrasonography (according to the Graf-method) in comparison with radiography for the diagnosis of DDH in children younger than one year?
Study reference |
Patient selection
|
Index test |
Reference standard |
Flow and timing |
Comments with respect to applicability |
Atalar, 2013 |
Was a consecutive or random sample of patients enrolled? Unclear
Was a case-control design avoided? Yes
Did the study avoid inappropriate exclusions? No, all referred patients had undergone radiographs, high prevalence of DDH
|
Were the index test results interpreted without knowledge of the results of the reference standard? Yes
If a threshold was used, was it pre-specified? NA
|
Is the reference standard likely to correctly classify the target condition? No
Were the reference standard results interpreted without knowledge of the results of the index test? Yes
|
Was there an appropriate interval between index test(s) and reference standard? Yes, max 10 days
Did all patients receive a reference standard? Yes
Did patients receive the same reference standard? Yes
Were all patients included in the analysis? Unclear |
Are there concerns that the included patients do not match the review question? Yes, see patient selection
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No |
CONCLUSION: Could the selection of patients have introduced bias?
RISK: HIGH |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: LOW |
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: HIGH |
CONCLUSION Could the patient flow have introduced bias?
RISK: LOW |
|
|
Morin, 1999 |
Was a consecutive or random sample of patients enrolled? Unclear
Was a case-control design avoided? Yes
Did the study avoid inappropriate exclusions? Yes
|
Were the index test results interpreted without knowledge of the results of the reference standard? Unclear (no reference was set, two methods were compared for correlation)
If a threshold was used, was it pre-specified? NA
|
Is the reference standard likely to correctly classify the target condition? No (no reference was set, two methods were compared for correlation)
Were the reference standard results interpreted without knowledge of the results of the index test? Unclear (no reference was set; two methods were compared for correlation) |
Was there an appropriate interval between index test(s) and reference standard? Yes
Did all patients receive a reference standard? Yes
Did patients receive the same reference standard? Yes
Were all patients included in the analysis? Yes |
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No |
|
CONCLUSION: Could the selection of patients have introduced bias?
RISK: LOW |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: UNCLEAR |
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: HIGH |
CONCLUSION Could the patient flow have introduced bias?
RISK: LOW |
|
Van Moppes, 1988 |
Was a consecutive or random sample of patients enrolled? Unclear
Was a case-control design avoided? Yes
Did the study avoid inappropriate exclusions? No, there may be a selection bias, a selection of patients received both tests, why is unclear
|
Were the index test results interpreted without knowledge of the results of the reference standard? Unclear
If a threshold was used, was it pre-specified? NA
|
Is the reference standard likely to correctly classify the target condition? No
Were the reference standard results interpreted without knowledge of the results of the index test? Unclear
|
Was there an appropriate interval between index test(s) and reference standard? Unclear
Did all patients receive a reference standard? No, see patient selection
Did patients receive the same reference standard? Yes
Were all patients included in the analysis? Yes |
Are there concerns that the included patients do not match the review question? Yes
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No |
|
CONCLUSION: Could the selection of patients have introduced bias?
RISK: HIGH |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: UNCLEAR |
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: HIGH |
CONCLUSION Could the patient flow have introduced bias?
RISK: LOW |
|
Kosar, 2009 |
Was a consecutive or random sample of patients enrolled? Yes
Was a case-control design avoided? Yes
Did the study avoid inappropriate exclusions? Yes
|
Were the index test results interpreted without knowledge of the results of the reference standard? Yes
If a threshold was used, was it pre-specified? Yes
|
Is the reference standard likely to correctly classify the target condition? No
Were the reference standard results interpreted without knowledge of the results of the index test? Yes
|
Was there an appropriate interval between index test(s) and reference standard? Yes
Did all patients receive a reference standard? Yes
Did patients receive the same reference standard? Yes
Were all patients included in the analysis? Yes |
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No |
CONCLUSION: Could the selection of patients have introduced bias?
RISK: LOW |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: LOW |
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: HIGH |
CONCLUSION Could the patient flow have introduced bias?
RISK: LOW |
|
|
Alamdaran, 2016 |
Was a consecutive or random sample of patients enrolled? Yes
Was a case-control design avoided? Yes
Did the study avoid inappropriate exclusions? Yes
|
Were the index test results interpreted without knowledge of the results of the reference standard? Yes
If a threshold was used, was it pre-specified? Yes
|
Is the reference standard likely to correctly classify the target condition? No
Were the reference standard results interpreted without knowledge of the results of the index test? Yes
|
Was there an appropriate interval between index test(s) and reference standard? Yes
Did all patients receive a reference standard? Yes
Did patients receive the same reference standard? Yes
Were all patients included in the analysis? Yes |
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No |
CONCLUSION: Could the selection of patients have introduced bias?
RISK: LOW |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: LOW |
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: HIGH |
CONCLUSION Could the patient flow have introduced bias?
RISK: LOW |
|
|
Finnbogason, 2008 |
Was a consecutive or random sample of patients enrolled? Yes
Was a case-control design avoided? Yes
Did the study avoid inappropriate exclusions? Yes
|
Were the index test results interpreted without knowledge of the results of the reference standard? Yes
If a threshold was used, was it pre-specified? Yes
|
Is the reference standard likely to correctly classify the target condition? No
Were the reference standard results interpreted without knowledge of the results of the index test? Yes
|
Was there an appropriate interval between index test(s) and reference standard? Yes
Did all patients receive a reference standard? Yes
Did patients receive the same reference standard? Yes
Were all patients included in the analysis? Yes |
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No |
CONCLUSION: Could the selection of patients have introduced bias?
RISK: LOW |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: LOW |
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: HIGH |
CONCLUSION Could the patient flow have introduced bias?
RISK: LOW |
|
|
Falliner, 2006 |
Was a consecutive or random sample of patients enrolled? Yes
Was a case-control design avoided? Yes
Did the study avoid inappropriate exclusions? Yes
|
Were the index test results interpreted without knowledge of the results of the reference standard? Yes
If a threshold was used, was it pre-specified? Yes
|
Is the reference standard likely to correctly classify the target condition? No
Were the reference standard results interpreted without knowledge of the results of the index test? Yes
|
Was there an appropriate interval between index test(s) and reference standard? Yes
Did all patients receive a reference standard? Yes
Did patients receive the same reference standard? Yes
Were all patients included in the analysis? Yes |
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No |
CONCLUSION: Could the selection of patients have introduced bias?
RISK: LOW |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: LOW |
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: HIGH |
CONCLUSION Could the patient flow have introduced bias?
RISK: LOW |
|
|
Teixera, 2015 |
Was a consecutive or random sample of patients enrolled? Yes
Was a case-control design avoided? Yes
Did the study avoid inappropriate exclusions? Yes
|
Were the index test results interpreted without knowledge of the results of the reference standard? Yes
If a threshold was used, was it pre-specified? Yes
|
Is the reference standard likely to correctly classify the target condition? No
Were the reference standard results interpreted without knowledge of the results of the index test? Yes
|
Was there an appropriate interval between index test(s) and reference standard? Yes
Did all patients receive a reference standard? Yes
Did patients receive the same reference standard? Yes
Were all patients included in the analysis? Yes |
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No |
CONCLUSION: Could the selection of patients have introduced bias?
RISK: LOW |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: LOW |
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: HIGH |
CONCLUSION Could the patient flow have introduced bias?
RISK: LOW |
|
|
Czubak, 1998 |
Was a consecutive or random sample of patients enrolled? Yes
Was a case-control design avoided? Yes
Did the study avoid inappropriate exclusions? Yes
|
Were the index test results interpreted without knowledge of the results of the reference standard? Yes
If a threshold was used, was it pre-specified? Yes
|
Is the reference standard likely to correctly classify the target condition? No
Were the reference standard results interpreted without knowledge of the results of the index test? Yes
|
Was there an appropriate interval between index test(s) and reference standard? Yes
Did all patients receive a reference standard? Yes
Did patients receive the same reference standard? Yes
Were all patients included in the analysis? Yes |
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No |
CONCLUSION: Could the selection of patients have introduced bias?
RISK: LOW |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: LOW |
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: HIGH |
CONCLUSION Could the patient flow have introduced bias?
RISK: LOW |
|
|
Gunay, 2009 |
Was a consecutive or random sample of patients enrolled? Yes
Was a case-control design avoided? Yes
Did the study avoid inappropriate exclusions? Yes
|
Were the index test results interpreted without knowledge of the results of the reference standard? Yes
If a threshold was used, was it pre-specified? Yes
|
Is the reference standard likely to correctly classify the target condition? No
Were the reference standard results interpreted without knowledge of the results of the index test? Yes
|
Was there an appropriate interval between index test(s) and reference standard? Yes
Did all patients receive a reference standard? Yes
Did patients receive the same reference standard? Yes
Were all patients included in the analysis? Yes |
Are there concerns that the included patients do not match the review question? No
Are there concerns that the index test, its conduct, or interpretation differ from the review question? No
Are there concerns that the target condition as defined by the reference standard does not match the review question? No |
CONCLUSION: Could the selection of patients have introduced bias?
RISK: LOW |
CONCLUSION: Could the conduct or interpretation of the index test have introduced bias?
RISK: LOW |
CONCLUSION: Could the reference standard, its conduct, or its interpretation have introduced bias?
RISK: HIGH |
CONCLUSION Could the patient flow have introduced bias?
RISK: LOW |
|
Tabel Exclusie na het lezen van het volledige artikel
Auteur en jaartal |
Redenen van exclusie |
PICO 1 |
|
Harcke, 1984 |
Geen bruikbare uitkomsten data |
Pillai, 2011 |
Vergelijking tussen US bij 4 weken, 4-8 weken en >10 weken en röntgen (AI) bij 6 maanden |
Geertsema, 2018 |
Vergelijking tussen US bij 12 weken en röntgen bij 5 maanden |
Boal, 1985 |
Niet bruikbaar: geen info over radiografie |
Terjesen, 1996 |
Voldoet niet aan PICO: Alleen radiografie bij kinderen waarbij US abnormaal of inconclusive was. |
Joseph, 1996 |
Discrepanties tussen verschillende methoden zijn geëxploreerd. Geen diagnostische maten, zeer beschrijvend |
Tudor, 2007 |
Voldoet niet aan PICO: US screening, alleen de afwijkingen werden ook radiografisch gecheckt |
Morin, 1985 |
Gebruikt FHC, geen Graf |
Graf, 1984a |
Artikel niet gevonden |
Graf, 1984b |
Artikel niet gevonden |
Auteur en jaartal |
Redenen van exclusie |
PICO 2 |
|
Treguier, 2013 |
Geen relevante methode |
Quader, 2018 |
Geen relevante methode; 3D niet relevant |
Zonoobi, 2018 |
Geen relevante methode; 3D niet relevant |
Mabee, 2016 |
Geen relevante methode; 3D niet relevant |
Jaremko, 2014 |
Geen relevante methode; 3D niet relevant |
Padilla-Raygoza, 2017 |
Geen relevante methode |
Hareendranathan, 2016 |
Geen relevante methode/vergelijking |
Soboleski, 1993 |
Geen vergelijking van twee methodes |
Rosendahl, 1992 |
Geen relevante methode; Barlow niet relevant |
Fan, 2019 |
Geen aparte resultaten voor verschillende methodes gepresenteerd |
Reikeras, 2019 |
Geen vergelijking tussen Graf en andere US methode |
Peterlein, 2010 |
Geen klinische vergelijking, alleen reproduceerbaar |
Suzuki, 1991 |
Vergelijking is niet bruikbaar, te weinig data |
Yavuz, 2014 |
Beide Graf, studie alleen in Graf IIa patiënten, niet bruikbaar |
Verantwoording
Autorisatiedatum en geldigheid
Laatst beoordeeld : 07-12-2020
Laatst geautoriseerd : 07-12-2020
Geplande herbeoordeling : 01-01-2026
Voor het beoordelen van de actualiteit van deze richtlijn is de werkgroep niet in stand gehouden. Uiterlijk in 2025 bepaalt het bestuur van de Nederlandse Orthopaedische Vereniging of de modules van deze richtlijn nog actueel zijn. Op modulair niveau is een onderhoudsplan beschreven. Bij het opstellen van de richtlijn heeft de werkgroep per module een inschatting gemaakt over de maximale termijn waarop herbeoordeling moet plaatsvinden en eventuele aandachtspunten geformuleerd die van belang zijn bij een toekomstige herziening (update). De geldigheid van de richtlijn komt eerder te vervallen indien nieuwe ontwikkelingen aanleiding zijn een herzieningstraject te starten.
De Nederlandse Orthopaedische Vereniging is de regiehouder van deze richtlijn en eerstverantwoordelijke op het gebied van de actualiteitsbeoordeling van de richtlijn. De andere aan deze richtlijn deelnemende wetenschappelijke verenigingen of gebruikers van de richtlijn delen de verantwoordelijkheid en informeren de regiehouder over relevante ontwikkelingen binnen hun vakgebied.
Module |
Regiehouder(s) |
Jaar van autorisatie |
Eerstvolgende beoordeling actualiteit richtlijn |
Frequentie van beoordeling op actualiteit |
Wie houdt er toezicht op actualiteit |
Relevante factoren voor wijzigingen in aanbeveling |
Diagnostiek DDH |
NOV |
2020 |
2025 |
Eens in de vijf jaar |
NOV |
- |
Algemene gegevens
De richtlijnontwikkeling werd ondersteund door het Kennisinstituut van de Federatie Medisch Specialisten en werd gefinancierd uit de Stichting Kwaliteitsgelden Medisch Specialisten (SKMS).
De financier heeft geen enkele invloed gehad op de inhoud van de richtlijn.
Doel en doelgroep
Doel
Het doel van de richtlijn is om een meer uniforme en eenduidige diagnostiek en behandeling te verkrijgen voor kinderen onder één jaar met of met verdenking op DDH.
Doelgroep
Deze richtlijn is geschreven voor alle leden van de beroepsgroepen die betrokken zijn bij de zorg voor patiënten met (een verhoogde kans op) DDH.
Samenstelling werkgroep
Voor het ontwikkelen van de richtlijn is in 2018 een multidisciplinaire werkgroep ingesteld, bestaande uit vertegenwoordigers van alle relevante specialismen die betrokken zijn bij de zorg voor kinderen tot de leeftijd van één jaar waarbij het heupgewricht niet goed ontwikkeld is.
De werkgroepleden en klankbordgroepleden zijn door hun beroepsverenigingen gemandateerd voor deelname. De werkgroep is verantwoordelijk voor de integrale tekst van deze richtlijn. De klankbordgroepleden hebben op het raamwerk en voor de commentaarfase schriftelijke input kunnen leveren voor de richtlijn.
Werkgroep
- Dr. M.M.E.H. Witbreuk, orthopedisch chirurg, werkzaam in het OLVG te Amsterdam en Amsterdam UMC, NOV (voorzitter)
- Dr. C.J.A. van Bergen, orthopedisch chirurg, werkzaam in het Amphia ziekenhuis te Breda, NOV
- Dr. B.J. Burger, orthopedisch chirurg, werkzaam in de Noordwest Ziekenhuisgroep te Alkmaar, NOV
- Dr. M.M.H.P. Foreman-van Dongelen, stafarts/heupechoscopiste werkzaam bij Diagnostiek voor U te Eindhoven, AJN
- Dr. Y.M. den Hartog, orthopedisch chirurg, werkzaam in het Medisch Spectrum Twente te Enschede, NOV
- Drs. J.H. van Linge, orthopedisch chirurg, werkzaam in het Reinier de Graaf ziekenhuis te Delft, NOV
- R.M. Pereboom, patiëntenvertegenwoordiger, Vereniging Afwijkende Heupontwikkeling
- Prof. Dr. S.G.F. Robben, radioloog, werkzaam in het Maastricht UMC+ te Maastricht, NVvR
- Dr. M.A. Witlox, orthopedisch chirurg, werkzaam in het Maastricht UMC+ te Maastricht, NOV
- Dr. P.B. de Witte, orthopedisch chirurg, werkzaam in het Leids Universitair Medisch Centrum te Leiden, NOV
Klankbordgroep
- M.J. Becht, Gipsverbandmeester, werkzaam in het Wilhelmina Kinderziekenhuis te Utrecht, VGN
- Ing. S. Oostveen, Orthopedisch Technoloog, werkzaam in Centrum Orthopedie te Rotterdam, NBOT
- Dr. S.A. (Sandra) Prins, kinderarts, werkzaam in Amsterdam UMC locatie te Amsterdam, NVK
- L. de Vries, Wetenschappelijk medewerker NHG en huisarts (niet praktiserend), NHG
Met ondersteuning van
- Dr. F. Willeboordse, adviseur, Kennisinstituut van de Federatie van Medisch Specialisten
- Drs. B.L. de Geest, junior adviseur, Kennisinstituut van de Federatie van Medisch Specialisten
Belangenverklaringen
De KNMG-code ter voorkoming van oneigenlijke beïnvloeding door belangenverstrengeling is gevolgd. Alle werkgroepleden hebben schriftelijk verklaard of zij in de laatste drie jaar directe financiële belangen (betrekking bij een commercieel bedrijf, persoonlijke financiële belangen, onderzoeksfinanciering) of indirecte belangen (persoonlijke relaties, reputatiemanagement, kennisvalorisatie) hebben gehad. Een overzicht van de belangen van werkgroepleden en het oordeel over het omgaan met eventuele belangen vindt u in onderstaande tabel. De ondertekende belangenverklaringen zijn op te vragen bij het secretariaat van het Kennisinstituut van de Federatie Medisch Specialisten.
Werkgroeplid |
Functie |
Nevenfuncties |
Gemelde belangen |
Ondernomen actie |
Witbreuk |
Orthopedisch chirurg werkzaam in het OLVG en in het AUMC |
voorzitter werkgroep kinderorthopedie tot december 2018, onbetaald Voorzitter opleidingscommissie onbetaald Lid van educational committee van de EPOS (European Pediatric Orthopaedic society) onbetaald
Bestuurslid ANNA fonds: onbetaald |
geen |
GEEN |
Bergen, van |
Orthopedisch chirurg Amphia |
Geen die belangenverstrengeling veroorzaken |
geen |
GEEN |
Burger |
Orthopedisch chirurg Opleider orthopedie Lid Medisch Specialisten Noord West Voorzitter Wetenschapscommissie Noord West Academie Bestuurslid Nederlandse Orthpaedische Vereniging |
Bestuurslid ANNA fonds: onbetaald
Bestuurslid CORAL (centre of orthopaedic research Alkmaar): onbetaald |
geen |
GEEN |
Foreman-van Drongelen |
Jeugdarts KNMG Jeugdgezondheidszorg 0-4 Stafarts/heupechoscopist (vlg. Graf) bij Diagnostiek voor U, eerstelijns diagnostisch centrum. |
Eigenaar van Weloverwogen Expertise
Gelicenseerd trainer Tijdsurfen |
geen |
GEEN |
den Hartog |
Orthopedisch chirurg Medisch Spectrum Twente |
geen |
geen |
GEEN |
van Linge |
Orthopedisch chirurg fulltime |
Commissie balansdruk kwaliteit van FMS betaald met lege gelden. Inmiddels project recent beëindigd. Docent kinderorthopedie huisartseninstituut Erasmus MC betaalt 4 uur per jaar |
geen |
GEEN |
Pereboom |
Secretaris vereniging Afwijkende Heupontwikkeling (onbetaald) |
geen |
geen |
GEEN |
Robben |
Kinderradioloog MUMC |
Geen |
geen |
GEEN |
Witlox |
Orthopedisch chirurg mumc |
Bestuur WKO (oud-voorzitter) niet betaald.
Bestuur Maastrichtse hockey club niet betaald. Penningmeester oudervereniging OBS Maastricht niet betaald |
geen |
GEEN |
De Witte |
Orthopeed Sophia-Erasmus MC Orthopeed LUMC vanaf 2020 |
Onderzoeker /epidemioloog |
geen |
GEEN |
Klankbordgroep |
|
|
|
|
Becht |
Gastdocent (anatomie kinderen) Erasmus MC opleiding gipsverbandmeester (betaald), Lid opleidingscommissie gipsverbandmeester CZO (college zorg opleidingen) (onbetaald) |
Geen |
geen |
GEEN |
Oostveen |
Penningmeester NBOT vereniging halve dag in de week
Orthopedisch (schoen) Technoloog at Centrum Orthopedie Rotterdam 40 uur |
Vereniging is onbetaald Orthopedisch (schoen) technoloog is betaald aanmeten hulpmiddelen voor aan het lichaam van schoenen, kniebrace, korsetten, polsspalken ect. |
geen |
GEEN |
Prins |
kinderarts neonatoloog |
Geen |
geen |
GEEN |
Vries, de |
Wetenschappelijk Medewerker afdeling richtlijnontwikkeling en wetenschap NHG,0,6 fte |
extern lid Richtlijn Advies commissie (RAC) geen directe betaling, vacatiegelden gaan naar NHG |
geen |
GEEN |
Inbreng patiëntenperspectief
Er werd aandacht besteed aan het patiëntenperspectief door een afgevaardigde van de Vereniging Afwijkende Heupontwikkeling (VAH) in de werkgroep op te nemen. Het rapport hiervan (zie aanverwante producten) is besproken in de werkgroep en de belangrijkste knelpunten zijn verwerkt in de richtlijn. De conceptrichtlijn is tevens voor commentaar voorgelegd aan de VAH en de Patiëntenfederatie Nederland.
Methode ontwikkeling
Evidence based
Implementatie
In de verschillende fasen van de richtlijnontwikkeling is rekening gehouden met de implementatie van de richtlijn (module) en de praktische uitvoerbaarheid van de aanbevelingen. Daarbij is uitdrukkelijk gelet op factoren die de invoering van de richtlijn of richtlijnmodule in de praktijk kunnen bevorderen of belemmeren. Het implementatieplan is te vinden bij de aanverwante producten.
Werkwijze
AGREE
Deze richtlijn is opgesteld conform de eisen vermeld in het rapport Medisch Specialistische Richtlijnen 2.0 van de adviescommissie Richtlijnen van de Raad Kwaliteit. Dit rapport is gebaseerd op het AGREE II instrument (Appraisal of Guidelines for Research & Evaluation II; Brouwers, 2010), dat een internationaal breed geaccepteerd instrument is. Voor een stap-voor-stap beschrijving hoe een evidence-based richtlijn tot stand komt wordt verwezen naar het stappenplan Ontwikkeling van Medisch Specialistische Richtlijnen van het Kennisinstituut van de Federatie Medisch Specialisten.
Knelpuntenanalyse
Tijdens de voorbereidende fase inventariseerden de voorzitter van de werkgroep en de adviseur de knelpunten. Tevens zijn er knelpunten aangedragen door vertegenwoordigers van de NBOT, VAH, NVFK/KNGF, AJN en de VRA via een schriftelijke reactie op het raamwerk en schriftelijke knelpuntenanalyse. Deze zijn besproken in de gehele werkgroep.
Uitgangsvragen en uitkomstmaten
Op basis van de uitkomsten van de knelpuntenanalyse zijn door de voorzitter en de adviseur concept-uitgangsvragen opgesteld. Deze zijn met de werkgroep besproken waarna de werkgroep de definitieve uitgangsvragen heeft vastgesteld. Vervolgens inventariseerde de werkgroep per uitgangsvraag welke uitkomstmaten voor de patiënt relevant zijn, waarbij zowel naar gewenste als ongewenste effecten werd gekeken. De werkgroep waardeerde deze uitkomstmaten volgens hun relatieve belang bij de besluitvorming rondom aanbevelingen, als cruciaal (kritiek voor de besluitvorming), belangrijk (maar niet cruciaal) en onbelangrijk. Tevens definieerde de werkgroep tenminste voor de cruciale uitkomstmaten welke verschillen zij klinisch relevant vonden.
Strategie voor zoeken en selecteren van literatuur
Er werd voor de afzonderlijke uitgangsvragen aan de hand van specifieke zoektermen gezocht naar gepubliceerde wetenschappelijke studies in (verschillende) elektronische databases. Tevens werd aanvullend gezocht naar studies aan de hand van de literatuurlijsten van de geselecteerde artikelen. In eerste instantie werd gezocht naar studies met de hoogste mate van bewijs. De werkgroepleden selecteerden de via de zoekactie gevonden artikelen op basis van vooraf opgestelde selectiecriteria. De geselecteerde artikelen werden gebruikt om de uitgangsvraag te beantwoorden. De databases waarin is gezocht, de zoekstrategie en de gehanteerde selectiecriteria zijn te vinden in de module met desbetreffende uitgangsvraag. De zoekstrategie voor de oriënterende zoekactie en patiëntenperspectief zijn opgenomen onder aanverwante producten.
Kwaliteitsbeoordeling individuele studies
Individuele studies werden systematisch beoordeeld, op basis van op voorhand opgestelde methodologische kwaliteitscriteria, om zo het risico op vertekende studieresultaten (risk of bias) te kunnen inschatten. Deze beoordelingen kunt u vinden in de Risk of Bias (RoB) tabellen. De gebruikte RoB instrumenten zijn gevalideerde instrumenten die worden aanbevolen door de Cochrane Collaboration: AMSTAR - voor systematische reviews; Cochrane - voor gerandomiseerd gecontroleerd onderzoek; Newcastle-Ottowa - voor observationeel onderzoek; QUADAS II - voor diagnostisch onderzoek.
Samenvatten van de literatuur
De literatuursamenvatting (met de bijbehorende zoekcriteria) is in het Engels geschreven, om internationale uitwisseling van kennis te faciliteren. De relevante onderzoeksgegevens van alle geselecteerde artikelen werden overzichtelijk weergegeven in evidencetabellen. De belangrijkste bevindingen uit de literatuur werden beschreven in de samenvatting van de literatuur. Bij een voldoende aantal studies en overeenkomstigheid (homogeniteit) tussen de studies werden de gegevens ook kwantitatief samengevat (meta-analyse) met behulp van Review Manager 5.
Beoordelen van de kracht van het wetenschappelijke bewijs
A) Voor interventievragen (vragen over therapie of screening)
De kracht van het wetenschappelijke bewijs werd bepaald volgens de GRADE-methode. GRADE staat voor ‘Grading Recommendations Assessment, Development and Evaluation’ (zie http://www.gradeworkinggroup.org/).
GRADE onderscheidt vier gradaties voor de kwaliteit van het wetenschappelijk bewijs: hoog, redelijk, laag en zeer laag. Deze gradaties verwijzen naar de mate van zekerheid die er bestaat over de literatuurconclusie (Schünemann, 2013).
GRADE |
Definitie |
Hoog |
|
Redelijk* |
|
Laag |
|
Zeer laag |
|
*in 2017 heeft het Dutch GRADE Network bepaald dat de voorkeursformulering voor de op een na hoogste gradering ‘redelijk’ is in plaats van ‘matig’
B) Voor vragen over diagnostische tests, schade of bijwerkingen, etiologie en prognose
De kracht van het wetenschappelijke bewijs werd eveneens bepaald volgens de GRADE-methode: GRADE-diagnostiek voor diagnostische vragen (Schünemann, 2008) en een generieke GRADE-methode voor vragen over schade of bijwerkingen, etiologie en prognose. In de gehanteerde generieke GRADE-methode werden de basisprincipes van de GRADE-methodiek toegepast: het benoemen en prioriteren van de klinisch (patiënt) relevante uitkomstmaten, een systematische review per uitkomstmaat, en een beoordeling van bewijskracht op basis van de vijf GRADE-criteria (startpunt hoog; downgraden voor risk of bias, inconsistentie, indirectheid, imprecisie, en publicatiebias).
Formuleren van de conclusies
Voor elke relevante uitkomstmaat werd het wetenschappelijk bewijs samengevat in een of meerdere literatuurconclusies waarbij het niveau van bewijs werd bepaald volgens de GRADE-methodiek. De werkgroepleden maakten de balans op van elke interventie (overall conclusie). Bij het opmaken van de balans werden de gunstige en ongunstige effecten voor de patiënt afgewogen. De overall bewijskracht wordt bepaald door de laagste bewijskracht gevonden bij een van de cruciale uitkomstmaten. Bij complexe besluitvorming waarin, naast de conclusies uit de systematische literatuuranalyse vele aanvullende argumenten (overwegingen) een rol spelen, werd afgezien van een overall conclusie. In dat geval werden de gunstige en ongunstige effecten van de interventies samen met alle aanvullende argumenten gewogen onder het kopje 'Overwegingen'.
Overwegingen (van bewijs naar aanbeveling)
Om te komen tot een aanbeveling zijn naast (de kwaliteit van) het wetenschappelijke bewijs ook andere aspecten belangrijk en worden meegewogen, zoals de expertise van de werkgroepleden, de waarden en voorkeuren van de patiënt (patient values and preferences), kosten, beschikbaarheid van voorzieningen en organisatorische zaken. Deze aspecten worden, voor zover geen onderdeel van de literatuursamenvatting, vermeld en beoordeeld (gewogen) onder het kopje ‘Overwegingen’.
Formuleren van aanbevelingen
De aanbevelingen geven antwoord op de uitgangsvraag en zijn gebaseerd op het beschikbare wetenschappelijke bewijs en de belangrijkste overwegingen, en een weging van de gunstige en ongunstige effecten van de relevante interventies. De kracht van het wetenschappelijk bewijs en het gewicht dat door de werkgroep wordt toegekend aan de overwegingen, bepalen samen de sterkte van de aanbeveling. Conform de GRADE-methodiek sluit een lage bewijskracht van conclusies in de systematische literatuuranalyse een sterke aanbeveling niet a priori uit, en zijn bij een hoge bewijskracht ook zwakke aanbevelingen mogelijk. De sterkte van de aanbeveling wordt altijd bepaald door weging van alle relevante argumenten tezamen.
Randvoorwaarden (Organisatie van zorg)
In de knelpuntenanalyse en bij de ontwikkeling van de richtlijn is expliciet rekening gehouden met de organisatie van zorg: alle aspecten die randvoorwaardelijk zijn voor het verlenen van zorg (zoals coördinatie, communicatie, (financiële) middelen, menskracht en infrastructuur). Randvoorwaarden die relevant zijn voor het beantwoorden van een specifieke uitgangsvraag maken onderdeel uit van de overwegingen bij de bewuste uitgangsvraag. Meer algemene, overkoepelende, of bijkomende aspecten van de organisatie van zorg worden behandeld in de module met algemene uitgangspunten.
Indicatorontwikkeling
Voor deze richtlijn zijn geen interne kwaliteitsindicatoren ontwikkeld omdat naar inschatting van de werkgroep dit niet relevant zou zijn en dit de registratielast alleen maar zou vergroten.
Kennislacunes
Tijdens de ontwikkeling van deze richtlijn is systematisch gezocht naar onderzoek waarvan de resultaten bijdragen aan een antwoord op de uitgangsvragen. Bij elke uitgangsvraag is door de werkgroep nagegaan of er (aanvullend) wetenschappelijk onderzoek gewenst is om de uitgangsvraag te kunnen beantwoorden. Een overzicht van de onderwerpen waarvoor (aanvullend) wetenschappelijk van belang wordt geacht, is als aanbeveling in de Kennislacunes beschreven (onder aanverwante producten).
Commentaar- en autorisatiefase
De conceptrichtlijn werd aan de betrokken (wetenschappelijke) verenigingen en (patiënt) organisaties voorgelegd ter commentaar. De commentaren werden verzameld en besproken met de werkgroep. Naar aanleiding van de commentaren werd de conceptrichtlijn aangepast en definitief vastgesteld door de werkgroep. De definitieve richtlijn werd aan de relevante (wetenschappelijke) verenigingen en (patiënten) organisaties voorgelegd voor autorisatie en door hen geautoriseerd dan wel geaccordeerd.
Literatuur
Brouwers, M. C., Kho, M. E., Browman, G. P., Burgers, J. S., Cluzeau, F., Feder, G.,... & Littlejohns, P. (2010). AGREE II: advancing guideline development, reporting and evaluation in health care. Canadian Medical Association Journal, 182(18), E839-E842.
Medisch Specialistische Richtlijnen 2.0 (2012). Adviescommissie Richtlijnen van de Raad Kwaliteit. https://richtlijnendatabase.nl/over_deze_site.html
Ontwikkeling van Medisch Specialistische Richtlijnen: stappenplan. Kennisinstituut van Medisch Specialisten.
Schünemann H, Brożek J, Guyatt G, et al. GRADE handbook for grading quality of evidence and strength of recommendations. Updated October 2013. The GRADE Working Group, 2013. Available from http://gdt.guidelinedevelopment.org/central_prod/_design/client/handbook/handbook.html.
Schünemann, H. J., Oxman, A. D., Brozek, J., Glasziou, P., Jaeschke, R., Vist, G. E.,... & Bossuyt, P. (2008). Rating Quality of Evidence and Strength of Recommendations: GRADE: Grading quality of evidence and strength of recommendations for diagnostic tests and strategies. BMJ: British Medical Journal, 336(7653), 1106.
Wessels, M., Hielkema, L., & van der Weijden, T. (2016). How to identify existing literature on patients' knowledge, views, and values: the development of a validated search filter. Journal of the Medical Library Association: JMLA, 104.
Zoekverantwoording
Zoekacties zijn opvraagbaar. Neem hiervoor contact op met de Richtlijnendatabase.