Monitoren van fysieke fitheid
Uitgangsvraag
Welke test(s) kun je in een oncologisch behandeltraject gebruiken om de fysieke fitheid te monitoren?
Aanbeveling
1. Gebruik als verpleegkundig specialist de anamnese, idealiter aangevuld met een vragenlijst (bijvoorbeeld de DASI) om de fysieke fitheid van patiënten met kanker laagdrempelig te monitoren.
2. Gebruik als beweegspecialist bij voorkeur de ‘steep ramp test’ om de cardiorespiratoire fitheid te monitoren en training aan te sturen. Gebruik het indirect gemeten 1-herhalingsmaximum van een voor de patiënt relevante spiergroep om de spierkracht te monitoren en de training aan te sturen.
- Overweeg de ‘incremental shuttle walk test’ als alternatief voor de ‘steep ramp test’ om de cardiorespiratoire fitheid te monitoren, bijvoorbeeld wanneer een (aangepast) fietsprotocol niet goed uitvoerbaar is door de patiënt. De 6-minuten wandeltest kan toegepast worden bij laag belastbare patiënten om de cardiorespiratoire fitheid te monitoren.
- Overweeg de handknijpkracht of de ‘30-second chair stand test’ als alternatief voor het indirect gemeten 1-herhalingsmaximum om de spierkracht te monitoren wanneer dit de relevante spiergroepen of -ketens zijn voor de patiënt.
3. Herhaal een keer per vier weken de initieel gekozen test(s) voor het monitoren van fysieke fitheid en om training verder aan te sturen en om de resultaten over de tijd te kunnen vergelijken.
4. Wijk af wanneer de initiële test gedurende of na het behandeltraject niet meer goed uitvoerbaar is door de patiënt of wanneer er een plafondeffect in het meetinstrument is opgetreden.
5. Zet een inspanningstest met ademgasanalyse niet regulier in als monitoringsinstrument voor het meten van cardiorespiratoire fitheid, maar overweeg de inzet wanneer er sprake is van een diagnostische vraag naar de aard van een inspanningsbeperking tijdens een monitoringstraject.
Overweeg een loopbandprotocol of armergometrie als alternatief wanneer een inspanningstest met ademgasanalyse met een fietsprotocol niet haalbaar of mogelijk is.
Overwegingen
Voor- en nadelen en de kwaliteit van het bewijs
Het monitoren van fysieke fitheid is belangrijk om in de gaten te houden wat de progressie of achteruitgang voor, tijdens, of na een oncologisch behandeltraject is, alsook of te evalueren of men bereikt wat men beoogt te bereiken (bijvoorbeeld met een fysiek trainingsprogramma). Voor monitoring worden structureel en herhaaldelijk metingen uitgevoerd gedurende een periode, bijvoorbeeld tijdens de behandeling of tijdens een trainingsprogramma. De resultaten van de herhaalde metingen zullen informatie geven over het al dan niet bereiken van het beoogde doel en de progressie of achteruitgang van de fysieke fitheid.
In deze richtlijn is fysieke fitheid als cardiorespiratoire fitheid en spierkracht gedefinieerd, hoewel meerdere constructen gerelateerd zijn aan of invloed hebben op de fysieke fitheid van de patiënt met kanker (denk hierbij bijvoorbeeld aan fysiek functioneren, voedingsstatus, kwetsbaarheid, activiteitenniveau). Voor meetinstrumenten voor dergelijke constructen wordt verwezen naar de KNGF-richtlijn ‘Oncologie’ (KNGF, 2022) en de VRA-richtlijn ‘Medisch specialistische revalidatie bij oncologie’ (VRA, 2014).
Voor het inschatten of afleiden van de cardiorespiratoire fitheid was de werkgroep geïnteresseerd in meeteigenschappen van de Physician-based Assessment and Counceling for Exercise vragenlijst (PACE), de FitMáx vragenlijst, de Veterans-Specific Activity Questionnaire (VSAQ) en de Duke Activity Status Index (DASI) vragenlijsten. Daarnaast was men geïnteresseerd in de meeteigenschappen van de ‘steep ramp test’, ‘30-second chair stand test’, de ‘five times sit-to-stand test’, de traplooptest, de ‘timed up-and-go test’, de 6-minuten wandeltest, de ‘incremental shuttle walk test’, het indirect gemeten 1-herhalingsmaximum en de handknijpkracht als fysieke tests voor het meten van cardiorespiratoire fitheid en/of spierfunctie of spierkracht. De bevindingen van de literatuuranalyse worden samengevat in Tabel Samenvatting van literatuuranalyse met betrekking tot de betrouwbaarheid (inclusief meetfout) en de responsiviteit per meetinstrument van interesse. Korte beschrijvingen van de vragenlijsten en tests zijn bijgevoegd als bijlage aan deze richtlijnmodule.
Table Samenvatting van literatuuranalyse met betrekking tot de betrouwbaarheid (inclusief meetfout) en de responsiviteit per meetinstrument van interesse
Instrument |
Betrouwbaarheid |
Meetfout |
Responsiviteit |
|||
GRADE |
Resultaat |
GRADE |
Resultaat |
GRADE |
Resultaat |
|
Vragenlijsten |
|
|
|
|
|
|
Physician-based Assessment and Counceling for Exercise vragenlijst |
○○○○ |
– |
○○○○ |
– |
○○○○ |
– |
Fitmáx vragenlijst |
○○○○ |
– |
○○○○ |
– |
○○○○ |
– |
Veterans-Specific Activity Questionnaire |
○○○○ |
– |
○○○○ |
– |
○○○○ |
– |
Duke Activity Status Index |
○○○○ |
– |
○○○○ |
– |
○○○○ |
– |
Fysieke tests |
|
|
|
|
|
|
30-Second Chair Stand Test |
⬤⬤⬤○ |
ICC=0.97 (Aabo, 2021)
ICC=0.89 (Blackwood, 2021)
ICC=0.92 (Van Hinte, 2020) |
⬤⬤⬤○
|
SEM = 1.0 MDC95 = 2.6 (Aabo, 2021)
MDC95 = 3 (Blackwood, 2021)
SEM = 1.07 MDC95=2.96 (Van Hinte, 2020) |
○○○○ |
– |
Five Times Sit-to-Stand Test |
⬤○○○ |
ICC=0.86 (Blackwood, 2021) |
○○○○ |
MDC95 = 3.19 (Blackwood, 2021) |
○○○○ |
– |
Traplooptest |
○○○○ |
– |
○○○○ |
– |
○○○○ |
– |
Timed Up-and-Go test |
⬤⬤○○ |
ICC=0.95 (Blackwood, 2020)
ICC=0.98 (Van Hinte, 2020) |
⬤⬤○○ |
SEM = 0.90 MDC95 = 2.494 (Blackwood, 2021)
SEM = 0.55 MDC95 = 1.54 (Van Hinte, 2020) |
○○○○ |
– |
6-Minuten Wandel Test |
⬤⬤⬤○
|
ICC=0.96 (Eden, 2018)
ICC=0.93 (Schmidt, 2013)
ICC=0.98 (Sebio-Garcia, 2017)
ICC=0.97 (Van Hinte, 2020) |
⬤⬤⬤○
|
CV = 3% (Schmidt, 2013)
Gemiddeld verschil = 19.5 95%LoA = -32.3 to 70.38 (Sebio-Garcia, 2017)
SEM = 20.45 MDC95 = 56.67 (Van Hinte, 2020) |
○○○○ |
– |
Incremental Shuttle Walk Test |
○○○○ |
– |
⬤⬤○○
|
Gemiddeld verschil = 2 [95%CI -6 to 8] / Mean difference: -11 95%LoA: -81.7 to 58.0 (Booth, 2001)
Gemiddeld verschil = 16 95%LoA = -80 to 112 / Gemiddeld verschil = 5 95%LoA: -78 to 88 (Wilcock, 2018) |
○○○○ |
– |
Steep Ramp Test |
⬤○○○ |
ICC=0.996 (De Backer, 2007) |
○○○○ |
– |
⬤⬤⬤○ |
Correlatie van verschilscores van de steep ramp test met verschilscores van inspanningstest met ademgasanalyse: r=0.52 (Weemaes, 2021)
Wanneer >6% toename steep ramp test: AUC=0.74 (Weemaes, 2021) |
Indirect gemeten 1-herhalingsmaximum |
○○○○ |
– |
○○○○ |
– |
○○○○ |
– |
Handknijpkracht |
⬤⬤○○ |
Jamar: r=0.966 / Biodex: r=0.765 (Trutschnigg, 2008)
Jamar, linker hand: ICC=0.88 / rechter hand: ICC=0.96 (Van Hinte, 2020) |
⬤⬤○○
|
Jamar (Lbs): %CV = 6.30, (Trutschnigg, 2008)
Jamar (kilogram): Linker hand: Gemiddeld verschil (SD) = 0.22 (6.67) 95%LoA = -12.86 to 13.30 SEM = 4.67 MDC95 = 12.96 (Van Hinte, 2020)
Rechterhand: Gemiddeld verschil (SD) = 0.52 (4.23) 95%LoA = -7.76 to 8.80 SEM = 2.98 MDC95 = 8.26 (Van Hinte, 2020)
Biodex (Nm): %CV = 16.70, (Trutschnigg, 2008) |
○○○○ |
– |
Zie de literatuuranalyse voor de resultaten voor de validiteit per meetinstrument. ○○○○: Geen data/ geen GRADE ⬤○○○: Zeer lage zekerheid (GRADE) ⬤⬤○○: Lage zekerheid (GRADE) ⬤⬤⬤○: Redelijke zekerheid (GRADE) ⬤⬤⬤⬤: Hoge zekerheid (GRADE) 1RM: 1 herhalingsmaximum, 6MWT: zes-minuten wandeltest, 95% LoA: 95% limits of agreement, AUC: area under the curve, CPET: cardiopulmonary exercise test, CV: coefficient of variation, ICC: intra-class correlation coefficient, Lbs: pounds, MDC95: 95% minimal detectable change, Nm: Newton meter, SD: standaarddeviatie, SEM: standard error of measurement |
De meetinstrumenten moeten valide, betrouwbaar, responsief en (relatief) eenvoudig uit te voeren en te interpreteren zijn. Doordat er frequente metingen uitgevoerd kunnen worden voor het monitoren van de fysieke fitheid is een praktische, kortdurende test van belang in de praktijk.
Vanuit de literatuuranalyse is er te weinig data gevonden over vragenlijsten. Vragenlijsten kunnen gebruikt worden om een eerste inschatting te maken van de fysieke fitheid (zie module ‘screening’) in plaats van het monitoren hiervan, hoewel de praktische kanten van vragenlijsten voor gebruik in de praktijk aantrekkelijk zijn. Met de beperkingen van vragenlijsten en beperkte beschiknare date blijven fysieke tests daarom aangewezen voor het monitoren van de fysieke fitheid. Deze moeten natuurlijk veilig en onder begeleiding van adequaat geschoolde zorgverleners (zie module screening & assessment) uitgevoerd kunnen worden. Fysieke tests kunnen daarnaast eenvoudig gestandaardiseerd herhaald worden mits de juiste materialen en ruimte aanwezig is voor de uitvoer. Deze tests hebben veelal goed beschreven testconfiguraties en -protocollen. Idealiter zijn de tests in verschillende settingen uit te voeren, zoals bijvoorbeeld in het ziekenhuis, in de eerste lijn en wellicht zelfs in de thuissituatie. Het zou daarnaast gunstig zijn om door de lijnen heen dezelfde tests te kunnen gebruiken.
Om resultaten over de tijd te kunnen interpreteren is het wenselijk om telkens dezelfde test(s) te gebruiken voor het monitoren van fysieke fitheid. Het kan voorkomen dat deze initieel gekozen test gedurende of na het behandeltraject niet meer haalbaar of mogelijk is. Probeer in dat geval een alternatief testprotocol of test te selecteren die bij de situatie van de patiënt past (bijvoorbeeld een loopprotocol op dezelfde test of een looptest, in plaats van een fietstest). Tijdens een fysiek trainingsprogramma kan de fysieke fitheid van de patiënt veranderen. Initieel gekozen tests voor monitoringsdoeleinden kunnen daardoor een plafondeffect laten zijn tijdens het traject. Overweeg, in dit geval, om als vervolg een andere test te selecteren die bij het fysieke fitheidsniveau van de patiënt past.
Voor de interpretatie van testuitslagen over de tijd moet de betrouwbaarheid (inclusief meetfout) van het instrument adequaat zijn om de resultaten te kunnen interpreteren. Idealiter heeft het instrument daarnaast een goede responsiviteit om zeker(der) te zijn dat de geobserveerde verandering in het juiste construct heeft plaatsgevonden.
Cardiorespiratoire fitheid
Cardiorespiratoire fitheid, uitgedrukt in de maximale zuurstofopnamecapaciteit per kilogram lichaamsgewicht (VO2piek) van het lichaam, reflecteert iemands uithoudingsvermogen. Qua vragenlijsten was ten tijde van de systematische literatuursearch nog geen data beschikbaar over de responsiviteit voor het meten van veranderingen in VO2piek in de populatie van patiënten met kanker. Intussen is er een studie vanuit de oncologierevalidatie verschenen waarin onderzocht is in hoeverre de FitMáx vragenlijst, VSAQ en DASI in staat zijn om een verandering in VO2piek gemeten op een inspanningstest met ademgasanalyse weer te geven (Weemaes, 2023). Uit de resultaten bleek dat de responsiviteit van de FitMáx vragenlijst beperkt is (ICC 0,43) en dat de responsiviteit van de VSAQ (ICC 0,19) en DASI (ICC 0,18) slecht zijn voor veranderingen in VO2piek (Weemaes, 2023). Qua fysieke tests hebben de ‘6-minuten wandeltest’ en de ‘steep ramp test’ een hoge mate van betrouwbaarheid in een oncologische populatie (6-minuten wandeltest: ICC=0,96 [Eden, 2018], ICC=0,93 [Schmidt, 2013], ICC=0,98 [Sebio-Garcia, 2017], ICC=0,97 [Van Hinte, 2020]; ‘steep ramp test’: 0,99 [De Backer, 2007]). De 6-minuten wandeltest kan in verschillende configuraties afgenomen worden, afhankelijk van de beschikbare ruimte. Het aantal gelopen meters binnen zes minuten wordt gemeten. Neem wel in acht dat er een plafondeffect aanwezig kan zijn voor de 6-minuten wandeltest wanneer de patiënt niet een (zeer) laag niveau van fysieke fitheid heeft. De ‘steep ramp test’ was de enige test waarvoor data over de responsiviteit werd gevonden. Hiervoor werden verschilscores van de ‘steep ramp test’ gecorreleerd met de verschilscores op een inspanningstest met ademgasanalyse (r=0,51) (Weemaes, 2021). De ‘steep ramp test’ is een snelle en praktische maximale test op een fietsergometer en kost weinig tijd om af te nemen (ca. 6 minuten), maar is niet geschikt voor zeer kwetsbare patiënten. De belasting op de ergometer loopt snel op met 25 W per 10 seconden. De hoogst behaalde belasting (Wpiek) is de primaire uitkomstmaat. Bij deze test is de supervisie van een (oncologie) fysiotherapeut van belang voor een goede testuitvoer. Er werd geen data gevonden over de test-hertest betrouwbaarheid van de ‘incremental shuttle walk test’ in de oncologische populatie. Wel werd gezien dat 95% van de verschillen tussen herhaalde metingen tussen -81 tot 58 meter (Booth, 2001), tussen -80 tot 112 meter of tussen -78 tot 88 meter (Wilcock, 2018) lagen, afhankelijk van de gebruikte testsessie en of de hertest op dezelfde of een andere dag plaats vond. De ‘incremental shuttle walk test’ is een maximale looptest en kan een brede range aan fysieke fitheid meten, waardoor dit een geschikt instrument voor zowel kwetsbare als fitte patiënten. Deze test heeft een audiobestand, pylonen en een oefenzaal of lange gang nodig (zonder dat men gestoord kan worden tijdens de test). De tijd tussen de audiosignalen wordt elke minuut verkort, waardoor de loopsnelheid toeneemt van 1,8 km/u tot 8,5 km/u (Singh protocol) of tot 10,3 km/u (Bradley protocol).
Spierkracht
Het 1-herhalingmaximum (1RM) is de gouden standaard voor het bepalen van de spierkracht van een specifieke spiergroep, waarbij de absolute score genormaliseerd moet worden voor lichaamsgewicht om het te vergelijken met normwaarden. Het bepalen van de 1RM is echter praktisch lastig uitvoerbaar. Het lijkt aannemelijk dat een indirecte meting van het 1-herhalingsmaximum van de voor de patiënt relevante spiergroep een belangrijke meting is voor het monitoren, hoewel er geen test-hertestbetrouwbaarheids data voor is in de oncologische populatie. Een directe 1-herhalingsmaximum is de gouden standaard, maar zeer onpraktisch op uit te voeren. Het schatten van het 1-herhalingsmaximum vanuit een indirecte meting, bijvoorbeeld met behulp van het Oddvar Holten diagram, lijkt theoretisch gezien een goede optie. Het indirect gemeten 1-herhalingsmaximum kan daarnaast een belangrijke overweging zijn bij specifieke subgroepen. Een voorbeeld is de groep patiënten met botmetastasen die bij een hoge belasting het risico op een fractuur lopen. Ook kan het indirect gemeten 1-herhalingsmaximum op verschillende (grote) spiergroepen relatief geïsoleerd uitgevoerd worden. Alternatieve fysieke tests, zoals de handknijpkracht en de ‘30-second chair stand test', zijn over het algemeen minder flexibel voor het meten van verschillende spiergroepen of -ketens. Hoewel handknijpkracht een hoge mate van test-hertestbetrouwbaarheid laat zien (ICC=0,88 en 0,96 [Van Hinte, 2021], r=0,97 en r=0,76 [Truttschnigg, 2008]), meet deze test vooral de kracht in de onderarm musculatuur. Er is en geen informatie beschikbaar over de responsiviteit van de handknijpkrachtmetingen.
Een gerandomiseerde trainingsstudies na behandeling van kanker liet geen veranderingen zien in deze uitkomstmaat als gevolg van de training (Kampshoff 2015). Voor de ‘30-second chair stand test’ is de meetfout acceptabel te noemen (MDC95=2,49 [Blackwood, 2020], MDC95=1,54 [Van Hinte, 2020]), maar is er geen informatie bekend over de test-hertest betrouwbaarheid en responsiviteit in een oncologische populatie. Deze test meet vooral de spierfunctie van de onderste extremiteit door van de patiënt te vragen om op te staan uit een stoel. Neem wel in acht dat er een plafondeffect aanwezig kan zijn voor de 30-second chair stand test bij fitte(re) patiënten. Op deze uitkomstmaat was geen effect van de vormen van fysieke training die zijn meegenomen in de analyse zichtbaar (van Waart 2015; Kampshoff 2015). Het indirecte 1-herhalingsmaximum biedt daarom meer vrijheid om specifieke, voor de patiënt relevante spiergroepen te kunnen meten. Deze tests kunnen worden afgenomen door een (oncologie) fysiotherapeut die, in de basis, geschoold is om deze tests af te nemen.
Het kan voorkomen dat er tijdens het monitoren van de fysieke fitheid van een patiënt met kanker niet de beoogde resultaten behaald worden. In sommige gevallen blijft het onduidelijk waar dit aan ligt. Overweeg in deze gevallen om tijdens het monitoringstraject diagnostisch een inspanningstest met ademgasanalyse uit te voeren naar aard van de inspanningsbeperking, maar zet de inspanningstest met ademgasanalyse niet regulier in als een monitoringsinstrument.
Er zijn mogelijk subgroepen aanwezig waarbij specifieke, individuele afwegingen gemaakt moeten worden. Dit zijn patiënten die complexe revalidatie nodig hebben, zie hiervoor de richtlijn ‘Medisch specialistische revalidatie bij oncologie’ (VRA, 2014). Ook kwetsbare ouderen kunnen een subgroep zijn waar individuele overwegingen kunnen leiden tot het inzetten van specifieke instrumenten. Om de kwetsbaarheid van de ouderen te meten dan onder andere de G8, SPPB, TUG of de CFS scale gebruikt worden. Het is bij alle Fysieke tests belangrijk om de situatie van de patiënt en zijn/haar fysieke beperkingen mee te nemen in de keuze voor het meetinstrument en/of testprotocol. Kan een patiënt door de oncologische behandeling of andere omstandigheden, bijvoorbeeld, geen fietstest uitvoeren, overweeg dan een alternatief zoals een looptest (of loopprotocol voor dezelfde test) en meet voor spierkracht dan ook herhaaldelijk de voor de patiënt relevante spiergroep(en).
Optimale testfrequentie voor aansturing van training
De ‘steep ramp test’ en de indirecte 1RM test lenen zich uitstekend voor een vertaling naar individueel aangepaste trainingsdosering, en zeker indien dit wordt ondersteund door de Borg score. In de praktijk worden deze tests vaak elke 3 of 4 weken gedaan, afhankelijk van de chemotherapiecyclus, waarbij dan telkens de training aangepast kan worden aan de behaalde testwaarden en een goed overzicht ontstaat in het trainingsbeloop (Kampshoff, 2015, van Waart 2015).
Waarden en voorkeuren van patiënten (en evt. hun verzorgers)
Het is noodzakelijk voor een betrouwbaar testresultaat en prettig voor de patiënt om voorafgaand aan de test het nut van de test begrijpelijk uit te leggen, ook al is de test al vaker uitgevoerd door de patiënt. Dit bleek uit interviews die met patiënten werden gehouden in het kader van de ontwikkeling van deze module. Door het gebruik van instrumenten in het monitoren van fysieke fitheid krijgt de patiënt een beeld van de voor- of achteruitgang en kan bekeken worden of het beoogde resultaat (bijvoorbeeld van een fysiek trainingsprogramma) wordt behaald. De veiligheid van de Fysieke tests is zeer belangrijk voor de patiënt, bleek ook uit de interviews. Denk daarom ook aan de eventueel aanwezige absolute contra-indicaties vóór het afnemen van een fysieke test. Houd ook de veiligheid gedurende het afnemen de tests in acht (bijvoorbeeld valgevaar, onwel worden, verschuivende onderdelen, belasting bij ossale problematiek). Idealiter corresponderen de tests die gebruikt worden tijdens de screening met de tests die gebruikt worden voor het monitoren. Echter kan het zijn dat er op basis van de (on)bekende meeteigenschappen van tests en individuele overwegingen van de patiënt er afgeweken wordt van het aanbevolen meetinstrument voor monitoring. Probeer echter zo veel mogelijk dezelfde test(s) te gebruiken voor monitoringsdoeleinden om de resultaten te kunnen vergelijken, ook wanneer er door de lijnen heen gemeten en gemonitord wordt (bijvoorbeeld met metingen in de eerste en tweede lijn).
Kosten (middelenbeslag)
Voor sommige instrumenten zijn aanvullende materialen nodig. Denk hierbij aan bijvoorbeeld een fietsergometer, stoel, pionnen, timer, meetlint, trap, audiobestand. Afhankelijk van wie de test afneemt zullen de kosten mogelijk kunnen verschillen. Sommige tests kunnen daarnaast zo belastend zijn dat er voor de veiligheid BLS-geschoold personeel aanwezig moet zijn om de test te kunnen afnemen. Voor het uitvoeren van de inspanningstest met ademgasanalyse is er een beperkte capaciteit beschikbaar. Deze test kan dus niet regulier als monitoringsinstrument ingezet worden.
Aanvaardbaarheid, haalbaarheid en implementatie
Voor het uitvoeren van fysieke tests zijn, afhankelijk van het type test, ruimten en materialen nodig. Het is aannemelijk dat deze tests zullen worden uitgevoerd door (oncologie) fysiotherapeuten. Zij zijn geschoold in het afnemen van fysieke tests en zullen over het algemeen de beschikking hebben over de juiste materialen en ruimten. Het kan echter mogelijk zijn dat specifieke onderdelen van een aantal tests (nog) niet overal beschikbaar zijn, zoals een audiobestand va de ‘incremental shuttle walk test’. Het uitvoeren van fysieke tests zoals de ‘steep ramp test’, de ‘incremental shuttle walking test’, de ‘6-minuten wandeltest’, het indirecte 1-herhalingsmaximum, een handknijpkrachtmeting en de ‘30-second chair stand test’ worden desondanks als haalbaar en implementeerbaar geacht. Sommige fysieke tests kunnen in verschillende configuraties of met testprotocollen worden uitgevoerd en voor enige praktijkvariatie zorgen. Idealiter wordt er een configuratie en testprotocol geselecteerd die valide is voor de populatie met kanker en die herhaaldelijk gestandaardiseerd uitgevoerd kan worden. Het is aan de afnemer van de fysieke test om de meest geschikte configuratie en/of testprotocol te selecteren in de praktijk, passend bij de setting en de patiënt. De inspanningstest met ademgasanalyse is daarnaast geen ongebruikelijke test in de praktijk wanneer hier indicatie voor is. Er worden derhalve geen implementatieproblemen of een toename in gebruik verwacht voor het eventueel diagnostisch inzetten van de inspanningstest met ademgasanalyse tijdens het monitoringstraject.
Een hoge frequentie van metingen kunnen, afhankelijk van het type test en de status van de patiënt, zeer belastend zijn. Zeer hoge belastingen in een hoge frequentie worden niet als aanvaardbaar gezien voor de patiënt. De frequentie van metingen binnen trainingsprogramma’s kan gestandaardiseerd worden, maar overweeg, in verband met de belasting, om niet eerder te meten dan wanneer er een verandering kan worden verwacht in relatie tot te toegediende trainingsprikkel of het verloop in het (behandel)traject. Wanneer monitoring buiten trainingsprogramma’s plaats vindt kan er in samenspraak met de patiënt geschikte intervallen worden afgesproken om metingen te verrichten die relevant zijn voor het individuele zorgtraject. De frequentie van metingen kan per patiënt verschillen door bijvoorbeeld onderling verschillende kankertypen, behandeltrajecten, of plaats in het zorgtraject (bijv. tijdens oncologische behandeling of tijdens nazorg en follow-up). Dat kan ook samenvallen met de contactmomenten die de patiënt met de verpleegkundig specialist of (oncologie) fysiotherapeut heeft.
Rationale van de aanbeveling: weging van argumenten voor en tegen de interventies
Voor het longitudinaal monitoren van fysieke fitheid (d.w.z. cardiorespiratoire fitheid en spierkracht) is herhaaldelijk testen met hetzelfde meetinstrument belangrijk om de resultaten over de tijd te kunnen vergelijken. Hiervoor is, naast de validiteit en betrouwbaarheid, een belangrijke meeteigenschap van de test de responsiviteit. Wanneer tests of testprotocollen niet (meer) aansluiten bij de situatie van de patiënt en deze niet uitgevoerd kunnen worden, kan er ook voor een alternatieve test of testprotocol gekozen worden. Denk hierbij bijvoorbeeld aan de bezwaren voor een fietstest bij een patiënt met een testiscarcinoom.
Onderbouwing
Achtergrond
Het monitoren van de fysieke fitheid is belangrijk. Op deze wijze kan voor- of achteruitgang van de fysieke fitheid in de gaten worden gehouden. Ook gedurende gepersonaliseerde trainingsprogramma’s vóór, tijdens, of na de oncologische behandeling is het belangrijk en kan er beoordeeld worden of de beoogde doelen (kunnen) worden behaald. Het is echter niet duidelijk welke meetinstrumenten geschikt zijn om fysieke fitheid van patiënten met kanker te monitoren in de praktijk.
Conclusies
One Repetition Maximum
VERY LOW GRADE |
We are unsure about the leg press one-repetition maximum’s construct validity to measure muscle strength in a population with cancer.
Source: Tsuji, 2020 |
NO GRADE GRADE |
The level of evidence regarding the outcomes criterion validity (muscle strentght and cardiorespiratory fitness), reliability, measurement error, and responsiveness were not GRADEd. None of the included studies reported data about these measurement properties. |
Handgrip dynamometry
VERY LOW GRADE |
We are unsure about the handgrip dynamometer’s construct validity to measure muscle strength in a population with cancer.
Source: Rogers, 2017; Tsuji, 2022 |
LOW GRADE |
There is a low confidence in the reported reliability of handgrip dynamometery in a population with cancer.
Source: Trutschnigg, 2008; Van Hinte (2020) |
LOW GRADE |
There is a low confidence in the found measurement errors of handgrip dynamometery in a population with cancer. The measurement error is difficult to interpret from the literature because a minimal clinical important change in the population of interest was not reported.
Source: Trutschnigg, 2008; Van Hinte (2020) |
NO GRADE |
The level of evidence regarding the outcomes criterion validity (muscle strentght and cardiorespiratory fitness) and responsiveness were not GRADEd. None of the included studies reported data about these measurement properties. |
Thirty-Second Chair Stand Test
VERY LOW GRADE |
We are unsure about the thirty-second chair-stand test’s construct validity to measure muscle strength or cardiorespiratory fitness in a population with cancer.
Source: Blackwood (2021) |
MODERATE GRADE |
There is a moderate confidence in the reported reliability of the thirty-second chair stand test in a population with cancer.
Source: Aabo, 2021; Blackwood, 2021; Eden, 2018, Van Hinte, 2020 |
MODERATE GRADE |
There is moderate certainty about the reported measurement errors of the thirty-second chair stand test in a population with cancer. The measurement error is difficult to interpret from the literature because a minimal clinical important change in the population of interest was not reported.
Source: Aabo, 2021; Blackwood, 2021; Van Hinte, 2020 |
NO GRADE |
The level of evidence regarding the outcomes criterion validity (for both muscle strength and cardiorespiratory fitness) and inter-rater reliability were not GRADEd. None of the included studies reported data about this measurement properties. |
Five-Time Sit to Stand Test
VERY LOW |
We are unsure about the five-time sit to stand test’s construct validity to measure muscle strength in a population with cancer.
Source: Blackwood, 2021 |
VERY LOW GRADE |
We are unsure about the reported reliability of the five-time sit to stand test in a population with cancer.
Source: Blackwood, 2021 |
VERY LOW |
We are unsure about the five-time sit to stand test’s measurement error in a population with cancer. The measurement error is difficult to interpret from the literature because a minimal clinical important change in the population of interest was not reported.
Source: Blackwood, 2021 |
NO GRADE |
The level of evidence regarding the outcomes criterion validity (for both muscle strength and cardiorespiratory fitness) and responsiveness were not GRADEd. None of the included studies reported data about these measurement properties. |
Stair Climbing Test
LOW GRADE |
The stair climbing test might possibly have a good criterion validity for measuring cardiorespiratory fitness in a population with cancer.
Source: Koegelenberg, 2020 |
NO GRADE |
The level of evidence regarding the outcomes criterion validity (for muscle strength), reliability, measurement error, and responsiveness were not GRADEd. None of the included studies reported data about these measurement properties. |
Timed Up and Go test
VERY LOW GRADE |
We are unsure about the timed up and go test’s construct validity to measure muscle strength in a population with cancer.
Source: Blackwood, 2021 |
VERY LOW GRADE |
There is a low confidence in the reported reliability of the timed up and go test in a population with cancer.
Source: Blackwood, 2020; Van Hinte, 2020 |
LOW GRADE |
There is a low confidence in the found measurement errors of the timed up and go test in a population with cancer. The measurement error is difficult to interpret from the literature because a minimal clinical important change in the population of interest was not reported.
Source: Blackwood, 2020; Van Hinte, 2020 |
NO GRADE |
The level of evidence regarding the outcomes criterion validity (for both muscle strength and cardiorespiratory fitness) and responsiveness were not GRADEd. None of the included studies reported data about these measurement properties. |
Six-Minute Walking Test
VERY LOW GRADE |
We are unsure about the six-minute walking test’s construct validity to measure muscle strength in a population with cancer.
Source: Eden, 2018 |
MODERATE GRADE |
There is a moderate confidence in the reported results for the criterion validity of the six-minute walking test using the walking distance to measure cardiorespiratory fitness in a population with cancer.
Source: Granger, 2015; Schmidt, 2013 |
LOW GRADE |
There is a low confidence in the reported results for the criterion validity of the six-minute walking test using the predicted VO2peak to measure cardiorespiratory fitness in a population with cancer.
Source: Tsuji, 2022 |
VERY LOW GRADE |
We are unsure about the six-minute walking test’s construct validity when using the walking distance to measure cardiorespiratory fitness in a population with cancer.
Source: Granger, 2015 |
LOW GRADE |
There is a low confidence in the reported results for the construct validity of the six-minute walking test using the predicted VO2peak to measure cardiorespiratory fitness in a population with cancer.
Source: Schumacher, 2018 |
MODERATE GRADE |
There is a moderate confidence about the reported reliability of the five-time sit to stand test in a population with cancer.
Source: Eden, 2018; Schmidt, 2013; Sebio-Garcia, 2017; Van Hinte, 2020 |
MODERATE GRADE |
There is moderate certainty about the reported measurement errors of the six-minute walking test using the walking distance in a population with cancer. The measurement error is difficult to interpret from the literature because a minimal clinical important change in the population of interest was not reported.
Source: Schmidt, 2013, Sebio-Garcia, 2017; Van Hinte, 2020 |
NO GRADE |
The level of evidence regarding the outcome inter-rater reliability was not GRADEd. None of the included studies reported data about this measurement property. |
Incremental Shuttle Walking Test
VERY LOW GRADE |
We are unsure about the incremental shuttle walking test’s construct validity to measure muscle strength in a population with cancer.
Source: England, 2014 |
HIGH GRADE |
We are very confident about the reported results for the incremental shuttle walking test’s criterion validity measuring cardiorespiratory fitness in a population with cancer.
Source: Granger, 2015; Win, 2006 |
LOW GRADE |
There is a low confidence about the reported measurement errors of the six-minute walking test using the walking distance in a population with cancer. The measurement error is difficult to interpret from the literature because a minimal clinical important change in the population of interest was not reported.
Source: Booth, 2001; Wilcock, 2018 |
NO GRADE |
The level of evidence regarding the outcomes criterion validity (for muscle strength), reliability, and responsiveness were not GRADEd. None of the included studies reported data about these measurement properties. |
Steep Ramp Test
HIGH GRADE |
There is a high confidence in the reported criterion validity of the steep ramp test using Wmax to measure cardiorespiratory fitness in a population with cancer.
Source: De Backer, 2007; Weemaes, 2021 |
MODERATE GRADE |
There is a moderate confidence in the reported criterion validity of the steep ramp test predicting the VO2peak to measure cardiorespiratory fitness in a population with cancer.
Source: Stuiver, 2017 |
VERY LOW GRADE |
We are unsure about the reported reliability of steep ramp test in a population with cancer.
Source: De Backer, 2007 |
MODERATE GRADE |
There is a moderate confidence in the reported responsivity of the steep ramp test in a population with cancer.
Source: Weemaes, 2021 |
NO GRADE |
The level of evidence regarding the outcomes criterion validity (for muscle strength), reliability, measurement error, and responsiveness were not GRADEd. None of the included studies reported data about these measurement properties |
Physician-based Assessment and Counceling for Exercise questionnaire
NO GRADE |
The level of evidence regarding the outcomes validity, reliability, measurement error, and responsiveness were not GRADEd. None of the included studies reported data about these measurement properties. |
Fitmáx questionnaire
NO GRADE |
The level of evidence regarding the outcomes validity, reliability, measurement error, and responsiveness were not GRADEd. None of the included studies reported data about these measurement properties.
|
Veterans-Specific Activity Questionnaire
NO GRADE |
The level of evidence regarding the outcomes validity, reliability, measurement error, and responsiveness were not GRADEd. None of the included studies reported data about these measurement properties. |
Duke Activity Status Index
LOW GRADE |
There is a low confidence in the Duke Activity Status Index’s criterion validity to measure the cardiorespiratory fitness in a population with cancer.
Source: Li, 2018 |
NO GRADE |
The level of evidence regarding the outcomes criterion validity (for muscle strength), reliability, measurement error, and responsiveness were not GRADEd. None of the included studies reported data about these measurement properties. |
Samenvatting literatuur
Description of studies
Aabo (2021) examined the reliability and measurement error of the 30-second chair-stand test in a convenience sample of male outpatients with prostate cancer (n=60). Participants were excluded when they had opioid-demanding skeletal pain. All participants were on androgen deprivation therapy (median time since start: 1 year and 23 days, range: 3 months and 1 week to 8 years and 11 months) and were already familiar with the 30-second chair-stand test. The mean age was 70.8 years (SD: 6.6, range: 51-86). Comorbidities in the sample were cardiovascular disease (n=16), diabetes (n=5), hypertension (n=29), dyslipidemia (n=16), and osteoporosis (n=3). Two or more comorbidities were present in 25 men. Additional therapies were provided in the form of chemotherapy or novel endocrine therapy to ten men with castration-resistant prostate cancer. Metastatic disease was prevalent in 38 men. The original manual was followed for conducting the 30-second chair-stand test, using a chair with 44cm seat height. Participants were instructed to sit in the middle of the chair with their backs straight and feet placed approximately at shoulder-with a little posterior to the knees. One foot was placed slightly in front of the other in order to maintain balance. Arms were crossed over the chest and the number of sit-to-stand repetitions within 30 seconds were recorded. If a participant was not able to stand up with crossed arms, he was allowed to use the armrests. The rater demonstrated one session of the test. The participant was given a practice trial before the actual test trial was performed. Two raters were used for the test procedures and calibrated test procedures prior to the test trials. A manual was used to standardize the same information and instructions to participants. Encouragements or talking was not allowed during the test trials. Test trials were performed twice under identical conditions on the same day by the same rater. The two test trials were separated by one hour of rest, where the participants were instructed not to be physically active. Intra-rater reliability and measurement error were reported.
Blackwood (2020) examined the reliability and measurement error of the Timed Up and Go test in 50 older breast (n=26), prostate (n=17), lung (n=5), and colorectal (n=1) cancer survivors. Patients were included when 65 years or older, were English speaking, had a medically confirmed diagnosis of breast / lung / prostate / colorectal cancer, had at least 3 months treatment completion for the primary cancer before testing, were able to get up from a chair and walk 50 feet with or without an assistive device. The mean age was 74.2 years (SD: 6.74), with 58% of the sample being female. Mean functional comorbidity index was 2.72 (range 0-7), with 17 participants having a fall history. The mean years since last active cancer treatment was 11.71 years (SD: 9.05). Participants received chemotherapy only (n=1), radiotherapy only (n=6), surgery only (n=19), chemotherapy and radiotherapy (n=5), chemotherapy and surgery (n=5), radiotherapy and surgery (n=5), chemotherapy and radiotherapy and surgery (n=5), or chemotherapy and surgery and hormone therapy (n=1). Treatment type was not reported for three participants. Participants were instructed to stand up from the chair (46cm seat height) and walk 3 meters to a line on the floor whereafter to turn around, walk back and sit on the chair again. Time needed to complete the task was recorded with a stopwatch. Two trials were performed and the average of those two trials was recorded while a two-week interval was used between test sessions.
Blackwood (2021) examined the reliability and measurement error of the 30-second Chair-Stand Test and Five-Time Sit To Stand test in older breast (n=28), prostate (n=15), lung (n=3), or colorectal (n=1) cancer survivors. Construct validity of the 30-second Chair-Stand Test, Timed Up and Go test and Five-Time Sit To Stand test were examined by correlating these measures. Participants were included when 65 years or older, were English speaking, had a medically confirmed diagnosis of breast / lung / prostate / colorectal cancer, had at least 3 months treatment completion for the primary cancer before testing, were able to get up from a chair and walk 50 feet with or without an assistive device. Participants were excluded when they reported a cancer recurrence or metastasis, had a history of chronic neurologic condition, had more than one cancer diagnosis, had an acute illness, or had an unstable medical condition. Sixty participants were recruited, however only forty-seven participants had complete data for all measures. Reasons were lost to follow-up, missing data at one or both time points, or not able to complete a test without upper extremity support. The mean age of the resulting 47 participants was 73.7 years (SD: 6.38, range: 65-89), with 66% being female. Participants received radiotherapy only (n=6), surgery only (n=14), hormonal treatment only (n=1), chemotherapy and radiotherapy (n=4), chemotherapy and surgery (n=5), radiotherapy and surgery (n=6), chemotherapy and radiotherapy and surgery (n=7), or chemotherapy and surgery and hormone therapy (n=1). Cancer stage at diagnosis was stage 0 (n=2), stage 1 (n=22), stage 2 (n=8), stage 3 (n=3), or unknown (n=12). For the 30-second chair-stand test, participants were instructed to be seated with arms crossed over their chest. When the test started, participants stood up and sat back down repeatedly. The number of repetitions within 30 seconds were recorded. Tests were performed by one rater. Participants performed one trial per session, with a two-week interval between sessions.
Booth (2001) examined the measurement error of the Shuttle Walking Test in patients with advanced cancer receiving palliative care. Patients had a WHO performance status of 1 or 2 and were recruited from the day center and hospice. Patients who could walk without assistance from another person were eligible for the study. Patients with cognitive impairment, short-term memory loss, or walking difficulties due to pain were excluded from the study. Initially, 71 patients were recruited but 30 declined due to impracticality (n=16), felt too frail (n=8), limited by pain or giddiness (n=5), tasked of saying ‘no’ [n=1]). Thereafter, 9 patients dropped out due to deterioration and unable to walk for a second trial (n=8) and declining further involvement (n=1). From the patients who completed all three tests, 10 participants were excluded from analysis because of experiencing pain (n=4), feeling giddy (n=1), lack of mobility (n=1), complications between trials (n=3), and baseline shortness of breath was very different between trials (n=1). Median age of the resulting 22 analyzed patients was 70 years (range: 31-84), where n=7 were males. Patients had a lung carcinoma (n=3), metastatic breast carcinoma (n=5), or other types of cancer (n=14). The Shuttle Walking test was performed on three different days. The first trial was a familiarization trial. Median interval between the second and third trial was 7 days (range: 1-35). Walking distance in meters was recorded per trial.
De Backer (2007) examined the reliability and criterion validity of the steep ramp test in patients with breast cancer (n=20), ovarian cancer (n=5), colorectal cancer (n=4), testicular cancer (n=2), non-Hodgkin’s lymphoma (n=5), and Hodgkin’s lymphoma (n=1) treated curatively. Patients were excluded when patients were not able to sit or lie down, had cognitive disorders or severe emotional instability, or had other serious illnesses limiting physical performance. Thirty-seven patients (n=10 male, n=27 female) were included, having a mean age of 48 years (SD: 11, range: 24-71). Patients received chemotherapy (n=37), chemotherapy and radiotherapy (n=1), chemotherapy and surgery (n=11), or chemotherapy and radiotherapy and surgery (n=20). A VO2max test was performed on a cycle ergometer using an oximeter connected to a computer for breath-by-breath analysis. A ramp protocol was used in where the patient was suspected to reach their maximum load within 10 minutes. The ramp protocol was applied after 4 minutes of unloaded cycling. Patients were instructed and encouraged to cycle until exhaustion and with a pedal frequency between 70 and 80rpm. After the ramp protocol, the patients cycled unloaded until VO2 values were back to baseline. The VO2max test was repeated after 18 weeks of training. The Steep Ramp Test started with cycling at 25W for 30 seconds. Thereafter, the load was increased every 20 second by 25W until patients were exhausted. Patients were instructed to cycle with a frequency between 70-80rpm and the test was terminated when the pedal frequency fell below 60rpm. The Steep Ramp Test was performed before training, at week 5, week 9, week 13, and week 18. The test-retest reliability of the Steep Ramp Test was performed in a subset of the sample (n=23).
Eden (2018) examined the reliability of the Six Minute Walking Test and the 30-Seconds Chair-Stand test in patients with head and neck cancer. Construct validity was assessed by correlating the six-minute walking test to the 30-second chair stand test. Patients were included when they had surgery in the past three months for head and neck cancer or were currently undergoing radiotherapy, chemotherapy, or chemoradiotherapy for their head and neck cancer. Patients furthermore had to be between 18-85 years old, be community-dwelling during the study period, and have adequate proficiency in English. Patients were excluded when patients needed an interpreter, were unable or unwilling to return for the second trial within one week, or had comorbidities limiting the safe completion of trials. Sixty-six eligible patients consented, of which 1 patient was lost to follow-up and 23 patients were not included due to a diagnosis of skin or thyroid cancer. The mean age of the remaining 42 participants (81% male) was 63.1 years (SD: 9.3, range: 32.5-76.8) and had a mean BMI of 26.9 (SD: 5.4, range: 32.5-76.8). Tumor diagnosis was oral cavity (n=34), oropharynx or hypopharynx (n=3), nasal cavity or paranasal sinus (n=3), or parotid (n=2) cancer. Tumor stage was either T1 (n=14), T2 (n=10), T3 (n=9), or T4 (n=9). Therapy received was surgery for 40 patients (n=8 with radiation, n=6 with radiation and chemotherapy), radiotherapy only for one patient, and radiotherapy and chemotherapy only for one patient. Trials used for the reliability testing were performed on separate days (mean time between days: 4.3 days, range: 1-12). On the first day, two trials of the Six Minute Walking Test were performed with one hour of rest in between in a 22 meters long hallway. The first trial was a familiarization trial, where the second trial was used for reliability testing. The 30-Seconds Chair-Stand test was performed once on day one. Patients kept their arms crossed over their chest. On the second testing day, one trial for both the Six Minute Walking Test and 30-Seconds Chair-Stand test was performed.
England (2012) examined the construct validity of the Incremental Shuttle Walking Test by correlating the test to measures of inspiratory and peripheral muscle strength. Patients (n=41, of which 19 females) with incurable thoracic cancer and an ECOG-status of 0-2 were included. Patients were excluded when a palliative intervention to improve exercise capacity was appropriate, when they had ischaemic heart disease, had pain affecting their ability to walk, or if they had received chemotherapy or radiotherapy within the last four weeks. The dose of drugs potentially affecting the ability to exercises needed to be stable for at least a week. The mean age was 64 years (SD: 8) with diagnosis of non-small cell cancer (n=26), mesothelioma (n=11), or small cell cancer (n=4). ECOG-status was either 0 (n=16), 1 (n=21), or 2 (n=4). Cancer was either local (n=21) or advanced (n=20), where patients had metastases in the lymph nodes (n=12), lung (n=6), bones (n=4), liver (n=2), and adrenal metastases (n=1). Thirty-seven patients received palliative treatments previously in the form of chemotherapy (n=26), radiotherapy (n=10), or radical surgery (n=1). Smoking status was ‘never’ (n=7), ‘current’ (n=7), or ‘ex-smoker’ (n=27). Comorbidities observed in the sample were COPD (n=5), osteoarthritis (n=3), hypertension (n=4), diabetes (n=4), and a recent pulmonary embolism (n=1). Patients performed a sniff nasal inspiratory pressure test for measuring the inspiratory muscle strength and the leg extensor power test for peripheral muscle strength. Five minutes of rest were taken between assessments. Patients listened to pre-recorded instructions for the Incremental Shuttle Walking Test prior to performing a familiarization trial. After the trial, patients were seated until their heart rate and breathlessness returned to normal and rested for an hour thereafter. A second trial was then performed.
Granger (2015) assessed the criterion validity of the Six-Minute Walking Test and Incremental Shuttle Walking Test. Construct validity was assessed by correlating the Six-Minute Walking Test to the Incremental Shuttle Walking Test. Participants were included when they had non-small cell lung cancer (histologically confirmed), received or were schedules to receive treatment in the past six months, and had an ECOG-status of 0-2. Patients were excluded when they were unable to give consent, had a cognitive disorder, had extensive siceral or skeletal metastases, had stage IV disease after treatment, had co-morbidities that prevented exercise, Insufficient proficiency of English, or had contraindications for performing a PET (as recommended by the American Thoracic Society). Included patients (n=20) had a mean age of 66.1 years (SD: 6.5), of which 8 were male (40%). Mean Body Mass Index was 28.5 (SD:4.0). In the sample, 7 patients (35%) had COPD and smoking status was ‘never’ (n=2, 10.5%), ‘current’ (n=2, 10.5%), ‘ex-smoker’ (n=15, 78.9%), or unknown (n=1). The lung cancer stage was either stage I (n=12), stage II, (n=5), or stage III (n=3). Adenocarcinomas (n=15), squamous cell carcinomas (n=2), and other types (n=3) were observed in the sample. Time of performing the study tests was pre-treatment (n=3), post-surgery only (n=12), or post-surgery and chemotherapy (n=5). ECOG-status was either 0 (n=11), 1 (n=8), or unknown (n=1). A CPET was performed on a cycle ergometer using an incremental protocol. The workload increased every minute until symptomatic limitations. Breathlessness and heart rate data were used to determine whether the test was maximal.
The Six-Minute Walking Test was performed in a 30-meters long corridor according to the American Thoracic Society’s and European Respiratory Society’s recommendations. The Incremental Shuttle Walking Test was performed around a 10-meters shuttle course while the walking pace increased every minute. Walking pace started at 0.5 m/sec. and the test terminated when the participant was no longer able to maintain the pace. Two tests for both the Six-Minute Walking Test and the Incremental Shuttle Walking Test were performed to account for learning effects and the highest result was used in the analyses. Assessments were performed on each single visit. Patients returned for a second test visit within seven days. Patients were stable and did not receive medical or exercise interventions in-between visits. At least 15 minutes of rest were taken between tests on a single visit, while the test order was randomized.
Koegelenberg (2007) examined the criterion validity by correlating the Stair Climb Test with a cycle ergonometry CPET. Patients were included when 18 years or older, were scheduled for a lung resection, were under optimal medical therapy, had an FEV1 under 80% of the predicted FEV1, were physically able to perform cycle ergometry and stair climbing, and could provide written consent. Patents were excluded when a lung volume reduction surgery was planned, had serious cardiopulmonary disease, or were unable to exercise due to musculoskeletal or lower limb pathology. Patients (n=44, of which 13 females) had a mean age of 47.6 years (SD: 12.5). Indication for surgery was either bronchiectasis (n=27), non-small cell bronchial carcinoma (n=13), aspergilloma (n=2), or other (n=2). COPD was present in 24 patients. Cycle ergometry was performed on the same day as the Stair Climb Test with 2 hours of rest in-between. Cycle ergometry was performed following the American Thoracic Society guidelines. A 2-minute warm-up period at 0W was followed by the test start at 20W, using increases of 20W every minute. When a clear plateau for VO2max was not observed, the observed VO2peak was used. For the Stair Climb Test, patients needed to climb a stair as fast and as high as possible. The test was terminated when patients stopped for more than three seconds or reached the height of 20 meters. Use of the handrail was allowed to maintain balance, but not for elevation of themselves.
Li (2018) examined the criterion validity of the Duke Activity Status Index in 43 patients with cancer. Patients were included when scheduled for a major cancer surgery, when referred to the CPET service during pre-operative work-up, and who had a concurrent Duke Activity Status Index questionnaire administered. Patients were excluded when unable to perform the test due to pain, had neurological deficits, or had severe cognitive deficits. Cases with missing data were removed from analysis. The sample contained 25 males and had a median age of 63 years (IQR: 18). Fifteen participants had received chemotherapy within the last 6 months. Median Body Mass Index in the sample was 25.92 (IQR: 8.07). The CPET was performed following the American Thoracic Society/American College of Chest Physicians’ guidelines. Three minutes of unloaded cycling was used before a ramp protocol was initiated with 20W increases every minute at 60-70RPM until peak exercise. Thereafter, patients pedalled unloaded for 5 minutes as a recovery period. Tests were terminated at peak exercise due to fatigue, dyspnoea, chest pain, leg pain, signals of myocardial ischemia, hypotension, or arrhythmia at the patient’s or clinician’s discretion. The Duke Activity Status Index was used as a self-administered questionnaire, part of routine pre-operative assessments. The questionnaire was immediately administered before CPET testing.
Rogers (2017) examined the construct validity of handgrip dynamometry by correlating the dynamometry to a One-Repetition Maximum barbell bench press. Patients were included when they were a female breast cancer survivor between 1-15 years after diagnosis, were free from cancer, had ≥1 lymph nodes removed, had no medical conditions or medications that contraindicated an exercise program, had a BMI ≤50, had no future plans for surgery during the study period, had no bilateral lymph node removal, did not perform weight lifting in the past year, and had a stable body weight while not trying to lose weight. The sample contained 295 participants with a mean age of 55.9 years (SD: 8.8) and a mean Body Mass Index of 29.2 (SD: 6.1). A large proportion (86%) of the sample was post-menopausal. Disease stage at diagnosis was predominantly stage I (45%) or stage III (31%), while the mean time since diagnosis was 60.8 (SD: 39.3) months. About half of the sample had lymphedema (48%) and breast cancer on their dominant side (50%). Therapies received in the same were chemotherapy (n=224) and/or radiotherapy (n=229). Hand grip strength was assessed with the participants seated having their elbows at 90 degrees flexion. Three maximal contractions in each hand were performed, alternatingly, with a 1-minute rest in-between. A warm-up session for the Weight for the One-Repetition Maximum barbell bench press performed with 4 to 6 repetitions using a 2.6kg weight. Weight was progressively increased based on the participant’s self-reported rating of difficulty until the participant indicated a maximal effort, was unable to lift with proper biomechanics, was unwilling or unable to attempt to lift more, or reported a problem that required termination of the test.
Schmidt (2013) examined the criterion validity, reliability and measurement error of the Six-Minute Walking Test. Patients were included when they had a histologically confirmed cancer (or recurrence), had ongoing or recently finished chemotherapy, radiotherapy, and/or hormone therapy, were between 18-75 years old, and had an EGOC status of 0-2. Patients were excluded when they had brain or bone metastases, had a hemoglobin concentration less than 8g/dl, had chronic infection, had significant respiratory or cardiac disease, had any medical condition that limited participation, or had orthopedic or neurologic conditions that influenced successful completion of exercise tests. Fifty patients were recruited (n=36 females) with a mean age of 57.4 (SD: 10.2) years. Mean BMI was 25.3 (SD: 4.2) and median time since diagnosis was 12 months. Most common cancer site was breast (38%), colorectal (26%), or prostate (10%). Disease status was either advanced (n=14) or there was a curative option (n=36). The Six-minute Walking Test was performed according to the American Thoracic Society’s guideline using a 30-meters course in a corridor. Patients were instructed to walk back and forth and stopped when six minutes elapsed. The test was repeated by a subsample (n=30) within 2 to 7 days after the first test, at the same time of day. CPET was performed on a cycle ergometer. The load was increased with 25W every 3 minutes until exhaustion, initially starting the test at 0W. Criteria for exhaustion were symptom limitations or the inability to keep pedaling at 60RPM.
Schumacher (2018) examined the construct validity of the Six-Minute Walking test in cancer survivors. Patients were included with a wide range of cancers. Patients were excluded when there was a history of congestive heart failure, had myocardial infarction, had a chronic lung disease (including asthma), had significant ambulatory problems, or had a history of hemoptysis, fainting, or epilepsy. The sample consisted of 187 patients (115 females) having a mean age of 61 years (SD: 13). During the study, 31% of the participants were undergoing (chemo)radiation. Participants completed a treadmill protocol of twenty-one on-eminute stages. Both speed and inclination could increase with each stage. Patients could terminate the test at any time but were encouraged to perform maximally. Tests were terminated when patients indicated that they reached their maximum performance or when they grabbed onto the handrails. Gas-analyses were not used. Instead, a prediction model was used to estimate the VO2peak from the treadmill test. Patients furthermore completed a single Six-Minute Walking Test. The treadmill test and the Six-Minute Walking Test were performed one week apart in randomized order.
Sebio-Garcia (2017) examined the reliability and measurement error of the Six-Minute Walking Test in patients with cancer awaiting surgery. Patients were included when undergoing major oncological surgery, had a high risk for post-operative complications, and had severe deconditioning. Patients were excluded when they were unable to complete tests and questionnaires in a prehabilitation program, were scheduled for surgery within 3 weeks, or had severe musculoskeletal, neurological, or cognitive impairments. The sample consisted of 170 patients (114 males) with a mean age of 71.1 (SD 14.9) years and a mean Body Mass Index of 26.5 (SD: 4.8). ASA score was either I (n=69), II (n=84), III (n=13), or IV (n=4). Planned type surgery was colorectal (n=65), upper gastrointestinal (n=36), pancreatic (n=8), urologic (n=44), gynecolgic (n=7), cytoreductive (n=3), or other (n=7). Most patients did not receive neoadjuvant treatment (n=112), while others received chemotherapy (n=37), radiotherapy (n=17), or chemoradiotherapy (n=17). Only 2.9% of the sample scored a 0 on the Charlson Comorbidity index, the others ranged from 1 to 5 with the highest proportion scoring 1 (44.9%). The Six-Minute Walking Test was performed following the American Thoracic Society/European Respiratory Society’s guideline using a 30-meters course in a corridor. Patients received 30 minutes of rest before starting the second trial, assessed by the same rater.
Stuiver (2017) assessed the criterion validity of the Steep Ramp Test in cancer survivors. The study used patient data recorded in two randomized controlled trials. Data of 283 patients were used (68 males) with patients having a mean age of 53 years (SD: 11). Types of cancer in the sample were breast cancer (n=162), colon cancer (n=49), lymphomas (n=56), ovarian cancer (n=8), cervix cancer (n=4), and testis cancer (n=4). Patients received chemotherapy (n=283), radiotherapy (n=123), surgery (n=227), stem cell transplantation (n=32), immunotherapy (n=51), and/or hormone therapy (n=114). A CPET at baseline followed by the participants first Steep Ramp Test within 30 days after CPET while not starting exercise training in this interval were used for validation (median interval: 8 days, IQR: 6-10). Both tests were performed on a cycle ergometer. The CPET used an incremental protocol adjusted to each participant, aiming for a maximal performance within 8-12 minutes. Patients cycled between 60-80RPM and received encouragement. The test was terminated when exhausted or when unable to maintain 60-80RPM. The Steep Ramp Test had a four-minute warm-up period at 10W. The starting workload was 25W followed by an increase of 25W every 10 seconds. The Steep Ramp Test was terminated when the pedal frequency dropped below 60RPM.
Trutschnigg (2008) examined the reliability and measurement error of handgrip dynamometry. Patients aged between 18 and 36 years old with recently diagnosed (up to 6 months) advanced non-small cell lung cancer or gastrointestinal cancer were included. Patients (n=70, 27 females) had a mean age of 61.5 years (SD: 13.2) and a mean Body Mass Index of 24.4 (SD: 4.9). The Jamar dynamometer and the Biodex System 3 (with handgrip attachment) were used for measurements of handgrip strength. Participants performed one or two familiarization trials on both instruments. For both instruments, three consecutive maximal contractions (lasting 3 seconds) were performed for the dominant hand with a 1-minute break between instruments. Participants were tested twice and the mean of three trials was used as the outcome. Testing position of the participants was standardized. Patients had their elbow flexed at 90 degrees and the wrist at 0 degrees for the Biodex. Patients had their elbow flexed at 90 degrees and their arm rested on the armrest. The non-dominant arm rested neutrally and both feet were firmly on the ground at shoulder’s width.
Tsuji (2022) examined the criterion validity of the Six-Meter Walking Test. Construct validity of the One-Repetition Maximum (leg press) was assessed and the construct validity of the hand grip strength predicting the One-repetition maximum (along with other predictors) was assessed as well. Patients were included when they were females aged between 20 to 59 years at diagnosis, were diagnosed with stage I-IIa invasive breast cancer, were within 2-13 months after surgery, had no chemotherapy next to hormone or radio therapy, did not exercise more than 30 minutes on moderate intensity on two weekdays. Patients were excluded when exercise was considered too riskful, had a smoking history in the past year, had a Body Mass Index of ≥30, were judged unfit for participation for other reasons, or were administered beta-adrenergic blocking agents. The sample (n=50) had a mean age of 48 years (SD: 6) and a mean Body Mass Index of 21.0 (SD: 2.1). Breast cancer stage was either stage I (n=36, 72%) or stage IIa (n=14, 28%). The tumor was estrogen receptor positive (n=49), progesterone receptor positive (n=48), and/or HER2 positive (n=1). Forty-seven patients received hormone therapy and twenty-three patients received radiotherapy. Mean time since surgery was 11 months (SD: 22). A CPET was performed on a cycle ergometer using an incremental multistage protocol. The test commenced at 29.4W and was increased every minute by 14.7W. Patients performed the test until exhaustion. That is, when the pedal frequency dropped below 55RPM for the third time. The Six-Meter Walking Test was performed according to the American Thoracic Association on a 30-meters course in a corridor. Participants were instructed to walk as far as possible in their own pace for 6 minutes. The One-Repitition Maximum was performed on a leg press. Patients performed a 10 repetition warm-up at 20-30 kilograms and rested for 2-3 minutes thereafter. Weight was increased (10-20%) in single trials, with 1-5 minutes of rest in-between, until the One-Repetition Maximum was achieved. Grip strength was measured twice for both the left and right hand, with measures alternating between hands. The test was performed while standing up straight with the arms in neutral position.
Van Hinte (2020) examined the reliability and measurement error of handgrip dynamometry, the 30-second Chair-Stand Test, the Timed Up and Go test, and the Six-Minute Walking Test in survivors of head and neck cancer. Patients were included when they were survivors of head and neck cancer, had completed medical treatment, were 18 or over, and were able to walk unaided. Patients were excluded when not able to speak or understand Dutch, when receiving palliative care, or when being at risk when performing physical measurements. Fifty patients were recruited (22 females) with a mean age of 68.6 years (SD: 9.9) and a median Body Mass Index of 25.0 (IQR: 23.5-26.7). Smoking status was ‘current’ (n=4), ‘never’ (n=7), or ‘ex-smoker’ (n=39). Patients had a median of 3.0 years (IQR: 1.0-5.25) since cancer treatment. Tumor location was at the oral cavity (n=28), nasopharynx (n=1), oropharynx (n=2), larynx (n=12), or other (n=7). Patients received surgery (n=19), surgery and radiotherapy (n=18), radiotherapy (n=4), Surgery with radiotherapy and chemotherapy (n=7), or radiotherapy and chemotherapy (n=2). Twenty-eight patients received a neck dissection, performed unilateral (n=22) or bilateral (n=6). Time interval between two test trials were at least one hour with a maximum of two hours. A single test trial for the 30-second Chair-Stand test and the Six-Minute Walking Test both contained one measurement. The handgrip strength and Timed Up and Go test were performed three times each test trial, where the best score was used. Test and re-test assessments were performed by the same rater.
Weemaes (2021) assessed the criterion validity and responsiveness of the Steep Ramp Test in cancer survivors. Patients were included when they had completed active medical treatment, were suffering physically and psychosocially (as identified by a sports physician, psychologist, and occupational therapist), had completed a CPET and Steep Ramp Test before participation in the rehabilitation programme, and gave consent to use their usual care data. Patients were excluded when they were unable to cycle until exhaustion during the tests. The sample consisted of 106 patients (n=28 males) with a mean age of 56.6 years (SD: 11.0) and a mean Body Mass index of 27.5 (SD: 4.8). Cancer type in the sample was breast cancer (n=51), colorectal cancer (n=9), lung cancer (n=7), lymphomas (n=6), prostate cancer (n=4), or other types (n=29). Metastases were prevalent in a part of the sample: lymphatic (n=17), hepatic (n=5, skeletal (n=4), other (n=3). Patients had received surgery (n=80), chemotherapy (n=62), radiotherapy (n=55), hormone therapy (n=32), immunotherapy (n=11), and/or stem cell transplantation (n=4). Participants performed a CPET and a Steep Ramp Test (2 to 7 days apart) before the start of an exercise program and after 10 weeks of training. For the CPET, patients started with an unloaded 3-minute warm-up period. Increase in workload was adjusted to the patient, aiming at reaching a maximal effort between 8 to 12 minutes. Patients needed to keep pedaling at least at 60RPM until exhaustion. The test was terminated when the patient stopped or dropped under 60RPM. The Steep Ramp Test was performed on a cycle ergometer and patients started with a 3-minute warming-up at 25W. Thereafter, every 10 seconds the load increased with 25W.The test was terminated when the pedaling frequency dropped below 60RPM or when voluntary exhaustion was achieved by clinical signs of intense effort.
Willcock (2018) examined the measurement error of the Incremental Shuttle Walking Test in patients with thoracic cancer. Patients were included when they had incurable thoracic cancers and had an ECOG-status of 0-2 (reporting a limitations to undertake daily activities). Patients were excluded when they received chemotherapy or radiotherapy within the prior four weeks or when their symptoms were related to a palliative intervention. The sample (n=41) had a mean age of 64 (SD: 8) and a median ECOG-status of 1 (range: 0-2). Cancer types observed in the sample were non-small cell lung cancer (n=26), small cell lung cancer (n=4), and mesothelioma (n=11). The incremental shuttle walking test was performed two times at the same time of day on consecutive days. A 10-meters course was used and walking speed was externally paced. Patients were instructed to walk and not to run. Each minute the walking speed would increase until patients reached their symptom-limited maximum (by breathlessness and leg fatigue). Patients were advised to wear comfortable shoes and take their usual medication. On the first day of testing, a familiarization trial and a test trial was performed, with one hour of rest in-between. The second test day consisted of two trials again.
Win (2006) assessed the criterion validity of the Incremental Shuttle Walking Test in patients with operable lung cancer. Consecutive patients with potentially resectable lung cancer were recruited. Patient were excluded when they had unstable angina, had a myocardial infarction in the prior 6 weeks, or had disorders that physically influenced exercise performance. Patients (n=125) had a mean age of 68.8 years (SD: 7.7, range: 42-85). Thirty-three percent of the sample had an FEV1 under 1.5 liters (for lobectomy) and under 2.0 liters (for pneumonectomy). For the Incremental Shuttle Walking Test the participants walked back and forth on a 10-meter course. Walking speed was externally paced by a cassette playing signals. The test was terminated when the patient could no longer keep up with the required speed or became too breathless to continue. The CPET was performed on a treadmill using the Standardized Exponential Exercise Protocol. Every minute the workload increased by 15% using an increase in speed or inclination for a maximum of 20 minutes (including baseline and recovery). The test proceeded until patients were sign or symptom limited. The Incremental Shuttle Walking Test and treadmill CPET were performed on the same day with at least four hours in-between. Patients were fully familiarized with both tests before performing the tests.
Results
Validity of instruments for muscle strength
See Table Validity results regarding the tests of interest intending to measure muscle strength for a summary of the results found for the validity of the One-Repetition Maximum, handgrip dynamometry, Thirty-Second Chair-Stand Test, Five-Times Sit to Stand test, Timed-Up and Go test, Six-Minute Walking Test, and the Incremental Shuttle Walk Test to assess muscle strength.
No information was found for the Stair Climb Test, Steep Ramp Test, Physician-based Assessment and Counseling for Exercise questionnaire, Fitmáx, Veteran-Specific Activity Questionnaire, and Duke Activity Status Index, in the population of interest.
One repetition maximum (1RM)
Tsuji 2020 reported data about the construct validity (hypotheses testing) of the 1RM on a leg press.
Handgrip Dynamometry
Information about the construct validity (hypotheses testing) of handgrip dynamometry to assess muscle strength was reported by Rogers (2017) and Tsuji (2022). Rogers (2017) correlated several different measures of handgrip strength to the performance on a 1RM barbell bench press. Tsuji (2022) used a prediction equation including hand grip strength, among others, as predictors for the 1RM on a leg press.
Thirty-second Chair-Stand Test (30sCST)
Blackwood (2021) correlated the 30sCST with the Timed Up and Go test for construct validity (hypotheses testing).
Five-Times Sit-To-Stand test (5TSTS)
Blackwood (2021) correlated the 5TSTS with the Timed Up and Go test for construct validity (hypotheses testing).
Timed Up and Go test (TUG)
Blackwood (2021) had correlated both the 30sCST and 5TSTS test with the TUG test for construct validity (hypotheses testing).
Six-Minute Walking Test (6MWT)
Eden (2018) correlated the walking distance from the 6MWT to the number of repetitions from the 30sCST for construct validity (hypotheses testing).
Incremental shuttle walking test (ISWT)
England (2014) correlated the ISWT to the inspiratory muscle strength and peripheral muscle strength (leg extension power) for the information on the construct validity (hypotheses testing).
Table Validity results regarding the tests of interest intending to measure muscle strength
Instrument |
Measurement property |
Test parameter |
Author (year) |
Result |
Risk of bias assessment* |
Individual outcome assessment |
One repetition maximum (1RM) |
Validity (hypotheses testing) |
Kilogram |
Tsuji (2020) |
1RM leg press correlation with Chair Stand Test (seconds required to stand up 10 times [combined strength for lower limbs]): r= -0.381 [95%CI -0.594 to 0.111, p=0.006] |
Inadequate |
– |
Handgrip dynamometry |
Validity (hypotheses testing) |
Kilograms |
Rogers (2017) |
Average isometric handgrip dynamometry correlation with 1RM barbell bench press (kilogram): Mean dominant hand: 24.1 (SD: 6.4), r = 0.359, p<0.001
Max dominant hand: 25.6 (SD: 6.5), r = 0.363, p<0.001
Mean both hands: 23.5 (SD: 5.8), r = 0.399, p<0.001
Max both hands: 26.6 (SD: 6.3), r = 0.369, p<0.001
Mean hand of non-breast cancer side hand: 23.8 (SD: 6.0), r = 0.350, p<0.001
Max hand non-breast cancer side: 25.5 (SD: 6.6), r = 0.295, p<0.001 |
Inadequate |
– |
|
|
Predicted kilograms (for 1RM) |
Tsuji (2022) |
Correlation of predicted kilograms (from hand grip strength and other predictors) with observed 1RM legpress:
Equation 1 (Leg press 1RM = 542.295 – 1.065 x Age(years) – 3.595 x Height(centimeters) – 2.672 x Weight(kilograms) + 3.179 x Grip strength(kilograms) – 2.700 x Chair stand test(seconds)): r = 0.754 [95%CI 0.597-0.851], p<0.001, mean difference = -0.05 (SD: 20.9), 95%LoA = -40.92 to 41.02 |
Inadequate |
? |
Thirty-second Chair-Stand Test (30sCST) |
Validity (hypotheses testing) |
Repetitions |
Blackwood (2021) |
Correlation with the Timed Up and Go test (seconds): r = -0.69 |
Doubtful |
? |
Five-times sit to stand test (5TSTS) |
Validity (hypotheses testing) |
Seconds |
Blackwood (2021) |
Correlation with the Timed Up and Go test: r = 0.53 |
Doubtful |
? |
Stair Climbing Test (SCT) |
No studies were included that reported the validity of this instrument in the population of interest |
|||||
Timed Up and Go (TUG) Test |
Validity (hypotheses testing) |
Seconds |
Blackwood (2021) |
Correlation with 5TSTS test (seconds): r = 0.53
Correlation with 30sCST (repetitions): r = -0.69 |
Doubtful |
? |
Six-Minute walking test (6MWT) |
Validity (hypotheses testing) |
Meters |
Eden (2018) |
Correlation with the 30-second Chair Stand test (30sCST, repetitions): r = 0.407 |
Adequate |
? |
Incremental Shuttle Walking Test (ISWT) |
Validity (hypotheses testing) |
Meters |
England (2014) |
Correlation with the Stiff Nasal Inspiratory Pressure (inspiratory muscle strength): r = 0.42
Correlation with the leg extension power (peripheral muscle power): r = 0.39 |
Doubtful |
? |
Steep Ramp Test (SRT) |
No studies were included that reported the validity of this instrument in the population of interest |
|||||
Physician-based Assessment and Counseling for Exercise (PACE) Questionnaire |
No studies were included that reported the validity of this instrument in the population of interest |
|||||
Fitmáx Questionnaire |
No studies were included that reported the validity of this instrument in the population of interest |
|||||
Veterans-Specific Activity Questionnaire (VSAQ) |
No studies were included that reported the validity of this instrument in the population of interest |
|||||
Duke Activity Status Index (DASI) |
No studies were included that reported the validity of this instrument in the population of interest |
|||||
*Based on the 4-point COSMIN risk of bias tool using the lowest score counts method. **Includes measurement error derived from intra-rater (or unclear) test-retest designs 5TSTS: 5 times sit to stand, 6MWT: six minute walk test, 30sCST: 30 second chair stand test, 10mWT: 10 meter walk test, CI: Confidence Interval, CPET: cardiopulmonary exercise testing, CST: chair stand test, ICC: Intraclass correlation coefficient, CV: Coefficient of variation, DASI: Duke Activity Status Index, ESWT: endurance shuttle walk test, ISWT: incremental shuttle walk test, LOA: limits of agreement, m: meter, MDC95: minimal detectable change at 95%, PACE: Physician-based Assessment and Counseling for Exercise, SCT: stair climbing test, SD: standard deviation, SEM: standard error of measurement, SRT: steep ramp test, TUG: timed up and go test, VSAQ: Veterans-Specific Activity questionnaire, Wmax: maximum work capacity, VO2peak: peak oxygen uptake, VO2max: maximum oxygen uptake |
Validity of instruments for cardiorespiratory fitness
See Table Validity results regarding the tests of interest intending to measure cardiorespiratory fitness for a summary of the results found for the Thirty-Second Chair-Stand Test, Stair Climb Test, Six-Minute Walking Test, Incremental Shuttle Walk Test, and the Steep Ramp Test.
No information was found for the One-Repetition Maximum, handgrip dynamometry, Five-Times Sit to Stand test, Timed-Up and Go test, Physician-based Assessment and Counseling for Exercise questionnaire, Fitmáx questionnaire, and the Veteran-Specific Activity Questionnaire in the population of interest.
Thirty-second Chair-stand test (30sCST)
Blackwood (2021) and Eden (2018) examined the construct validity of the 30sCST by correlating the test to the Timed-up and Go test and Six-Minute Walking Test, respectively.
Stair climbing test (SCT)
Koegelenberg (2020) examined the criterion validity by correlating the average speed of ascend in the SCT to the VO2peak as measured by a CPET.
Six-minute walking test (6MWT)
Granger (2015), Schmidt (2013), and Tsuji (2022) examined the criterion validity by correlating the walking distance to the VO2peak as assessed by CPET. Tsuji (2022) also used the 6MWT walking distance as a predictor, among other predictors, in a prediction equation to estimate the VO2peak. Eden (2018) and Granger (2015) also correlated the walking distance to the Thirty-Second Chair Stand Test and Incremental Shuttle Walking Test for the construct validity, respectively. Schumacher (2018) correlated in several prediction models, with walking distance as a predictor, the VO2peak which was estimated from a treadmill walking test also using a prediction equation to obtain the VO2peak.
Incremental shuttle walking test (ISWT)
Granger (2015) and Win (2006) assessed the criterion validity of the ISWT by correlating the test with the VO2peak obtained from a CPET. Granger (2015) furthermore assessed the construct validity (hypotheses testing) by correlating the ISWT with the Six-Minute Walking Test.
Steep ramp test (SRT)
De Backer (2007) and Weemaes (2021) correlated the Wmax from the Steep Ramp Test to the VO2max or VO2peak from a CPET, respectively. Stuiver (2017) used prediction equations to estimate the VO2peak obtained from a CPET, using the workload of the last SRT stage plus 2.5W (i.e. maximal short exercise capacity) as a predictor in these equations.
Duke activity status index (DASI)
Li (2018) used several prediction models to estimate the VO2 obtained by CPET using the DASI score or specific DASI questions as predictors.
Table Validity results regarding the tests of interest intending to measure cardiorespiratory fitness
Instrument |
Measurement property |
Test parameter |
Author (year) |
Result |
Risk of bias assessment* |
Individual outcome assessment |
One repetition maximum (1RM) |
No studies were included that reported the validity of this instrument in the population of interest
|
|||||
Handgrip dynamometry |
No studies were included that reported the validity of this instrument in the population of interest |
|||||
Thirty-second Chair-Stand Test (30sCST) |
Validity (hypotheses testing) |
Repetitions |
Eden (2018) |
Correlation with the 6MWT (meters): r = 0.407 |
Doubtful |
? |
Five-times sit to stand test (5TSTS) |
No studies were included that reported the validity of this instrument in the population of interest |
|||||
Stair Climbing Test (SCT) |
Validity (criterion validity) |
Average speed of ascend |
Koegelenberg (2020) |
Correlation with VO2peak from CPET: r = 0.77 |
Very good |
+ |
Timed Up and Go (TUG) Test |
No studies were included that reported the validity of this instrument in the population of interest |
|||||
Six-Minute walking test (6MWT) |
Validity (criterion validity) |
Meters |
Granger (2015) |
Correlation with VO2peak from CPET: r = 0.24 [95%CI -0.12-0.69] |
Very good |
– |
|
|
|
Schmidt (2013) |
Correlation with VO2peak from CPET: r = 0.67 |
Very good |
– |
|
|
|
Tsuji (2022) |
Correlation with VO2peak from CPET: r =0.326 [95%CI 0.048-0.551, p=0.021] |
Very good |
– |
|
|
Predicted VO2peak |
Tsuji (2022) |
Correlation with VO2peak from CPET: VO2peak = 35.13 + 0.028 x 6MWT distance(meters) – 0.101 x Weight(kilograms) – 0.101 x Age(years): r=0.463 [95%CI 0.206-0.653, p<0.001]
Bland-Altman plot: Mean difference= -0.15 95%LoA = -6.48 to 6.19 mL/kg/min |
Very good |
– |
|
Validity (hypotheses testing) |
Meters |
Granger (2015) |
Correlation with Incremental Shuttle Walking Test (ISWT): r = 0.80 [95%CI 0.64-0.93] |
Inadequate |
? |
|
|
Predicted VO2peak |
Schumacher (2018) |
Correlation with VO2peak from a prediction model using a treadmill walking test: Equation 1 (VO2peak = 0.03 x 6MWT distance(meters) + 3.98): r = 0.81
Equation 2 (VO2peak = 0.02 x 6MWT distance(meters) – 0.191 x Age(years) – 0.06 x Weight(kilograms) +0.09 x Height(centimeters) + 0.26 x (Rate Pressure Product x 10-3) + 0.10): r = 0.76
Equation 3 (VO2peak = 0.02 x 6MWT distance(meters) – 0.14 x Age(years) – 0.07 x Weight(kilograms) + 0.03 x Heigh(centimeters) + 0.23 x (Rate Pressure Product x 10-3) + 0.10 x Forces Expiratory Volume1(liters) -1.19 x Forced Vital Capacity(liters) +7.77): r = 0.59
Equation 4 (VO2peak = 4.948 + 0.023 x 6MWT distance(meters)): r = 0.76 |
Doubtful |
+ (Equation 1) + (Equation 2) – (Equation 3) + (Equation 4) |
Incremental Shuttle Walking Test (ISWT) |
Validity (criterion validity |
Meters |
Granger (2015) |
Correlation with VO2peak from CPET: r = 0.61 [95%CI 0.20-0.88] |
Very good |
– |
|
|
|
Win (2006) |
Correlation with VO2peak from CPET using a treadmill protocol: r = 0.67 [95%CI 0.56-0.76, p<0.001] |
Very good |
– |
|
Validity (hypotheses testing) |
Meters |
Granger (2015) |
Correlation with the 6MWT (meters): r = 0.80 [95CI% 0.64-0.93] |
Doubtful
|
? |
Steep Ramp Test (SRT) |
Validity (criterion validity) |
Wmax |
De Backer (2007) |
Correlation with VO2max from CPET: r = 0.82 [95%CI 0.67-0.90] |
Very good |
+ |
|
|
|
Weemaes (2021) |
Correlation with VO2peak from CPET: r = 0.86 [95%CI 0.80-0.90] |
Very Good |
+ |
|
|
Predicted VO2peak |
Stuiver (2017) |
Correlation with VO2peak from CPET: Equation 1 from literature (Predicted VO2peak = 356.7 + 6.7 x Maximum Short Exercise Capacity): ICC = 0.61 [95%CI 0.41-0.74] 95%LoA = ±705 ml/min
Equation 2 extension (model includes MSEC, Weight, Sex): r2 = 0.58 (r = 0.76), ICC = 0.73 [95%CI 0.67-0.78] 95%LoA = ±608 ml/min |
Very good |
– (Equation 1)
+ (Equation 2) |
Physician-based Assessment and Counseling for Exercise (PACE) Questionnaire |
No studies were included that reported the validity of this instrument in the population of interest |
|||||
Fitmáx Questionnaire |
No studies were included that reported the validity of this instrument in the population of interest |
|||||
Veterans-Specific Activity Questionnaire (VSAQ) |
No studies were included that reported the validity of this instrument in the population of interest |
|||||
Duke Activity Status index (DASI) |
Validity (criterion validity) |
DASI predicted VO2 |
Li (2018) |
Correlation with VO2 from CPET: Initial equation (DASI predicted VO2 ml/kg-1/min-1 = 0.43 x DASI(score) + 9.6): adjusted r2 = 0.20 Mean difference (sample) = 8.0 95%LoA (sample) = -3.4 to 19.5 Mean difference (recent chemotherapy) = 8.5 95%LoA (srecent chemotherapy) = -3.5 to 20.5
Revised equation 1 (DASI predicted VO2 ml/kg-1/min-1 = 0.115 x DASI(score) +13.3): adjusted r2 = 0.20
Revised equation 2 (DASI predicted VO2 ml/kg-1/min-1 = 0.11 x DASI(score) + 1.93 x Male +12.4): adjusted r2 = 0.25
Revised raw data equation: DASI predicted VO2 ml/kg-1/min-1 = 10.294 + 2.45 x Male + 0.203 x Question5 + 1.417 x Question6 + 0.574 x Question10): adjusted r2 = 0.37 |
Very Good |
– |
*Based on the 4-point COSMIN risk of bias tool using the lowest score counts method. **Includes measurement error derived from intra-rater (or unclear) test-retest designs 5TSTS: 5 times sit to stand, 6MWT: six minute walk test, 30sCST: 30 second chair stand test, 10mWT: 10 meter walk test, CI: Confidence Interval, CPET: cardiopulmonary exercise testing, CST: chair stand test, ICC: Intraclass correlation coefficient, CV: Coefficient of variation, DASI: Duke Activity Status Index, ESWT: endurance shuttle walk test, ISWT: incremental shuttle walk test, LOA: limits of agreement, m: meter, MDC95: minimal detectable change at 95%, PACE: Physician-based Assessment and Counseling for Exercise, SCT: stair climbing test, SD: standard deviation, SEM: standard error of measurement, SRT: steep ramp test, TUG: timed up and go test, VSAQ: Veterans-Specific Activity questionnaire, Wmax: maximum work capacity, VO2peak: peak oxygen uptake, VO2max: maximum oxygen uptake |
Reliability
Table Reliability results regarding the instruments of interest summarizes the extracted results for reliability of the instruments of interest. No data regarding the reliability of the one-repetition maximum, stair climbing test, incremental shuttle walking test, Physician-based Assessment and Counseling for Exercise questionnaire, Fitmáx questionnaire, Veterans-Specific Activity Questionnaire, or Duke Activity Status Index were found.
Handgrip dynamometry
Two studies assessed the within-day test-retest reliability of handgrip dynamometers (Trutschnigg, 2008; Van Hinte, 2020). One study reported reliability results for both the left and right hand separately (Van Hinte, 2020).
Thirty-second chair-stand test
Two studies assessed the within-day intra-rater test-retest reliability (Aabo, 2021; Van Hinte, 2020) and found ICCs of 0.97 and 0.92, respectively. Two other studies assessed the between-day test-retest reliability (Blackwood, 2021; Eden, 2018), resulting in relatively similar ICCs of 0.89 and 0.948, respectively.
Five-time sit to stand test
Only one study reported reliability results. The intra-rater between-day test-retest reliability had an ICC of 0.86 (Blackwood, 2021).
Timed up and go test
Two studies reported reliability results. One study reported the between-day reliability, showing an ICC of 0.95 (Blackwood, 2020). The other study reported an ICC of 0.98 for within-day reliability (Van Hinte, 2020).
Six-minute walking test
Between-day reliability was assessed in two studies (Eden, 2018; Schmidt, 2013), reporting ICCs of 0.96 and 0.93, respectively. Two other studies investigated the within-day reliability and found ICCs of 0.98 and 0.97 (Sebio-Garcia, 2017; Van Hinte, 2020).
Steep ramp test
One study (De Backer, 2007) assessed the reliability of the steep rap test (Wmax). An ICC of 0.996 was reported.
Table Reliability results regarding the instruments of interest
Instrument |
Test parameter |
Author (year) |
Rater |
Retest |
Result |
Risk of bias assessment* |
Individual outcome assessment |
One repetition maximum (1RM) |
No studies were included that reported the reliability of this instrument in the population of interest |
||||||
Handgrip dynamometry |
Lbs or Nm |
Trutschnigg (2008) |
Unclear |
Within-day |
Jamar handgrip (Lbs): r = 0.966
Biodex handgrip (Nm): r = 0.765 |
Doubtful |
? |
Kilogram |
Van Hinte (2020) |
Intra-rater |
Within-day |
Jamar, left hand: ICC(3,1) = 0.88 [95%CI 0.80-0.93]
Jamar, right hand, ICC(3,1) = 0.96 [95%CI 0.93-0.98] |
Adequate |
+ |
|
Thirty-second Chair-Stand Test (30sCST) |
Repetitions |
Aabo (2021) |
Intra-rater |
Within-day |
ICC(2,1) = 0.97 [95%CI 0.94-0.98] |
Adequate |
+ |
|
|
Blackwood (2021) |
Intra-rater |
Between-day |
ICC(2,1) = 0.89 [95%CI 0.80-0.94] |
Doubtful |
+ |
|
|
Eden (2018) |
Unclear |
Between-day |
ICC(3,1) = 0.948 [95%CI 0.648-0.983] |
Doubtful |
+ |
|
|
Van Hinte (2020) |
Intra-rater |
Within-day |
ICC(3,1) = 0.92 [95%CI 0.85-0.95] |
Doubtful |
+ |
Five-times sit to stand test (5TSTS) |
Seconds |
Blackwood (2021) |
Intra-rater |
Between-day |
ICC (2,1) = 0.86 [95%CI 0.75-0.92] |
Doubtful |
+ |
Stair Climbing Test (SCT) |
No studies were included that reported the reliability of this instrument in the population of interest |
||||||
Timed Up and Go (TUG) Test |
Seconds |
Blackwood (2020) |
Intra-rater |
Between-day |
ICC = 0.95 [95%CI 0.91-0.97] |
Doubtful |
+ |
|
|
Van Hinte (2020) |
Intra-rater |
Within-day |
ICC(3.1) = 0.98 [95%CI 0.96-0.99] |
Adequate |
+ |
Six-Minute walking test (6MWT) |
Meters |
Eden (2018) |
Unclear |
Between-day |
ICC = 0.960 [95%CI 0.910-0.981] |
Adequate |
+ |
|
|
Schmidt (2013) |
Unclear |
Between-day |
ICC(2,1) = 0.93 [95%CI 0.86; 0.97, p<0.001]
|
Doubtful |
+ |
|
|
Sebio-Garcia (2017) |
Intra-rater |
Within-day |
ICC = 0.98 [95%CI 0.92-0.99] CV = 4% |
Doubtful |
+ |
|
|
Van Hinte (2020) |
Intra-rater |
Within-day |
ICC (3.1)= 0.97 [95%CI 0.95-0.98] |
Doubtful |
+ |
Incremental Shuttle Walking Test (ISWT) |
No studies were included that reported the reliability of this instrument in the population of interest |
||||||
Steep Ramp Test (SRT) |
Wmax |
De Backer (2007) |
Unclear |
Unclear |
ICC = 0.996 [95%CI 0.989-0.998] |
Doubtful |
+ |
Physician-based Assessment and Counseling for Exercise (PACE) Questionnaire |
No studies were included that reported the reliability of this instrument in the population of interest |
||||||
Fitmáx Questionnaire |
No studies were included that reported the reliability of this instrument in the population of interest |
||||||
Veterans-Specific Activity Questionnaire |
No studies were included that reported the reliability of this instrument in the population of interest |
||||||
Duke Activity Status index (DASI) |
No studies were included that reported the reliability of this instrument in the population of interest |
||||||
*Based on the 4-point COSMIN risk of bias tool using the lowest score counts method. 5TSTS: 5 times sit to stand, 6MWT: six minute walk test, 30sCST: 30 second chair stand test, 10mWT: 10 meter walk test, CI: Confidence Interval, CPET: cardiopulmonary exercise testing, CST: chair stand test, ICC: Intraclass correlation coefficient, CV: Coefficient of variation, DASI: Duke Activity Status Index, ESWT: endurance shuttle walk test, ISWT: incremental shuttle walk test, LOA: limits of agreement, m: meter, MDC95: minimal detectable change at 95%, PACE: Physician-based Assessment and Counseling for Exercise, SCT: stair climbing test, SD: standard deviation, SEM: standard error of measurement, SRT: steep ramp test, TUG: timed up and go test, VSAQ: Veterans-Specific Activity questionnaire, Wmax: maximum work capacity, VO2peak: peak oxygen uptake, VO2max: maximum oxygen uptake |
Measurement error
See Table Measurement error results regarding the tests of interest for a summary of the results regarding measurement error found for handgrip dynamometry, Thirty-Second Chair-Stand Test, Five-Times Sit to Stand test, Timed-Up and Go test, Six-Minute Walking Test, and the Incremental Shuttle Walk Test.
No information was found for the measurement error of the One-Repetition Maximum, Stair Climb Test, Steep Ramp Test, Physician-based Assessment and Counseling for Exercise questionnaire, Fitmáx questionnaire, and the Veteran-Specific Activity Questionnaire in the population of interest.
Handgrip dynamometry
Trutschnigg (2008) and Van Hinte (2020) reported data about the measurement error of handgrip dynamometry. Jamar (Truschnigg, 2008; Van Hinte, 2020) and the Biodex System 3 (Trutschnigg, 2018) was used.
Thirty-Second Chair-Stand Test (30sCST)
Aboo (2021), Blackwood (2021), and Van Hinte (2020) provided information about the measurement error. Minimal Detectable Differences were calculated for repetitions (range: 2.6 to 3).
Five-Times Sit to Stand test (5TSTS)
Blackwood (2021) calculated a Minimal Detectable Difference of 3.19 seconds.
Timed-Up and Go test (TUG)
Blackwood (2020) and van Hinte (2020) calculated the measurement error of the TUG test (Minimal Detectable Differences of 2.494 and 1.54, respectively)
Six-Minute Walking Test (6MWT)
Schmidt (2013), Sebio-Garcia (2017), and Van Hinte (2020) provided information about the measurement error. Van Hinte (2020) reported a Minimal Detectable Difference of 56.67 meters.
Incremental Shuttle Walk Test (ISWT)
Booth (2001) and Willcock (2018) reported data about the measurement error of the ISWT in Bland-Altman plots.
Table Measurement error results regarding the tests of interest
Instrument |
Measurement property |
Test parameter |
Author (year) |
Result |
Risk of bias assessment* |
Individual outcome assessment |
One-Repetition Maximum |
No studies were included that reported the measurement error of this instrument in the population of interest |
|||||
Handgrip Dynamometry |
Measurement error |
Lbs or Nm |
Trutschnigg (2008) |
Jamar (Lbs): %CV = 6.30, Mean difference = -0.03
Biodex (Nm): %CV = 16.70, Mean difference = -2.47 |
Doubtful |
? |
|
|
Kilogram |
Van Hinte (2020) |
Jamar (kilogram): Left hand: Mean difference (SD) = 0.22 (6.67) 95%LoA = -12.86 to 13.30 SEM = 4.67 MDC95 = 12.96
Right hand: Mean difference (SD) = 0.52 (4.23) 95%LoA = -7.76 to 8.80 SEM = 2.98 MDC95 = 8.26 |
Adequate |
? |
Thirty-Second Chair Stand Test |
Measurement error |
Repetitions |
Aabo (2021) |
SEM = 1.0 MDC95 = 2.6 |
Adequate |
? |
|
|
|
Blackwood (2021) |
MDC95 = 3 |
Doubtful |
? |
|
|
|
Van Hinte (2020) |
Mean Difference = -0.48 (SD:1.47) 95%LoA = -3.31 to 2.35 SEM = 1.07 MDC95=2.96 |
Adequate |
? |
Five-Time Sit-To-Stand test |
Measurement error |
Seconds |
Blackwood (2021) |
MDC95 = 3.19 |
Doubtful |
? |
Stair Climbing Test |
No studies were included that reported measurement error of this instrument in the population of interest |
|||||
Timed Up and Go test |
Measurement error |
Seconds |
Blackwood (2020) |
Mean (SD)= 10.18 (4.04), SEM = 0.90 MDC95 = 2.494 |
Doubtful |
? |
|
|
|
Van Hinte (2020) |
Mean Difference = 0.05 (SD: 0.79) 95%LoA = -1.50 to 1.60 SEM = 0.55 MDC95 = 1.54 |
Adequate |
? |
Six-Minute Walking Test |
Measurement error |
Meters |
Schmidt (2013) |
Mean difference = 16.6 (SD: 29.9) 95%LoA = -43.1 to 76.4 CV = 3% |
Doubtful |
? |
|
|
|
Sebio-Garcia (2017) |
Mean difference = 19.5 95%LoA = -32.3 to 70.38 |
Doubtful |
? |
|
|
|
Van Hinte (2020) |
Mean difference = -9.50 (SD: 27.59) 95%LoA = -63.57 to 44.57 SEM = 20.45 MDC95 = 56.67 |
Adequate |
? |
Incremental Shuttle Walking Test |
Measurement error |
Meters |
Booth 2001 |
Approximated from a reported figure: Mean difference = 2 [95%CI -6 to 8]
Calculated from reported data in a table, test 2 vs. 3 (n=22): Mean difference: -11 95%LoA: -81.7 to 58.0 |
Doubtful |
? |
|
|
|
Wilcock (2018) |
Between-day: Mean difference = 16 95%LoA = -80 to 112
Within-day: Mean difference = 5 95%LoA: -78 to 88 |
Doubtful |
? |
Steep Ramp Test |
No studies were included that reported the measurement error of this instrument in the population of interest |
|||||
Physician-based Assessment and Counceling for Exercise questionnaire |
No studies were included that reported the measurement error of this instrument in the population of interest |
|||||
Fitmáx questionnaire |
No studies were included that reported the measurement error of this instrument in the population of interest |
|||||
Veterans Specific Activity Questionnaire |
No studies were included that reported the measurement error of this instrument in the population of interest |
|||||
Duke Activity Status Index |
No studies were included that reported the measurement error of this instrument in the population of interest |
|||||
*Based on the 4-point COSMIN risk of bias tool using the lowest score counts method. CI: Confidence Interval, CV: Coefficient of variation, LoA: limits of agreement, m: meter, MDC95: minimal detectable change at 95%, SD: standard deviation, SEM: standard error of measurement |
Responsiveness
Only one study reported data about the responsiveness of an instrument. Weemaes (2021) assessed the responsiveness of the Steep Ramp Test. Patients received a 10-week exercise intervention. The before-after change scores of the steep ramp test (Wmax) were correlated to the change scores on the CPET (r=0.51). Using a cutoff of ≥6% increase, the steep ramp test had an area under the curve (AUC) of 0.74 (95%CI: 0.60-0.87), a sensitivity of 70.7%, a specificity of 66.7%, a positive predictive value of 82.9%, and a negative predictive value of 50.5% (Weemaes, 2021).
Level of evidence of the literature
See Table Summary of GRADE assessments for each instrument per measurement property for a summary of the GRADING for each measurement instrument per measurement property of interest.
Table Summary of GRADE assessments for each instrument per measurement property
Instrument |
Validity (muscle strength |
Validity (oxygen intake) |
Reliability |
Measurement error |
Responsiveness |
One Repetition Maximum |
VERY LOW (construct validity) |
NO GRADE |
NO GRADE |
NO GRADE |
NO GRADE |
Handgrip dynamometry |
VERY LOW (construct validity for kilograms)
VERY LOW (construct validity for predicted 1RM kilograms) |
NO GRADE |
LOW |
LOW |
NO GRADE |
Thirty-Second Chair Stand Test |
VERY LOW (construct validity) |
VERY LOW (construct validity) |
MODERATE |
MODERATE |
NO GRADE |
Five-Time Sit to Stand Test |
VERY LOW (construct validity) |
NO GRADE |
VERY LOW |
VERY LOW |
NO GRADE |
Stair Climbing Test |
NO GRADE |
LOW |
NO GRADE |
NO GRADE |
NO GRADE |
Timed Up and Go test |
VERY LOW (construct validity) |
NO GRADE |
LOW |
LOW |
NO GRADE |
Six-Minute Walking Test |
VERY LOW (construct validity) |
MODERATE (criterion validity for walking distance)
LOW (criterion validity for predicted VO2peak) |
MODERATE |
MODERATE |
NO GRADE |
Incremental Shuttle Walking Test |
VERY LOW (construct validity) |
HIGH |
NO GRADE |
LOW |
NO GRADE |
Steep Ramp Test |
NO GRADE |
HIGH (criterion validity for Wmax)
MODERATE (criterion validity for predicted VO2peak) |
VERY LOW |
NO GRADE |
MODERATE |
Physician-based Assessment and Counceling for Exercise questionnaire |
NO GRADE |
NO GRADE |
NO GRADE |
NO GRADE |
NO GRADE |
Fitmáx questionnaire |
NO GRADE |
NO GRADE |
NO GRADE |
NO GRADE |
NO GRADE |
Veterans-Specific Activity Questionnaire |
NO GRADE |
NO GRADE |
NO GRADE |
NO GRADE |
NO GRADE |
Duke Activity Status Index |
NO GRADE |
LOW |
NO GRADE |
NO GRADE |
NO GRADE |
One Repetition Maximum
The level of evidence regarding the outcome measure construct validity (muscle strength) was downgraded by 4 levels because of study limitations (3 levels for risk of bias: only one study which was judged inadequate); number of included patients (1 level for imprecision: n=50 in sample); publication bias was not assessed.
The level of evidence regarding the outcome measures validity (cardiorespiratory fitness), measurement error, reliability, and responsiveness were not GRADEd. None of the included studies reported data about these measurement properties.
Handgrip dynamometry
The level of evidence regarding the outcome measure construct validity (kilograms for muscle strength) was downgraded by 3 levels because of study limitations (3 levels for risk of bias: only one study judged to have inadequate quality); publication bias was not assessed.
The level of evidence regarding the outcome measure construct validity (predicted kilograms for muscle strength) was downgraded by 4 levels because of study limitations (3 levels for risk of bias:); conflicting results (inconsistency); applicability (bias due to indirectness); number of included patients (1 level for imprecision: n=50 in body of evidence); publication bias was not assessed.
The level of evidence regarding the outcome measure reliability was downgraded by 2 levels because of study limitations (1 level for risk of bias: one study judged as doubtful and one as adequate); number of included patients (1 level for imprecision: between 50 and 100 patients in the total body of evidence); publication bias.
The level of evidence regarding the outcome measure measurement error was downgraded by 2 levels because of study limitations (1 level for risk of bias: only one study was judged as adequate); conflicting results (not downgraded for inconsistency based on mean differences); number of included patients (1 level for imprecision: sample size in the body of evidence was between 50 and 150); publication bias was not assessed.
The level of evidence regarding the outcome validity (cardiorespiratory fitness) and responsiveness were not GRADEd. None of the included studies reported data about these measurement properties.
Thirty-Second Chair Stand Test
The level of evidence regarding the outcome measure construct validity (muscle strength) was downgraded by 4 levels because of study limitations (2 levels for risk of bias: only 1 study which was judged as doubtful); number of included patients (2 levels for imprecision: less than 50 participants in sample); publication bias was not assessed.
The level of evidence regarding the outcome measure construct validity (cardiorespiratory fitness) was downgraded by 4 levels because of study limitations (2 levels for risk of bias: only one study which was judged as doubtful); number of included patients (2 levels for imprecision: less than n=50 in the total sample); publication bias was not assessed.
The level of evidence regarding the outcome measure reliability was downgraded by 1 level because of study limitations (1 level for risk of bias: three studies judged as doubtful and one as adequate); publication bias was not assessed.
The level of evidence regarding the outcome measure measurement error was downgraded by 1 level because of study limitations (1 level for risk of bias: one study judged as adequate, multiple as doubtful); publication bias was not assessed.
The level of evidence regarding the outcome responsiveness was not GRADEd. None of the included studies reported data about this measurement property.
Five-Time Sit to Stand Test
The level of evidence regarding the outcome measure construct validity was downgraded by 4 levels because of study limitations (2 levels for risk of bias: only 1 study which was judged as doubtful); number of included patients (2 levels for imprecision: less than 50 participants in sample); publication bias was not assessed.
The level of evidence regarding the outcome measure reliability was downgraded by 4 levels because of study limitations (2 levels for risk of bias: only one study which was judged as doubtful); number of included patients (2 levels for imprecision: less than n=50 in the body of evidence); publication bias was not assessed.
The level of evidence regarding the outcome measure measurement error was downgraded by 4 levels because of study limitations (2 levels for risk of bias: only 1 study which was judged as doubtful); number of included patients (2 levels for imprecision: less than 50 participants in sample); publication bias was not assessed.
The level of evidence regarding the outcomes validity (cardiorespiratory fitness) and responsiveness were not GRADEd. None of the included studies reported data about this measurement properties.
Stair Climbing Test
The level of evidence regarding the outcome measure construct validity could not be GRADEd, since none of the included studies reported data about its validity to measure muscle strength.
The level of evidence regarding the outcome measure criterion validity (cardiorespiratory fitness) was downgraded by 2 levels number of included patients (2 levels for imprecision: less than n=50 in the total sample); publication bias was not assessed.
The level of evidence regarding the outcome reliability, measurement error and responsiveness were not GRADEd. None of the included studies reported data about these measurement properties.
Timed Up and Go test
The level of evidence regarding the outcome measure construct validity was downgraded by 4 levels because of study limitations (2 levels for risk of bias: only 1 study which was judged as doubtful); number of included patients (2 levels for imprecision: less than 50 participants in sample); publication bias was not assessed.
The level of evidence regarding the outcome measure was not GRADEd. None of the included studies reported data about this measurement property.
The level of evidence regarding the outcome measure reliability was downgraded by 2 levels because of study limitations (1 level for risk of bias: one study judged as doubtful and one as adequate); conflicting results (inconsistency); applicability (bias due to indirectness); number of included patients (1 level for imprecision: between 50 and 100 patients in the total body of evidence); publication bias was not assessed.
The level of evidence regarding the outcome measure measurement error was downgraded by 2 levels because of study limitations (1 level for risk of bias: one study judged as adequate and one as doubtful); conflicting results (not downgraded for inconsistency: the observed difference may be related to differences in measurement protocols); number of included patients (1 level for imprecision: body of evidence contained n=100); publication bias was not assessed.
The level of evidence regarding the outcomes validity (cardiorespiratory fitness) and responsiveness reliability were not GRADEd. None of the included studies reported data about these measurement properties.
Six-Minute Walking Test
The level of evidence regarding the outcome measure construct validity (muscle strength) was downgraded by 3 levels because of study limitations (1 level for risk of bias: only one study judged as adequate); number of included patients (2 levels for imprecision: less than n=50 in the sample); publication bias was not assessed.
The level of evidence regarding the outcome measure criterion validity (walking distance for cardiorespiratory fitness) was downgraded by 1 level because of conflicting results (1 level for inconsistency: heterogeneous results might not (fully) be explained by differences between studies); publication bias was not assessed.
The level of evidence regarding the outcome measure construct validity (walking distance for cardiorespiratory fitness) was downgraded by 4 levels because of study limitations (3 levels for risk of bias: only one study which judged as inadequate); number of included patients (2 levels for imprecision: less than n=50 in the total body of evidence); publication bias was not assessed.
The level of evidence regarding the outcome measure construct validity (predicted VO2peak for cardiorespiratory fitness) was downgraded by 2 levels because of study limitations (2 levels for risk of bias: only one study which judged as doubtful); conflicting results (not downgraded for inconsistency: different models with different predictors will probably result in different results); publication bias was not assessed.
The level of evidence regarding the outcome measure criterion validity (predicted VO2peak for cardiorespiratory fitness) was downgraded by 2 levels because of study limitations (2 levels for risk of bias: the largest (ca. 3 times) study was of doubtful quality); conflicting results (not downgraded for inconsistency: different models with different predictors will probably result in different results); publication bias was not assessed.
The level of evidence regarding the outcome measure reliability was downgraded by 1 level because of study limitations (1 level for risk of bias: multiple studies judges as doubtful); publication bias was not assessed.
The level of evidence regarding the outcome measure measurement error was downgraded by 1 level because of study limitations (1 level for risk of bias: multiple studies were judged as doubtful); conflicting results (not downgraded for inconsistency: differences in measurement protocols may explain the differences, although the limits of agreement were deemed relatively comparable); publication bias was not assessed.
The level of evidence regarding the outcome responsiveness was not GRADEd. None of the included studies reported data about this measurement property.
Incremental Shuttle Walking Test
The level of evidence regarding the outcome measure construct validity (muscle strength) was downgraded by 4 levels because of study limitations (2 levels for risk of bias: only one study, judged as doubtful); number of included patients (2 levels for imprecision: less than n=50 in the sample); publication bias.
The level of evidence regarding the outcome measure criterion validity (cardiorespiratory fitness) was not downgraded. No evidence of study limitations (risk of bias), conflicting results (inconsistency), applicability (bias due to indirectness), or limited number of included patients (imprecision) were found. Publication bias was not assessed.
The level of evidence regarding the outcome measure construct validity (cardiorespiratory fitness) was downgraded with 4 levels because of study limitations (2 levels for risk of bias: only one study which was judged as doubtful), conflicting results (inconsistency), applicability (bias due to indirectness), or limited number of included patients (2 levels for imprecision: less than n=50 in the total body of evidence) were found. Publication bias was not assessed.
The level of evidence regarding the outcome measure measurement error was downgraded by 2 levels because of study limitations (1 level for risk of bias: multiple studies judged as doubtful); conflicting results not downgraded for (inconsistency: some heterogeneity may be explained due to different sample characteristics [e.g. tumor location]); applicability (bias due to indirectness); number of included patients (1 level for imprecision: between 50 and 100 patients in the sample); publication bias was not assessed.
The level of evidence regarding the outcome reliability and responsiveness were not GRADEd. None of the included studies reported data about these measurement properties.
Steep Ramp Test
The level of evidence regarding the outcome measure criterion validity (Wmax for cardiorespiratory fitness) was not downgraded. No evidence of study limitations (risk of bias), conflicting results (inconsistency), applicability (bias due to indirectness), or limited number of included patients (imprecision) were found. Publication bias was not assessed.
The level of evidence regarding the outcome measure criterion validity (predicted VO2peak for cardiorespiratory fitness) was downgraded by 1 level because of the number of included patients (1 level for imprecision: between 50 and 100 patients were included); publication bias was not assessed.
The level of evidence regarding the outcome measure reliability was downgraded by 4 levels because of study limitations (2 levels for risk of bias: only one study which was judged as doubtful); number of included patients (2 levels for imprecision: less than n=50 in the body of evidence); publication bias was not assessed.
The level of evidence regarding the outcome measure responsiveness (cardiorespiratory fitness) was downgraded by 1 levela because the number of included patients (1 level for imprecision: between 50-100 patients in the total body of evidence); publication bias was not assessed.
The level of evidence regarding the outcomes validity (muscle strength) and measurement error were not GRADEd. None of the included studies reported data about these measurement properties.
Physician-based Assessment and Counceling for Exercise questionnaire
The level of evidence regarding the outcomes validity, reliability, measurement error, and responsiveness were not GRADEd. None of the included studies reported data about these measurement properties.
Fitmáx questionnaire
The level of evidence regarding the outcomes validity, reliability, measurement error, and responsiveness were not GRADEd. None of the included studies reported data about these measurement properties.
Veterans-Specific Activity Questionnaire
The level of evidence regarding the outcomes validity, reliability, measurement error, and responsiveness were not GRADEd. None of the included studies reported data about these measurement properties.
Duke Activity Status Index
The level of evidence regarding the outcomes validity (for muscle strength), reliability, measurement error, and responsiveness were not GRADEd. None of the included studies reported data about these measurement properties.
The level of evidence regarding the outcome measure criterion validity (cardiorespiratory fitness) was downgraded by 2 levels because of the number of included patients (2 levels for imprecision: less than n=50 included in the total body of evidence); publication bias was not assessed.
Zoeken en selecteren
A systematic review of the literature was performed to answer the following question:
What is the validity, the reliability, the measurement error, and responsiveness of measurement instruments (as defined in the PICO) to monitor the physical fitness (i.e. cardiorespiratory fitness and muscle strength) of patient diagnosed with cancer?
P: | Patients diagnosed with cancer |
I: | Use of Six-Minute Walking Test / Incremental Shuttle Walking Test / Stair Climbing Test / Steep Ramp Test / Thirty-Second Chair Stand Test / Five-Time Sit to Stand Test / Timed Up and Go test / Physician-based Assessment and Counceling for Exercise questionnaire / Fitmáx questionnaire / Veterans-Specific Activity Questionnaire/ Duke Activity Status Index / One Repetition Maximum / handgrip dynamometry to measure (or predict) oxygen intake (VO2max or VO2peak) or muscle strength. |
C: | Comparison between instruments, if available |
O: | Validity, reliability, measurement error, and responsiveness |
Relevant outcome measures
The guideline development group considered validity, reliability, measurement error, and responsiveness as a critical measurement properties for decision making.
The working group defined the measurement properties following the Consensus-based Standards for the selection of health Measurement Instruments (COSMIN) taxonomy (Mokkink, 2010).
The working group used the updated criteria for good measurement properties reported in Mokkink (2018), based on Terwee (2007) and Prinsen (2018):
For criterion validity
- Positive (+): The correlation with the gold standard was ≥0.70 or the Area Under the Curve ≥0.70
- Unclear (?): Not all information for a positive judgement (+) of the measurement property was reported
- Negative (-): The correlation with the gold standard was <0.70 or the Area Under the Curve was <0.70
For construct validity (hypotheses testing)
- Positive (+): The result is in accordance with the hypothesis
- Unclear (?): No hypothesis was defined
- Negative (-): The result is not in accordance with the hypothesis
For reliability
- Positive (+): The ICC or weighted Kappa was ≥0.70
- Unclear (?): The ICC or weighted Kappa was not reported
- Negative (-): The ICC or weighted Kappa was <0.70
For measurement error
- Positive (+): The Smallest Detectable Change or Limits of Agreement was < Minimal Important Change
- Unclear (?): The Minimal Important Change was not defined
- Negative (-): The Smallest Detectable Change or Limits of Agreement was > Minimal Important Change
For responsiveness
- Positive (+): The result in in accordance with the hypothesis or the Area Under the Curve was ≥0.70
- Unclear (?): No hypothesis was defined
- Negative (-): The result is not in accordance with the hypothesis or the Area Under the Curve was <0.70
Search and select (Methods)
The databases Medline (via OVID) and Embase (via Embase.com) were searched with relevant search terms until the 2nd of May 2022. The detailed search strategy is depicted under the tab Methods. The systematic literature search resulted in 688 hits. Studies were selected based on the following criteria: participants were diagnosed with any kind of cancer, one of the measurement instruments of interest were used, data about one of the measurement properties of interest were reported. For criterion validity, studies or study data were excluded when it was unclear (i.e. not reported and not deducible) which parameter of a test (e.g. VO2peak, distance walked, seconds, etc.) was used to examine the measurement properties of interest. For construct validity (e.g. hypotheses testing), studies or study data were excluded that correlated an instrument of interest with any instrument not of interest when it was not specified which construct the concurrent instrument intended to measure.
Forty-six studies were initially selected based on title and abstract screening, of which five systematic reviews. Potentially relevant systematic reviews were screened for studies they had included which our search strategy had missed. After reading the full text, 25 studies were excluded (see the table with reasons for exclusion under the tab Methods), and 21 studies were included (of which 3 were identified through the screened systematic reviews: England, 2012; Koegelenberg, 2007; Win, 2006).
Results
Twenty-one studies were included in the analysis of the literature. Important study characteristics and results are summarized in the evidence tables. The assessment of the risk of bias is summarized in the risk of bias tables and was carried out using the COSMIN-tool (Mokkink, 2018).
Referenties
- Aabo MR, Ragle AM, Østergren PB, Vinther A. Reliability of graded cycling test with talk test and 30-s chair-stand test in men with prostate cancer on androgen deprivation therapy. Support Care Cancer. 2021 Aug;29(8):4249-4256. doi: 10.1007/s00520-020-05918-8. Epub 2021 Jan 7. PMID: 33411043.
- Blackwood J, Rybicki K, Huang M. Mobility Measures in Older Cancer Survivors: An Examination of Reliability, Validity, and Minimal Detectable Change. Rehabilitation Oncology. 2021 Apr 1;39(2):74-80.
- Blackwood J, Rybicki K. Physical function measurement in older long-term cancer survivors. Journal of Frailty, Sarcopenia and Falls. 2021 Sep;6(3):139.
- Booth S, Adams L. The shuttle walking test: a reproducible method for evaluating the impact of shortness of breath on functional capacity in patients with advanced cancer. Thorax. 2001 Feb;56(2):146-50. doi: 10.1136/thorax.56.2.146. PMID: 11209105; PMCID: PMC1745995.
- De Backer IC, Schep G, Hoogeveen A, Vreugdenhil G, Kester AD, van Breda E. Exercise testing and training in a cancer rehabilitation program: the advantage of the steep ramp test. Arch Phys Med Rehabil. 2007 May;88(5):610-6. doi: 10.1016/j.apmr.2007.02.013. PMID: 17466730.
- Eden MM, Tompkins J, Verheijde JL. Reliability and a correlational analysis of the 6MWT, ten-meter walk test, thirty second sit to stand, and the linear analog scale of function in patients with head and neck cancer. Physiother Theory Pract. 2018 Mar;34(3):202-211. doi: 10.1080/09593985.2017.1390803. Epub 2017 Oct 25. PMID: 29068767.
- England R, Maddocks M, Manderson C, Wilcock A. Factors influencing exercise performance in thoracic cancer. Respir Med. 2012 Feb;106(2):294-9. doi: 10.1016/j.rmed.2011.11.002. Epub 2011 Nov 21. PMID: 22104542.
- Granger CL, Denehy L, Parry SM, Martin J, Dimitriadis T, Sorohan M, Irving L. Which field walking test should be used to assess functional exercise capacity in lung cancer? An observational study. BMC Pulm Med. 2015 Aug 12;15:89. doi: 10.1186/s12890-015-0075-2. PMID: 26264470; PMCID: PMC4534028.
- Kampshoff CS, Chinapaw MJ, Brug J, Twisk JW, Schep G, Nijziel MR, van Mechelen W, Buffart LM. Randomized controlled trial of the effects of high intensity and low-to-moderate intensity exercise on physical fitness and fatigue in cancer survivors: results of the Resistance and Endurance exercise After ChemoTherapy (REACT) study. BMC Med. 2015 Oct 29;13:275. doi: 10.1186/s12916-015-0513-2. PMID: 26515383; PMCID: PMC4625937.
- Koegelenberg CF, Diacon AH, Irani S, Bolliger CT. Stair climbing in the functional assessment of lung resection candidates. Respiration. 2008;75(4):374-9. doi: 10.1159/000116873. Epub 2008 Feb 13. PMID: 18272936.
- Li MH, Bolshinsky V, Ismail H, Ho KM, Heriot A, Riedel B. Comparison of Duke Activity Status Index with cardiopulmonary exercise testing in cancer patients. J Anesth. 2018 Aug;32(4):576-584. doi: 10.1007/s00540-018-2516-6. Epub 2018 May 29. PMID: 29845328.
- Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, Bouter LM, de Vet HC. The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. J Clin Epidemiol. 2010 Jul;63(7):737-45. doi: 10.1016/j.jclinepi.2010.02.006.
- Mokkink LB, de Vet HCW, Prinsen CAC, Patrick DL, Alonso J, Bouter LM, Terwee CB. COSMIN Risk of Bias checklist for systematic reviews of Patient-Reported Outcome Measures. Qual Life Res. 2018 May;27(5):1171-1179. doi: 10.1007/s11136-017-1765-4. Epub 2017 Dec 19. PMID: 29260445; PMCID: PMC5891552.
- Prinsen CA, Vohra S, Rose MR, Boers M, Tugwell P, Clarke M, et al. How to select outcome measurement instruments for outcomes included in a "Core Outcome Set" a practical guideline. Trials. 2016;17(1):449.
- Rogers BH, Brown JC, Gater DR, Schmitz KH. Association Between Maximal Bench Press Strength and Isometric Handgrip Strength Among Breast Cancer Survivors. Arch Phys Med Rehabil. 2017 Feb;98(2):264-269. doi: 10.1016/j.apmr.2016.07.017. Epub 2016 Aug 16. PMID: 27543047; PMCID: PMC5276727.
- Schmidt K, Vogt L, Thiel C, Jäger E, Banzer W. Validity of the six-minute walk test in cancer patients. Int J Sports Med. 2013 Jul;34(7):631-6. doi: 10.1055/s-0032-1323746. Epub 2013 Feb 26. PMID: 23444095.
- Schumacher AN, Shackelford DYK, Brown JM, Hayward R. Validation of the 6-min Walk Test for Predicting Peak V?O2 in Cancer Survivors. Med Sci Sports Exerc. 2019 Feb;51(2):271-277. doi: 10.1249/MSS.0000000000001790. PMID: 30239495.
- Sebio-Garcia R, Dana F, Gimeno-Santos E, López-Baamonde M, Ubré M, Montané-Muntané M, Risco R, Messagi-Sartor M, Roca J, Martínez-Palli G. Repeatability and learning effect in the 6MWT in preoperative cancer patients undergoing a prehabilitation program. Support Care Cancer. 2022 Jun;30(6):5107-5114. doi: 10.1007/s00520-022-06934-6. Epub 2022 Feb 28. PMID: 35229179.
- Stuiver MM, Kampshoff CS, Persoon S, Groen W, van Mechelen W, Chinapaw MJM, Brug J, Nollet F, Kersten MJ, Schep G, Buffart LM. Validation and Refinement of Prediction Models to Estimate Exercise Capacity in Cancer Survivors Using the Steep Ramp Test. Arch Phys Med Rehabil. 2017 Nov;98(11):2167-2173. doi: 10.1016/j.apmr.2017.02.013. Epub 2017 Mar 18. PMID: 28322759.
- Terwee CB, Bot SD, de Boer MR, van der Windt DA, Knol DL, Dekker J, et al. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol. 2007;60(1):34-42.
- Trutschnigg B, Kilgour RD, Reinglas J, Rosenthall L, Hornby L, Morais JA, Vigano A. Precision and reliability of strength (Jamar vs. Biodex handgrip) and body composition (dual-energy X-ray absorptiometry vs. bioimpedance analysis) measurements in advanced cancer patients. Appl Physiol Nutr Metab. 2008 Dec;33(6):1232-9. doi: 10.1139/H08-122. PMID: 19088782.
- Tsuji K, Matsuoka YJ, Kuchiba A, Suto A, Ochi E. Accuracy of exercise-based tests for estimating cardiorespiratory fitness and muscle strength in early-stage breast cancer survivors in Japan. Support Care Cancer. 2022 May;30(5):3857-3863. doi: 10.1007/s00520-022-06811-2. Epub 2022 Jan 17. PMID: 35037120.
- van Hinte G, Leijendekkers RA, Te Molder B, Jansen L, Bol C, Merkx MAW, Takes R, Nijhuis-van der Sanden MWG, Speksnijder CM. Reproducibility of measurements on physical performance in head and neck cancer survivors; measurements on maximum mouth opening, shoulder and neck function, upper and lower body strength, level of physical mobility, and walking ability. PLoS One. 2020 Sep 3;15(9):e0233271. doi: 10.1371/journal.pone.0233271. PMID: 32881858; PMCID: PMC7470389.
- Weemaes ATR, Beelen M, Bongers BC, Weijenberg MP, Lenssen AF. Criterion Validity and Responsiveness of the Steep Ramp Test to Evaluate Aerobic Capacity in Survivors of Cancer Participating in a Supervised Exercise Rehabilitation Program. Arch Phys Med Rehabil. 2021 Nov;102(11):2150-2156. doi: 10.1016/j.apmr.2021.04.016. Epub 2021 May 21. PMID: 34023324.
- Weemaes ATR, Meijer R, Beelen M, van Hooff M, Weijenberg MP, Lenssen AF, van de Poll-Franse LV, Savelberg HHCM, Schep G. Monitoring aerobic capacity in cancer survivors using self-reported questionnaires: criterion validity and responsiveness. J Patient Rep Outcomes. 2023 Jul 19;7(1):73.
- Wilcock A, Koon S, Manderson C, Taylor V, Maddocks M. Within and between day repeatability of the incremental shuttle walking test in patients with thoracic cancer. Respir Med. 2018 Jul;140:39-41. doi: 10.1016/j.rmed.2018.05.018. Epub 2018 May 21. PMID: 29957278.
- Win T, Jackson A, Groves AM, Sharples LD, Charman SC, Laroche CM. Comparison of shuttle walk with measured peak oxygen consumption in patients with operable lung cancer. Thorax. 2006 Jan;61(1):57-60. doi: 10.1136/thx.2005.043547. Epub 2005 Oct 21. PMID: 16244091; PMCID: PMC2080711.
Verantwoording
Autorisatiedatum en geldigheid
Laatst beoordeeld : 06-01-2025
Laatst geautoriseerd : 06-01-2025
Geplande herbeoordeling : 06-01-2028
Algemene gegevens
De ontwikkeling/herziening van deze richtlijnmodule werd ondersteund door het Kennisinstituut van de Federatie Medisch Specialisten (www.demedischspecialist.nl/kennisinstituut) en werd gefinancierd uit de Kwaliteitsgelden Medisch Specialisten (SKMS). De financier heeft geen enkele invloed gehad op de inhoud van de richtlijnmodule.
Samenstelling werkgroep
Voor het ontwikkelen van de richtlijnmodule is een multidisciplinaire werkgroep ingesteld, bestaande uit vertegenwoordigers van alle relevante specialismen (zie hiervoor de Samenstelling van de werkgroep) die betrokken zijn bij de zorg voor patiënten met kanker en die raakvlak hebben met de zorg voor fysieke fitheid.
Werkgroeplid |
Namens |
Dr. G. (Goof) Schep (voorzitter) † |
VSG |
Drs. R.J.A. (Rhijn) Visser (voorzitter) |
VSG |
Dr. M.E. (Marieke) van Vessem (voorzitter) vanaf 1-7-’24 |
VSG |
Dr. J.V. (Hans) van Thienen |
NIV/NVMO |
Dr. D.C.P.(David) Cobben |
NVRO |
Dr. L.R. (Lieneke) van Veelen |
NVRO |
Dr. J.K. (Jonna) van Vulpen |
NVRO |
Drs. M.C. (Marlieke) van Kooten |
KNGF/NVFL |
Drs. M. (Michelle) Verseveld |
KNGF/NVFL |
Prof. dr. J.M.(Joost) Klaase |
NVvH |
M.M.A. (Merel) Brouwer |
V&VN |
Drs. M.M.J. (Manon) van de Valk |
V&VN |
Dr. L.M. (Laurien) Buffart |
Persoonlijke titel |
F.H.M. (Manon) Crijns-Prophitius |
BVN |
R. (Remco) van der Molen Kuipers |
NFK |
Dr. B.C. (Bart) Bongers |
VvBN |
Dr. A. (Arnold) Romeijnders tot 1-12-2022) |
Persoonlijke titel |
Drs. J.A.W. (Judith) de Bruijn-Reijnen |
VRA |
Met ondersteuning van
Dr. J. (Joppe) Tra, senior adviseur, Kennisinstituut van de Federatie Medisch Specialisten
Drs. M. (Michiel) Oerbekke, adviseur, Kennisinstituut van de Federatie Medisch Specialisten
Dr. N. (Nadine) Zielonke, adviseur, Kennisinstituut van de Federatie Medisch Specialisten
Drs. T. (Toon) Lamberts, senior adviseur, Kennisinstituut van de Federatie Medisch Specialisten
Drs. N. (Nicole) Thomaes, stagiaire, Kennisinstituut van de Federatie Medisch Specialisten
Belangenverklaringen
De Code ter voorkoming van oneigenlijke beïnvloeding door belangenverstrengeling is gevolgd. Alle werkgroepleden hebben schriftelijk verklaard of zij in de laatste drie jaar directe financiële belangen (betrekking bij een commercieel bedrijf, persoonlijke financiële belangen, onderzoeksfinanciering) of indirecte belangen (persoonlijke relaties, reputatiemanagement) hebben gehad. Gedurende de ontwikkeling of herziening van een module worden wijzigingen in belangen aan de voorzitter doorgegeven. De belangenverklaring wordt opnieuw bevestigd tijdens de commentaarfase.
Een overzicht van de belangen van werkgroepleden en het oordeel over het omgaan met eventuele belangen vindt u in onderstaande tabel. De ondertekende belangenverklaringen zijn op te vragen bij het secretariaat van het Kennisinstituut van de Federatie Medisch Specialisten.
Naam |
Hoofdfunctie |
Nevenwerkzaamheden |
Persoonlijke Financiële Belangen |
Persoonlijke Relaties |
Extern gefinancierd onderzoek |
Intell. belangen en reputatie |
Overige belangen |
Acties |
Arnold Romeinders |
Gepensioneerd huisarts (33 jaar) en medisch directeur zorggroep PoZoB (20 jaar, betaald). Voorheen 11 jaar (1991-2022, betaald) werkzaam bij het Nederlands Huisartsen Genootschap, afdeling richtlijnontwikkeling en wetenschap |
- |
Geen |
Nee |
Niet van toepassing |
Niet van toepassing |
Niet van toepassing |
Geen |
Bart Bongers |
- Universitair docent, medisch fysioloog bij Maastricht University: betaald |
- Scholing in inspannings- en trainingsfysiologie bij ExerScience: betaald |
Niet van toepassing |
Niet van toepassing |
Ja, de 5 meest recente hieronder gespecificeerd |
Niet van toepassing |
Niet van toepassing |
Geen |
David Cobben |
Lid namens NVRO |
Onderzoek op het gebied van 'frailty' in longkanker patienten in Liverpool |
Niet van toepassing |
Niet van toepassing |
Niet van toepassing |
Niet van toepassing |
Niet van toepassing |
Geen |
Goof Schep (vz) |
Sportarts, Maxima medisch centrum 0,9 fte |
* Lid wetenschapscommissie, vereniging voor sportgeneeskunde, onbetaald |
Op dit moment heb ik geen financieel belang. |
Geen |
Geen |
Zie eerder. Dit speelt voor de FitMáx(c) vragenlijst, wat ook logisch is omdat dit aansluit op mijn expertise (=meten van fysieke fitheid en dit vertalen naar wat het betekend) en een knelpunt in de huidige oncologische zorg (fysieke fitheid wordt niet/nauwelijks zichtbaar gemaakt/niet gemonitord. |
Geen |
Mocht niet betrokken worden bij modules over screening & assessment en monitoring |
Hans van Thienen |
Internist-oncoloog, NKI-AvL |
* Inhoudelijk/ vice voorzitter Medisch Inhoudelijke Standpunten (MIS) groep van DRCG (onbetaald) |
Geen |
Nee |
* Pfizer - Neoadjuvant axitnib en avelumab bij niercelcarcinoom - Projectleider |
Geen |
Geen |
Geen |
Jonna van Vulpen |
AIOS Radiotherapie |
Medisch-wetenschappelijk onderzoek in het veld van fysieke fitheid/training bij oncologische patiënten. |
Niet van toepassing |
Niet van toepassing |
Niet van toepassing |
Niet van toepassing |
Niet van toepassing |
|
Joost Klaase |
Gemandateerd namens de NVvH |
Betrokken bij Standpunt Prehabilitatie als lid van de Werkgroep Prehabilitatie van de NVvH |
Geen |
Geen |
Bij het ontwikkelen van Standpunt Prehabilitatie is het Kennis Instituut van de FMS betrokken, dit wordt gefinancierd middels een SKMS subsidie. Daarnaast is voor de Werkgroep Prehabilitatie arts-onderzoeker Charissa Sabajoo aangesteld (aanstelling UMCG), die gefinanceerd wordt met sponor gelden van 1. J&J, 2. Vifor Pharma, 3. Noaber Foundation, 4. PPP Allowance. |
Als projectleider van focusproject HPB prehabilitatiepoli binnen Groningen Leefstijl Interventie Model) ben ik boegbeeld van prehabilitatie binnen het UMCG |
geen |
Geen |
Laurien Buffart |
Universitair hoofddocent, afdeling Medical BioSciences, Radboudumc |
Geen |
Geen |
Nee |
* NWO-Vidi - Fysieke trainig bij uitgezaaide darmkanker (Aerobic fitness of muscle mass training to improve colorectal cander outcome) - Projectleider - World Cancer Research Fund (WCRF): Replacing sedentary behaviour with standing, physical activity or sleep after treatment for localized renal and colorectal cancer: associations with changes in adiposity, fatigue and quality of life, and underlying biological mechanisms (co-applicant) |
Niet van toepassing |
Nee |
Geen |
Lieneke van Veelen |
radiotherapeut-oncoloog (betaald) bij het Zuid West Radiotherapeutisch Instituut |
SCEN-arts (betaald) |
Geen |
Nee |
Geen |
Geen |
Nee |
Geen |
Manon Crijns |
Patientenparticipatie vanuit Borstkanker Vereniging Nederland & Teamleider Belangenbehartiging NFK |
Werkzaam bij NFK Patient advocate / vrijwilliger BVN |
Niet van toepassing |
Niet van toepassing |
Niet van toepassing |
Niet van toepassing |
Niet van toepassing |
Geen |
Manon van de Valk |
Verpleegkundig Specialist AGZ |
Geen |
Geen |
Nee |
Nee |
Nee |
Nee |
Geen |
Marlieke van Kooten |
Praktijkeigenaar Actief Fysiotherapie Rotterdam Oncologie-oedeemfysiotherapeut |
Lid wetenschapscommissie van de Nederlandse Vereniging voor Fysiotherapie binnen de Lymfologie & Oncologie (deels betaald) |
Geen |
Nee |
Nee |
Geen |
Geen |
Geen |
Marieke van Vessem |
Sportarts, Maxima Medisch Centrum (0.4 FTE), betaald. |
Werkgroep Exercise is Medicine Vereniging voor Sportgeneeskunde (sinds 2020): algemeen lid, post-COVID project team. Onbetaald |
Geen |
De FitMax score lijst is in het Maxima Medisch Centrum ontwikkeld. Deze wordt genoemd in de richtlijn. |
Pilot Fit bij Borstkanker |
Geen |
Geen |
Geen |
Merel Brouwer |
Verpleegkundig specialist gastro-enterologische oncologie bij Jeroen Bosch ziekenhuis (36u contract - betaald waarvan tot aug 2023 9u ouderschapsverlof - onbetaald) |
Plaatsvervangend lid College Specialismen Verpleegkunde prakijkopleider (vacatiegelden) |
x |
Zwager is fysiotherapeut |
x |
x |
x |
Geen |
Michelle Verseveld |
* Bestuurslid Nederlandse Vereniging voor fysiotherapie bij Lymfologie en Oncologie, portefeuilehouder Wetenschap Oedeem en Oncologie. 8 uur per week |
NVFL: vrijwillge functie |
Niet van toepassing |
Niet van toepassing |
Niet van toepassing |
Niet van toepassing |
Nee, niet bekend |
Geen |
Remco van der Molen Kuipers |
Insumares BV, intrim advies |
Niet van toepassing |
Ik ben DGA (100 %) van Insumares BV en werk momenteel aan een intrim project op het gebied van ICT dienstverlening bij SLTNICT Solutiions BV |
Neen |
Niet van toepassing |
Ik ben als patiënt advocat verbonden aan Inspire2Live |
Neen |
Geen |
Rhijn Visser (vz.) |
* Sportarts en Medisch Manger Afdeling revalidatie Elkerliek Ziekenhuis te Helmond., 36 uur in loondienst |
* Voorzitter Raad van Toezicht SGS (Stichting Gorinchemse Sportaccomodaties), 4 uur per maand, betaald |
Het Elkerliek ziekenhuis biedt Oncologische nazorg aan. Gezien het feit dat ik in loondienst ben, heeft dit geen effect op mijn salariëring. Verder geen belangen |
Geen |
Geen |
Geen |
Geen |
Geen |
Inbreng patiëntenperspectief
Inbreng patiëntenperspectief
Er werd aandacht besteed aan het patiëntenperspectief door de Nederlandse Federatie van Kankerpatiënten uit te nodigen voor de twee invitational conferences en de werkgroep. Het verslag hiervan (zie bijlage) is besproken in de werkgroep. De verkregen input is meegenomen bij het opstellen van de uitgangsvragen, de keuze voor de uitkomstmaten en bij het opstellen van de overwegingen. De conceptrichtlijn is tevens voor commentaar voorgelegd aan de NFK. De eventueel aangeleverde commentaren zijn bekeken en waar mogelijk verwerkt.
Wkkgz & Kwalitatieve raming van mogelijke substantiële financiële gevolgen
Kwalitatieve raming van mogelijke financiële gevolgen in het kader van de Wkkgz. Bij de richtlijn is conform de Wet kwaliteit, klachten en geschillen zorg (Wkkgz) een kwalitatieve raming uitgevoerd of de aanbevelingen mogelijk leiden tot substantiële financiële gevolgen. Bij het uitvoeren van deze beoordeling zijn richtlijnmodules op verschillende domeinen getoetst (zie het stroomschema op de Richtlijnendatabase).
Uit de kwalitatieve raming blijkt dat er waarschijnlijk geen substantiële financiële gevolgen zijn, zie onderstaande tabel.
Module |
Uitkomst raming |
Toelichting |
Monitoren van fysieke fitheid |
geen financiële gevolgen |
Uitkomst 3. Hoewel de richtlijn een grote groep patiënten betreft (> 40.000) is het niet de verwachting dat er substantiële investeringen moeten worden gedaan, dat er een aanzienlijke toename van het aantal FTE’s noodzakelijk is of dat er structureel hogere kwalificaties nodig zijn. |
Werkwijze
AGREE
Deze richtlijnmodule is opgesteld conform de eisen vermeld in het rapport Medisch Specialistische Richtlijnen 2.0 van de adviescommissie Richtlijnen van de Raad Kwaliteit. Dit rapport is gebaseerd op het AGREE II instrument (Appraisal of Guidelines for Research & Evaluation II; Brouwers, 2010).
Knelpuntenanalyse en uitgangsvragen
Tijdens de voorbereidende fase inventariseerde de werkgroep schriftelijk de knelpunten in de zorg voor patiënten met kanker betreft fysieke fitheid. Op basis van de uitkomsten van de knelpuntenanalyse zijn door de werkgroep concept-uitgangsvragen opgesteld en definitief vastgesteld.
Uitkomstmaten
Na het opstellen van de zoekvraag behorende bij de uitgangsvraag inventariseerde de werkgroep welke uitkomstmaten voor de patiënt relevant zijn, waarbij zowel naar gewenste als ongewenste effecten werd gekeken. De werkgroep waardeerde deze uitkomstmaten volgens hun relatieve belang bij de besluitvorming rondom aanbevelingen, als cruciaal (kritiek voor de besluitvorming), belangrijk (maar niet cruciaal) en onbelangrijk. Tevens definieerde de werkgroep tenminste voor de cruciale uitkomstmaten welke verschillen zij klinisch (patiënt) relevant vonden.
Methode literatuursamenvatting
Een uitgebreide beschrijving van de strategie voor zoeken en selecteren van literatuur is te vinden onder ‘Zoeken en selecteren’ onder Onderbouwing. Indien mogelijk werd de data uit verschillende studies gepoold in een random-effects model met behulp van Review Manager 5.4. De beoordeling van de kracht van het wetenschappelijke bewijs wordt hieronder toegelicht.
Beoordelen van de kracht van het wetenschappelijke bewijs
De kracht van het wetenschappelijke bewijs werd bepaald volgens de GRADE-methode. GRADE staat voor ‘Grading Recommendations Assessment, Development and Evaluation’ (zie http://www.gradeworkinggroup.org/). De basisprincipes van de GRADE-methodiek zijn: het benoemen en prioriteren van de klinisch (patiënt) relevante uitkomstmaten, een systematische review per uitkomstmaat, en een beoordeling van de bewijskracht per uitkomstmaat op basis van de acht GRADE-domeinen (domeinen voor downgraden: risk of bias, inconsistentie, indirectheid, imprecisie, en publicatiebias; domeinen voor upgraden: dosis-effect relatie, groot effect, en residuele plausibele confounding).
GRADE onderscheidt vier gradaties voor de kwaliteit van het wetenschappelijk bewijs: hoog, redelijk, laag en zeer laag. Deze gradaties verwijzen naar de mate van zekerheid die er bestaat over de literatuurconclusie, in het bijzonder de mate van zekerheid dat de literatuurconclusie de aanbeveling adequaat ondersteunt (Schünemann, 2013; Hultcrantz, 2017).
GRADE |
Definitie |
Hoog |
|
Redelijk |
|
Laag |
|
Zeer laag |
|
Bij het beoordelen (graderen) van de kracht van het wetenschappelijk bewijs in richtlijnen volgens de GRADE-methodiek spelen grenzen voor klinische besluitvorming een belangrijke rol (Hultcrantz, 2017). Dit zijn de grenzen die bij overschrijding aanleiding zouden geven tot een aanpassing van de aanbeveling. Om de grenzen voor klinische besluitvorming te bepalen moeten alle relevante uitkomstmaten en overwegingen worden meegewogen. De grenzen voor klinische besluitvorming zijn daarmee niet één op één vergelijkbaar met het minimaal klinisch relevant verschil (Minimal Clinically Important Difference, MCID). Met name in situaties waarin een interventie geen belangrijke nadelen heeft en de kosten relatief laag zijn, kan de grens voor klinische besluitvorming met betrekking tot de effectiviteit van de interventie bij een lagere waarde (dichter bij het nuleffect) liggen dan de MCID (Hultcrantz, 2017).
Overwegingen (van bewijs naar aanbeveling)
Om te komen tot een aanbeveling zijn naast (de kwaliteit van) het wetenschappelijke bewijs ook andere aspecten belangrijk en worden meegewogen, zoals aanvullende argumenten uit bijvoorbeeld de biomechanica of fysiologie, waarden en voorkeuren van patiënten, kosten (middelenbeslag), aanvaardbaarheid, haalbaarheid en implementatie. Deze aspecten zijn systematisch vermeld en beoordeeld (gewogen) onder het kopje ‘Overwegingen’ en kunnen (mede) gebaseerd zijn op expert opinion. Hierbij is gebruik gemaakt van een gestructureerd format gebaseerd op het evidence-to-decision framework van de internationale GRADE Working Group (Alonso-Coello, 2016a; Alonso-Coello 2016b). Dit evidence-to-decision framework is een integraal onderdeel van de GRADE methodiek.
Formuleren van aanbevelingen
De aanbevelingen geven antwoord op de uitgangsvraag en zijn gebaseerd op het beschikbare wetenschappelijke bewijs en de belangrijkste overwegingen, en een weging van de gunstige en ongunstige effecten van de relevante interventies. De kracht van het wetenschappelijk bewijs en het gewicht dat door de werkgroep wordt toegekend aan de overwegingen, bepalen samen de sterkte van de aanbeveling. Conform de GRADE-methodiek sluit een lage bewijskracht van conclusies in de systematische literatuuranalyse een sterke aanbeveling niet a priori uit, en zijn bij een hoge bewijskracht ook zwakke aanbevelingen mogelijk (Agoritsas, 2017; Neumann, 2016). De sterkte van de aanbeveling wordt altijd bepaald door weging van alle relevante argumenten tezamen. De werkgroep heeft bij elke aanbeveling opgenomen hoe zij tot de richting en sterkte van de aanbeveling zijn gekomen.
In de GRADE-methodiek wordt onderscheid gemaakt tussen sterke en zwakke (of conditionele) aanbevelingen. De sterkte van een aanbeveling verwijst naar de mate van zekerheid dat de voordelen van de interventie opwegen tegen de nadelen (of vice versa), gezien over het hele spectrum van patiënten waarvoor de aanbeveling is bedoeld. De sterkte van een aanbeveling heeft duidelijke implicaties voor patiënten, behandelaars en beleidsmakers (zie onderstaande tabel). Een aanbeveling is geen dictaat, zelfs een sterke aanbeveling gebaseerd op bewijs van hoge kwaliteit (GRADE gradering HOOG) zal niet altijd van toepassing zijn, onder alle mogelijke omstandigheden en voor elke individuele patiënt.
Implicaties van sterke en zwakke aanbevelingen voor verschillende richtlijngebruikers |
||
|
Sterke aanbeveling |
Zwakke (conditionele) aanbeveling |
Voor patiënten |
De meeste patiënten zouden de aanbevolen interventie of aanpak kiezen en slechts een klein aantal niet. |
Een aanzienlijk deel van de patiënten zouden de aanbevolen interventie of aanpak kiezen, maar veel patiënten ook niet. |
Voor behandelaars |
De meeste patiënten zouden de aanbevolen interventie of aanpak moeten ontvangen. |
Er zijn meerdere geschikte interventies of aanpakken. De patiënt moet worden ondersteund bij de keuze voor de interventie of aanpak die het beste aansluit bij zijn of haar waarden en voorkeuren. |
Voor beleidsmakers |
De aanbevolen interventie of aanpak kan worden gezien als standaardbeleid. |
Beleidsbepaling vereist uitvoerige discussie met betrokkenheid van veel stakeholders. Er is een grotere kans op lokale beleidsverschillen. |
Organisatie van zorg
In de knelpuntenanalyse en bij de ontwikkeling van de richtlijnmodule is expliciet aandacht geweest voor de organisatie van zorg: alle aspecten die randvoorwaardelijk zijn voor het verlenen van zorg (zoals coördinatie, communicatie, (financiële) middelen, mankracht en infrastructuur). Randvoorwaarden die relevant zijn voor het beantwoorden van deze specifieke uitgangsvraag zijn genoemd bij de overwegingen. Meer algemene, overkoepelende, of bijkomende aspecten van de organisatie van zorg worden behandeld in de module Organisatie van zorg.
Commentaar- en autorisatiefase
De conceptrichtlijnmodule werd aan de betrokken (wetenschappelijke) verenigingen en (patiënt) organisaties voorgelegd ter commentaar. De commentaren werden verzameld en besproken met de werkgroep. Naar aanleiding van de commentaren werd de conceptrichtlijnmodule aangepast en definitief vastgesteld door de werkgroep. De definitieve richtlijnmodule werd aan de deelnemende (wetenschappelijke) verenigingen en (patiënt) organisaties voorgelegd voor autorisatie en door hen geautoriseerd dan wel geaccordeerd.
Literatuur
Agoritsas T, Merglen A, Heen AF, Kristiansen A, Neumann I, Brito JP, Brignardello-Petersen R, Alexander PE, Rind DM, Vandvik PO, Guyatt GH. UpToDate adherence to GRADE criteria for strong recommendations: an analytical survey. BMJ Open. 2017 Nov 16;7(11):e018593. doi: 10.1136/bmjopen-2017-018593. PubMed PMID: 29150475; PubMed Central PMCID: PMC5701989.
Alonso-Coello P, Schünemann HJ, Moberg J, Brignardello-Petersen R, Akl EA, Davoli M, Treweek S, Mustafa RA, Rada G, Rosenbaum S, Morelli A, Guyatt GH, Oxman AD; GRADE Working Group. GRADE Evidence to Decision (EtD) frameworks: a systematic and transparent approach to making well informed healthcare choices. 1: Introduction. BMJ. 2016 Jun 28;353:i2016. doi: 10.1136/bmj.i2016. PubMed PMID: 27353417.
Alonso-Coello P, Oxman AD, Moberg J, Brignardello-Petersen R, Akl EA, Davoli M, Treweek S, Mustafa RA, Vandvik PO, Meerpohl J, Guyatt GH, Schünemann HJ; GRADE Working Group. GRADE Evidence to Decision (EtD) frameworks: a systematic and transparent approach to making well informed healthcare choices. 2: Clinical practice guidelines. BMJ. 2016 Jun 30;353:i2089. doi: 10.1136/bmj.i2089. PubMed PMID: 27365494.
Brouwers MC, Kho ME, Browman GP, Burgers JS, Cluzeau F, Feder G, Fervers B, Graham ID, Grimshaw J, Hanna SE, Littlejohns P, Makarski J, Zitzelsberger L; AGREE Next Steps Consortium. AGREE II: advancing guideline development, reporting and evaluation in health care. CMAJ. 2010 Dec 14;182(18):E839-42. doi: 10.1503/cmaj.090449. Epub 2010 Jul 5. Review. PubMed PMID: 20603348; PubMed Central PMCID: PMC3001530.
Hultcrantz M, Rind D, Akl EA, Treweek S, Mustafa RA, Iorio A, Alper BS, Meerpohl JJ, Murad MH, Ansari MT, Katikireddi SV, Östlund P, Tranæus S, Christensen R, Gartlehner G, Brozek J, Izcovich A, Schünemann H, Guyatt G. The GRADE Working Group clarifies the construct of certainty of evidence. J Clin Epidemiol. 2017 Jul;87:4-13. doi: 10.1016/j.jclinepi.2017.05.006. Epub 2017 May 18. PubMed PMID: 28529184; PubMed Central PMCID: PMC6542664.
Medisch Specialistische Richtlijnen 2.0 (2012). Adviescommissie Richtlijnen van de Raad Kwalitieit. http://richtlijnendatabase.nl/over_deze_site/over_richtlijnontwikkeling.html
Neumann I, Santesso N, Akl EA, Rind DM, Vandvik PO, Alonso-Coello P, Agoritsas T, Mustafa RA, Alexander PE, Schünemann H, Guyatt GH. A guide for health professionals to interpret and use recommendations in guidelines developed with the GRADE approach. J Clin Epidemiol. 2016 Apr;72:45-55. doi: 10.1016/j.jclinepi.2015.11.017. Epub 2016 Jan 6. Review. PubMed PMID: 26772609.
Schünemann H, Brożek J, Guyatt G, et al. GRADE handbook for grading quality of evidence and strength of recommendations. Updated October 2013. The GRADE Working Group, 2013. Available from http://gdt.guidelinedevelopment.org/central_prod/_design/client/handbook/handbook.html.