Invasive cardiac output (CO) assessment with the thermodilution (TD) or the Fick method during right heart catheterization (RHC) is a well-established examination applied to patients with various cardiac and pulmonary diseases [1, 2].

The TD method enables CO calculation by using the modified Stewart–Hamilton principle. A thermistor at the distal end of the Swan Ganz catheter measures the change in blood temperature in the pulmonary artery after saline injection of a definite volume and temperature in the proximal ending of the catheter [3, 4]. Although methodologically prone to errors that may influence the injectate alongside its path, as in higher-grade tricuspid valve insufficiency or severe heart failure, the TD method is generally regarded as the guideline-recommended practical “reference standard” for CO assessment and is only discouraged in patients with intracardiac shunt diseases [1, 5].

The direct Fick method is also recommended by the guidelines and is based on the principle that CO is equal to oxygen consumption (\(\dot {\mathrm{V}}\)O2), measured as the difference between inspired and expired oxygen content, divided by the difference in arterial and mixed venous oxygen concentration [1, 6]. For the direct measurement of \(\dot {\mathrm{V}}\)O2, tightly fitted facemasks are used that monitor the oxygen and carbon dioxide levels in the patient’s breath. Noteworthy is a significant variability in the measured \(\dot {\mathrm{V}}\)O2 levels, and therefore, repeat or continuous measurements are advocated [7]. Because the accurate measurement of \(\dot {\mathrm{V}}\)O2 in the catheter laboratory is demanding and time-consuming, the indirect Fick (iFM) method is preferably applied in clinical practice, since \(\dot {\mathrm{V}}\)O2 is estimated by different approximation formulas and not measured [2].

There are multiple formulas for \(\dot {\mathrm{V}}\)O2 estimation, with the equations proposed by LaFarge and Miettinen [8], Dehmer et al. [9], and Bergstra et al. [10] being the most frequently cited in the literature [11,12,13].

The correlation between TD and iFM-derived CO is much weaker than between TD and direct Fick-derived CO. In multiple studies, a CO discrepancy greater than 20% between both methods was reported, which is regarded as a clinically relevant threshold to indicate unsatisfactory agreement [14]. Deviations > 20% were described in up to 30–40% of cases [11, 15, 16]. It is unclear whether the discrepancies might be predisposed or enhanced by a compromised right ventricular function or other morphological constraints.

Besides these better-established formulas, Dr. Ingo Krakau introduced a \(\dot {\mathrm{V}}\)O2 approximate in his cardiac catheterization textbook, first published in 1999 [17]. This formula is used as the default setting for CO estimation by the iFM in many catheter laboratories, although its accuracy and performance in the clinical setting have not been validated to date. The present study aimed to compare CO, calculated as proposed by I. Krakau (iFMKrakau), with TD-derived CO and three established approximation formulas (i.e., iFMLaFarge, iFMDehmer, iFMBergstra.). We further analyzed whether discrepancies between TD- and iFMKrakau-derived CO correlate with morphological or functional pathologies assessed in cardiac magnetic resonance imaging (CMR).

Methods

Study design and patient selection

In this retrospective, observational, single-center study, we evaluated the hemodynamic interrelations in patients undergoing RHC and CMR at the University Hospital of Würzburg. Patients were identified using the Data Warehouse of the University Hospital, a proprietary digital storage solution that connects and harmonizes all electronically stored patient data, including discharge letters, diagnosis coding schemes, laboratory values, and procedures such as echocardiography, CMR, cardiac catheterization, and others [18].

The study was conducted in compliance with the Declaration of Helsinki. Approval of the ethical committee was waived as the Data Warehouse runs on standard operating procedures that are controlled and approved by the institution’s data protection officer.

In total, 293 consecutive patients with information on RHC and CMR were identified between January 2016 and January 2022 via the Data Warehouse. Patients were excluded from the analyses if one of the following conditions applied: no valid concomitant CO measurement of either TD method or one iFM method (N = 69); invalid or insufficient CMR data (e.g., early termination for claustrophobia) or a time distance greater than 2 weeks between RHC and CMR (N = 26); shunt volume > 10% or significant congenital heart defects (N = 10). Accordingly, the current study refers to 188 patients. Based on electronic patient records, information was collected on cardiovascular diseases (CVD) and risk factors (CVRF), non-cardiovascular conditions, medication (with emphasis on the treatment of CVD and CVRF), electrocardiogram, echocardiography, coronary angiography, RHC, CMR, and laboratory parameters measured at the Central Lab of the University Hospital on the day of or the day before RHC.

Transthoracic echocardiography

Transthoracic echocardiography was performed according to practice guidelines [19] as part of the clinical routine during or before the index hospitalization as part of an outpatient visit. In five patients, transesophageal echocardiography and not transthoracic echocardiography was available. The median time difference between echocardiography and RHC was 1 day (quartiles 0; 5 days).

Cardiac magnetic resonance imaging

The CMR was performed on either a 1.5- or a 3.0‑T Achieva D scanner (Philips Healthcare, Best, The Netherlands) according to the Society for Cardiovascular Magnetic Resonance standard [20]. The median time difference between CMR and RHC was 3 days (quartiles 1, 6 days).

Data analysis was performed on the dedicated workstation IntelliSpacePortal (Philips Healthcare, Best, The Netherlands). Ventricular volumes were derived from a short-axis CINE stack covering the ventricles from the apex to the valvular plane. The endomyocardial border was traced manually for the end-systolic and end-diastolic phases, with the papillary muscle considered part of the intracavitary volume. The stroke volume (SV) was calculated as the difference between the end-diastolic (EDV) and the end-systolic (ESV) volume for both the left and the right heart. Ejection fraction (EF) was calculated from the division of SV and the EDV multiplied by 100.

Right heart catheterization and formulas used

Depending on the indication, RHC was performed according to standard recommendations [21], either alone or in combination with coronary angiography, using the Vigilance II™ monitor (Edwards Lifesciences, Irvine, CA, USA) or the Schwarzer Cardiotek Evolution system (Schwarzer Cardiotek GmbH, Heilbronn, Germany). Hemoglobin and oxygen saturation of mixed venous blood (SVO2) were measured with the ABL80 FLEX CO-OX blood gas analyzer (Radiometer Medical ApS, Brønshøj, Denmark). Arterial oxygen saturation (SaO2) was either measured invasively if coronary angiography was additionally performed or derived from finger pulse oximetry. For body surface area (BSA), the formula from Dubois and Dubois was applied [21]. Data from RHC (hemodynamics and pressure tracings) were double-checked and entered manually by two cardiologists (TR and GG).

Thermodilution approach.

The TD was performed utilizing one of two True Size Thermodilution Swan-Ganz catheters (models 141F7 and 151F7; Edwards Lifesciences, Irvine, CA, USA). As an in-house standard, CO measurements were performed at least three times for patients with sinus rhythm. They were repeated until three similar measurements were obtained for patients with atrial fibrillation or with a discrepancy of > 10% from the mean [22, 23]. Only the mean of the TD measurements was available for the present analysis.

Indirect Fick approaches.

The CO was calculated as the ratio between estimated oxygen consumption (\(\dot {\mathrm{V}}\)O2) and the difference between arterial and mixed venous oxygen content using Eq. 1 [17]:

$$\mathrm{av}\mathrm{DO}_{2}=\text{hemoglobin}\,(\mathrm{g}/\mathrm{dL})*1.34\mathrm{\,ml}/\mathrm{g}*(\mathrm{a}\mathrm{S}_{\mathrm{O}2}-\mathrm{v}\mathrm{S}_{\mathrm{O}2}).$$
(1)

The four different formulas used for \(\dot {\mathrm{V}}\)O2 estimation are listed in Table 1.

Table 1 Equations used for \(\dot {\mathrm{V}}\)O2 estimation

Statistical analysis

Data are described by count (percent), mean (standard deviation), or median (quartiles), as appropriate. Group comparisons were carried out for nominal and ordinal parameters using exact Fisher’s or chi-square tests and for metric parameters using Mann–Whitney U tests or the Kruskal–Wallis test. The TD-CO served as the reference standard and was compared with each method using the Wilcoxon signed-rank test and Pearson’s correlation coefficient (r). Bland–Altman plots illustrated the agreement between TD-derived CO and CO derived by different iFM methods [24].

The CO derived by iFMKrakau was divided by TD-derived CO to obtain the percentage of patients deviating more than 20% from the reference standard. Patients’ characteristics were then compared in the three resulting groups with ratios < 0.8 versus 0.8 to 1.2 and > 1.2. The respective ratios were also computed for iFMLaFarge, iFMDehmer, and iFMBergstra. To quantify comparability, percentage errors were calculated as 1.96 times the standard deviation (SD) of the difference between TD-CO and iFM-CO divided by their means and expressed as percent (Eq. 2):

$$(1.96*\text{SD of TD-CO} - \text{iFM-CO}/(\text{TD-CO}+\text{iFM-CO})*0.5).$$
(2)

Percentage error estimates were also calculated between iFMKrakau-CO and other iFM-CO. As recommended by Critchley and Critchley, estimates of the percentage error < 30% were regarded as acceptable [14].

Predictors of a deviation > 20% from the reference TD-derived CO (ratio iFMKrakau-CO/TD-CO < 0.8 OR ratio iFMKrakau-CO/TD-CO > 1.2) were determined in univariable logistic regression analysis. Variables used for the index variable or their derivatives (e.g., TD-CO or SaO2) were excluded from the analysis. Significant univariate predictors (with p < 0.05) were included in a multivariable logistic regression analysis using the forward and backward selection methods. As the results were similar, only the backwards selection method is shown.

A significant group difference was assumed for all test procedures at a (two-sided) value of p < 0.05. All statistical analyses were performed using IBM SPSS Statistics for Windows Version 28.

Results

Study population and baseline characteristics

The mean age of the 188 patients was 63 (±14.4) years, and 30% were women. Their characteristics are shown in Table 2. In 10% of the cases, the indication for RHC was diagnosis or suspicion of pulmonary hypertension of non-cardiac origin (N = 18/188). The majority of RHC was performed for a cardiac reason: 78 patients had grade °III valvular heart disease (VHD; N = 28/78, 36% with accompanying HFrEF); 35 patients were suspected of having higher-grade VHD, which could be invasively excluded (N = 14/35; 40% with HFrEF); 54 patients had heart failure without higher-grade VHD (N = 11/54 had heart failure with preserved ejection fraction [HFpEF] of different origins); three patients were suspected of having constrictive pericarditis.

Table 2 Baseline characteristics

For group comparisons, patients were divided according to the iFMKrakau-CO/TD-CO index into three groups: iFMKrakau underestimated CO (index < 0.8) compared to TD in 74 (39%) patients, whereas overestimations (index > 1.2) occurred in 11 patients (5.9%). For the remaining 103 (55%) patients, the index was between 0.8 and 1.2. This group was classified as the group with “no significant deviation” (Table 2).

Patients for whom iFMKrakau overestimated CO, i.e., ratio > 1.2, had lower TD-derived CI than patients with no significant deviation (Table 2).

By contrast, patients for whom iFMKrakau underestimated CO, i.e., ratio < 0.8, were smaller, had a higher intake of loop-diuretics, and more often had severe aortic valve stenosis than patients with no significant deviation (Table 2). In RHC, pulmonary artery pressure, TD-CO, and TD-CI levels were higher than in patients with no significant deviation (all P < 0.05).

CMR and discrepancies between iFMKrakau-CO and TD-CO

Reduced LV and RV stroke indexes were the only variables that showed a trend toward a deviation of > 20% between iFMKrakau-CO and TD-CO (p = 0.066 and p = 0.052 in patients with no deviation vs. a ratio > 1.2, respectively). All other variables were not significantly different between groups (Table 2).

Correlations between different CO methods

Table 3 summarizes the differences and correlations between TD-derived CO, cardiac index (CI), and oxygen consumption (\(\dot {\mathrm{V}}\)O2, mL/min) compared to different iFM methods. Associations between CO derived by TD and iFM methods were visualized in a scatter plot, including best-fit lines for linear regression models for each equation (Fig. 1).

Table 3 Comparison of thermodilution-derived cardiac output, cardiac index, and oxygen consumption with different indirect Fick methods
Fig. 1
figure 1

Cardiac output (CO) determined by thermodilution (TD) compared to various indirect Fick methods with best-fit lines and simple linear regression

The TD-CO was significantly higher than all iFM-derived CO (Table 3; all p > 0.05). The correlation coefficients between TD-CO were similar in magnitude for all iFM (r = 0.75–0.76; p > 0.001 Table 3). Associations were similar, albeit numerically smaller, when CI or \(\dot {\mathrm{V}}\)O2 instead of CO values were compared (Table 3).

When iFMKrakau-CO was compared with the other iFM-CO levels, significant differences were seen for all methods (all p ≤ 0.001), except for iFMDehmer (p = 0.19). Associations were similar when iFMKrakau-CI and iFMKrakau-\(\dot {\mathrm{V}}\)O2 were compared with the other iFM measures (CI: p ≤ 0.001, except for iFMDehmer p = 0.12; \(\dot {\mathrm{V}}\)O2: p all ≤ 0.001, except for iFMDehmer p = 0.15).

Level of agreement between different CO methods

Table 4 illustrates the percentage of deviations > 20% from TD-derived CO measurements for each iFM calculated. As shown, iFMLaFarge yielded the highest proportion of total deviations (64%), mainly via a high number of patients with a ratio < 0.8 (60%). By contrast, iFMBergstra showed opposing associations (28% deviations > 20%), while iFMKrakau and iFMDehmer performed in-between (deviations > 20%; 45% and 43%). Concerning interrater reliability, iFMDehmer had the highest Cohen’s kappa coefficient among the calculated iFM estimates compared to iFMKrakau.

Table 4 Discrepancies greater than 20% between different iFM-CO methods and TD-CO

The level of agreement between TD and different iFM methods was further depicted in Bland–Altman plots (Fig. 2). The lowest mean difference (0.14 L/min), but numerically also the broadest limits of agreement (+2.51 to −2.23) were seen for iFMBergstra, while iFMLaFarge had the largest mean difference (1.24 L/min) and numerically the narrowest limits of agreement (3.45 to −0.98; Fig. 2). Mean differences between TD-CO and iFMKrakau and iFMDehmer-derived CO were 0.77 L/min and 0.76 L/min, with similar limits of agreements (iFMKrakau 3.01 to1.47; iFMDehmer 3.05; −1.51 Fig. 2). Percentage errors between TD-CO and different iFM were similar, but all greater than 30% (iFMKrakau/iFMLaFarge/iFMDehmer/iFMBergstra 44%/45%/44%/43%), indicative of poor comparability.

Fig. 2
figure 2

Bland–Altman plots comparing the cardiac output (CO) measured by thermodilution (TD; reference standard) and four indirect Fick (iFM-CO) methods

When iFMKrakau was compared with the other iFM, the lowest mean difference (0.01 L/min) was seen for the comparison with iFMDehmer (limits of agreement −0.65 to 0.63 L/min), but also the comparisons with iFMLaFarge (mean difference 0.46 L/min; limits of agreement +1.15 to −0.22) and iFMBergstra, (mean difference −0.63 L/min; limits of agreement −1.2 to −0.05) were more related than with the TD method. Percentage errors between iFMKrakau and other iFM were all less than 30% (iFMLaFarge/iFMDehmer/iFMBergstra 15%/13%/11%).

Determinants of a significant discrepancy

Table 5 lists the baseline characteristics predicting a deviation of > 20% between iFMKrakau-derived and TD-derived CO in univariable logistic regression (age and sex forced into the analysis). Significant predictors were included in a multivariable model (age and sex forced into the analysis), in which male sex was associated with a reduced risk, while high-grade aortic valve stenosis and higher mean pulmonary artery pressure were associated with an increased risk of a significant deviation (Table 5).

Table 5 Determinants of a discrepancy of > 20% between thermodilution and iFMKrakau-derived CO

Discussion

In this retrospective cohort study, we investigated the agreement between TD-derived CO and CO derived from four different equations for \(\dot {\mathrm{V}}\)O2 estimation, including a non-validated formula proposed by Krakau [17]. We further determined independent predictors of a significant deviation between the measured and the estimated method, according to Krakau.

The principal findings were:

  1. 1.

    None of the formulas tested showed good agreement with the reference standard, but the Krakau formula correlated highly with the Dehmer equation in different method-comparison analyses.

  2. 2.

    None of the variables assessed in routine CMR were predictive of an increased risk for a discrepancy of > 20% between iFMKrakau and TD-derived CO, but female sex, high-grade aortic valve stenosis, and higher pulmonary pressure were independent predictors in multivariable logistic regression.

Right heart catheterization is indicated whenever relevant diagnostic information or therapeutic consequences are expected from the results [25]. In the last few years, the number of RHC procedures has been continuously decreasing, as the procedure was primarily recommended for evaluation of pulmonary hypertension, advanced heart failure, and specific conditions such as congenital heart diseases, pericarditis, and restrictive cardiomyopathies [1, 26]. However, RHC may also provide important prognostic and diagnostic information in valvular heart disease (VHD) and (non-advanced) HFrEF, especially in those with discordant clinical findings [25]. Further, RHC is demanded for basic cardiology training, thus, regular examinations are mandatory in training centers to address educational requirements. One of the critical measures in RHC is the accurate assessment of CO.

The direct Fick and the TD method are guideline-advocated methods for measuring CO. As the direct Fick method is not suitable for routine application in the cardiac catheter laboratory for various reasons, TD is regarded as the clinical gold standard despite the multiple caveats associated with this method [27].

In clinical practice, the iFM may be preferred over the gold standards for feasibility reasons. However, there are multiple \(\dot {\mathrm{V}}\)O2 estimation formulas with significant in-between differences [11, 13, 28, 29]. Although these variations are widely published, even renowned journals do not request citing the exact equations used for CO estimation [15, 16]. The largest study comparing TD-derived CO with iFM-derived CO analyzed 12,232 patients from the Veterans Cohort and 3197 patients from the Vanderbilt cohort and showed that TD-derived CO was more effective in predicting mortality than iFM-derived CO but without specifying the formulas used for \(\dot {\mathrm{V}}\)O2 estimation [15]. Fares et al. studied a cohort of patients with pulmonary hypertension and found a difference of > 20% between TD-CO and iFM-CO in 36% of patients but did not reference the formula(s) used for \(\dot {\mathrm{V}}\)O2 estimation [16]. The Krakau formula [17], by contrast, is mentioned as a standard for \(\dot {\mathrm{V}}\)O2 estimation in various cardiology textbooks [17, 30, 31] and publications[32, 33] although its study was never validated and no information on its derivation cohort has been published to date. We showed that iFMKrakau-derived CO did not agree with the gold standard TD-CO, but the associations were not worse than with the scientifically better evaluated approximation formulas proposed by LaFarge and Miettinen [8], Dehmer et al. [9], and Bergstra et al. [10]. In fact, the Krakau formula agreed in all method-comparison analyses (percentage errors, median levels, mean differences, levels of agreement) reasonably well with the formula proposed by Dehmer et al. [9]. This was also the only study that did not include children in the derivation cohort (N = 164; [9]). Here, the mean age of the adult patients was 50 years (age range: 21–75 years), and cardiac output was measured with either the TD (N = 89) or the dye dilution (N = 75) method. The cohorts from which the other formulas originate are not comparable with the standard adult patients receiving RHC today [11].

LaFarge and Miettinen investigated a predominantly pediatric cohort (76% aged 3–16 years) to derive the formula, using age, BSA, and heart rate as determinants of measured \(\dot {\mathrm{V}}\)O2 levels [8]. In most of the studies described, including the current one, this formula tended to underestimate \(\dot {\mathrm{V}}\)O2 and consequently the CO levels, and regularly yielded the highest number of patients with significant discrepancies from TD-CO [11, 13, 29].

The formula by Bergstra et al. [10] that originates from a cohort of 250 patients with a mean age of 34.6 ± 22 years (range: 1–84 years), in which \(\dot {\mathrm{V}}\)O2 levels were estimated by the indicator-dilution method, regularly overestimates \(\dot {\mathrm{V}}\)O2 and CO levels [11, 13, 28]. Patients were generally poorly characterized in all the aforementioned derivation cohorts. There are a couple of characteristics that may affect the results of iFM-CO or TD-CO assessment and thus alter the congruity between both methods.

Higher-grade tricuspid regurgitation (TR) and heart failure are alleged to be critical confounders of TD-CO assessment and hence may explain a higher degree of discrepancy between TD- and iFM-derived CO [5]. In our study, neither reduced ejection fraction nor high-grade TR were predictive of a CO discrepancy > 20%.

There are conflicting data on whether the TD procedure is methodologically suitable in the presence of higher-grade TR, which is why some authors in this situation endorse the Fick (or iFM) method [5, 34, 35]. As there may be backflow of a certain amount of the injected volume, a lower amount of the predefined cold volume reaches the tip of the catheter in time, which subsequently may lead to a reduction or “underestimation” of CO by the TD method in high-grade TR [34, 35]. Other clinical or experimental studies could not confirm a significant discrepancy between TD-derived and Fick-derived CO [36,37,38].

We hypothesized that reduced right ventricular (RV) function may have confounded previously described associations of high-grade TR and reliability of CO measurement by the TD method. But neither echocardiographic (tricuspid annular plane systolic excursion, RV end-diastolic diameter) nor CMR-derived parameters of RV function and morphology (RV ejection fraction, RV end-diastolic volume, RV-CO, RV stroke index) were significantly associated with a higher number of discrepancies between iFM-derived and TD-derived CO in our study.

In our cohort, female sex, higher pulmonary artery pressure, and high-grade aortic valve stenosis were independent predictors of an increased risk for discrepancies. While female sex and pulmonary hypertension are well-known predictors of relevant differences between the iFM and the TD method, the associations for aortic valve stenosis are less clear [11, 16].

The lower agreement between TD-derived and iFM-derived CO in patients with higher-grade AS was already shown in another study, in which the correlation coefficient between both methods was only r = 0.56 [39]. Associations were similar in our study when patients with high-grade AS were selected (according to different iFM methods: r = 0.52–0.59), but worse than in the total cohort (according to different iFM methods: r = 0.74–0.76).

The reason for the worse correlation in AS is unclear, but demonstrates once again that there is no “one-fits-all” \(\dot {\mathrm{V}}\)O2 approximation model that can be used for every patient. Since the characteristics of patients receiving RHC today are significantly different than in the period 1970–1999, when the respective \(\dot {\mathrm{V}}\)O2 approximation formulas were published[8,9,10, 17], the results are not surprising, but demonstrate once again, that currently the best methods for CO measurement are the established ones advocated by the guidelines [1].

Limitations

Our single-center study has some limitations, such as the retrospective design, the nonstandardized mode of data collection, and the exclusion of one third of the patients identified due to missing values. Additionally, as availability of CMR was an inclusion criterion for this study, the cohort was younger and healthier than the average patient presenting with RHC. Thus, the study did not include patients with non-CMR-compatible implantable devices, severe dyspnea when lying flat, or severe renal dysfunction. However, proper characterization of the cardiac chambers in CMR was one of the primary objectives of this study and the results of comparative analyses between different iFM-CO and TD-CO were similar to already published data. Further, we did not measure CO according to the direct Fick method. Since TD is regarded as the clinical gold standard and direct \(\dot {\mathrm{V}}\)O2 measurements are not performed routinely in most catheter laboratories anyway, results still reflect real-world conditions.

Conclusion

In conclusion, the as-yet scientifically unfathomed \(\dot {\mathrm{V}}\)O2 estimation formula proposed by Krakau performed better than the formula introduced by LaFarge and Miettinen, worse than the formula of Bergstra et al., and comparable to the formula proposed by Dehmer at al., when iFM-derived CO was calculated and the number of significant discrepancies from TD-derived CO was compared. However, none of the formulas showed good agreement with TD-derived CO. Newer derivation cohorts and formulas are needed for the estimation of \(\dot {\mathrm{V}}\)O2 that are more comparable with today’s patients presenting with RHC.