Whole-body magnetic resonance imaging for detection of malignant disease A head-to-head-comparison with whole-body 18F-FDG-PET/CT as the reference standard

Introduction: The combined use of 18 F-FDG-PET and computed tomography (CT) scans is an integrated part of diagnosing and staging patients with suspected malignancy or other pathologies. Magnetic resonance imaging (MRI) has proven its use as a non-ionizing imaging alternative for identifying potential malignancy in specific organs. Objective: To investigate the clinical value of whole-body MRI in detection of suspicious lesions, we compared 1.5T MRI findings to those obtained from a whole-body 18 F-FDG-PET/CT scan, the latter serving as the reference standard. Finally, the findings were compared with histology if available. Materials and methods: Twenty-five patients (9 women and 16 men, mean age ± SD: 64.5 ± 11.8 years; range: 34±85 years) with suspected malignancy or other pathologies were enrolled. All patients were scanned using both modalities. Imaging included the head, torso and extremities. Images were scored blinded by experienced readers: two radiologists and two nuclear medicine physicians. Statistical tests included weighted kappa for measuring interobserver reliability and Wilcoxon signed-rank test for detection differences between paired observations. Results and discussion: Interobserver reliability between each pair of specialists was fair-to-strong (Weighted kappa). Statistically significant differences between the findings of the two modalities were found in the colon (p=0.016), soft tissues of the extremities (p=0.002) and skeleton of the extremities (p=0.008). Twelve patients had histology available. WB-MRI and whole-body 18 F-FDG-PET/CT found 10 of these cases (sensitivity: 83.3%, 95% CI: 55.2%-95.3%). Conclusion: The diagnostic value of WB-MRI equaled whole-body 18 F-FDG-PET/CT. The MRI approach could therefore be considered in patients unsuitable for 18 F-FDG-imaging e.g. younger patients, during pregnancy or dysregulated diabetics.


Introduction
First line examinations when malignant disease or other pathologies (e.g.inflammation or infectious disease) are suspected include clinical evaluation, biochemical tests, X-ray, ultrasound and computed tomography (CT).Fluorine-18 fluorodeoxyglucose positron emission tomography combined with low dose non-contrast-enhanced CT ( 18 F-FDG-PET/CT) may be added to the examination repertoire to stage and characterize potential pathological lesions further.The results are used for planning the subsequent treatment as the modality is highly sensitive for clarification of staging, prognosis and response to therapy [1][2][3][4].Furthermore, the 18 F-FDG-PET/CT modality is useful in staging malignancies in the breast, colorectal, head and neck, lung, lymphoma, melanoma, esophageal and thyroid or detection of an unknown primary tumor [5].
Inherent challenges of the 18 F-FDG-PET/CT hybrid modality may cause diagnostic challenges, such as misregistration artifacts when merging CT and PET images [6].Furthermore, 18 F-FDG uptake is not specific for malignant lesions versus other pathologies and the results may be hampered by physiologic variations of nuclear tracer accumulation.Conversely, false-negative results may be seen in lowgrade tumors as they exhibit slow metabolic rates [7].
Non-ionizing whole-body magnetic resonance imaging (WB-MRI) could be an alternative or supplement to the nuclear based imaging approach [8][9][10][11][12].Studies have shown that information on tumor burden and tumor spreading can be obtained from MRI without the use of a contrast agent [13,14], hence MRI is useful in identifying malignancies or other pathologies in specific organs, but the ability to screen the entire body, such as 18 F-FDG-PET/CT remains to be elucidated.We hypothesized that WB-MRI could be a radiation friendly screening alternative to 18 F-FDG-PET/CT for identifying potential suspicious lesions in organs, soft tissues, lymph nodes and/or the skeleton.In this context, MRI suspicious lesions are defined as focal areas of high signal on MRI identified on the WB-DWI images (b=900 s/ mm 2 ) with low signal correlation on ADC map.
The purpose of the study was to investigate the diagnostic value of WB-MRI with 18 F-FDG-PET/CT as the reference method.In a subgroup of patients with an inconclusive CT scan, histology served as the reference method for both modalities (MRI and 18 F-FDG-PET/CT).

Patient group
All included patients were referred to a 18 F-FDG-PET/ CT scan due to inconclusive findings on a preliminary diagnostic CT scan suspicious for malignancy or other pathologies (e.g.inflammation or infection).Patients with prior malignant disease were excluded from the study.Patient inclusion was carried out during a one-year period (August 2014 to August 2015).Twenty-five patients were included in the study (9 women and 16 men, mean age ± standard deviation: 64.5 ± 11.8 years; range: 34−85 years).None of the patients received anti-diabetic drugs as this may have influenced the sensitivity of the 18 F-FDG-PET/CT especially in the abdomen.The study was approved by the local ethical committee (case no.1-10-72-381-13) and all enrolled patients gave their written consent.

MRI techniques
All MRI scans were carried out on a Siemens Avanto 1.5T MRI scanner (software release VB17a Siemens Healthcare GmbH, Erlangen, Germany).The patients were placed in the supine position and entered the bore head first with their arms placed by their sides.MRI signal detection was done with dedicated receive coils that covered the entire scan area from head to toes.The imaging protocols were divided into two areas of interest.One aimed specifically at the whole-spine (bones, intervertebral discs, spinal cord and cerebrospinal fluid) using sagittal slice planes while the other targeted the whole-body (organs, soft tissues and lymph nodes) using axial slice planes.No contrast agent was administered for the MRI examinations.Apparent diffusion coefficient (ADC) maps were calculated (b=50s/mm 2 and b= 900s/mm 2 ) to evaluate possible T 2shine-through effects.Diffusion images were re-sliced to the coronal slice plane and contrast inversed to resemble the nuclear based scans.

F-FDG-PET/CT techniques
All 18 F-FDG-PET/CT scans were carried out on a Siemens Biograph mCT 4R-64 slice PET/CT scanner (Siemens Healthcare GmbH, Erlangen, Germany).The patients were placed in the supine position with their arms placed above their heads.Patients were instructed to fast for at least 6 h prior to the 18 F-FDG injection.A dose of 18 F-FDG (4MBq/kg) was administered intravenously into a cubital vein, after which the scans were obtained 60±5 min post injection (30 min resting in bed and 30 min resting while sitting prior to scanning).The 18 F-FDG-PET/CT scans were performed by initially obtaining a whole-body low dose CT scan (slice width 2.0 mm, collimation 64×0.6 mm, pitch 1.2 and rotation time 0.5 seconds, using dose modulation for both mAs (Quality reference mAs 50, effective mAs 120) and voltage (ref.120 KV; allowed range 100-140 KV)).This was followed by a whole-body 18 F-FDG-PET scan to ensure that the same anatomical area as the MRI examination was imaged (21 cm axial field of view, typically 12-15 bed positions depending on patient-height, with acquisition times of 2-3 min/bed-position depending on patient BMI; a typical scan being 3 min/bed-position over the head, torso and abdomen, with a reduction to 2 min/bed-position over the lower extremities).
Postprocessing of the CT-data was performed using two axial iterative reconstructions: one to perform attenuation correction of the 18 F-FDG-PET-data (2 mm slice thickness, extended field of view (780 mm), B30f medium smooth kernel) and the other for fusion with the PET images to perform localization and characterization evaluation of the uptake in the PET images (2 mm slice thickness, 500 mm field of view, safire strength 3, I30f medium smooth kernel).
Postprocessing of the 18 F-FDG-PET images were iteratively reconstructed using both attenuation and scatter correction, and ultra-high definition PET: Siemens True X, time-of-flight and point spread function corrections were applied (2 iterations/21 subsets, matrix size = 400, zoom 1, 2mm gauss filter and 2mm slice thickness).

Data evaluation
Two experienced MRI radiologists and two experienced nuclear medicine physicians evaluated the MRI and 18 F-FDG-PET/CT images using a standardized form.MRI images were evaluated using a standard clinical PACS system (Agfa IMPAX version 6.5 Agfa Healthcare NV, Mortsel, Belgium), and 18 F-FDG-PET/CT images were evaluated using a dedicated workstation (Siemens syngo.viaWorkplace MI, Siemens Healthcare GmbH, Erlangen, Germany).
The observers were instructed to count the number of suspicious lesions/ 18 F-FDG avid foci found in the lymph nodes (collum, axilla, mediastinum, lungs, abdomen, pelvis and inguinal), the organ systems (cerebrum, extra cerebral, collum, lungs and pleura, mediastinum, hepar, vesica fellea, spleen, renes, adrenals, pancreas, genitalia interna and vesica urinariae), the gastrointestinal tract (oesophagus, ventriculus, intestinum, colon and mesenterium), the soft tissues (collum, thorax, abdomen, pelvis and extremities) and the skeleton (cranium and faciem, columna, thorax, pelvis and extremities).MRI suspicious lesions were identified on the WB-DWI images (b=900 s/mm 2 ) with ADC map correlation and compared to the morphological MRI images in respect to anatomy, localization, size and the preexisting CT scan.

18
F-FDG-PET/CT lesions (FDG avid foci) were identified visually as having an 18 F-FDG uptake appreciably higher than the surrounding background to keep the sensitivity as high as possible.For both modalities, all findings were noted on a 4-point scale.A 0 was assigned when no lesion was detected, a 1 was assigned when one lesion was detected, a 2 was assigned when two lesions were detected and 3 whenever three or more lesions were detected.The 4-point scale was chosen for several reasons.It was important to distinguish between no lesion (0) and the presence of lesions in individual organs (1)(2)(3).Thus, the most important step on the scale is from 0-1, while 2-3 was used to give the four reviewers (radiologists and nuclear physicians) the opportunity to indicate the severity of lesions in a given organ.

Histological findings
The combined results from the 18 F-FDG-PET/CT scans and the clinical examinations led to biopsies at specific locations in some of the patients.Due to logistics, not all suspicious lesions identified by either 18 F-FDG-PET/CT and/ or WB-MRI were supported by biopsies.If histology was obtained, this information served as the ground truth when comparing the two modalities.These results were compared at patient level.
Differences between the paired observations (WB-MRI vs. 18 F-FDG-PET/CT) were done using two-sided Wilcoxon signed-rank tests and p values ≤ 0.05 were considered to be statistically significant.In order to do this, we used a worst-case consensus approach in which the reviewer with the highest number of noted suspicious lesions/ 18 F-FDG avid foci in each unique anatomical region was used for the statistical analysis.
The time between the two examinations, the total scan time used for scanning (scan preparation and time spent in each scanner), the time used by each observer for evaluation of the image data and the total radiation dose were also noted.All observers were blinded to the findings of the other raters.
In order to detect a paired difference in the classified findings (0 to 3) between the two examinations of one with a standard deviation of 1.6, a sample size of 23 was needed (power: 80%, significance level: 5%).To account for dropouts, we therefore included a total of 25 patients in the study.All statistical analyses were performed in Microsoft Excel 2016 with the statistical add-in tool pack Analyse-It (Version 4.65.3,Analyse-it Software Ltd, Leeds, England).

Results
Interobserver reliability analysis between the two MRI radiologists yielded weighted kappa values of 0.26−0.90while the nuclear physicians yielded values of 0.26−0.88,implicating fair to strong reliability in both modalities.The Wilcoxon Signed-rank tests found statistically significant differences in the colon (p=0.016) the soft tissue of the extremities (p=0.002) and in the skeleton of the extremities (p=0.008).All statistical data can be found in Table 1.
Histological tests were performed in 12 of the 25 patients.Table 2 shows the histological findings of primary cancer and the complimentary findings using the two modalities.Discrepancies between modalities were found in two patients (Patient #3 and #13) which both had prostate cancer.One of them was identified by 18 F-FDG-PET/CT and not WB-MRI while the other was identified by only one of the MRI radiologists and neither of the nuclear medical physicians.Finally, both modalities missed a renal cell carcinoma (Patient#14).Assessing conservatively positivity whenever at least one MRI radiologist or one nuclear medical physician did so lead to a sensitivity for both MRI and PET/CT of 83.3% (95% CI: 55.2-95.3).
The average time between the WB-MRI and 18 F-FDG-PET/ CT examinations was 3.8±3.1 days (range: 0-12 days).The mean preparation time for the WB-MRI and 18 F-FDG-PET/ CT examinations were 12.6 ± 3.1 min and 10 ± 0.0 min, while the total scan time were 75.5 ± 9.6 min and 31.6 Table 1 Statistical calculations for hot spot lesion detection using whole body MRI and 18 F-FDG-PET/CT.Weighted kappa for interobserver reliability between two radiologists (MRI) and two nuclear medicine physicians.Two-sided Wilcoxon signed-rank tests were used to detect differences between paired observation (MRI vs. 18   A positive sign (+) means that the suspicious lesions/ 18 F-FDG avid foci were found by both raters; A negative sign (-) means that the malignancy was missed by both raters.Grey markings are patients with positive histological findings; 'No malignancy' indicates that neither of the modalities found any suspicious lesions/ 18 F-FDG avid foci; Gender: = female; = male; * In this patient, one WB-MRI rater found the lesion while the other missed it.
± 2.0 min respectively.The mean time spent for image evaluation was 119 ± 78 min for the MRI and 9 ± 4 min for the 18 F-FDG-PET/CT evaluation.
The total radiation dose for the 18 F-FDG-PET/CT examinations, used for this type of examination is (mSv ± standard deviation) 11.7±2.3mSv (6.1 mSv from the low dose CT and 5.6 mSv from the 18 F-FDG injection).For radiation friendly screening purposes, the clinical value of Whole-Body MRI (WB-MRI) in detection of suspicious lesions is similar to avid foci detection in whole-body 18 F-FDG-PET/CT.The WB-MRI technique is slow and data analysis cumbersome compared to the applied nuclear imaging technique.However, the MRI approach could be feasible in vulnerable patient subgroups such as young patients, during pregnancy or dysregulated diabetics.

Discussion
Fair to strong interobserver reliability (weighted kappa values of 0.26 to 0.88-0.90)were found between each pair of specialists.In this context, kappa values cannot be directly used to discriminate between random differences and systematic differences.Therefore, any disagreement caused by chance cannot be separated from any consistent pattern in the sampled data [19].Larger disagreements (e.g.scale values 1 vs 4) lead to larger decreases in weighted kappa than slight disagreements (e.g.scale values 2 vs 3 or 3 vs 4).To this end, as much information as possible was derived from the data by analyzing these on the highest scale available, meaning the 4-point scale (in opposition to dichotomizing the scale into two categories 1 vs 2-4).Thus, the kappa values indicate that raters of both modalities do not always agree and the interobserver variability is approximately the same.
Our main finding was that there was no statistical difference between WB-MRI and 18 F-FDG-PET/CT in regard to the detection of suspicious lesions and 18 F-FDG avid foci in 32 out of 35 unique anatomical regions including the lymph nodes, the organ systems, gastrointestinal tract, the soft tissues of the trunk and the skeleton.Figure 1 shows an example of the two modalities with converging findings in a patient with malignant disease in the liver and the lungs.These findings seem to indicate that WB-MRI can be used as an alternative non-ionizing screening tool and is supported by the findings of Stecco et al. [20] who underlined the usefulness of WB-MRI for cancer screening, staging and follow-up.A statistically significant difference between the two modalities was, among others, found in the colon (p = 0.016).Figure 2 shows an example of a patient with an 18 F-FDG avid focus in the colon while the MRI examination found no noticeable pathology.The nuclear medicine physicians noted that their findings may be physiologic in origin and therefore possibly benign.In comparison, the suspicious lesion found on the DWI images (b=900 s/mm 2 ) had no ADC correlation and therefore the MRI radiologists noted the finding as benign.In this particular case, the suspicious lesions were most likely caused by a T 2 -shinethrough effect and therefore not marked as a suspicious on the MRI scan.Nevertheless, as no biopsy was carried out for the specific lesion, it can only be stated that the two modalities differ in this particular case and not which modality were actually right.Statistically significant differences between the two modalities were also found in the soft tissue of the extremities (p = 0.002).A total of ten patients had 18 F-FDG avid foci, while only one patient had a single suspicious lesion identified using WB-MRI.The 18 F-FDG-PET/CT findings included tissue in the proximity of the humerus, the knees, the feet and the skin of the extremities.The discrepancy between the findings may have several origins.Firstly, the arms of the patients were placed in the outer rims of the magnetic field of view causing a clear visual reduction in image quality (lower Signal to Noise and contrast).Secondly, images of the legs were carried out using MRI coils which are not optimized for imaging the limbs as e.g.dedicated knee or foot coils are.Thus, in several cases, the area with positive findings by the nuclear technique was either poorly imaged (e.g.noisy images or lacked contrast) or even missing from the MRI data due to the physical placement on the rim of the signal sensitive parts of the MRI receive coils.Acquiring data from these areas more sufficiently with MRI, the total scan time would have to be extended.Even though it seems unlikely, the 18 F-FDG-PET/CT may have yielded false positive lesions.Hence, as there are no biopsy reports from these areas, it can only be stated that the two modalities differ.Figure 3 shows an example of the issue.Finally, statistically significant differences were seen between lesions in the skeleton of the extremities (p = 0.008).In this case, the 18 F-FDG-PET/CT technique found eight patients with 18 F-FDG avid foci in the skeleton while MRI found one with a suspicious lesion.The differences can again, in part, be explained by the poor image quality of the extremities using MRI.
Based on our findings, we do not recommend that future WB-MRI protocols for screening purposes include the extremities as the images are of a non-diagnostic quality.Furthermore, withdrawal of the extremities from the imaging protocols will reduce the total MRI scan time substantially.The same may hold true at higher magnetic field strengths.The next step, if any suspicious lesion is identified, is to carry out dedicated MRI protocols targeted at the suspected pathology.
Comparing the findings of the two modalities and the histological tests, three cases exhibited discrepancies between the modalities as evident from Table 2. Histological evidence indicated that two of these patients had prostate cancer (Patient #3 and #13).One of them was identified by 18 F-FDG-PET/CT and not by WB-MRI while the other was identified by only one of the MRI radiologists and neither of the nuclear physicians.It is well-known that many prostate cancers do not present with increased 18 F-FDG uptake and therefore 18 F-FDG-PET/CT is not part of the prostate cancer staging routine.In regards to MRI, the setup of the MRI screening protocol is limited by several factors.The combination of a higher b-value (1,500-2,000s/ mm 2 ) and the addition of a T 2 -weighted sequence may have provided a more sensitive tumor detection [21].It should be noted that using a contrast agent for tumor detection may not have added to the detection of the cancer as there is an overlap in enhancement patterns with prostatitis and benign prostate hyperplasia nodules [22].Finally, optimizing the detection of malignant disease in the prostate also calls for a dedicated coil (e.g. an endorectal coil).We applied a larger body coil as part of the WB-MRI screening setup.
Both modalities missed a renal cell carcinoma (Patient#14).The patient had a large cystic tumor on the lower pole of the left kidney. 18F-FDG activity was present in the periphery of the tumor but was interpreted as physiological FDGexcretion in the urine.In general, physiological 18 F-FDG in the urine hampers the assessment of urinary system abnormalities.On the MRI scans, it is well-known that renal cell carcinoma is not well seen and cystic changes are often challenged by T 2 -shine though effects.In this context, Galia et al. [23] pointed out that the detection of renal cell carcinoma can be challenging also in non-cystic lesions since it poorly uptakes 18 F-FDG and can show no restricted pattern of diffusion.In their work, they reported a case of renal cell carcinoma missed by PET and whole body DWI but detected by post-contrast MRI images.Thus, using contrast enhanced MRI may have contributed to the detection of the tumor while diffusion weighted imaging still needs to be further evaluated [24].
The two scan techniques yielded similar preparation times of approximately 10-13 min.However, the total scan time varied between the two modalities.MRI is a rather time-consuming modality compared to other imaging techniques.The average active MRI scan time of 75 min is more than twice as long compared to the nuclear modality of approximately 32 min and also about the maximum time a patient can withstand to be inside the MRI scanner bore.In this perspective, 18 F-FDG-PET/CT is much more patient friendly as the entire scan session can be completed within approximately half an hour.
The time the observers spent for image evaluation varied vastly between the two modalities.The nuclear medicine physicians spend in average less than 9 min for evaluation while the MRI radiologists spend in average 75 and 162 min respectively.The large discrepancy can in part be found in the way the images were evaluated.The nuclear physicians only registered the time it took to systematically look through the images and detect the number and anatomical localization of 18 F-FDG avid foci.Thus, the noted time does not represent a full clinical evaluation.The radiologists on the other hand registered the gross time they spent including the clinical evaluation of each patient.Regardless of how the evaluation times were noted, the difference clearly shows that WB-MRI is more time-consuming, not only the acquisition time, but also the image evaluation.
The average radiation exposure from an 18 F-FDG-PET/CT examination is on average 11.7 mSv.This value is based on a low dose CT scan from the 18 F-FDG-PET/CT scan as a full dose diagnostic CT scan had already been obtained at the referring hospital as part of the initial diagnosing.MRI does not expose the patient to any ionizing radiation and this is a major advantage of this modality and the argument for MRI's superiority in radiation-sensitive patients, e.g.children and pregnant women.As the majority of patients suspected of malignant disease are older this issue is of minor importance.Like 18 F-FDG-PET/CT, WB-MRI protocol does not require the use of a contrast agent making both modalities for patients with reduced renal function to reduce the risk of Nephrogenic systemic fibrosis [25].
The study has a clear limitation as only about half of the patients had a malignant disease and that not all suspicious lesions could be verified by biopsy.

Conclusion
A WB-MRI examination using dedicated DWI sequences at a field strength of 1.5T detects suspicious lesions similar to FDG-PET/CT.However, the MRI technique is slow and data analysis cumbersome compared to the applied nuclear imaging technique.Nevertheless, for determining the presence of malignancy or other pathology, WB-MRI could be an alternative non-ionizing modality to standard wholebody 18 F-FDG-PET/CT in vulnerable patient subgroups such as young patients, during pregnancy or dysregulated diabetics.

Figure 1
Figure 1 Example of a patient with coincidence of suspicious lesions/ 18 F-FDG avid foci in the liver and lungs using 18 F-FDG-PET/CT and diffusion weighted MRI.(a) 3D whole-body 18 F-FDG-PET/CT scan, (b) 3D Diffusion Weighted Whole-body Imaging with Background Body Signal Suppression (DWIBS), (c) axial 18 F-FDG-PET slice of the liver, (d) axial DWI slice with a b-value of 900, (e) merged 18 F-FDG-PET/CT and (f) is an axial ADC map.

Figure 2
Figure 2 Example of a patient with 18 F-FDG avid foci found in the colon while the MRI examination found no noticeable pathology.Green arrows point to the specific anatomical area of interest of the sigmoid in the lower left side of the abdomen.(a and c) Axial and coronal slice planes from the 18 F-FDG-PET/CT.(b) Axial image from the Whole-body MRI diffusion sequence (b=900 s/mm 2 ).(d) Corresponding ADC map.The diffusion weighted MRI images indicated that the suspicious lesion was a T2-shine-through effect and therefore not malignant.

Figure 3
Figure 3 Example of a patient with an 18 F-FDG avid focus in the soft tissue of the right arm while the MRI examination found no noticeable pathology.(a) Axial 18 F-FDG-PET/CT image.The green arrow points to the area of an 18 F-FDG avid focus.(b) Axial diffusion weighted MRI (b = 900 s/mm 2 ) of the same image slice, but without any hyperintensity.The reduction in MRI signal intensity was caused by the location of the anatomy on the outer rims of the field of view.
A CI = confidence intervals; B N/A = not applicable; P values ≤ 0.05 were considered to be statistically significant.Bold p-values indicate statistically significant differences.