Advertisement

Evaluating the plan quality of a general head-and-neck knowledge-based planning model versus separate unilateral/bilateral models

Open AccessPublished:November 16, 2022DOI:https://doi.org/10.1016/j.meddos.2022.10.002

      ABSTRACT

      The implementation of knowledge-based planning (KBP) continues to grow in radiotherapy clinics. KBP guides radiation treatment design by generating clinically acceptable plans in a timely and resource-efficient manner. The role of multiple KBP models tailored for variations within a disease site remains undefined in part because of the substantial effort and number of training cases required to create a high-quality KBP model. In this study, our aim was to explore whether site-specific KBP models lead to clinically meaningful differences in plan quality for head-and-neck (HN) patients when compared to a general model. One KBP model was created from prior volumetric-modulated arc therapy (VMAT) cases that treated unilateral HN lymph nodes while another model was created from VMAT cases that treated bilateral HN nodes. Thirty cases from each model (60 cases total) were randomly selected to create a third, general model. These models were applied to 60 HN test cases – 30 unilateral and 30 bilateral – to generate 180 VMAT plans in Eclipse. Clinically relevant dose metrics were compared between models. Paired-sample t-tests were used for statistical analysis, with the threshold for statistical significance set a priori at 0.007, taking into consideration multiple hypothesis testing to avoid type I error. For unilateral test cases, the unilateral model-generated plans had significantly lower spinal cord maximum doses (12.1 Gy vs 19.3 Gy, p < 0.001) and oral cavity mean doses (20.8 Gy vs 23.0 Gy, p < 0.001), compared with the bilateral model-generated plans. The unilateral and general models generated comparable plans for unilateral HN test cases. For bilateral test cases, the bilateral model created plans had significantly lower brainstem maximum doses (10.8 Gy vs 12.2 Gy, p < 0.001) and parotid mean doses (24.0 Gy vs 25.5 Gy, p < 0.001) when compared to the unilateral model. Right parotid mean doses were lower for bilateral model plans compared to general model plans (23.8 Gy vs 24.4 Gy). The general model created plans with significantly lower brainstem maximum doses (10.3 Gy vs 10.8 Gy) and oral cavity mean doses (35.3 Gy vs 36.7 Gy) when compared with bilateral model-generated plans. The general model outperformed the bilateral model in several dose metrics but they were not deemed clinically significant. For both case sets, the unilateral and general model created plans had higher monitor units when compared to the bilateral model, likely due to more stringent constraint settings. All other dose metrics were comparable. This study demonstrates that a balanced general HN model created using carefully curated treatment plans can produce high quality plans comparable to dedicated unilateral and bilateral models.

      Keywords

      Introduction

      Cancer continues to be the second-leading cause of death in the United States. Treatment commonly includes chemotherapy, immunotherapy, surgery and radiation therapy, or a combination of therapies. According to the Agency for Healthcare Research and Quality, radiation therapy is offered to nearly 75% of patients diagnosed with head and neck (HN) cancer, which makes HN cancer a common disease site treated in radiotherapy clinics. The development of effective radiation treatment plans for patients with HN cancer is critical. Radiation treatment planning is a difficult process that requires meticulous care to provide maximum patient benefit with minimal side effects. Many factors are considered when treatment planning for HN cancers, including the size of target(s), location of target(s), proximity to organs at risk (OARs), and prescribed dose. Due to the complexity of HN planning, the quality of manual treatment plans may vary greatly. A few factors that can have a negative impact on plan quality include insufficient dosimetry staffing, shortened deadlines, and inexperienced planners.
      With the increasing incidence of cancer, the Medical Dosimetry Workforce Study predicts that there will be a shortage of medical dosimetrists from 2021 to 2035.

      Mills, M., Executive summary - medical dosimetry workforce study. 2021.

      This shortage will lead to increased workloads for medical dosimetrists across the country. Radiotherapy clinics are typically fast-paced environments, staffing shortages and increased workloads combined with short planning deadlines can lead to rushed or sub-optimal treatment plans. Additionally, plan quality may suffer because inexperienced planners lack the planning expertise to navigate complex optimizations within the planning software to manually arrive at the optimal tradeoffs between target coverage and the sparing of multiple OARs. Suboptimal treatment plans may compromise tumor control and increase the rates of normal organ toxicities.
      • Peters L.J.
      • et al.
      Critical impact of radiotherapy protocol compliance and quality in the treatment of advanced head and neck cancer: Results from TROG 02.02.
      Knowledge-based planning (KBP) has been developed to address the barriers of planner inexperience and time constraints to more consistently and efficiently generate high-quality treatment plans. Various forms of KBP have been proposed over the last decade, including geometry-based patient referencing,
      • Wu B.
      • et al.
      Data-driven approach to generating achievable dose–volume histogram objectives in intensity-modulated radiotherapy planning.

      Wu, B., et al. Using overlap volume histogram and IMRT plan data to guide and automate VMAT planning: A head-and-neck case study. 2013. 40:021714.

      • Zhang J.
      • et al.
      Knowledge-based statistical inference method for plan quality quantification.
      • Zhang J.
      • et al.
      Knowledge-Based Tradeoff Hyperplanes for Head and Neck Treatment Planning.
      • Chanyavanich V.
      • et al.
      Knowledge-based IMRT treatment planning for prostate cancer.
      model-driven DVH prediction,

      Zhu, X., et al. A planning quality evaluation tool for prostate adaptive IMRT based on machine learning. 2011. 38:719-726.

      • Appenzoller L.M.
      • et al.
      Predicting dose-volume histograms for organs-at-risk in IMRT planning.

      Yuan, L., et al. Quantitative analysis of the factors which affect the interpatient organ-at-risk dose sparing variation in IMRT plans. 2012. 39:6868-6878.

      • Zhang J.
      • et al.
      Modeling of multiple planning target volumes for head and neck treatments in knowledge-based treatment planning.
      three-dimensional dose prediction,
      • Nguyen D.
      • et al.
      3D radiotherapy dose prediction on head and neck cancer patients with a hierarchically densely connected U-net deep learning architecture.
      ,
      • Jensen P.J.
      • et al.
      A novel machine learning model for dose prediction in prostate volumetric modulated arc therapy using output initialization and optimization priorities.
      direct beam fluence prediction,
      • Li X.
      • et al.
      Automatic IMRT planning via static field fluence prediction (AIP-SFFP): A deep learning algorithm for real-time prostate treatment planning.
      • Wang W.
      • et al.
      Fluence map prediction using deep learning models – direct plan generation for pancreas stereotactic body radiation therapy.
      • Ma L.
      • et al.
      Deep learning-based inverse mapping for fluence map prediction.
      and reinforcement learning approaches that simulate planner-TPS interactions.
      • Shen C.
      • Chen L.
      • Jia X.
      A hierarchical deep reinforcement learning framework for intelligent automatic treatment planning of prostate cancer intensity modulated radiation therapy.
      ,
      • Zhang J.
      • et al.
      An interpretable planning bot for pancreas stereotactic body radiation therapy.
      One implementation, commonly known as RapidPlan (Varian Medical Systems, CA), predicts estimated dose-volume histograms (DVH) for OARs for a specific treatment site which can guide the dose constraint settings to increase consistency and reduce variability among plans. RapidPlan is increasingly used in radiotherapy clinics worldwide and has been shown to reduce optimization time independent of a planner's experience.
      • Kubo K.
      • et al.
      Dosimetric comparison of RapidPlan and manually optimized plans in volumetric modulated arc therapy for prostate cancer.
      A study conducted by Chang et al. found that planning efficiency is significantly improved with the utilization of KBP-generated plans compared to manual planning for nasopharyngeal cancers.
      • Chang A.T.Y.
      • et al.
      Comparison of planning quality and efficiency between conventional and knowledge-based algorithms in nasopharyngeal cancer patients using intensity modulated radiation therapy.
      It has also been shown that KBP can produce the best results when the plan that is applied to the model has similar geometric characteristics to the cases which were involved in the creation of the model.
      • Tol J.P.
      • et al.
      Evaluation of a knowledge-based planning solution for head and neck cancer.
      An intensity-modulated radiation therapy liver study found that a specific KBP model can improve conformity to the target and reduce OAR dose compared to a generic model.
      • Yu G.
      • et al.
      Knowledge-based IMRT planning for individual liver cancer patients using a novel specific model.
      However, the topic of site-specific models, specifically when applied to VMAT HN cases has not been studied. According to Gang et al.,
      • Yu G.
      • et al.
      Knowledge-based IMRT planning for individual liver cancer patients using a novel specific model.
      RapidPlan results depend on a variety of factors but the most important are the region of interest and the plan quality of the training cases. For this study, HN cancers are broadly grouped into 2 categories based on unilateral or bilateral nodal treatment. The aim of this study was to evaluate the necessity of creating separate unilateral and bilateral models when treating HN cancers. We utilized three HN models consisting of high quality, clinically approved plans, one specifically for unilateral cases, one specifically for bilateral cases, and a general model which was a combination of the unilateral and bilateral models.

      Methods and Materials

      Model creation

      Clinically approved VMAT HN plans were collected from a single institution and added to the appropriate site-specific HN KBP model, unilateral or bilateral. While the plans used to train each model varied in the prescription dose levels, each plan was normalized so that the prescription dose covered 95% of the highest-dose planning target volume (PTV). The unilateral and bilateral HN models used in this study were trained using 98 and 175 HN prior plans, respectively. Each of these plans achieved the institutional clinical goals and thus represent a broad sample of high-quality HN treatment plans. Nasopharynx cases were excluded from the models.
      Optimization objectives are an essential component of RapidPlan models. Custom optimization objectives and priorities were entered for all the relevant OARs. The target objectives and priorities used for optimization with each model are in shown Tables 1 and 2. Most OARs included in each model have a mean and a line objective associated, and most OARs have strategically placed upper objectives at specific doses to limit the maximum doses. All upper objectives, for each model, were set to a fixed volume and generated dose determined by the KBP model. The mean objectives were predicted by the KBP model. Each line objective was chosen to prefer target coverage except for the brainstem, brainstem planning risk volume (PRV), spinal cord, and spinal cord PRV for each model. Each OAR had a fixed priority determined by expert dosimetrists. To improve dose conformity and fall-off the manual normal tissue objective (NTO) was used. An expert dosimetrist determined the values for the NTO. The models were trained in order to generate DVH estimations.
      Table 1Unilateral and general model target objectives
      IDVol (%)Dose (%)Priority
      PTV_high
      Upper0105200
      Upper3104125
      Lower99.5103.2200
      Lower99.9103.1200
      Lower100103200
      Lower100103200
      PTV_intermediate
      Lower99.5103.2200
      Lower99.9103.1200
      Lower100103200
      Lower100103200
      PTV_low
      Lower99.5103.2200
      Lower99.9103.1200
      Lower100103200
      Lower100103200
      zPTV_intermediate
      Upper0105200
      Upper3104125
      Lower99.5103.2200
      Lower99.9103.1200
      Lower100103200
      Lower100103200
      zPTV_intermediate
      Upper0105200
      Upper3104125
      Lower99.5103.2200
      Lower99.9103.1200
      Lowe r100103200
      Lower100103200
      Table 2Bilateral model target objectives
      IDVol (%)Dose (%)Priority
      PTV_high
      Upper0105200
      Upper0105125
      Upper3104200
      Lower100103200
      Lower100103200
      PTV_intermediate
      Lower100103200
      Lower100103200
      PTV_Low
      Lower100103200
      Lower100103200
      zPTV_intermediate
      Upper0105200
      Upper0105125
      Upper3104200
      Lower100103200
      Lower100103200
      zPTV_lower
      Upper0105200
      Upper0105125
      Upper3104200
      Lower100103200
      Lower100103200
      The unilateral and bilateral model went through a filtering process where each training case was evaluated based on plan quality metrics. Suboptimal plans, identified based on outlier statistics and model training results, were removed or re-planned using the guidance of the newly created model to improve clinical evaluation criteria. A re-optimized plan was then added back into the model and the original plan was removed. Both models were clinically validated for up to 4 (PTV) levels and determined safe for clinical use at this institution.
      To create the general model, 30 cases were randomly selected from the unilateral model and 30 cases from the bilateral model. All structures were matched exactly in the same manner as in their original model. The unilateral plan template was used for the general model which includes target objectives/priorities, OARs objectives/priorities, and NTO. The posterior avoid structure was added to the objective template with the same priorities as the bilateral model.

      Optimization structures

      It was attempted to minimize the number of additional optimization structures but ultimately several were used for each model. Optimization PTVs were used to ensure target coverage and improve dose uniformity. Optimization PTVs were cropped away from higher dose levels by 2 to 5 millimeters (mm) depending on the proximity and difference in prescription levels. If the dose separation was between 3 Gy to 7 Gy, the optimization PTVs were cropped away 2 to 3 mm and if the separation was greater than 7 Gy, optimization PTVs were cropped 4 to 5 mm. Each optimization PTV was cropped 4 mm from the skin surface. Additionally, a 5 mm inner ring structure of each dose level was created, cropped similarly to the optimization PTVs. For the bilateral and general models only, a posterior avoidance structure was used to reduce dose to the posterior neck. The posterior avoidance structure was created by combining the spinal cord PRV and the brainstem PRV structures and then expanding them in all directions by 5 mm. Next, the posterior avoidance structure was expanded 10 centimeters posteriorly and cropped outside of the body. Lastly, the posterior avoidance structure was cropped 5 mm away from each PTV level.

      Patient selection

      Model testing was performed using 30 unilateral and 30 bilateral HN test cases. Each test case was unique to its respective model and was not involved in the model creation process.

      VMAT treatment planning

      The Varian Eclipse treatment planning system V15.6 was used for this study: photon optimizer was used for VMAT optimization; Anisotropic Analytical Algorithm was used for the dose calculation on a 2.5 mm grid. All unilateral test cases were planned using 2 or more partial arcs restricted to the targeted side to minimize the path length through healthy tissue to the targets while full arcs were used for the bilateral cases. Plans were optimized using convergence mode and restarted at level MR3 following the intermediate dose calculation. Convergence mode increases the number of iterations within each MR level prior to advancing to the next MR level. Within the optimizer, the appropriate KBP model was selected, dose levels were defined for target structures, and all relevant OARs were matched to the model. Any OAR that fully overlapped with one or more of the PTVs was unmatched prior to optimization. From this point, automatic optimization mode and automatic intermediate dose were set to ON and optimization objectives were not changed from those generated by the model. After optimization and the final dose calculation, each plan was normalized such that 100% of the prescription dose covered 95% of the highest prescription volume. If a fourth prescription dose level was required, the optimization PTV for the lowest dose was attached twice. Due to limitations of the RapidPlan software, only a total of 3 PTV levels are available in the Model Configuration workspace.

      Plan comparison metrics

      Key DVH statistics, total plan monitor units (MU), and global hot spot (Dmax) were collected. A paired-sample t-test was used for statistical analysis and the differences were considered significant when p < 0.007, with Bonferroni correction for multi-hypothesis testing.

      Results

      A summary of the regression and chi-squared training results for the unilateral, bilateral, and general HN models are found in Table 3, 4, and 5, respectively. R2 is a regression analysis that measures the goodness of fit for each structure, R2 can range from 0 to 1 where a value of 1 is a perfect fit. The lowest R2 for the unilateral model was 0.310 for the spinal cord and the highest was 0.973 for the left parotid. For the bilateral model, the lowest and the highest R2 were 0.251 and 0.966 for the brainstem and right eye, respectively. The general models lowest and highest R2 values were for the spinal cord (0.355) and the brachial plexus (0.958), respectively. Chi-squared (χ2) indicates the difference between the observed and expected frequencies or in this case the difference between the OARs in the model and their in-model predictions. A χ2 value of 1 would indicate a perfect fit. The lowest and highest χ2 values for the unilateral model were for the right optic nerve (1.004) and the optic chiasm (1.167); respectively. The bilateral models lowest and highest χ2 values were for the left eye (1.000) and the pharynx (1.113). The lowest χ2 for the general model was 1.034 for the left cochlea and the highest was 1.335 for the brachial plexus. Structures with insufficient in-field matches were excluded from this analysis.
      Table 3Summary of the unilateral HN model training results. There were 98 plans included in the unilateral HN model
      StructureR2χ2
      Mandible0.8081.016
      L&R brachial plexus0.8881.041
      Brainstem0.7151.062
      Brainstem PRV030.8061.094
      Oral cavity0.8711.042
      L cochlea0.9091.052
      R cochlea0.9111.036
      Esophagus0.9181.053
      L eye0.9061.076
      R eye0.7221.012
      L submandibular gland0.9111.038
      R Submandibular gland0.8691.068
      Larynx0.8351.056
      L lens0.8101.027
      R lens0.5871.030
      Lips0.8581.100
      Optic chiasm0.7341.167
      L optic nerve0.4941.006
      R optic nerve0.5971.004
      L parotid0.9731.044
      R parotid0.9451.055
      Pharynx0.7771.068
      Spinal cord0.3101.053
      Spinal cord PRV050.3191.039
      Thyroid0.8421.012
      Table 4Summary of the bilateral HN model training results. There were 175 plans included in the bilateral HN model
      StructureR2χ2
      Mandible0.6451.046
      L brachial plexus0.8181.105
      R brachial plexus0.8001.083
      Brainstem0.2511.014
      Brainstem PRV030.5511.038
      Oral cavity0.6441.049
      L cochlea0.8181.013
      R cochlea0.7711.025
      Esophagus0.5501.079
      L eye0.9201.000
      R eye0.9661.044
      L submandibular gland0.6121.074
      R submandibular gland0.3641.101
      Larynx0.7781.053
      L lens0.7921.006
      R lens0.9501.011
      Lips0.8701.068
      Optic chiasm
      Not enough in-field matches
      0.0000.000
      L optic nerve
      Not enough in-field matches
      0.0000.000
      R optic nerve
      Not enough in-field matches
      0.0000.000
      L parotid0.8081.044
      R parotid0.7701.040
      Pharynx0.5631.113
      Spinal cord0.3121.066
      Spinal cord PRV050.3331.056
      Thyroid0.5271.047
      Posterior avoid structure0.2551.061
      low asterisk Not enough in-field matches
      Table 5Summary of the general HN model training results. There were 60 plans included in the general HN model
      StructureR2χ2
      Mandible0.8231.088
      L&R brachial plexus0.9581.335
      Brainstem0.7211.077
      Brainstem PRV030.7111.062
      Oral cavity0.7641.064
      L cochlea0.8601.034
      R cochlea0.6321.037
      Esophagus0.6371.095
      L eye0.9561.098
      R eye
      Not enough in-field matches
      0.0000.000
      L submandibular gland0.9381.102
      R submandibular gland0.9231.046
      Larynx0.8301.110
      L lens
      Not enough in-field matches
      0.0000.000
      R lens
      Not enough in-field matches
      0.0000.000
      Lips0.6321.036
      Optic chiasm
      Not enough in-field matches
      0.0000.000
      L optic nerve
      Not enough in-field matches
      0.0000.000
      R optic nerve
      Not enough in-field matches
      0.0000.000
      L parotid0.9411.054
      R parotid0.9551.091
      Pharynx0.8621.284
      Spinal cord0.3551.059
      Spinal cord PRV050.4191.054
      Thyroid0.9021.298
      Posterior avoid structure0.7421.313
      low asterisk Not enough in-field matches
      The dosimetry metrics regarding MU, Dmax, and OAR sparing for unilateral, bilateral, and general test cases are displayed in Tables 6 and 7. A paired sample t-test with a threshold for statistical significance set a priori at 0.007 was used for statistical analysis. All plan metrics listed are within the normal clinical acceptance range and sufficient coverage was achieved to the PTV(s). Table 6 summarizes the dosimetric endpoints for the 30 unilateral test cases when planned with the unilateral, bilateral, and general KBP models. There is a significantly lower spinal cord maximum dose (12.1 Gy vs 19.3 Gy, p < 0.001) and oral cavity mean dose (20.8 Gy vs 23.0 Gy, p < 0.001) for the unilateral model-generated plans compared to bilateral model-generated plans. There were no significant differences between the unilateral and general models. Table 7 shows results from the 30 bilateral test cases when planned with the unilateral, bilateral, and general KBP models. These results also show that there is a significantly lower brainstem maximum (10.8 Gy vs 12.2 Gy, p < 0.001) and parotid mean doses (24.0 Gy vs 25.5 Gy, p < 0.001) for the bilateral model-generated plans compared to the unilateral model-generated plans. The general model created plans with significantly lower brainstem doses (10.3 Gy vs 10.8 Gy) and oral cavity mean doses (35.3 Gy vs 36.7 Gy) compared to the bilateral model. While the bilateral model outperformed the general model for the right parotid mean dose (23.8 Gy vs 24.4 Gy). The unilateral and general model-created plans with higher MUs for both case sets (p < 0.001) when compared to the bilateral model-generated plans.
      Table 6Dosimetric comparison results for the 30 unilateral test cases. The differences were considered significant when p < 0.007
      Unilateral test casesUnilateral modelBilateral modelGeneral modelp-value (between unilateral and general models)p-value (between unilateral and bilateral models)
      MU717.8630.0731.60.014<0.001
      Denotes a significant result
      Dmax (%)110.4109.9110.20.510.21
      Brainstem Dmax (Gy)8.99.18.70.640.59
      Spinal cord Dmax (Gy)12.119.311.20.013<0.001
      Denotes a significant result
      Oral cavity Dmean (Gy)20.823.023.11.00<0.001
      Denotes a significant result
      L parotid Dmean (Gy)13.211.913.10.470.17
      R parotid Dmean (Gy)19.418.619.20.370.25
      low asterisk Denotes a significant result
      Table 7Dosimetric comparison results for the 30 bilateral test cases. The differences were considered significant when p < 0.007
      Bilateral test casesUnilateral modelBilateral modelGeneral modelp-value (between bilateral and general models)p-value (between unilateral and bilateral models)
      MU925.3854.3946.5<0.001
      Denotes a significant result
      <0.001
      Denotes a significant result
      Dmax (%)110.0110.5110.50.810.21
      Brainstem Dmax (Gy)12.210.810.3<0.001
      Denotes a significant result
      <0.001
      Denotes a significant result
      Spinal cord Dmax (Gy)15.815.114.80.110.11
      Oral cavity Dmean (Gy)36.736.735.3<0.001
      Denotes a significant result
      0.99
      L parotid Dmean (Gy)26.224.224.40.38<0.001
      Denotes a significant result
      R parotid Dmean (Gy)24.823.824.4<0.001
      Denotes a significant result
      <0.001
      Denotes a significant result
      low asterisk Denotes a significant result
      Fig. 1 shows a representative unilateral test case. The dose distributions from the unilateral, bilateral, general models are shown in the transversal, sagittal, and coronal planes. In the sagittal views of Fig. 1, it can be seen that the unilateral and general model reduces the low dose wash in the region of the oral cavity compared to the bilateral model. Fig. 2 shows the transversal, sagittal, and coronal dose distribution for a unilateral, bilateral, general model-generated plan, respectively, applied to a bilateral test case. The transversal view demonstrates less low dose in the region of the oral cavity using the bilateral model, however, this reduction in dose was not a statistically significant result. In the sagittal view there is less low dose in the posterior neck region when using the bilateral model compared to the other two models. Dose to the posterior neck region was not analyzed in this study.
      Fig 1
      Fig. 1Example dose distribution for a unilateral test case. The top row was planned using the unilateral model, the middle row using the bilateral model, and the bottom row using the general model. From left to right, the images are transversal, sagittal, and coronal. The structures outlined are PTVs that were used for planning, delineated by a radiation oncologist.
      Fig 2
      Fig. 2Example dose distribution for a bilateral test case. The top row was planned using the unilateral model, the middle row using the bilateral model, and the bottom row using the general model. From left to right, the images are transversal, sagittal, and coronal. The structures outlined are PTVs that were used for planning, delineated by a radiation oncologist.

      Discussion

      The field of radiation oncology is rapidly evolving. Every evolution comes with new tools and ideas that can help enhance patient treatment outcomes. RapidPlan is a tool that has been proven to assist in the generation of clinically acceptable treatment plans for a number of different disease sites based on a patient's specific anatomy. Prior studies have demonstrated that RapidPlan provides increased consistency and reduces variability among treatment plans compared to manual planning methods.
      • Chang A.T.Y.
      • et al.
      Comparison of planning quality and efficiency between conventional and knowledge-based algorithms in nasopharyngeal cancer patients using intensity modulated radiation therapy.
      In a study conducted by Cornell et al.
      • Cornell M.
      • et al.
      Noninferiority study of automated knowledge-based planning versus human-driven optimization across multiple disease sites.
      investigating the non-inferiority of RapidPlan-generated plans, it was determined that HN displayed the strongest evidence in favor of automated planning. Physicians selected fully automated HN plans instead of manual plans two thirds of the time because of lower doses to certain OARs. As KBP models, such as RapidPlan, become more prominent in radiotherapy clinics, it is crucial to gain knowledge of KBP model training methodologies in order to further improve plan quality and ultimately patient outcomes.
      The main contribution of this paper is that we have shown that creating a well-tuned general HN model can perform as well as two separate HN models in performing their respective tasks. Therefore, it is not necessary to create two separate models for HN treatment planning. Due to the time commitment of creating and validating a high quality model, our results will allow for a more efficient model creation workflow and will help avoid confusion or accidently selecting the incorrect model during the treatment planning process. Another benefit of creating one general HN model is when updating the model with new cases they can be added and validated on the single model. A limitation of the study design is that all plans used for model training were created at a single institution and therefore reflect the current planning practices of only one institution. There are often tradeoffs for prioritizing specific aspects of planning and not all clinics will have the same planning standards and objectives.
      A potential limitation is heterogeneity in dose and fractionation schedules used in the plans comprising the KBP models and the same is true for the test cases included in the study. The results of this study are fractionation-independent because the OARs that were compared vary proportionally to the total dose delivered. Similarly, the unilateral and bilateral models have slightly different optimization objectives to reflect the different planning approaches for each site. The results of the unilateral and bilateral models in this study are independent of the optimization objectives because each set of objectives helps the respective model achieve the clinical standards at the specific institution.
      Another factor worth considering when building KBP models is outliers. It has been previously shown that outlier cases used in the training dataset can be detrimental to the model generalization if the number of outliers reaches certain thresholds.
      • Delaney A.R.
      • et al.
      Effect of dosimetric outliers on the performance of a commercial knowledge-based planning solution.
      RapidPlan has built-in mechanisms to identify outlier cases based on extracted features of individual patients in comparison with the whole training cohort. In this work, each model has multiple outliers for most structures. OAR outliers were automatically identified by the RapidPlan software for a number of reasons such as a total volume, out-of-field volume, overlap of target volumes, and/or principal component scores falling outside the estimated thresholds. The total volume of each structure can vary based on specific physician contouring guidelines. For example, inclusion of the deep lobes of the parotid or the oral cavity including the mandible and/or base of tongue region alter total volume. Due to the variations in tumor volumes, structures can be completely out of the field as frequently seen with optic structures. The esophagus is often marked as an outlier depending on where the structure contour stops. For instance, if the contour is stopped at the gastroesophageal junction as opposed to 2 centimeters below the most inferior slice of the target this can have a large impact on not only the out of field volume but the total volume as well. A combination of the geometric expected dose and plans DVH can cause the estimates to fall outside of the expected principal component score range. Most of the outliers in the models were not addressed because they were identified for factors not deemed to be critical to model creation and development. The actual DVH curves for each structure was compared to the DVH estimation region and if a structures DVH was significantly above its prediction, that specific structure was removed from the model or re-planned. Each time a plan was re-planned, it was to improve certain OARs with an overall focus on increasing plan quality. It was found that re-planning cases within the training set helped improve the quality of the models. The predictions are based on what is learned in the training set so it is important to include quality plans in the model while ensuring all plans are similar.
      For both case sets, the unilateral and general models generated plans with higher MUs. This is expected due to the more stringent optimization objectives. By placing higher priorities on certain structures, the optimization results in increased modulation and monitor units. Each model had slightly different priorities on target structures and OARs. Unilateral HN gross tumor volumes often do not involve the contralateral side of the body therefore the unilateral model objective template places higher priorities on certain OARs like the side-specific parotid and submandibular glands. In most unilateral HN cases, the beam does not enter directly through the spinal cord or brainstem because all plans were treated using half circular arcs on the affected side. The only dose that these contralateral structures receive is the exit dose so the model has a higher priority to limit dose spill in these areas. In contrast the bilateral model, which utilizes full arcs, has a slightly lower priority on the spinal cord and brainstem because beams enter and exit directly through these structures. While spinal cord dose was acceptable using each model, it is important to consider the long-term effects of radiation therapy, radiation to the neck and spinal cord regions can lead to increased risks of transverse myelitis.
      • Brook I.
      Late side effects of radiation treatment for head and neck cancer.
      Further, an unfortunate subset of patients will develop recurrent disease, and for these patients minimizing dose to healthy tissues in the initial course will broaden the range of irradiation options available.

      Conclusions

      This study demonstrates that a general HN model can perform similarly to unilateral and bilateral HN models. The development of site-specific models is not necessary. We were able to prove the feasibility of using one general HN model allowing for reduced workload for creating and validating models.

      Conflict of Interest

      The author declares no conflicts of interest.

      Acknowledgment

      Supported by Winship Cancer Institute # IRG-21-137-07 -IRG from the American Cancer Society.

      References

      1. Radiotherapy Treatments for Head and Neck Cancer: Update. 2014; Available from: https://effectivehealthcare.ahrq.gov/products/head-neck-cancer-update/research-protocol#:∼:text=Radiation%20therapy%20is%20the%20mainstay,significant%20long%20term%20side%20effects.

      2. Mills, M., Executive summary - medical dosimetry workforce study. 2021.

        • Peters L.J.
        • et al.
        Critical impact of radiotherapy protocol compliance and quality in the treatment of advanced head and neck cancer: Results from TROG 02.02.
        J Clin Oncol. 2010; 28: 2996-3001
        • Wu B.
        • et al.
        Data-driven approach to generating achievable dose–volume histogram objectives in intensity-modulated radiotherapy planning.
        Int J Radiat Oncol Biol Phys. 2011; 79: 1241-1247
      3. Wu, B., et al. Using overlap volume histogram and IMRT plan data to guide and automate VMAT planning: A head-and-neck case study. 2013. 40:021714.

        • Zhang J.
        • et al.
        Knowledge-based statistical inference method for plan quality quantification.
        Technol Cancer Res Treat. 2019; 181533033819857758
        • Zhang J.
        • et al.
        Knowledge-Based Tradeoff Hyperplanes for Head and Neck Treatment Planning.
        Int J Radiat Oncol Biol Phys. 2020; 106: 1095-1103
        • Chanyavanich V.
        • et al.
        Knowledge-based IMRT treatment planning for prostate cancer.
        Med Phys. 2011; 38: 2515-2522
      4. Zhu, X., et al. A planning quality evaluation tool for prostate adaptive IMRT based on machine learning. 2011. 38:719-726.

        • Appenzoller L.M.
        • et al.
        Predicting dose-volume histograms for organs-at-risk in IMRT planning.
        Med Phys. 2012; 39: 7446-7461
      5. Yuan, L., et al. Quantitative analysis of the factors which affect the interpatient organ-at-risk dose sparing variation in IMRT plans. 2012. 39:6868-6878.

        • Zhang J.
        • et al.
        Modeling of multiple planning target volumes for head and neck treatments in knowledge-based treatment planning.
        Med Phys. 2019; 46: 3812-3822
        • Nguyen D.
        • et al.
        3D radiotherapy dose prediction on head and neck cancer patients with a hierarchically densely connected U-net deep learning architecture.
        Phys Med Biol. 2019; 64065020
        • Jensen P.J.
        • et al.
        A novel machine learning model for dose prediction in prostate volumetric modulated arc therapy using output initialization and optimization priorities.
        Front Artif Intell. 2021; 4
        • Li X.
        • et al.
        Automatic IMRT planning via static field fluence prediction (AIP-SFFP): A deep learning algorithm for real-time prostate treatment planning.
        Phys Med Biol. 2020; 65175014
        • Wang W.
        • et al.
        Fluence map prediction using deep learning models – direct plan generation for pancreas stereotactic body radiation therapy.
        Front Artif Intell. 2020; 3
        • Ma L.
        • et al.
        Deep learning-based inverse mapping for fluence map prediction.
        Phys Med Biol. 2020; 65235035
        • Shen C.
        • Chen L.
        • Jia X.
        A hierarchical deep reinforcement learning framework for intelligent automatic treatment planning of prostate cancer intensity modulated radiation therapy.
        Phys Med Biol. 2021; 66134002
        • Zhang J.
        • et al.
        An interpretable planning bot for pancreas stereotactic body radiation therapy.
        Int J Radiat Oncol Biol Phys. 2021; 109: 1076-1085
        • Kubo K.
        • et al.
        Dosimetric comparison of RapidPlan and manually optimized plans in volumetric modulated arc therapy for prostate cancer.
        Phys Med. 2017; 44: 199-204
        • Chang A.T.Y.
        • et al.
        Comparison of planning quality and efficiency between conventional and knowledge-based algorithms in nasopharyngeal cancer patients using intensity modulated radiation therapy.
        Int J Radiat Oncol Biol Phys. 2016; 95: 981-990
        • Tol J.P.
        • et al.
        Evaluation of a knowledge-based planning solution for head and neck cancer.
        Int J Radiat Oncol Biol Phys. 2015; 91: 612-620
        • Yu G.
        • et al.
        Knowledge-based IMRT planning for individual liver cancer patients using a novel specific model.
        Radiat Oncol. 2018; 13: 52
        • Cornell M.
        • et al.
        Noninferiority study of automated knowledge-based planning versus human-driven optimization across multiple disease sites.
        Int J Radiat Oncol Biol Phys. 2020; 106: 430-439
        • Delaney A.R.
        • et al.
        Effect of dosimetric outliers on the performance of a commercial knowledge-based planning solution.
        Int J Radiat Oncol Biol Phys. 2016; 94: 469-477
        • Brook I.
        Late side effects of radiation treatment for head and neck cancer.
        Radiat Oncol J. 2020; 38: 84-92