If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Reprint requests: Paul van Amstel, MD, Emma Children’s Hospital, Amsterdam UMC, University of Amsterdam & Vrije Universiteit Amsterdam, Department of Pediatric Surgery, Meibergdreef 9, 1105AZ Amsterdam, The Netherlands.
Department of Pediatric Surgery, Emma Children’s Hospital, Amsterdam University Medical Centre, University of Amsterdam & Vrije Universiteit Amsterdam, The Netherlands
Department of Pediatric Surgery, Emma Children’s Hospital, Amsterdam University Medical Centre, University of Amsterdam & Vrije Universiteit Amsterdam, The NetherlandsDepartment of Surgery, Northwest Hospital, Alkmaar, The Netherlands
Department of Pediatric Surgery, Emma Children’s Hospital, Amsterdam University Medical Centre, University of Amsterdam & Vrije Universiteit Amsterdam, The Netherlands
Department of Pediatric Surgery, Emma Children’s Hospital, Amsterdam University Medical Centre, University of Amsterdam & Vrije Universiteit Amsterdam, The Netherlands
Department of Pediatric Surgery, Emma Children’s Hospital, Amsterdam University Medical Centre, University of Amsterdam & Vrije Universiteit Amsterdam, The Netherlands
Department of Pediatric Surgery, Emma Children’s Hospital, Amsterdam University Medical Centre, University of Amsterdam & Vrije Universiteit Amsterdam, The Netherlands
Department of Pediatric Surgery, Emma Children’s Hospital, Amsterdam University Medical Centre, University of Amsterdam & Vrije Universiteit Amsterdam, The Netherlands
Several clinical prediction rules have been developed for preoperative differentiation between simple and complex appendicitis in children, as potential treatment strategies differ. This study aimed to externally validate applicable clinical prediction rules that could be used to differentiate between simple and complex appendicitis in children.
Methods
Potential clinical prediction rules were identified by a scoping review of the literature. Clinical prediction rules applicable in our daily practice were subsequently externally validated in a multicenter historical cohort consisting of 1 tertiary center and 1 large teaching hospital. All children (<18 years old) with histopathologically confirmed acute appendicitis between 2013 and 2020 were included. Test results of clinical prediction rules were compared to the gold standard of either simple or complex appendicitis consisting of predefined perioperative and histopathological criteria. Areas under the receiver operating characteristic curves were determined for the selected clinical prediction rules. Areas under the receiver operating characteristic curve >0.7 were considered acceptable and potentially useful.
Results
In total, 31 clinical prediction rules were identified, of which 12 could be evaluated in our cohort consisting of 550 children. The main reason to exclude clinical prediction rules was the use of variables that were not routinely measured in our cohort. In our cohort, 208/550 (38%) were diagnosed with complex appendicitis according to the gold standard. Clinical prediction rules with areas under the receiver operating characteristic curve >0.7 were: Gorter (0.81), Bogaard (0.79), Bröker (0.79), Graham (0.77), Hansson (0.76), BADCF (0.76), and Eddama (0.75).
Conclusion
In this study, clinical prediction rules consisting of a combination of clinical and objective variables had the highest discriminative ability. External validation showed that 7 clinical prediction rules were potentially useful. Integration of these clinical prediction rules in daily practice is proposed to guide decision making regarding treatment strategies.
Introduction
Historically, appendectomy has been standard of care for all children presenting with acute appendicitis. Current international guidelines, however, recommend emergency appendectomy (or drainage procedures in case of appendiceal abscess) for complex appendicitis that should be performed as soon as feasible after diagnosis, versus urgent appendectomy within 24 hours after diagnosis for simple appendicitis.
Association of nonoperative management using antibiotic therapy vs laparoscopic appendectomy with treatment success and disability days in children with uncomplicated appendicitis.
As potential treatment strategies differ according to appendicitis severity, timely and accurate diagnosis of both simple and complex appendicitis has become increasingly important.
To improve differentiation between both types of appendicitis, several clinical, biochemical, and radiological variables that have been found to be insufficient for differentiation as a standalone modality have been combined into clinical prediction rules (CPRs).
Because only recently 2 distinct types of appendicitis are distinguished, few CPRs have been specifically developed to differentiate between simple and complex appendicitis.
The studies in which these CPRs were developed showed promising results, but an overview of available CPRs and especially their external validation are lacking.
It is hypothesized that CPRs are useful to accurately differentiate simple from complex appendicitis and that, after external validation, CPRs with an acceptable diagnostic accuracy would be identified.
Therefore, the aim of this study was to perform a scoping review of the literature to identify CPRs that differentiate between simple and complex appendicitis and to select CPRs that are applicable in routine daily practice in The Netherlands. Second, we aimed to externally validate the selected CPRs in a multicenter historical cohort of pediatric patients with acute appendicitis.
Methods
Identification of the CPRs: Scoping review of the literature
A comprehensive search was performed in the PubMed and Embase databases in collaboration with our experienced medical librarian (R.V.) to identify available CPRs. Databases were searched from inception up to July 21, 2021. Terms used as index terms or free text words were: “appendicitis,” “predict∗,” “score∗,” and “decision”. A detailed description of the literature search can be found in Supplementary Appendix S1. Studies were screened for title and abstract and subsequently assessed for full text by 2 independent reviewers (P.A. and S.T.). Disagreements were resolved by consensus. All CPRs that were tested in previous studies for their potential in differentiating between simple and complex appendicitis were eligible for inclusion. Both CPRs developed in a pediatric population and an adult population were included. CPRs containing variables that are not routinely measured in the participating hospitals (eg, rectal tenderness and biomarkers such as matrix metalloproteinase) and CPRs that used computed tomography (CT) scan results were excluded.
Multicenter historical cohort
In this multicenter historical cohort study, patient records of all children (<18 years old) with histopathologically proven acute appendicitis (simple and complex) treated at the Amsterdam University Medical Centre (UMC) and Northwest Hospital Alkmaar between January 1, 2013 and December 31, 2019 were reviewed. Eligible patients were identified using International Classification of Diseases (ICD) codes for acute appendicitis and acute abdomen. Patients who were transferred to either of the participating hospitals after appendectomy elsewhere and those with a finding of a noninflamed appendix during surgery or at histopathological examination were excluded. Furthermore, patients who were initially treated nonoperatively and those who underwent appendectomy for an indication other than acute appendicitis were excluded.
The study protocol was reviewed by both the Medical Ethics Review Committee of the Amsterdam UMC, location AMC, and the Medical Ethics Review Committee of the Northwest hospital. It was confirmed that the Dutch Medical Research Involving Human Subjects Act does not apply and therefore the need for complete ethical review was waived. Patients who objected against the use of their data were excluded. The results of the study were reported according to the STARD 2015 guidelines.
Patient files were reviewed, and data were extracted and stored in an online database (Castor EDC). Two authors independently extracted data (P.A., S.T.), and conflicts were resolved by consensus. Variables were collected to calculate scores of all identified CPRs for each included patient. These collected variables were baseline characteristics (age at presentation, sex, comorbidities), clinical variables (duration of abdominal pain [days], location of pain in right iliac fossa, migration of pain, anorexia, vomiting, intensity of pain [numeric rating scale], progression of pain, constant pain, hopping/coughing/percussion tenderness, tenderness inside/outside right iliac fossa, rebound tenderness, guarding, rigidity, temperature [degrees Celsius], heart rate, micturition difficulties), biochemical variables (leukocytes [×109/L], C-reactive protein [CRP], leukocytes differential count), radiological variables (ultrasonography and magnetic resonance imaging results), perioperative variables (including perioperative diagnosis [simple or complex appendicitis]), and histopathological variables (variables that differentiate between simple and complex appendicitis).
CPR scores were calculated for all patients and compared to our gold standard consisting of a combination of perioperative and histopathological criteria as proposed by Bhangu et al.
Simple appendicitis was therefore defined as the macroscopic appearance of an increased diameter of the appendix, without signs of necrosis or perforation, and without the presence of an abscess or mass. Complex appendicitis was defined as appendicitis with signs of necrosis (gangrenous appendicitis) or perforation, or appendicitis with an abscess or mass. Perforation was defined as a (macroscopic or microscopic) visible hole in the appendix or the presence of a free fecalith in the abdominal cavity.
Data analysis
Descriptive analysis of the data was performed. Continuous variables were presented as mean with standard deviation or median with interquartile ranges (IQR) according to their distribution. Scores of all CPRs were calculated for each patient and compared to our gold standard, consisting of a combination of perioperative and histopathological criteria.
Complex appendicitis was considered a positive test result in the index test (CPRs) and the gold standard, whereas simple appendicitis was regarded as a negative test result. Subsequently, receiver operating characteristic (ROC) curves were drawn and area under the curve (AUC) values, the primary outcome of this study, were calculated and displayed as proportions with 95% confidence interval (CI). AUC values ≥0.7 were considered acceptable and potentially useful. If a CPR could be calculated for only <50% of our cohort, the CPR was excluded from analysis. If provided, cut-off values as presented in the original manuscript were used to calculate the sensitivity, specificity, predictive values (positive and negative), and likelihood ratios (positive and negative) for these CPRs. These secondary outcomes were displayed as percentages with 95% CI or proportions with 95% CI.
The total cohort was subsequently divided into patients who presented at the Amsterdam University Medical Centers (tertiary referral hospital) and those who presented at the Northwest Hospital Alkmaar (large teaching hospital). Subgroup analyses were performed, and the outcomes as described for the primary analysis were presented. All statistics were performed using IBM SPSS Statistics version 26 (IBM SPSS 26.0, Armonk, NY).
Results
Literature review: Identification of the CPRs
The literature search yielded 6,218 articles, of which 2,477 were found in PubMed and 3,741 in Embase. After removal of duplicates, 4,105 articles were screened for title and abstract. Subsequently, 97 articles were assessed for full text, of which 59 were excluded due to various reasons (Figure 1). The 38 included studies reported on 31 different CPRs that were tested for their ability to differentiate between simple and complex appendicitis. Of these 31 identified CPRs, 14 were excluded. Reasons for exclusion of CPRs were the use of variables that are not routinely measured in our emergency department (n = 8) and use of CT scan results (n = 6). Variables not routinely measured in our emergency department were rectal tenderness, classification of rebound tenderness into light, medium, and strong, and laboratory markers such as matrix metalloproteinase and erythrocyte sedimentation rate.
The 17 CPRs that were included in this study were: the Alvarado Score, BADCF score, Heidelberg Appendicitis score, Ohmann score, Pediatric Appendicitis Score (PAS), Tzanakis score, and the CPRs developed by Atema, Bogaard, Bonadio, Bröker, Eddama, Feng, Gorter, Graham, Hansson, Kim, and Obinwa.
During the study period, 630 children were identified using ICD codes for acute appendicitis and acute abdomen. Of these, 80 were excluded for the following reasons: patients were treated nonoperatively (n = 39), a noninflamed appendix was found during surgery or at histopathological examination (n = 22), patients were transferred for treatment of complications after appendectomy elsewhere (n = 14), patients underwent appendectomy for another indication than acute appendicitis (n = 3), and patients objected against the use of their data (n = 2). In total, 550 patients with a median age of 11 years (IQR 9–14) were included. Of these patients, 342 (62%) were diagnosed with simple and 208 (38%) with complex appendicitis according to our gold standard. General characteristics of the study cohort are shown in Table II.
Table III shows the number of patients for which CPR scores could be calculated and ROC curves were drawn. The CPRs of Atema, Bonadio, Feng, the Alvarado Score, and PAS could not be calculated for at least 50% of patients in our cohort, mainly due to missing data, and were therefore excluded from analysis (Table III). The presence of a fecalith on imaging and absolute neutrophil count were the most frequently missing variables for the scores of Atema and Bonadio, respectively, and left shift was most frequently missing for both the Alvarado Score and PAS. The CPR of Feng was designed for children up to 5 years old and could therefore not be calculated for at least 50% of patients in this cohort.
Table IIINumber of patients for which CPR scores could be calculated
The ROC curves of the CPRs are shown in the Supplementary Appendix S2, and the corresponding AUC values are presented in Table IV. In our cohort, CPRs with an AUC value of >0.7 were the CPRs of Gorter (0.81), Bogaard (0.79), Bröker (0.79), Graham (0.77), Eddama (0.76), BADCF score (0.76), and Hansson (0.75). Subgroup analysis of the performance of CPRs for each hospital individually showed that the CPRs of Bogaard (AUC value: 0.90), Bröker (AUC value: 0.86), Eddama (AUC value: 0.81), Gorter (AUC value: 0.88), BADCF score (AUC value: 0.77), Hansson (AUC value: 0.76), and Kim (AUC value: 0.76) had an acceptable discriminative power for patients who presented at the Amsterdam UMC (tertiary referral center) (Supplementary Appendix S4). In the cohort of patients who presented at the Northwest Hospital (large peripheral teaching hospital), the CPRs of Bogaard (0.76), Bröker (0.76), Eddama (0.76), Gorter (0.78), Graham (0.78), Hansson (0.78), and BADCF score (0.75) had an AUC value >0.7 (Supplementary Appendix S4).
Table IVDiagnostic accuracy of CPRs (total cohort)
The diagnostic accuracy of the CPRs, according to the cut-off values as provided by the original articles, is displayed in Table IV, and the STARD flow diagrams can be found in the Supplementary Appendix S5. For 7 out of the 12 identified CPRs, a cut-off value was provided by the original article. The negative predictive values of the BADCF score, the CPRs of Bogaard, Gorter, Graham, and Kim with their original cut-off values are ≥80%, implicating that ≥80% of patients with a test result of simple appendicitis were also diagnosed with simple appendicitis according to the gold standard. Focusing on ruling out complex appendicitis, in our cohort the BADCF score and the CPRs of Graham and Kim had the best negative likelihood ratios (0.2, 95% CI 0.2–0.4, 0.1–0.4 and 0–0.5, respectively).
The CPR of Gorter has the highest positive likelihood ratio (3.4, 95% CI 2.7–4.3), although still approximately one-third of patients with complex appendicitis according to the CPR were diagnosed with simple appendicitis according to the gold standard.
Discussion
This study identified 31 CPRs through a scoping review of the literature, of which 17 were included for external validation. Owing to missing data, only 12 of these CPRs were included in our final analysis. External validation of these CPRs in our multicenter historical cohort showed that 7 (Gorter, Bogaard, Bröker, Graham, Eddama, BADCF, and Hansson) had an AUC value ≥0.7 and thus were considered as potentially useful in our daily practice.
Accurate preoperative differentiation between simple and complex appendicitis has become increasingly important, as potential treatment strategies for both types of appendicitis differ. International guidelines on the management of acute appendicitis recommend emergency appendectomy (as soon as feasible after diagnosis) for complex appendicitis, because this strategy is associated with better outcomes.
Initial nonoperative treatment of simple appendicitis is found to be safe and effective and therefore has become a potential alternative for appendectomy. When appendectomy is preferred for simple appendicitis, it can safely be delayed up to 24 hours after diagnosis, since several studies have shown that a short time delay from emergency department to surgery did not lead to increased appendiceal perforation rates or worse outcomes.
Association of nonoperative management using antibiotic therapy vs laparoscopic appendectomy with treatment success and disability days in children with uncomplicated appendicitis.
Thus far, the ability of several variables to differentiate between simple and complex appendicitis has been tested and some clinical variables, such as duration of abdominal pain, have been found to be predictive for complex appendicitis.
However, most clinical variables derived from history and physical examination are subject to interpretation bias, which reduces their reproducibility.
Therefore, the potential discriminative ability of (objective) biomarkers has also been investigated, demonstrating high diagnostic accuracy of white blood cell count, granulocyte count, and especially CRP for prediction of complex appendicitis.
It is hypothesized that the discriminative ability of biomarkers could be improved in combination with other objective predictors, such as imaging modalities. Radiological variables highly specific for appendiceal perforation are signs of an abscess, intraluminal fecalith, and especially loss of the submucosal layer of the appendix.
Therefore, it has been suggested that combining these objective variables into CPRs would optimize differentiation between simple and complex appendicitis. This was confirmed in the present study, as the CPRs that combine objective variables such as vital parameters and laboratory results with radiological variables (Gorter, Bogaard, BADCF) had the highest AUC values. Especially objective variables such as body temperature, CRP, and free fluid on imaging are included in the majority of these CPRs and are therefore found to be highly discriminative. In contrast, CPRs mainly consisting of physical examination variables (Heidelberg Appendicitis Score and Ohmann) were the least useful for differentiation between simple and complex appendicitis.
This study identified 7 CPRs, with a wide variety of clinical, biochemical, and radiological variables. These CPRs could be used for clinical decision-making about timing of appendectomy (emergency appendectomy versus urgent appendectomy within 24 hours after diagnosis) or for selecting patients who are eligible for nonoperative treatment. Selecting a CPR for use in daily clinical practice depends on 2 main factors. First, it depends on whether the integrated variables in the CPRs are already routinely assessed in daily practice and, second, on the purpose of integrating CPRs in clinical practice. Of note, some CPRs are more useful for selecting patients for nonoperative treatment, whereas others are more accurate in identifying patients with complex appendicitis in need of emergency treatment. For the former purpose, a CPR with a high sensitivity should be chosen to ensure that no patients with complex appendicitis are selected for nonoperative treatment. For the latter purpose, however, CPR with a high specificity should be selected to ensure prompt treatment for those patients with complex appendicitis. In our daily practice, the CPR of Gorter is used in a randomized controlled trial for selection of patients who are eligible for nonoperative treatment.
This CPR was chosen because of the high sensitivity and negative predictive value that were found in a previous validation study, and because it consists of 5 variables that are routinely assessed in our daily clinical practice.
Difference in performance of the CPRs in the original studies compared to the present study can first be explained by the fact that most of the CPRs tested in our study were developed and validated in the same cohort of patients and therefore lack external validation. The performance of a CPR is generally lower in cohorts of patients other than the one in which it was developed. This could be one of the main reasons that the AUC values of the selected CPRs were lower in the present study compared to the original studies. Only the CPR of Gorter was externally validated after development, yielding a sensitivity of 86%, specificity of 91%, positive predictive value of 67%, negative predictive value of 97%, positive likelihood ratio of 10, and negative likelihood ratio of 0.15.
The present study found a slightly lower diagnostic accuracy, which could be explained by differences in population. The original CPR was validated in a peripheral teaching hospital, whereas our cohort consisted of patients from both a tertiary referral center and a peripheral teaching hospital. For the development of future CPRs, external validation is recommended. Second, differences could be explained by the fact that the goals for which the CPRs were developed vary. Some CPRs were specifically developed for the differentiation between simple and complex appendicitis,
whereas others were originally developed for the diagnosis of acute appendicitis (without differentiation of type) and were tested for their ability of differentiating between simple and complex appendicitis in previous studies.
Third, the present study found a higher discriminative ability of CPRs in the tertiary center subgroup compared to the peripheral teaching hospital, which could be explained by different characteristics of the subgroups. In general, children treated at the tertiary center were younger, and a higher proportion of patients were diagnosed with complex appendicitis compared to the patients treated at the peripheral teaching hospital.
The results of our study should be interpreted in the light of some limitations. Data regarding the variables of the CPRs and outcomes were retrospectively collected, possibly leading to information bias. Due to missing data, 5 of 17 identified CPRs could not be calculated for at least 50% of patients in our cohort and were therefore excluded from analysis. Moreover, 14 out of 31 CPRs were excluded because they contained variables that were not routinely collected in our daily practice. Therefore, these CPRs could not be tested in our cohort of patients, although they could potentially be useful in another population. Furthermore, for the CPRs that used ultrasound variables, ultrasound reports were not reassessed by a radiologist. Moreover, ultrasound performance and conclusions are prone to interobserver variability, which is also dependent on the skills of the radiologist. Additionally, the patients included in our study were identified using ICD codes for acute appendicitis and acute abdomen, which might have led to selection bias in cases of wrong ICD code classification. Moreover, as acute appendicitis is a disease with different behaviors in different pediatric age groups (eg, children younger than 5 years old versus teenagers), it could be interesting to investigate the applicability of the CPRs in different age groups. Lastly, generalizability of the results of this study could be questioned, as those CPRs that are applicable in our daily practice were selected. However, nowadays, global guidelines on the diagnostic work-up and treatment of acute appendicitis are followed by many countries; therefore, management of patients is becoming increasingly comparable.
Furthermore, in this study, 7 CPRs consisting of different clinical, biochemical, and radiological variables have been identified as potentially useful. Therefore, surgeons from different countries can select the CPR that consists of variables that are routinely assessed in their daily practice, thereby keeping in mind the purpose for which they are planning to use the CPR. In the present study, the CPRs with the best discriminative ability were those developed in a population consisting of patients treated in The Netherlands.
These CPRs were therefore tested in a patient cohort that was relatively similar to the population in which the CPRs were developed, which could have attributed to the high diagnostic accuracy. Thus, apart from selecting one of the CPRs identified in the present study, it might also be beneficial for clinicians to develop a CPR in their own population.
Strengths of this study include the predefined definitions for simple and complex appendicitis based on the perioperative and histopathological classification as proposed by Bhangu et al.
This resulted in an objective and reproducible classification of appendicitis severity according to the gold standard. Furthermore, all data were collected by one author, and, subsequently, the entire database was checked by another author. Classification of simple and complex appendicitis according to the gold standard was also independently performed by 2 authors. Additionally, our cohort consisted of patients treated in a tertiary referral center and a large peripheral teaching hospital, which improved the generalizability of our results. This is also reflected by the distribution of simple versus complex appendicitis in this study, which was comparable to the distribution that was found in a recent nationwide audit in The Netherlands.
Imaging in pediatric appendicitis is key to a low normal appendix percentage: a national audit on the outcome of appendectomy for appendicitis in children.
In conclusion, external validation of 12 CPRs in a multicenter retrospective cohort showed that 7 CPRs had an AUC value >0.7 and were therefore potentially useful for the differentiation between simple and complex appendicitis in our population. CPRs consisting of a combination of clinical, biochemical, and radiological variables were found to have the highest discriminative ability. We propose the integration of these 7 CPRs in daily clinical practice to guide decision making regarding treatment strategies for children with acute appendicitis.
Funding/Support
Outside the submitted work Dr R.R. Gorter and Dr R. Bakx received a (governmental) ZonMw grant for research in the field of complex appendicitis in the pediatric population (grant number: 80-85009-98-2007).
Conflict of interest/Disclosure
The authors have no conflicts of interest or financial ties to disclose for the submitted work.
Association of nonoperative management using antibiotic therapy vs laparoscopic appendectomy with treatment success and disability days in children with uncomplicated appendicitis.
Imaging in pediatric appendicitis is key to a low normal appendix percentage: a national audit on the outcome of appendectomy for appendicitis in children.