Cirugi A Espan Ola: Efficiency of The Bethesda System For Thyroid Cytopathology
Cirugi A Espan Ola: Efficiency of The Bethesda System For Thyroid Cytopathology
2018;96(6):363–368
CIRUGÍA ESPAÑOLA
www.elsevier.es/cirugia
Original article
Article history: Introduction: Fine-needle aspiration biopsies are a key tool for preoperative assessment of
Received 29 December 2017 thyroid nodules, and the Bethesda system is the preferred method to report cytological
Accepted 19 February 2018 analysis. The purpose of this study is to assess the efficiency of the Bethesda system to
Available online 29 June 2018 identify the malignancy risk of thyroid nodules.
Methods: Patients who underwent thyroid surgery between June 2010 and June 2017 were
Keywords: included. Samples were classified into six categories according to rates of malignancy
Thyroid cancer associated with each diagnostic category. In order to investigate the correlation between
Bethesda categories, a statistical analysis compared the categories with pathology reports. Diagnostic
Fine-needle aspiration indicators were calculated as a screening test (categories IV, V, VI as true-positive) and as a
Cytology method to identify malignancy (V, VI as true-positive).
Results: In a series of 522 patients, we found 184 (35.2%) malignant tumors, papillary
carcinoma being the most prevalent with 155 cases (84.2%). Malignant rates for diagnostic
categories were: I, 0%; II, 1.5%; III, 6.4%; IV, 31%; V, 86.5%; VI, 100%. A robust correlation was
identified between categories on statistical analysis. For the ‘‘screening test’’ analysis,
sensitivity was 98.9%, specificity 84.4%, positive predictive value 69.6%, negative predictive
value 99.5%, and diagnostic accuracy 88.2%. Analysing the accuracy to detect malignancy,
values were: sensitivity 98.6%, specificity 97.6%, positive predictive value 93.5%, negative
predictive value 99.5%, diagnostic accuracy 97.9%.
Conclusion: The Bethesda system is a clear and reliable approach to report thyroid cytology
and therefore is an effective tool to identify malignancy risk and guide clinical management.
# 2018 AEC. Published by Elsevier España, S.L.U. All rights reserved.
§
Please cite this article as: Mora-Guzmán I, Muñoz de Nova JL, Marı́n-Campos C, Jiménez-Heffernan JA, Cuesta Pérez JJ, Lahera Vargas M,
et al. Rendimiento del sistema Bethesda en el diagnóstico citopatológico del nódulo tiroideo. Cir Esp. 2018;96:363–368.
* Corresponding author.
E-mail address: [email protected] (J.L. Muñoz de Nova).
2173-5077/ # 2018 AEC. Published by Elsevier España, S.L.U. All rights reserved.
364 cir esp. 2018;96(6):363–368
resumen
Palabras clave: Introducción: La punción-aspiración con aguja fina es una pieza clave en la evaluación
Cáncer de tiroides preoperatoria del nódulo tiroideo y el sistema Bethesda es el más aceptado para categorizar
Bethesda el análisis citológico. El objetivo del estudio es evaluar la validez del sistema Bethesda en la
Punción-aspiración con aguja fina enfermedad nodular tiroidea para diagnosticar malignidad.
Citologı́a Métodos: Se incluye a los pacientes intervenidos de tiroides consecutivamente entre junio de
2010 y junio de 2017. Se realizó el análisis de la punción preoperatoria según el sistema
Bethesda, correlacionando este dato con la histologı́a definitiva para cada nódulo biopsiado.
Los parámetros de prueba diagnóstica se calcularon como prueba de screening (verdadero
positivo: categorı́as IV, V, VI) y como método para identificar malignidad (verdadero positivo:
categorı́as V, VI).
Resultados: Se incluyó a 522 pacientes, de los que 184 (35,2%) presentaron un carcinoma en la
histologı́a definitiva; siendo el carcinoma papilar el más frecuente (84,2%). Los porcentajes
de malignidad en el nódulo biopsiado para cada categorı́a Bethesda fueron: I, 0%; II, 1,5%; III,
6,4%; IV, 31%; V, 86,5% y VI, 100%. En el análisis como prueba de screening, se identificó una
sensibilidad del 98,9%, especificidad del 84,4%, valor predictivo positivo del 69,6%, valor
predictivo negativo del 99,5% y precisión diagnóstica global del 88,2%. En el análisis para
detectar malignidad, los parámetros fueron: sensibilidad 98,6%, especificidad 97,6%, valor
predictivo positivo 93,5%, valor predictivo negativo 99,5% y precisión diagnóstica global
97,9%.
Conclusiones: El sistema Bethesda es un método sencillo y reproducible en la categorización
citológica del nódulo tiroideo, una herramienta útil en el manejo y eficaz para identificar el
riesgo de malignidad.
# 2018 AEC. Publicado por Elsevier España, S.L.U. Todos los derechos reservados.
Statistical Analysis
Patients operated on from June,
2010 – June, 2017
631 The statistical analysis was carried out using the SPSS1 23.0
program for Windows (SPSS Inc., Chicago, Illinois, USA). The
results were expressed as percentages for categorical varia-
67 FNA at another hospital bles, and as mean and standard deviation for continuous
variables, using the median and interquartile range for
variables with asymmetrical distribution.
25 No preoperative
FNA The correlation between the different diagnostic categories
was assessed by comparing them with the final histological
result, for which a linear logarithmic model (likelihood ratio)
17 FNA not in accordance and a chi-squared model were applied, using symmetrical
with the Bethesda system
measures of association. The malignancy data used were
calculated by assigning to each biopsied nodule its corres-
522 ponding final histological diagnosis. Statistically significant
differences were considered bilaterally with P values <.05. The
phi correlation was used as a measure of the degree of
Fig. 1 – Diagram of patients included in the study.
association between categorical variables, whose values
oscillate between +1 and 1. According to the strength of
association: 1 indicates a strong negative association, +1
indicates a strong positive association and 0 indicates absence
follicular neoplasm/suspicion of follicular neoplasm; (V) of association.
suspected malignancy and (VI) malignant. The indication to The diagnostic test parameters calculated were sensitivity,
repeat a needle-aspiration was limited to those cases with specificity, predictive values (positive predictive value [PPV],
diagnostic categories I and III, and to benign punctures, but negative predictive value [NPV]) and diagnostic accuracy to
with a high degree of clinical-radiological suspicion. Once the detect malignancy by means of two analyses. In the analysis
needle-aspiration was done, surgery was indicated: in patients as a screening test (analysis I), the FNA results were
with categories IV, V and VI; in patients with persistent considered an indication for surgery for suspected malignancy
category I after repeated aspiration who presented a high (BS II vs IV, V, VI categories). According to this analysis, the
degree of clinical-radiological suspicion; in patients with terms ‘‘positive’’ or ‘‘negative’’ constituted the existence or
persistent category III after repeating needle-aspiration or not of surgical indication for the statistical analysis. Categories
after the initial aspiration if there was a high degree of I and III were excluded from this analysis because they may
suspicion; and in patients with category II, but who presented involve the repetition of FNA. A second analysis was
symptoms attributable to thyroid nodularity, hyperfunction, performed, which measured the ability of the test to detect
progressive growth of the TN or if any of the TN was >4 cm. The malignancy (analysis II) in the case of highly suspicious
surgical technique used in each case was based on the aspirations (categories V and VI) compared to patients with
individual characteristics of the patient, the BS categories benign puncture (category II).
and the location of the TN. In general, hemithyroidectomy was
performed in the presence of unilateral nodules or millimetric
contralateral nodules in categories I–IV. In the presence of
Results
symptomatic bilateral multinodular goiter, Graves’ disease or
categories V–VI, total thyroidectomy was selected. In the study period, 631 patients were treated. Excluded from
The patient follow-up data and the final histological the study were 67 patients with FNA performed at another
correlation were only available in patients with surgical hospital, 25 patients without preoperative FNA, and 17
management. If a patient had several FNA samples from patients in whom the FNA report was not done in accordance
different TN, the results of each aspiration and the corres- with the BS (Fig. 1). Thus, out of the 522 patients included, 433
ponding histological results were analyzed separately. A (83%) were women, with a mean age of 51.816 years. The
thorough review of each evaluated TN was performed, median TN size evaluated preoperatively was 2.5 cm (1.6–4).
carefully correlating the description of the ultrasound that The most frequent cytology among the operated patients was
guided the FNA (size and location) with the findings of the category II (49%), with very similar percentages of patients
surgical piece to confirm the agreement between the biopsied operated on with categories III, IV, V and VI (14.9, 13.6, 7.1 and
TN with its respective definitive pathology diagnosis. Regar- 11.5%, respectively). In 316 cases (60.5%), total thyroidectomy
ding the study design, a prospectively maintained database was performed; the remainder had hemithyroidectomies
was analyzed that collected the diagnostic-therapeutic data of with isthmectomies (39.5%). Dissection of the central com-
all the patients, particularly demographic data, size and partment was associated in 66 cases (12.6%). Regarding
ultrasound localization of the TN, BS diagnostic category (in histological results, 184 malignancies (35.2%) were identified;
cases of multiple aspirations in the same patient, only the papillary carcinoma was the most frequent tumor with 155
highest BS risk category was included), operative data and cases (84.2%), 42 of which were incidental microcarcinomas
pathology data. This study was approved by the Clinical (27.1% of the total papillary carcinomas and 8% of the total
Research Ethics Committee at our hospital. number of patients treated surgically). The remaining
366 cir esp. 2018;96(6):363–368
neoplasms identified were 19 follicular carcinomas (10.3%), suspicious aspirations (analysis II: category II vs V+VI), this
right medullary carcinomas (4.3%), an anaplastic carcinoma analysis increased the overall accuracy of the test up to 97.9%
(0.5%) and a thyroid lymphoma (0.5%). (sensitivity 98.6%, specificity 97.6%, 93.5 PPV% and NPV 99.5%).
As for the percentages of malignancy in the different BS
categories, after excluding incidental microcarcinomas,
malignancy rates for category II, III, IV, V and VI were 4.6%,
Discussion
11.5%, 33.8%, 86.5% and 100%, respectively (Table 1). In
category I, the rate of malignancy was 35.3%, but in no case The most commonly used method for the description and
was this due to the preoperatively biopsied nodule, while in categorization of thyroid FNA samples is the BS.8,10 It is based
the overall series 86.4% of patients presented the tumor on the on six categories, for each of which there is an estimated risk
nodule that had been biopsied preoperatively. Thus, the rates of thyroid cancer.12 Our study seeks to review this risk,
of malignancy attributable to the biopsied nodule for comparing cytology findings with the only possible gold
categories II, III, IV, V and VI were 1.5%, 6.4%, 31%, 86.5% standard: the definitive histological study of patients who
and 100%, respectively. By analyzing the differences between have undergone thyroid surgery. The malignancy rates on
the percentages of malignancy in each of the different which the statistical study has been based do not take
categories, we found a strong correlation in practically all of incidental microcarcinomas into account, since most of them
the comparisons (Table 2). Only statistically significant will have an indolent clinical course. In addition, we have
differences were not detected between categories I and II considered only the existence of malignancy on the biopsied
(P=1.000) and between categories I and III (P=.581). nodule, since we sought to define the capacity of the cytology
Regarding the performance of BS, when we analyzed its study to identify the malignancy, not that of the ultrasound
utility as a screening test (analysis I: category II vs IV+V+VI), we selection of the nodule to be biopsied. Despite this, we would
found a sensitivity for detecting malignancy of 98.9%, with a like to point out that only 20 patients (3.8%) had a tumor >1 cm
specificity of 84.4%, a PPV of 69.6%, a NPV of 99.5% and an in one of the non-biopsied nodules. For identifying malig-
overall diagnostic accuracy of 88.2% (Table 3). In highly nancy, our analysis shows the existence of a strong correlation
Table 3 – Diagnostic Test Parameters of the Bethesda malignancy (categories V and VI), obtaining in this case a
System. specificity of 97.6% and a PPV of 93.5%, with an overall
Parameter Analysis I (%)a Analysis II (%)b accuracy of 97.9%. These data allow the BS to be defined as a
very reliable tool when it comes to confirming the existence of
Sensitivity 98.9 (87/88) 98.6 (72/73)
Specificity 84.4 (205/243) 97.6 (205/210) malignancy. One of the limitations of the present study is
PPV in DC VI 100 (43/43) 100 (43/43) those inherent to cytological analyses. Pathologists should be
PPV in DC V 85.3 (29/34) 85.3 (29/34) alert to the possibility of more errors in the analysis of cystic
PPV in DC IV 31.3 (15/48) – lesions, multinodular goiter or overlapping lesions with
PPV 69.6 (87/125) 93.5 (72/77) similar cytomorphological characteristics, such as the pre-
NPV 99.5 (205/206) 99.5 (205/206)
sence of reactive follicular cells or lesions with Hürthle cells.
Rate of false negatives 1.1 (1/88) 1.4 (1/73)
These findings mainly appear in the context of categories I and
Rate of false positives 15.6 (38/243) 2.4 (5/210)
Diagnostic accuracy 88.2 (292/331) 97.9 (277/283) III, precisely those involved in the only comparisons between
a
BS categories that did not show significant differences.
Considers cases in DC IV+V+VI true positives and cases in DC II
However, we believe that the small number of patients that
true negatives.
b
Considers cases in DC V+VI true positives and cases in DC II true made up category I limits the ability to detect differences.
negatives.
Conflict of Interest
for each of the BS categories, with differences between almost The authors have no conflict of interests to declare.
all of them (Table 2). This could justify maintaining the six
categories, as occurred in the recent revision of the BS.13 The
percentages of malignancy observed for each category were, in references
general, within the limits described.9,10
Only 3.3% of the patients who underwent surgery had
category I, a figure well below reports of other authors (7– 1. Burman KD, Wartofsky L. Clinical practice. Thyroid nodules.
26%).8,14 We believe that this figure is based on the fact that all N Engl J Med. 2015;373:2347–56.
FNA were guided by ultrasound, avoiding other aspirations, 2. Guth S, Theune U, Aberle J, Galach A, Bamberger CM. Very
high prevalence of thyroid nodules detected by high
while a cytologist evaluated the quality of the material
frequency (13 MHz) ultrasound examination. Eur J Clin
obtained in situ. The percentage of malignancy for the nodule Invest. 2009;39:699–706.
biopsied within this category was 0%, which represents an 3. Davies L, Morris LG, Haymart M, Chen AY, Goldenberg D,
ideal percentage if we take into account that the results of this Morris J, et al. American Association of Clinical
category affect the importance of obtaining satisfactory Endocrinologists and American College of Endocrinology
material for cytological analysis. disease state clinical review: the increasing incidence of
The malignancy rate associated with category III was 6.4%, thyroid cancer. Endocr Pract. 2015;21:686–96.
4. Vaccarella S, Franceschi S, Bray F, Wild CP, Plummer M, Dal
similar to the originally proposed rate.7 Although subsequent
Maso L. Worldwide thyroid-cancer epidemic? The increasing
studies presented rates of up to 48% for this category, this can impact of overdiagnosis. N Engl J Med. 2016;375:614–7.
be attributed to the selection of surgical patients and the 5. Siegel R, Ma J, Zou Z, Jemal A. Cancer statistics, 2014. CA
inclusion of incidental neoplasms in the analysis.15 To try to Cancer J Clin. 2014;64:9–29.
assess this wide variation of malignancy within category III, it 6. Ross DS. Predicting thyroid malignancy. J Clin Endocrinol
is recommended to assess the quotient between category III Metabol. 2006;91:4253–5.
7. Ali SZ, Cibas ES. The Bethesda system for reporting thyroid
and category VI patients,16 whose ideal value should be
cytopathology. Definitions criteria and explanatory notes.
between 1 and 3. Values above 3 would indicate an overuse of
Nueva York: Springer; 2010.
category III, while values lower than 1 would be due to a low 8. Bongiovanni M, Spitale A, Faquin WC, Mazzucchelli L,
use of this category, with the consequent risk of loss of Baloch ZW. The Bethesda system for reporting thyroid
sensitivity for detecting malignancy. In our series, this cytopathology: a meta-analysis. Acta Cytol. 2012;56:333–9.
quotient was 1.3, which brings us closer to the most efficient 9. Cibas ES, Baloch ZW, Fellegara G, Livolsi VA, Raab SS, Rosai J,
part of the recommended range. et al. A prospective assessment defining the limitations of
thyroid nodule pathologic evaluation. Ann Intern Med.
Regarding the diagnostic test parameters of the BS (when
2013;159:325–32.
assessed as a screening test, that is), needle-aspirations 10. Haugen BR, Alexander EK, Bible KC, Doherty GM, Mandel SJ,
indicating a surgical intervention (Table 3) had an observed Nikiforov YE, et al. 2015 American Thyroid Association
sensitivity of 98.9% and an NPV of 99.5%. These data are Management Guidelines for Adult Patients with Thyroid
similar to the Bongiovanni et al.9 study in terms of sensitivity, Nodules and Differentiated Thyroid Cancer: The American
although our NPV greatly improves the average indicated in Thyroid Association Guidelines Task Force on Thyroid
this study (99.5 vs 47%). This last datum is of special Nodules and Differentiated Thyroid Cancer. Thyroid.
2016;26:1–133.
importance, since the main objective of preoperative FNAP
11. Cooper DS, Doherty GM, Haugen BR, Kloos RT, Lee SL,
is to rule out the existence of malignancy in order to reduce the Mandel SJ, et al. Revised American Thyroid Association
number of unnecessary surgeries for this reason. In the second management guidelines for patients with thyroid nodules
part of the analysis of the diagnostic test parameters of the BS, and differentiated thyroid cancer. Thyroid. 2009;19:1167–
we have considered their capacity to ensure the existence of 214.
368 cir esp. 2018;96(6):363–368
12. Ali SZ. Thyroid cytopathology: Bethesda and beyond. Acta prevalence of malignancy in indeterminate thyroid
Cytol. 2011;55:4–12. nodules classified as Bethesda category III. Surgery.
13. Cibas ES, Ali SZ. The 2017 Bethesda system for reporting 2015;157:510–7.
thyroid cytopathology. Thyroid. 2017;27:1341–6. 16. Krane JF, Vanderlaan PA, Faquin WC, Renshaw AA. The
14. Theoharis CG, Schofield KM, Hammers L, Udelsman R, atypia of undetermined significance/follicular lesion of
Chhieng DC. The Bethesda thyroid fine-needle aspiration undetermined significance malignant ratio: a proposed
classification system: year 1 at an academic institution. performance measure for reporting in the Bethesda system
Thyroid. 2009;19:1215–23. for thyroid cytopathology. Cancer Cytopathol.
15. Iskandar ME, Bonomo G, Avadhani V, Persky M, Lucido D, 2012;120:111–6.
Wang B, et al. Evidence for overestimation of the