0% found this document useful (0 votes)
6 views16 pages

Intro

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views16 pages

Intro

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Intro

Paper one

Potential Celiac Patients (PCD) bear the Celiac Disease (CD) genetic predisposition, a signifcant
production of antihuman transglutaminase antibodies, but no morphological changes in the small bowel
mucosa. A minority of patients (17%) showed clinical symptoms and need a gluten free diet at time of
diagnosis, while the majority progress over several years (up to a decade) without any clinical problem
neither a progression of the small intestine mucosal damage even when they continued to assume
gluten in their diet. Recently we developed a traditional multivariate approach to predict the natural
history, on the base of the information at enrolment (time 0) by a discriminant analysis model. Still, the
traditional multivariate model requires stringent assumptions that may not be answered in the clinical
setting. Starting from a follow-up dataset available for PCD, we propose the application of Machine
Learning (ML) methodologies to extend the analysis on available clinical data and to detect most infuent
features predicting the outcome. These features, collected at time of diagnosis, should be capable to
classify patients who will develop duodenal atrophy from those who will remain potential. Four ML
methods were adopted to select features predictive of the outcome; the feature selection procedure
was indeed capable to reduce the number of overall features from 85 to 19. ML methodologies
(Random Forests, Extremely Randomized Trees, and Boosted Trees, Logistic Regression) were adopted,
obtaining high values of accuracy: all report an accuracy above 75%. The specifcity score was always
more than 75% also, with two of the considered methods over 98%, while the best performance of
sensitivity was 60%. The best model, optimized Boosted Trees, was able to classify PCD starting from the
selected 19 features with an accuracy of 0.80, sensitivity of 0.58 and specifcity of 0.84. Finally, with this
work, we are able to categorize PCD patients that can more likely develop overt CD using ML. ML
techniques appear to be an innovative approach to predict the outcome of PCD, since they provide a
step forward in the direction of precision medicine aimed to customize healthcare, medical therapies,
decisions, and practices tailoring the clinical management of PCD children. Abbreviations PCD Potential
celiac patients CD Celiac disease ML Machine learning AUC Area under the curve ROC Receiver operating
characteristics Potential Celiac patients are characterized by genetic predisposition to celiac disease
(CD), presence of CD specifc antibodies (anti-human tissue transglutaminase antibodies and anti-
endomysium) in the serum, but no OPEN 1 Department of Mathematics and Applications “Renato
Caccioppoli”, University of Naples “Federico II”, Via Cintia, Monte S. Angelo, 80126 Naples, Italy. 2
Department of Translational Medical Sciences, University of Naples “Federico II”, Naples, Italy. 3
European Laboratory for the Investigation of Food Induced Diseases (ELFID), University of Naples
“Federico II”, Naples, Italy. *email: [email protected] 2 Vol:.(1234567890) Scientifc Reports |
(2021) 11:5683 | https://doi.org/10.1038/s41598-021-84951-x www.nature.com/scientificreports/
morphological changes in the small bowel mucosa1–7 . Only a small percentage of them showed
signifcant clinical symptoms (and are started on a gluten free diet at time of diagnosis), while the
majority progressed over several years (up to a decade) without any clinical problem or a progression of
the small intestinal mucosal damage even if they continued a gluten containing diet, on long term follow
up one third of them progressed to a clear pattern of CD mucosal damage. Te real issue was to attempt
to predict, at enrolment, who was more likely to progress to villous atrophy disease in order to prevent
clinical and histological damage related to the disease. In a previous paper, we developed a traditional
multivariate approach to predict, on the base of the information at enrolment (time 0), subjects more
likely to develop the full-blown disease. Overall, a discriminant analysis model allowed to correctly
classify, at entry, 80% of the children who would not develop a fat mucosa over follow-up, whereas
approximately 69% of those who did develop fat mucosa are correctly classifed by the starting
parameters1 . As discussed by Wasserstein et al. in8 , making conclusions based uniquely on linear
models can give unhelpful information when clinical data are used. Among others, some of the well-
known limitations of the linear models are: assumption about the distribution of the variables not
controlled; non independency of the variables selected in the model; the models obtained, being
hypothesis driven, may not respect the uncertainty about the biological signifcance of the variable
selected; relative weakness of sample size leading to very large confdence intervals on follow up. In this
second phase, we adopted a machine learning approach to validate an innovative method to predict the
outcome. ML techniques were proposed to support clinical decision in studies where multiple features
can afect outcomes. Recently, several studies produced seminal papers that invite the community to use
such methods9–19. Obermeyer, Rajkomar et al.9-11 reviewed Artifcial Intelligence methods currently
used in medicine, while the impossibility to use large amount of data without an automatic code was
discussed by Schwalbe and Wahl12. Also, the description of the “Te All of Us Research Program” in13
and the recent review on deep learning by Piccialli et al.14 focused on these issues. Te editorial ofce of
Te Lancet Respiratory Medicine15 gave some guidelines for ML, as done by other authors16–18. Beam
et al.19 focused on guidelines for reproducibility of results. What was noticed in such studies was that
ML is a powerful set of tools that help the extraction of signifcant features for the prediction of
outcomes. Nevertheless, because of its wide range applicability, considerable caution in the
interpretation of models was required to produce an innovative approach to clinical data. Common
pitfalls and roadmap for the application of ML methods in the medical domain were deeply reviewed20–
23. Many studies adopted ML techniques efectively in various clinical frameworks for the prediction of
outcomes. Te main domain, where ML and other Data Analysis techniques are widely used, is cancer
research and rheumatology, as pointed out respectively in the review by Hinkson et al.24 and Radstake
et al. in25: in this case, also images are used to enrich the available data set. Images are also used in the
detection of CD with the use of ML in26,27. What is the most important feature—and maybe also the
main drawback—is that the application of the ML techniques is model-free, data-driven, and intrinsically
non-linear. ML takes advantage of all the available data, uses the diferent features known in the learning
process: for example, the felds with categorical values can be converted into diferent numeric felds so
that they are treated separately, without the need of ordering. Our data set presents a temporal pattern
due to the follow-up, with an increasing sparsity of the data as the follow-up is increased. For this
reason, a ML strategy which considers the features collected at the enrolment allow to obtain a not
increasing confdence interval for the fnal prediction, despite the decreasing sample size as the number
of follow-ups increases. Tis provides a robust methodology, compared to the usual statistical parameters
estimates. In this study, we used ML for feature selection and for classifcation in a new condition such as
CD and its multifactorial pathogenetic elements. Feature selection gives indications on the best
predictive items in the dataset, while the classifcation result is given via threshold: it will give 1 (high
risk) if the model output for a given value exceeds 50%, 0 otherwise. Aims of this work was, starting
from a follow-up dataset available for PCD, to apply Machine Learning (ML) to select most infuent
features and introduce predictive models to distinguish patients who developed duodenal atrophy from
those who remained potential on a gluten containing diet.
Paper 2

Celiac disease (CD) is an immune-mediated disease characterized by small bowel enteropathy triggered
by dietary exposure to gluten. The classic clinical presentation of CD includes signs of malabsorption and
gastrointestinal symptoms such as abdominal pain, bloating and diarrhea [1]. However, many patients
experience predominantly non-specific extra-intestinal symptomatology [2, 3], or are asymptomatic [4].
The diverse manifestations of CD can make it challenging for primary care providers (PCPs) to identify
and diagnose CD in the general population [5]. Indeed, a majority of adult patients with CD today are
likely undiagnosed [4, 5, 6, 7]. Patients who are eventually receive a diagnosis experience a mean delay
of eleven years from symptom onset to diagnosis, with more than half reporting a delay of five or more
years till the diagnosis is established [5]. CD has an estimated global prevalence of over 1% [8], and a
rising incidence in recent years [5, 9, 10]. Diagnostic delays and underdiagnosis of CD therefore
represent an important healthcare problem. Screening for CD is typically performed by highly sensitive
and widely available serum tests for CD autoimmunity (CDA): the most accurate and commonly used
being the assay for antibodies to tissuetransglutaminase (tTG-IgA) [11]. Seropositivity is suggestive of
underlying CD, and such patients typically undergo endoscopic evaluation to establish or rule out the
diagnosis of CD via biopsy of intestinal mucosa [12]. High seropositivity (tTG-IgA> 10X ULN) is associated
with a high (> 95%) positive predictive value (PPV) for villous atrophy and, in the proper clinical setting,
is considered sufficient for diagnosis without a biopsy in children and possibly in adults [6, 13, 14, 15].
Clinical guidelines do not provide clear and consistent definitions on when to screen for CDA. General
population screening is not currently recommended [16], although screening high-risk patients may be
warranted [17, 18]. What defines high-risk groups for CD varies somewhat between reports, although
the Page 4/15 focus has typically been on a family history of CD and medical comorbidities with
established associations with CD [19]. Beyond these factors, laboratory abnormalities such as iron-
deficiency anemia are common among patients with CD [20]. While PCPs may be aware of specific risk-
factors, signs and symptoms of undiagnosed CD, more subtle combinations of clinical features may go
missed. Machine learning (ML) algorithms have the potential to use existing data within a patient’s
electronic medical record (EMR) to provide risk assessments to providers [21]. ML models have been
developed to alert intensive care unit physicians to patients at risk of circulatory failure [22], to identify
clinically significant portal hypertension in non-alcoholic steatohepatitis patients from pathology reports
[23], to predict incident hypertension [24] and hypertension outcomes [25], to identify patients with
undiagnosed psoriatic arthritis [26], to predict dementia onset [27], future Parkinson’s disease diagnosis
[28], and to flag patients at risk of advanced colorectal cancer [29]. One previous study that attempted
to develop ML models to identify patients with incident CD using a variety of modeling methods found
that the models were not consistently better than chance [30]. Another study showed positive results,
but the study size was small and the models relied on symptoms extracted from unstructured clinical
documents and diagnostic codes [31]. Neither study included objective laboratory test results as
predictive input features, which may have hindered performance and limited generalizability. The goal
of the current study was to develop and assess a prescreening EMR-based tool to classify adult patients
by risk of having unidentified CD autoimmunity using ML methods. Input features included commonly
available laboratory results, age and biological sex. Incident cases were identified from a large
retrospective community-based dataset using results from tTG-IgA testing. Highly seropositive cases
(tTG-IgA> 10X ULN) with probable underlying CD were used for model training and evaluation against
cohorts of controls with no evidence of disease. Performance was additionally assessed in a test set
consisting of a cohort of seropositive cases (tTG-IgA> 2X ULN) who may require endoscopic evaluation
for CD and a cohort of controls. In both test sets discriminative ability as measured by estimated area
under the ROC (AUC) was assessed at multiple time points before first documented evidence of CD
autoimmunity.

Paper 3

Several Celiac Disease (CD) histopathological features and serologic tests the diagnose disease. Biopsy-
based assessment of proximal small intestine is the current is the gold standard for confirmation of CD
[1, 2, 3], with severity assessment using Marsh–Oberhüber classification (modified Marsh score) being
common in clinical practice [4, 5]. Serologic tests are frequently used to identify individuals who require
a biopsy; these include anti-gliadin IgA and IgG (AGA IgA and AGA IgG), anti-reticulin IgA (ARA), anti-
endomysium IgA(EMA), and anti-tissue transglutaminase IgA (TTG) antibodies [1]. Despite there being
guidelines for CD diagnosis, its severity assessment is challenging. It falls privy to interobserver
variability, non-specificity of discerning histopathological features, and a lack of specific
histopathological markers associated with either CD severity classes or comorbid conditions. The
Marsh–Oberhüber classification is a commonly used method for CD severity assessment in clinical
practice based on histopathological features involving villus and crypt injury and intraepithelial
inflammation [4, 5]. Unfortunately, high interobserver variability is found when this system is applied to
clinical samples, and agreement is generally only fair (κ = 0.29-0.35) [6, 7]. All rights reserved. No reuse
allowed without permission. (which was not certified by peer review) is the author/funder, who has
granted medRxiv a license to display the preprint in perpetuity. medRxiv preprint doi:
https://doi.org/10.1101/2021.01.20.21250194; this version posted January 26, 2021. The copyright
holder for this preprint Alternate purely quantitative methods of assessing mucosal damage by direct
measurement of the villous height to crypt depth ratio (Vh: Cd) and intraepithelial lymphocyte (IEL)
counts have been proposed [8]. Using these methods, the inter-observer agreement is moderate (κ =
0.55) [7]; however, this method is also highly dependent on the use of correctly oriented specimens and
is not used in clinical practice. In patients with CD, there is also evidence of comorbid endocrinopathies
such as T1DM and hypothyroidism, which substantially more common than in the general population.
Further, in comparison to those with isolated T1DM, patients with undiagnosed CD and T1DM also have
a higher prevalence of retinopathy (58% vs. 25%) and nephropathy (42% vs. 4%) [9, 10, 11]. We chose a
deep learning-based methodology for image analytics to develop a novel predictive model for CD
severity and the association of disease features with comorbid conditions. These models utilizing
Convolutional Neural Networks (CNNs) have shown potential benefit for disease characterization using
histological specimens in a broad range of diseases [12, 13, 14, 15], including CD [16]. They are
optimized to discern fine features from regions of interest within the biopsy images. One such example
is of the deep residual network (ResNet) that has outperformed early deep learning models and
achieved superior performance on image recognition benchmarks [17, 18, 19]. We also deployed a deep
learning method [20] to provide insight into the model decision-making and discerning specific histologic
features for each class. We confirmed the features identified by the model, specific for severe CD
(Marsh IIIc) and concurrent endocrine commodities, using immunohistochemistry, which confirmed the
utility of deep learning CNN models for discerning disease-specific features.

Paper 4

Coeliac disease (CD) is a rapidly expanding disorder both 20 in terms of prevalence in the world and in
terms of a more 21 significant number of diagnosed patients; it is an autoim22 mune disease that can
occur at all stages of life. Advances 23 in understanding the pathogenetic and genetic factors that The
associate editor coordinating the review of this manuscript and approving it for publication was Yizhang
Jiang . influence risk have led to the development and refinement 24 of diagnostic tools. It is a chronic
disease of the small 25 intestine characterized by an abnormal immune response; 26 the latter is due to
exposure to gluten present in the diet 27 in genetically predisposed subjects. The ‘‘environmental’’ 28
factor triggering coeliac disease is represented by gluten, 29 a protein complex contained in some
cereals (wheat, barley, 30 rye) [1]. Coeliac disease is an autoimmune disorder induced 31 by dietary
gluten in genetically predisposed subjects. CD has 32 VOLUME 10, 2022 This work is licensed under a
Creative Commons Attribution 4.0 License. For more information, see

Decision Support System for Coeliac Disease 33 a prevalence of ∼ 1% in many populations around the
https://creativecommons.org/licenses/by/4.0/ 102223 M. E. Tabacchi et al.: Fuzzy-Based Clinical

world, 34 and the breadth of established clinical presentations continues 35 to increase, making the
disorder a significant relevance in 36 the medical field [2], [3]. The CD has been analysed in its 37 many
aspects, and although pathogenesis and pathophysi38 ology remain unknown, it is assumed that the
disease is 39 strictly connected to genetic interactions, environmental and 40 immunological factors. 41
Like other underdiagnosed disorders, CD is depicted as an 42 iceberg of which the most considerable
part is submerged [4]. 43 There are no particular manifestations in the silent form of 44 coeliac disease,
and for this reason, it is difficult to diagnose. 45 The latent form instead characterizes those subjects
who, 46 despite having a predisposition to coeliac disease (positivity 47 of AGA anti-gliadin antibodies
and anti-endomysial EMA 48 antibodies), currently have a normal intestinal mucosa that 49 does not
present atrophy of the villi. However, atrophy will 50 appear after some time, and therefore periodic
monitoring is 51 necessary. With a major awareness of the disease an increas52 ing number of patients
are diagnosed and therefore, as with 53 many other autoimmune disorders, the real incidence in the 54
population seems to have increased [5]. 55 In diagnosing coeliac disease, serology is usually the first 56
step in diagnosing or ruling out the disease in symptomatic 57 patients or for screening. The biopsy is
essential for the 58 definitive diagnosis of the pathology. The serological markers 59 of coeliac disease
are: IgA against tTG, Endomysial antibod60 ies (IgA), IgG against DGD, IgA versus deamidated gliadin 61
peptide, IgG versus tTG. A small number of coeliac disease 62 patients have had negative serological test
results. There63 fore, biopsies should be performed if there is a high clinical 64 suspicion of coeliac
disease, regardless of these findings. 65 For asymptomatic patients, especially children, who have 66
slight increases in serological markers of the disease, biopsy 67 analysis can be delayed. Level
gastroscopy duodenojejunal 68 with intestinal biopsy is undoubtedly helpful to confirm the 69 diagnosis
and to ascertain the degree, being an invasive exam70 ination, it is desirable, especially in children, that
it is carried 71 out only when necessary. 72 The European Society for Paediatric Gastroenterology, 73
Hepatology and Nutrition (ESPGHAN) guidelines [6] report 74 the possibility of avoiding biopsy to
diagnose coeliac dis75 ease in genetically susceptible children with high titres of 76 tTG_IgA antibodies.
Unfortunately, it is only possible in a 77 few patients and is influenced by the lack of standardization 78
of anti tTG_IgA kits. 79 In order to satisfy the heterogeneity of data and their com80 plexity, a Decision
Support System (DSS) has to be consid81 ered for their manipulation. DSSs continue to be increasingly
82 requested in the clinical setting, and there are still many 83 open problems even in the field of
interoperability. Due to 84 the multidisciplinary nature of a DSS, certain precautions to 85 meet the
needs of interoperability must be taken into consid86 eration in the design phase during the
construction, use and 87 maintenance of the DSS. Some of these considerations were 88 addressed by
Sutton et al. in [7] as DSS evolve in complexity (Artificial Intelligence), interoperability
(multidisciplinarity) 89 and data sources (Cloud, open data,. . . ). Thus, decision sup- 90 port systems
become a relevant part of the tools that use 91 artificial intelligence (AI). They have the task of solving
open 92 questions to deepen the understanding of the correlations 93 between the data representing
the events. The DSS, supported 94 by multidisciplinary approaches, revisits how data are treated 95 and
analysed and generates virtually and globally cognitive 96 pathways that highlight unconventional
solutions and consid- 97 erations. This approach improves how data are analysed and 98 understood,
their knowledge and correlation. 99 knowledge-based, data-driven, or lacking a priori knowl- 100 edge.
The strategy of the former is based on rules that are 101 not necessarily deterministic; it recovers data
from informa- 102 tion systems (i.e.: databases) or in real-time from Biometric 103 systems and
evaluates the rules involved. Finally, it produces 104 an output event (Alarm, screening, diagnostic
pathway,. . . ). 105 Non-knowledge-based DSSs are data-driven, and the output 106 events result from
modelling applications on machine learn- 107 ing with no specific medical knowledge needs to take into
108 account to set up the model. Such a model without knowl- 109 edge, adopted for the creation of
the DSS, is currently being 110 studied in the scientific community; they are incredibly com- 111 plex to
implement, and they leave no room for understanding 112 the results, whether they are correct or
incorrect, even when 113 they have a high degree of sensitivity and specificity. 114 The inherent
imprecision of medical data, as well as the 115 fact that a patient enters the diagnostic pathways from
differ- 116 ent medical sources (as an outpatient, an inpatient, referred by 117 a physician, after blood
tests and other unrelated diagnostic 118 tools have been administered) makes standard classification
119 methods less easy to adapt to the CDSS backend. While 120 well-known classifiers such as SVM and
NN produce precise 121 results on binary classification problems, the diagnosis of 122 a coeliac patient
requires a number of steps, and a CDSS 123 should offer prioritisation advice on each of these steps, 124
regardless the completion of the whole diagnostic pathway. 125 Fuzzy classifiers allow for taking into
account this inherent 126 dynamicity and imprecision. 127 In this work, we present a fuzzy-based
Clinical Diagnostic 128 Support System developed within the ITAMA project (hence- 129 forth
ITAMACDSS). ITAMA (ICT Tools for the diagnosis 130 of Autoimmune diseases in the Mediterranean
Area, [8]) 131 is a cross-border project between Italy and Malta funded 132 by the European Regional
Development Fund within the 133 INTERREG V-A Italia - Malta Cooperation Programme, 134 in which
the common territorial challenge is to improve the 135 quality of life and well-being of the population
affected by 136 autoimmune diseases, containing the costs of health systems 137 through a strategic
commissioning demand towards the world 138 of research. In Sicily and Malta, autoimmune diseases
present 139 a high incidence, probably due to the high consumption of 140 starchy foods. In ITAMA, a
mass screening was carried out on 141 more than 20,000 Maltese children. The screening was based
142 on a Medical History Questionnaire (MHQ) and a Point- 143 of-Care Test (PoCT). Children tested
positive based on the 144 102224 VOLUME 10, 2022 M. E. Tabacchi et al.: Fuzzy-Based Clinical Decision
Support System for Coeliac Disease 145 result of the MHQ, of the PoCT, or both, were invited for 146
further investigation, and in particular for the anti-Actin IgA 147 to verify the possibility of avoiding
biopsy in a large number 148 of patients. 149 The implemented CDSS is based on a fuzzy classifier using
150 neural networks. The system was developed and tested using 151 a Virtual Database and a Real
Database acquired during the 152 ITAMA project. 153 Since the objective was to minimize the length
and impact 154 of the diagnostic pathway for a correct diagnosis of celiac 155 disease while maximizing
the effectiveness, it was neces156 sary to validate the CDSS with numerous pilot tests both 157 on
simulated data generated in collaboration with medical 158 experts and on real data acquired in real
contexts (hospitals). 159 The validation using ‘virtual patients’ (i.e. artificially gen160 erated data based
on current knowledge of the disease and 161 its symptoms/related conditions) was a preparatory step
in 162 order to reach a starting point of the system before completion 163 of the data collection on real
patients, and is reported for 164 comparison purposes. 165 Despite a large amount of data available
offered by the 166 acquisition in the clinical field, the developed Framework 167 intends to propose an
architecture that can also be used in 168 those sectors in which the data are not numerous and
therefore 169 cannot take advantage of methods based on the DL. Our 170 Framework, whose
intelligence is based on Fuzzy rules and 171 therefore can be analysed and verifiable, acquires
anamnestic 172 data validated by doctors or specific health personnel, pro173 vides both suggestions on
personalized diagnostic paths and 174 interpretations of the data to lead to a faster diagnosis of celiac
175 disease. 176 This paper is organized as follows. Section II discusses 177 existing literature. In Section
III the databases – a virtual one 178 (III.A) generated using knowledge from the diagnostic state 179 of
the art in coeliac disease, and a real one (III.B) obtained 180 through data collected by the ITAMA
project during a mass 181 screening – on which the CDSS Fuzzy classifier has been 182 trained are
presented. Section IV details data cleaning and 183 extraction procedure, the proposed approach for
designing a 184 fuzzy-based CDSS for coeliac disease, its architectural imple185 mentation and the
communication protocol with ITAMA DB. 186 In Section V results from the classification procedure are
187 presented and discussed, and hints at the complete implemen188 tation of the CDSS system are
given. Discussion on the results 189 as well as further considerations are given in Section IV

Paper 5

Coeliac disease is a common autoimmune enteropathy with a prevalence of 1.4% globally.1 Coeliac
disease has a strong genetic component with 80% of heritability explained by the human leukocyte
antigen (HLA) DQ locus.2 Most people with coeliac disease possess one or more of DQ2.5, DQ2.2 or DQ8
genotypes, with DQ7.5 also implicated.3 However these alleles are also seen in up to 40% of non-coeliac
individuals in European and South-East Asian populations.4 Genome wide association studies have
implicated over 40 additional single nucleotide polymorphisms5 in coeliac risk.6 Genetic testing in
clinical practice has, to date used genotyping of Class II HLA-DQ, and stratified genotypes into simple risk
categories.7 HLA-DQ typing is currently recommended in European paediatric guidelines where
diagnosis can be made without biopsy in symptomatic children with tissue transglutaminase antibodies
10-fold the upper limit normal, endomysial antibodies and positive genetic testing.8,9 Diagnostic
uncertainty can occur from false positives and false negative serology and/or histology caused by
insufficient gluten intake or selective immunoglobulin A deficiency. Diagnosis in young children is further
complicated by reluctance to perform invasive endoscopy, which often requires general anaesthesia. In
most North American centres and for the European setting with serological results less than 10 times
upper limit normal, duodenal biopsy is still required in the paediatric age group. HLA-DQ typing to aid
diagnosis, while useful,10 does not include all known coeliac genetic risk. Furthermore, current HLA-DQ
testing methods can be expensive, hard to interpret, and are not universally available. Genetic risk
scores (GRS) combine multiple risk single nucleotide polymorphism into a single continuous
variable,11,12 are often very discriminative in strongly HLA linked disease,13,14 and can be genotyped
at lower cost than HLA-DQ typing.15,16 Romanos et al showed discrimination of coeliac disease with a
combined HLA typing and non-HLA GRS approach using 57 single nucleotide polymorphisms.11 Abraham
et al showed further improved discrimination with a machine learning based approach requiring
genome wide single nucleotide polymorphism array data.14 Publicly available large data sets of coeliac
cases and controls now allow improved description of risk associated with individual loci and more
accurate GRS. Additionally, the major risk loci at HLA-DQ are now imputable at high resolution due to
the availability of large HLA reference panels. Despite the ability of a GRS to discriminate coeliac disease,
the potential clinical benefit of the GRS as a diagnostic test has not yet been fully explored, and there is
significant work still to be done to best integrate full genetic risk information into a coeliac diagnostic
pathway. We hypothesised that better descriptions of HLA-DQ risk and interactions, in combination with
non-HLA information would allow us to generate a more discriminative GRS with a small number of
single nucleotide polymorphisms. We therefore generated a GRS for coeliac disease in a large reference
cohort, independently validated it, and developed a panel of single nucleotide polymorphisms that can
be easily genotyped at low cost in a clinical setting. 2 | METHODS We identified single nucleotide
polymorphisms associated with coeliac disease from three genome-wide association studies5,17,18 and
used data from HLA imputation of UK Biobank to identify single nucleotide polymorphisms in strong
linkage-disequilibrium with HLA-DQ haplotypes (DQ2.5, DQ2.2, DQ8 and DQ7.5). We used data from a
large case-control study to generate odds ratios for genotype pairs of these alleles at HLA-DQ. We
combined variants into a coeliac disease GRS using a log-additive model quantifying genetic risk as a
single numeric variable. We validated the discriminative power of the GRS in a large population-based
study and further in a cohort recruited from a paediatric gastroenterology clinic specialising in coeliac
disease. 2.1 | Research Subjects: Case-control The case-control cohort included 12,041 coeliac
disease cases and 12 228 controls.5 Cases consisted of those from a combination of European studies
with variable diagnostic criteria as previously described and controls were from the UK based 1958 Birth
Cohort and National Blood Service cohorts. Principal components analysis showed all samples were of
White European ancestries. Samples were genotyped using the Illumina ImmunoChip (139 553 variants)
and additional single nucleotide polymorphisms were imputed using the 1000 Genomes reference panel
(80.1m variants). 2.2 | Research Subjects: UK Biobank We used single nucleotide polymorphism
array genotyping data from a subset of 379 767 samples from the UK Biobank study identified as of
White European ancestries by principal components analysis and K-means clustering.19 We identified
coeliac disease cases using either hospital admission code and/or self-reported coeliac disease. We
assessed the GRS ability to discriminate coeliac disease separately in those matching both criteria (n =
1237) or at least one of these criteria (n = 2901; Figure S1). 2.3 | Research Subjects: Stollery
Children's Hospital Coeliac Clinic We genotyped single nucleotide polymorphisms representing 14 HLA-
DQ genotypes and 38 non-HLA loci in 128 patients referred to a tertiary paediatric clinic in Alberta,
Canada, for consideration 13652036, 2020, 7, Downloaded from
https://onlinelibrary.wiley.com/doi/10.1111/apt.15826 by INASP/HINARI - PAKISTAN, Wiley Online
Library on [03/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-
conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative
Commons License | SHARP et al. 1167 of coeliac disease diagnosis. Patients provided informed
consent and study ethics were approved by the Human Research Ethics Board of the University of
Alberta (PRO00053479). A total of 110 patients, recruited between 2015 and 2017, with adequate
sampling and successful genotyping were included in the analysis. Diagnosis of coeliac disease was made
by endoscopy (n = 46) or according to a modification of European serological diagnostic guidelines (n =
64).20 In this Canadian clinic, patients consenting to serological diagnosis, which is not currently
standard of care in North America, were required to have tissue transglutaminase antibody levels
greater than 28 times upper limit normal and confirmatory HLA-DQ typing. High levels of tissue
transglutaminase antibody have been shown to be extremely specific as a marker of coeliac disease and
as such form part of current diagnostic guidelines in paediatric cohorts.21,22 As endomysial antibody
testing is not available in this clinic, a repeat positive tissue transglutaminase antibody result above 10
times upper limit normal was required as a second serological confirmatory test. Patients undergoing
duodenal biopsy were confirmed to have coeliac disease by Marsh score of 2 or greater. Control subjects
(n = 40) were paediatric general gastroenterology patients whom were negative for coeliac disease by
both intestinal biopsy and negative tissue transglutaminase serology. Cohort characteristics are
summarised in Table S4

Paper 6

Coeliac disease (CeD) is a chronic inflammatory disorder of the small intestine that develops when a
genetically susceptible individual is exposed to gluten proteins found in certain cereal grains including
wheat, rye and barley. Gluten is the collective term for a number of proteins, including gliadin and
glutenin in wheat, hordein in barley, and secalin in rye [1,2]. In the gastrointestinal tract, gliadin is
hydrolysed into gliadin peptide fragments, which are deamidated by the autoantigen tissue
transglutaminase, tTG, producing highly immunogenic deamidated gliadin peptides. These deamidated
gliadin peptides, just like antigens from invading pathogens, are endocytosed and presented on the
major histocompatibility complex (MHC; also known as HLA) on the surface of the intestinal epithelial
cells. The subsequent interaction between MHC-bound gliadin peptide Biomolecules 2023, 13, 1707.
https://doi.org/10.3390/biom13121707 https://www.mdpi.com/journal/biomolecules Biomolecules
2023, 13, 1707 2 of 16 antigens and CD4+ T cells is thought to drive the inflammation, which damages
the small intestine, causing atrophy (shortening and broadening) of the villi, leading to an array of
symptoms, including malabsorption and diarrhea [3]. Each T cell has a T cell receptor (TCR) on its surface
which determines antigen recognition; a set of TCRs is known as a T cell receptor repertoire and is
necessarily diverse to allow for the recognition of a wide range of antigens. This diversity is generated
through somatic TCR rearrangement, as illustrated in Supplementary Figure S1, with the most variable
region known as the complementarity determining region 3 (CDR3). T cell receptors have two chains,
according to which T cells can be categorised. The TCRs of αβ T cells are encoded by TRA and TRB genes,
respectively, while the TCRs of γδ T cells are encoded by TRG and TRD genes [4]. The diversity of CDR3
sequences in a TCR repertoire can be assessed by using bulk or single-cell sequencing, and the starting
material can be either DNA or RNA. The utility and advantages of these methods are reviewed in [5]. αβ
T cells can also be categorised further as helper (CD4+ ) or cytotoxic (CD8+) T cells based on the
expression of the co-receptor, either CD4 or CD8, in the TCR complex [6]. αβ T cells can recognise
intracellularly processed antigens bound to MHC complexes found on the cell membrane, forming a
complex with class II MHC molecules (MHCII) (in the case of CD4+ T cells) or class I MHC molecules (MHC
I) (in the case of CD8+ T cells) [7]. In contrast to αβ T cells, γδ T cells do not require interaction with an
MHC complex for antigen binding [8]. However, the precise mechanism for antigen recognition in γδ T
cells is not yet fully understood. The genetic susceptibility of developing CeD is most strongly associated
with the HLA haplotypes, with the majority (around 80 to 95%) of CeD patients possessing HLA-DQ2
(DQA1*05/DQB1*02) and a minority possessing DQ8 (DQA1*03/DQB1*0302) [9]. These HLA molecules
are capable of presenting an array of gliadin peptides which subsequently interact with gluten-reactive
CD4+ T cells; see Figure 1. Among these T cell epitopes of the gluten protein (consisting of α-, γ-, and ω-
gliadin and glutenin components), some appear to be more important than others in CeD development.
For instance, T cells specific for αI and αII epitopes of α-gliadin are found in almost all CeD patients,
while the epitopes of γ-gliadin are less often recognised [10]. Therefore, based on these CeD-specific
cognate interactions, a synthetic antigen presenting an MHC-tetramer complex loaded with gluten
epitopes can be designed to selectively label the gluten reactive CD4+ T cells from a T cell repertoire,
using a recombinant MHC tetramer assay [11]. However, this approach requires a large blood sample
and is labour intensive [12]. DNA sequencing can then be performed on these gluten reactive CD4+ T
cells to determine their αβ-TCR sequences, and these gluten-specific TCR sequences may provide
diagnostic potential for CeD. Despite the diversity of the TCR repertoire, the observed degree of sharing
between individuals is higher than expected at random [13]. One explanation for this is their immune
functionality, and that individuals responding to the same immune stimulus are more likely to share
TCRs [14]. In the case of CeD patients, they may share TCRs capable of responding to gluten. However, it
is also possible that different TCRs bind the same gluten epitope due to shared genetic motifs or shared
amino acid properties [15]; therefore, multiple gluten-specific TCR sequences need to be considered.
While the biological mechanisms of CeD have been widely studied, diagnosis remains a challenge. In
many countries, blood testing for both anti-TTG and anti-EMA antibodies are undertaken, followed by
endoscopic duodenal biopsy, with patients being required to eat a significant amount of gluten for 2–6
weeks prior to biopsy (NICE, ESPGAN, [16]). Many patients struggle to tolerate gluten challenge due to
the severity of their gluten-induced symptoms, meaning that antibody levels may be close to or below
the upper limit of normal and biopsy changes may be subtle and non-diagnostic, resulting in either
inconclusive tests and patients requiring further investigation, or incorrect diagnosis as non-coeliac [17].
This results in a low sensitivity, with many cases of CeD remaining undiagnosed. While histopathological
examination of a duodenal biopsy remains the “gold standard test”, studies indicate that concordance
between pathologists for the diagnosis of CeD is around Biomolecules 2023, 13, 1707 3 of 16 80%,
mainly due to the subjective interpretation of subtle biopsy change [18–20]. There is a clear need for a
novel method of diagnosis which does not require the consumption of gluten, and therefore is more
sensitive, and more objective. A test which could also be performed on a blood sample, rather than
requiring a duodenal biopsy, would provide further improvements on the current “gold standard”.

Paper 7

Under-nutrition is the underlying cause of approximately 45% of the 5 million under 5-year-old
childhood deaths annually in low and middle-income countries (LMICs) [1] and is a major cause of
mortality in this population. Linear growth failure (or stunting) is a major complication of under-
nutrition, and is associated with irreversible physical and cognitive deficits, with profound
developmental implications [32]. A common cause of stunting in LMICs is EE, for which there are no
universally accepted, clear diagnostic algorithms or non-invasive biomarkers for accurate diagnosis [32],
making this a critical priority [28]. EE has been described to be caused by chronic exposure to
enteropathogens which results in a vicious cycle of constant mucosal inflammation, villous blunting, and
a damaged epithelium [32]. These deficiencies contribute to a markedly reduced nutrient absorption
and thus under-nutrition and stunting [32]. Interestingly, CD, a common cause of stunting in the United
States, with an estimated 1% prevalence, is an autoimmune disorder caused by a gluten sensitivity [15]
and has many shared histological features with EE (such as increased inflammatory cells and villous
blunting) [32]. This resemblance has led to the major challenge of differentiating clinical biopsy images
for these similar but distinct diseases. Therefore, there is a major clinical interest towards developing
new, innovative methods to automate and enhance the detection of morphological features of EE
versus CD, and to differentiate between diseased and healthy small intestinal tissue [4]. The overview of
the methodology used is shown in Fig. 1. In this paper, we propose a CNN-based model for classification
of biopsy images. In recent years, Deep Learning architectures have received great attention after
achieving state-of-the-art results in a wide variety of fundamental tasks such classification [13,18–
20,24,29,35] or other medical domains [12,36]. CNNs in particular have proven to be very effective in
medical image processing. CNNs preserve local image relations, while reducing dimensionality and for
this reason are the most popular machine learning algorithm in image recognition and visual learning
tasks [16]. CNNs have been widely used for classification and segmentation in various types of medical
applications such as histopathological images of breast tissues, lung images, MRI images, medical X-Ray
images, etc. [11,24]. Researchers produced advanced results on duodenal biopsies classification using
CNNs [3], but those models are only robust to a single type of image stain or color distribution. Many
researchers apply a stain normalization technique as part of the image pre-processing stage to both the
training and validation datasets [27]. In this paper, varying levels of color balancing were applied during
image pre-processing in order to account for multiple stain variations. The rest of this paper is organized
as follows: In Sect. 2, we describe the different data sets used in this work, as well as, the required pre-
processing steps. The architecture of the model is explained in Sect. 4. Empirical results are elaborated
in Sect. 5. Finally, Sect. 6 concludes the paper along with outlining future directions

Paper 8
Celiac disease is a type of chronic disease that endures in those carrying halotype DQ2/DQ8 in the body
[1]. The initial representation of celiac disease was given by Gee in 1888 [2]. After ingestion of gluten, an
afflicted individual observes painful symptoms that appear instantly. The clinical features are abdominal
bloating, anaemia, fatigue, abdominal pain, vomiting, skin rashes, diarrhea, failure to thrive, delayed
growth, vitamin levels, anxiety, depression, etc., [3]. The manifestations differ from patient to patient
and depend upon ppm (parts per million) gluten intake. During the world war, when wheat grain lessens
in specific countries, it leads to a decline in certain celiac symptoms [4,5]. So, the notion of gluten-less
products was prefaced to grab the celiac disease [6,7]. European Society for Paediatric
Gastroenterology, Hepatology, and Nutrition (ESPGHAN) guidelines [6] assures the tTG-IgA test for celiac
diagnosis and if any ambiguous consequence, then proceeds with biopsy procedure for the confirmation
of celiac disease. Gut damage is commonly seen in celiac patients with biopsy testing. An individual is
also said to be celiac if the HLA DQ2/DQ8 genes is present in the body with gastrointestinal symptoms.
The concept of fuzzy logic was given by Zadeh LA to deal with the problem of intermediate values or
non-linear problems in 1965 [8]. A fuzzy system is a type of inference system which-takes input in the
form of crisp data and evaluates them by comparing rules stored in the fuzzy database [9]. Mamdani
and Sugeno fuzzy types of models can be used for producing a fuzzy knowledge base with different
types of the membership function. Triangular and trapezoidal member functions are ordinarily used
functions for the formation of fuzzy rules. Subsequently, the process of de-fuzzification produces an
output value as the single probable value which is a prediction in terms of health applications [10]. Fuzzy
logic is consistently used in recognising chronic diseases such as dental, cholera, liver diagnosis, viral
infection, lung diseases, kidney diseases, etc., effectively and accurately. Alongside this, a lot of research
has been done on the celiac disease with the help of clinical testing procedures in different countries.
Researchers did exceptionally well to accomplish submissions in diagnosing celiac disease with multiple
clinical examinations as Tissue transglutaminase ( tTG), Endomysial Antibody (EMA), Biopsy and Genetic
Testing procedure [11-17]. So, this study was aimed to design a fuzzy logic inference system to predict
celiac disease based on individual symptoms in North-Indian patients. The proposed system will
abandon a fruitful consequence for entities and physicians for celiac disease disclosure in few seconds
without any painful testing strategy.

Paper 9

Celiac disease (CD) is among the most frequent chronic digestive diseases worldwide but is severely
underdiagnosed [1]. It is an autoimmune, systemic, malabsorptive disease that primarily affects the
small bowel, leading to a crypt hyperplastic, atrophic injury of the mucosa, in response to dietary intake
of gluten. This occurs in genetically susceptible individuals and affects approximately 1% of the
population worldwide [2]. Currently, adult CD diagnosis relies on the combination of serological testing
and upper gastrointestinal endoscopy with small bowel biopsies, in order to detect the atrophic gluten-
induced mucosal damage [3]. In pediatrics, CD diagnosis can be made based on serology only [4], and
this has fueled a growing interest in a biopsy-avoiding diagnostic strategy in adults also [5,6]. There are
several anticipated benefits of a no-biopsy diagnosis in CD: lowering costs (by eliminating the need for
biopsy sampling, biopsy processing, and histopathology analysis), avoiding the wearing and tearing of
the working channel of the scope, reducing the procedural time and exposure to sedation and its
associated side effects, and not least avoiding the pitfalls of histology reported in CD diagnosis [7,8]. In
this setting, several studies have looked at the association between villous atrophy (VA) proven by
histological Diagnostics 2023, 13, 2780. https://doi.org/10.3390/diagnostics13172780
https://www.mdpi.com/journal/diagnostics Diagnostics 2023, 13, 2780 2 of 13 analysis and specific
changes in the small bowel mucosa, the so-called endoscopic markers of VA [9] (see Table 1). Table 1.
Endoscopic markers of VA. Atrophy of mucosa with prominent submucosal vascular pattern; Mucosal
fissures or grooves, with mosaic or “cracked-mud” appearance; Nodularity of the mucosa; Scalloping of
Kerckring folds; Reduction or loss of folds. In fact, the detection of VA on endoscopic examinations
completed for CD-unrelated indications is considered an opportunity for the detection of clinically
unsuspected CD [10]. Considering the large number of endoscopic examinations worldwide, along with
the availability of open-access endoscopy in some services, recognition of VA markers becomes of
paramount importance in potentially improving the diagnostic rate of CD. Some authors have shown
that a significant proportion of patients with CD had a previous recent endoscopic examination that
might have missed the subtle changes in the small bowel mucosa and failed to provide an early
diagnosis [11]. A potential role for computeraided detection of mucosal changes is foreseen in this
setting, as already validated for other pathologies [12]. There is already solid evidence regarding the
quantitative processing of capsule endoscopy images for CD diagnosis [13], but videocapsule
examination is far less commonly used than upper gastrointestinal endoscopy. Endoscopy is used both
in suspicion of CD and for confirmation of diagnosis [3], but the major window of opportunity is
represented by examinations conducted for non-CD related reasons, where the detection of VA by
computer-aided diagnosis could significantly improve CD case findings (see Figure 1). Diagnostics 2023,
13, x FOR PEER REVIEW 2 of 14 of the scope, reducing the procedural time and exposure to sedation and
its associated side effects, and not least avoiding the pitfalls of histology reported in CD diagnosis [7,8].
In this setting, several studies have looked at the association between villous atrophy (VA) proven by
histological analysis and specific changes in the small bowel mucosa, the socalled endoscopic markers of
VA [9] (see Table 1). Table 1. Endoscopic markers of VA. Atrophy of mucosa with prominent submucosal
vascular pattern; Mucosal fissures or grooves, with mosaic or “cracked-mud” appearance; Nodularity of
the mucosa; Scalloping of Kerckring folds; Reduction or loss of folds. In fact, the detection of VA on
endoscopic examinations completed for CD-unrelated indications is considered an opportunity for the
detection of clinically unsuspected CD [10]. Considering the large number of endoscopic examinations
worldwide, along with the availability of open-access endoscopy in some services, recognition of VA
markers becomes of paramount importance in potentially improving the diagnostic rate of CD. Some
authors have shown that a significant proportion of patients with CD had a previous recent endoscopic
examination that might have missed the subtle changes in the small bowel mucosa and failed to provide
an early diagnosis [11]. A potential role for computeraided detection of mucosal changes is foreseen in
this setting, as already validated for other pathologies [12]. There is already solid evidence regarding the
quantitative processing of capsule endoscopy images for CD diagnosis [13], but videocapsule
examination is far less commonly used than upper gastrointestinal endoscopy. Endoscopy is used both
in suspicion of CD and for confirmation of diagnosis [3], but the major window of opportunity is
represented by examinations conducted for non-CD related reasons, where the detection of VA by
computer-aided diagnosis could significantly improve CD case findings (see Figure 1). Figure 1.
Unmasking clinically unsuspected CD by opportunistic detection of VA on endoscopy. Artificial
intelligence (AI) has undoubtedly revolutionized practice in several medical fields, including
gastroenterology. Computer-aided image analysis has already been explored for endoscopic procedures,
ultrasound images, and histology slides in both luminal and hepato-bilio-pancreatic pathologies. In the
field of endoscopy, there are abundant data on AI applications for colonoscopy, a procedure well
recognized to be operator-dependent, for polyp detection and diagnosis, and also bowel cleansing, and
in this setting, AI techniques have proven to improve quality indicators of examinations [14]. In addition
to the use in colorectal cancer screening programs, in the lower gastrointestinal tract, there are also AI
applications for inflammatory bowel disease, both Crohn’s disease and ulcerative colitis. Concerning the
upper gastrointestinal tract, there are validated Figure 1. Unmasking clinically unsuspected CD by
opportunistic detection of VA on endoscopy. Artificial intelligence (AI) has undoubtedly revolutionized
practice in several medical fields, including gastroenterology. Computer-aided image analysis has
already been explored for endoscopic procedures, ultrasound images, and histology slides in both
luminal and hepato-bilio-pancreatic pathologies. In the field of endoscopy, there are abundant data on
AI applications for colonoscopy, a procedure well recognized to be operator-dependent, for polyp
detection and diagnosis, and also bowel cleansing, and in this setting, AI techniques have proven to
improve quality indicators of examinations [14]. In addition to the use in colorectal cancer screening
programs, in the lower gastrointestinal tract, there are also AI applications for inflammatory bowel
disease, both Crohn’s disease and ulcerative colitis. Concerning the upper gastrointestinal tract, there
are validated algorithms for Barrett’s esophagus [15], chronic atrophic gastritis [16], esophageal and
gastric cancer [17], and for the small bowel, there is AI assistance for the detection of bleeding on
capsule endoscopy software. Not least, we have significant data on AI for liver disease—hepatocellular
carcinoma, liver fibrosis, and pancreatic pathology—both benign (acute pancreatitis) and malignant
(pancreatic cancer) [17]. The benefit of using AI in medicine is not only about improving diagnosis and
detection but also saving time and providing faster and wider access to healthcare services by optimizing
resource use. Diagnostics 2023, 13, 2780 3 of 13 Most common AI classification algorithms have been
evaluated in order to retain the ones that provide the most accuracy for further studies. As there is no
“golden bullet” solution for medical image classification, and while deep learning algorithm
performances are highly sensitive to the training image database, no a priori presumption has been
made related to the best classifier. AI algorithms have already been validated for detecting CD in capsule
endoscopy images and also for automated CD diagnosis on biopsy slides [13,18]. We aimed to assess if
computer processing of duodenal images captured during endoscopy would be accurate in detecting
mucosal changes associated with VA in CD patients.

Paper 10

Celiac disease is a frequent type of immune-mediated inflammatory disease of the small intestine. This
gluten-sensitive enteropathy is caused by higher sensitivity of the gut and immune system to gluten of
the diet and to gluten-related proteins [1]. The pathogenesis of celiac disease depends on genetic
factors and mucosal immune response. This immune disorder occurs in genetically predisposed patients
after induction by an environmental factor, which is gluten in the diet, found in cereals. More than 99%
of the patients have HLA DR3-DQ2 and/or the DR4-DQ8 [2–4], but other non-HLA locus genes may also
be involved in the disease pathogenesis, such as TNFAIP3 (A20), REL, NKG2D, MICA, CTLA4, MMP3, MIF,
and etcetera [5–15]. Celiac disease is associated with several autoimmune disorders, such as type 1
diabetes mellitus and autoimmune thyroid disease [16,17]. The mucosal immune response also
participates in the disease pathogenesis. An inflammatory reaction develops in response to gliadin
fractions, and a result there is inflammation of the lamina propria and epithelium, with disruption of the
epithelial layer and villous atrophy. Both the innate and adaptive immune responses are activated in
celiac disease, including gliadin reactive T cells, autoantibodies, intraepithelial lymphocytes,
macrophages, monocytes, and dendritic cells. A detailed description of the pathogenesis of celiac
disease is shown in Table 1. Healthcare 2022, 10, 1550. https://doi.org/10.3390/healthcare10081550
https://www.mdpi.com/journal/healthcare Healthcare 2022, 10, 1550 2 of 18 Table 1. Pathogenesis of
Celiac Disease. Factors Pathophysiology References Dietary gluten 1 Gluten of wheat, rye, and barley.
Gliadins and glutenins are rich in proline, which makes them resistant to proteolysis by gastric and
pancreatic enzymes. Various long gliadin peptides activate the immune system (“33mer”). Undigested
peptides may also affect intestinal microbiota. [18–21] Genetics 1 Genetic predisposition: HLA-DQ2 and
HLA-DQ8 contribute to 20%–40% of the genetic risk. They are class II MHC expressed by antigen-
presenting cells (APCs). [22–24] 2 Forty-two non-HLA regions have been associated with celiac disease. It
is estimated that they account for 15% of the genetic risk: IL18R1, IL18RAP, STAT4, CD28, CTLA4, ICOS,
CCR4, CCR1, CCR2, CCR3, CD3E, IL1R1, IL12A, IL2, IL21, TNFAIP3, ELMO1, PRKCQ, SOCS1, ICOSLG, and
IRAK1. These genes belong to cytokine-cytokine receptor activation, JAK-STAT pathway, T-cell receptor
signaling, intestinal immune network for IgA production, NF-KB signaling, and cell adhesion molecules.
Of note, many of these genes belong to the immune checkpoint and immune-oncology pathway.
[22,23,25–28] Immune 1 Generation of gluten-specific T-cell responses: presence of gluten-specific CD4-
positive T lymphocytes, antibodies against gliadin and de enzyme TG2, and pro-inflammatory cytokines.
[29,30] 2 Generation of autoantibodies: activation and differentiation into plasma cells of gluten-specific
and TG2-specific B lymphocytes, generation of autoantibodies that are both circulating and deposited in
the mucosa. These autoantibodies are responsible for the increased permeability of the epithelial
barrier. [31–33] 3 Cytokines in the intestinal mucosal immune system: IFN gamma and IL-21 are
produced by gluten-specific CD4-positive T lymphocytes. Secretion of IL-15, IL-18, and inhibition of
FOXP3-positive regulatory T lymphocytes (Tregs). [34,35] 4 Intraepithelial lymphocytes (IELs): increased
in celiac disease and their amount correlates with mucosal atrophy. IELs display cytotoxic
transformation and induce apoptosis of intestinal epithelial cells through FAS-L, perforin, granzyme B,
and NKG2D. NKG2D interacts with MICA on epithelial cells. [36–42] 5 Innate immune activation:
dysregulation of the production of IL-15 and activation of the innate immune response, including the
induction of epithelial stress. [43,44] Environmental 1 Microorganisms: intestinal dysbiosis (unbalanced
intestinal microbiota) and increased prevalence of specific microbial virulence genes isolated from celiac
disease patients. [45–50] 2 Others, such as smoking [51] The pathogenesis of celiac disease is
multifactorial and includes dietary gluten and genetic, immune, and environmental factors. Celiac
disease has an estimated prevalence of 1% in the general population based on serologic studies,
although in many cases the disease is asymptomatic [52,53]. The most relevant clinical manifestation is
due to malabsorption, and includes diarrhea, weight loss, anemia, and other metabolic disturbances. Of
note is that celiac disease can have diverse extraintestinal presentations such as delayed puberty,
hepatitis, iron-deficiency anemia, arthralgia and arthritis, peripheral neuropathy, epilepsy and seizures,
cerebellar ataxia, and dermatitis herpetiformis (among others) [1,29]. The diagnosis is made by a
combination of clinical signs and symptoms, serology testing, and small intestine biopsy. Additional
diagnostic tools include HLA typing, quantification of inflammatory cells in the small intestine biopsy
such as increased CD3-positive lymphocytes in the villus tips or the quantification of intra-epithelial
lymphocytes (IELs), and detection of TG2-targeted celiac IgA isotype autoantibodies in the intestinal
mucosa, and detection of gluten-specific T cells in the circulation by ELISPOT [29]. Celiac disease has
associated conditions including selective IgA deficiency, autoimmune disease, gastrointestinal disease
(reflux disease, eosinophil esophagitis, inflammatory bowel disease, microscopic colitis, liver disease,
and pancreatitis), menstrual and reproductive issues, idiopathic pulmonary hemosiderosis, and
cardiovascular and kidney diseases [1]. Celiac disease is associated with several autoimmune diseases
including diabetes mellitus type 1 [54–57], autoimmune thyroid disease (hypothyroidism) [58,59], and
atopic dermatitis [60,61]. Other manifestations related to serological autoantibodies includes Healthcare
2022, 10, 1550 3 of 18 neurological disorders (peripheral neuropathy and ataxia) [62], and
neurodegeneration via apoptosis [63]. Patients with untreated celiac disease are at increased risk of
lymphoma and gastrointestinal cancer [1]. Patients with refractory celiac disease type II may be
associated with enteropathy-associated T-cell lymphoma (EATL) [64–69]. Due to the clinical relevance of
this disease, a better understanding of the pathogenesis is needed, and using non-linear analysis may
provide a different approach. This research was a proof-of-concept exercise to determine whether
artificial intelligence analysis was a feasible approach to model celiac disease using an autoimmune
discovery gene panel

You might also like