Plants 13 00636
Plants 13 00636
Article
The American Cherimoya Genome Reveals Insights into the
Intra-Specific Divergence, the Evolution of Magnoliales, and a
Putative Gene Cluster for Acetogenin Biosynthesis
Tang Li 1,† , Jinfang Zheng 1,† , Orestis Nousias 1 , Yuchen Yan 1 , Lyndel W. Meinhardt 2 , Ricardo Goenaga 3 ,
Dapeng Zhang 2, * and Yanbin Yin 1, *
1 Nebraska Food for Health Center, Department of Food Science and Technology, University of Nebraska,
Lincoln, NE 68588, USA; [email protected] (T.L.); [email protected] (J.Z.);
[email protected] (O.N.); [email protected] (Y.Y.)
2 Sustainable Perennial Crops Laboratory, United States Department of Agriculture, Agriculture Research
Service, Beltsville, MD 20705, USA; [email protected]
3 Tropical Agriculture Research Station, United States Department of Agriculture, Agriculture Research Service,
Mayaguez 00680, Puerto Rico; [email protected]
* Correspondence: [email protected] (D.Z.); [email protected] (Y.Y.); Tel.: +1-301-504-7477 (D.Z.);
+1-402-472-4303 (Y.Y.)
† These authors contributed equally to this work.
Abstract: Annona cherimola (cherimoya) is a species renowned for its delectable fruit and medici-
nal properties. In this study, we developed a chromosome-level genome assembly for the cheri-
moya ‘Booth’ cultivar from the United States. The genome assembly has a size of 794 Mb with a
N50 = 97.59 Mb. The seven longest scaffolds account for 87.6% of the total genome length, which
corresponds to the seven pseudo-chromosomes. A total of 45,272 protein-coding genes (≥30 aa)
were predicted with 92.9% gene content completeness. No recent whole genome duplications were
identified by an intra-genome collinearity analysis. Phylogenetic analysis supports that eudicots and
magnoliids are more closely related to each other than to monocots. Moreover, the Magnoliales was
Citation: Li, T.; Zheng, J.; Nousias, O.; found to be more closely related to the Laurales than the Piperales. Genome comparison revealed
Yan, Y.; Meinhardt, L.W.; Goenaga, R.;
that the ‘Booth’ cultivar has 200 Mb less repeats than the Spanish cultivar ‘Fino de Jete’, despite
Zhang, D.; Yin, Y. The American
their highly similar (>99%) genome sequence identity and collinearity. These two cultivars were
Cherimoya Genome Reveals Insights
diverged during the early Pleistocene (1.93 Mya), which suggests a different origin and domestication
into the Intra-Specific Divergence, the
of the cherimoya. Terpene/terpenoid metabolism functions were found to be enriched in Magnoliales,
Evolution of Magnoliales, and a
Putative Gene Cluster for Acetogenin
while TNL (Toll/Interleukin-1-NBS-LRR) disease resistance gene has been lost in Magnoliales during
Biosynthesis. Plants 2024, 13, 636. evolution. We have also identified a gene cluster that is potentially responsible for the biosynthesis
https://doi.org/10.3390/ of acetogenins, a class of natural products found exclusively in Annonaceae. The cherimoya genome
plants13050636 provides an invaluable resource for supporting characterization, conservation, and utilization of
Annona genetic resources.
Academic Editor: Khalid Meksem
Received: 18 January 2024 Keywords: Annona; Annonaceae; Magnoliales; whole genome duplication; Neotropical region; tropical
Revised: 16 February 2024 Americas; domestication; diversity
Accepted: 21 February 2024
Published: 26 February 2024
1. Introduction
Copyright: © 2024 by the authors.
Cherimoya (Annona cherimola) is a popular fruit widely cultivated in tropical and
Licensee MDPI, Basel, Switzerland. subtropical regions [1,2]. A. cherimola (2n = 2x = 14) belongs to the Annona genus in
This article is an open access article the Annonaceae family, which is the largest family in the Magnoliales order [3]. This fam-
distributed under the terms and ily contains around 108 genera and 2400 known species. In addition to A. cherimola,
conditions of the Creative Commons there are several other Annona species with edible fruits, including Annona muricata, An-
Attribution (CC BY) license (https:// nona squamosa, Annona reticulata, Annona purpurea and the interspecific hybrid Atemoya
creativecommons.org/licenses/by/ (A. cherimola × A. squamosa) [1]. Cherimoya fruits are commonly eaten fresh. The fruit is
4.0/). very sweet, with a custard-like texture. The aromatic flavor is a mix of pineapple, mango,
and strawberry [4]. Cherimoya can also be used to make a range of food products, includ-
ing ice creams, milkshakes, jellies, yogurt, juice, and wine [5]. The fruit is a good source of
vitamin C and vitamin B, fiber, flavonoids, and potassium [6,7].
Besides its utilization as a fruit, cherimoya has long been used as traditional medicine
in different American civilizations. It provides various health benefits such as promoting
digestion, preventing high blood pressure and supporting immunity [8,9]. This fruit has
attracted special attention in recent years due to its high concentration of acetogenins—a
class of lipophilic polyketide natural products uniquely found in Annonaceae species [10–12].
Research has accredited beneficial effects to acetogenins, including the induction of cytotoxic,
anti-inflammatory and anti-tumorous activities [13–15]. The chemical characteristics, bioactivity
and achievements regarding the therapeutic usage of acetogenins from cherimoya were recently
reviewed [16].
Cherimoya is indigenous to the tropical Americas [17,18]. However, the precise
center of origin of cherimoya is still controversial. The early chroniclers and scientists
proposed that the native home of cherimoya is in the inter-Andean valleys, including
northern Peru and southern Ecuador [5,17]. This hypothesis was supported by linguistic
data, botanical remains and ceramics of Chimu-Inca and Moche cultures in northern Peru,
proving the presence of cherimoya in Peru as early as 2500 B.C. [19]. The existence of natural
populations of cherimoya in the mountain valleys in southern Ecuador and northern Peru
provides additional evidence to support this hypothesis [17,20]. However, more recent
studies based on microsatellite markers revealed a higher allelic diversity in Mesoamerican
cherimoya germplasm compared to the diversity found in South America [18,21]. These
results suggested the Mesoamerican origin of cherimoya and a pre-Columbian movement
of plant material, probably seeds, to South America, which resulted in a secondary center
of diversity in the Andean region [18,21]. From tropical America, the cherimoya was
introduced to continental Spain between the 16th and 18th centuries. From there, this fruit
species was further introduced to Italy, Portugal, and northern Africa [22]. After that, it
was grown in most tropical regions in Asia. Today, cherimoya is cultivated in tropical and
subtropical regions throughout the world, including the Americas, Asia, Africa, Australia,
and the Mediterranean region. Spain is the largest cherimoya-producing country, followed
by Peru, Chile, Ecuador, and Mexico [21].
The Magnoliales, together with the Canellales, Laurales, and Piperales, form the Mag-
noliid complex [23]. The phylogenetic relationships among the clades have undergone
intensive investigations in recent years, but the results have been conflicting. A recent
study [24] analyzed four possible tree topologies by using different sampling datasets and
favored the topology that monocots and magnoliids are closer to each other than to eudicots.
Using single-copy orthologous gene trees, three other studies [25–27] reported that eudicots
and monocots form a clade that is a sister to magnoliids. Other recent studies also showed
that magnoliids and eudicots are closer to each other than they are to monocots [28–30].
It is expected that additional magnoliid genomes will help resolve the previous conflict-
ing conclusions regarding the phylogenetic relationship of the three major angiosperm
clades. Furthermore, new magnoliid genomes would improve our understanding of the
phylogenetic position and divergence time of A. cherimola from other magnoliids.
The economic importance of cherimoya and significance of Annona species in plant
evolution warrant greatly expanded efforts in developing genomic tools for supporting
the conservation of Annona genetic diversity, crop genetic improvement and utilization of
Annona genetic resources in the food and pharmaceutical industries. However, progress
in developing genomic resources for Annona lags behind what has been achieved for
temperate fruit trees and annual crops. While the first sequenced genome in the Annona
genus, A. muricata, was published in 2021, only raw DNA reads are available in GenBank,
and its genome assembly is still unavailable. More recently, the first A. cherimola genome
was developed from a Spanish cultivar, “Fino de Jete” [31].
In this study, we present the chromosome-level genome assembly of the American
cherimoya cultivar “Booth”. Our objectives were to: (1) explore the intra-specific ge-
Plants 2024, 13, 636 3 of 20
nomic diversity in the A. cherimola species; (2) analyze possible major duplication events in
cherimoya; (3) assess phylogenetic relationships among monocots, eudicots, and magnoli-
ids, (4) study disease-related genes and pathways, and (5) identify potential acetogenin
biosynthetic gene clusters in the cherimoya genome.
MAKER pipeline [45]. In the homology-based approach, the proteins from 12 species,
Aristolochia fimbriata (Afi), Arabidopsis thaliana (Ath), Amborella trichopoda (Atr), Cinnamomum
kanehirae (Cka), Chimonanthus salicifolius (Csa), Liriodendron chinense (Lch), Litsea cubeba
(Lcu), Magnolia officinalis (Mof), Nelumbo nucifera (Nnu), Oryza sativa (Osa), Persea americana
(Pam), Piper nigrum (Pni), and all proteins in the Swiss-Prot database were used as protein
evidence. Three RNA-seq data for atemoya (SRR6031481, SRR6031482 and SRR5908896)
were assembled by using Trinity [46] to obtain the transcripts and then served as the
transcriptome evidence. The protein and transcriptome evidence was used in MAKER to
guide the ab initio gene predictions. In the ab initio method, SNAP [47] and Augustus [48]
were trained after each round of MAKER and the training results were integrated for the
next round. MAKER was run with three iterations to ensure gene prediction accuracy.
The final gene models were selected from the third-round output using protein lengths
larger than 30 amino acids and then checked for annotation completeness using BUSCO (-l
viridiplantae, m proteins) [49].
3. Results
3.1. The Chromosome-Level Genome Assembly of Cherimoya “Booth”
The A. cherimola “Booth” genome was sequenced and assembled by Dovetail Ge-
nomics. First, a total of 13.6 Gb (21.8×) PacBio Hifi reads, and 70.1 Gb (112.2×) Illumina
reads (Omni-C library) were generated. The genome size was estimated to be approxi-
mately 625 Mb based on Omni-C Illumina reads, and the heterozygosity rate was 1.14%
(Figure S2) according to GenomeScope (http://qb.cshl.edu/genomescope/, accessed on
17 January 2024). Then, the “Booth” genome was assembled into 1658 contigs using Hifi
reads. The contigs were further scaffolded into a chromosome-level assembly with the Omni-C
reads using Dovetail HiRise™ scaffolding software (https://github.com/DovetailGenomics/
HiRise_July2015_GR, accessed on 17 January 2024). In total, 1377 scaffolds (794 Mb) were
in the final assembly with a scaffold N50 = 97.59 Mb (Table 1). A total of 284 gaps remained
and the percentage of Ns was only 0.00359% in the assembly. Seven linkage groups were
identified from the link density histogram (Figure 1A) corresponding to seven pseudo-
chromosomes in the A. cherimola genome. This is consistent with the seven chromosomes in
its sister species A. muricata [70] and agrees with the flow cytometry result [71]. The seven
chromosomes also have telomere repeats in at least one of their two ends and centromere
repeats (Table S1). The seven chromosomes account for 87.6% of the total genome length;
the chromosomes were ordered by their sizes, from the largest (~128 Mb) to the smallest
(~73 Mb), and named from AC1 to AC7 (Figure 1B). Regarding the core gene completeness
of the genome assembly, 98.43% (251/255) of complete core genes were found in the genome
assembly in BUSCO (Benchmarking Universal Single-Copy Orthologs) analysis (Table 1).
core gene completeness of the genome assembly, 98.43% (251/255) of complete core genes
were found in the genome assembly in BUSCO (Benchmarking Universal Single-Copy
Orthologs) analysis (Table 1).
Plants 2024, 13, 636 Table 1. General statistics of assembly and annotations for cherimoya and soursop genomes.
6 of 20
A. cherimola “Fino de
Assembly A. cherimola “Booth” A. muricata *
Jete” *
Table 1. General statistics of assembly and annotations for cherimoya and soursop genomes.
Total length (bp) 794,023,491 1,137,394,475 656,813,740
Number of scaffolds
Assembly
1377
A. cherimola “Booth”
2052
A. cherimola “Fino de Jete” *
755
A. muricata *
Number of Chromosomes 7 7 7
Total length (bp) 794,023,491 1,137,394,475 656,813,740
Longest
Numberscaffold (bp)
of scaffolds 128,576,476
1377 212,253,197
2052 122,620,176
755
Scaffolds N50 (bp)
Number of Chromosomes 97,591,913
7 170,859,109
7 93,205,713
7
GC content
Longest scaffold%(bp) 35.25
128,576,476 34.69
212,253,197 40.07
122,620,176
ScaffoldsBUSCOs
Complete N50 (bp) % 97,591,913
98.43 170,859,109
93.0 93,205,713
-
GC content % 35.25 34.69 40.07
Annotation
Complete BUSCOs % 98.43 93.0 -
Repeat sequences % 68.23 64.96 54.87
Annotation
Number of protein-coding genes 45,272 (≥30 aa) # 41,413 (≥50 aa) 23,375 (≥100 aa)
Repeat sequences % 68.23 64.96 54.87
Number of genes with
Number of protein-coding genes (≥30 aa) #
45,27232,377 41,413 (-≥50 aa) (≥100 aa)
22,769
23,375
Number ofannotation
genes with annotation 32,377 - 22,769
Complete
Complete BUSCOs
BUSCOs % % 92.9
92.9 90.9
90.9 92.14
92.14
**Statistics
Statisticsdata
data were
were taken
taken fromfrom soursop
soursop genome
genome paper paper [70]
[70] and and cherimoya
cherimoya “Fino de “Fino de Jete”
Jete” genome genome
paper [31].
#paper
34,890[31].
(≥100# 34,890 (≥100
aa), 43,926 (≥50aa),
aa).43,926
BUSCO (≥50 aa). BUSCO
database: database: viridiplantae.
viridiplantae.
Figure 1.
Figure 1. The
The chromosome-level
chromosome-level genome
genome assembly
assembly of
ofcherimoya
cherimoya“Booth”.
“Booth”. (A)
(A) Dovetail
Dovetail Genomic’s
Genomic’s
Hi-C linkage density heatmap for A. cherimola genome showing seven chromosomes.
Hi-C linkage density heatmap for A. cherimola genome showing seven chromosomes. The darker The darker
color indicates a higher frequency of interaction. (B) The circos plot of the seven chromosomes. From
color indicates a higher frequency of interaction. (B) The circos plot of the seven chromosomes. From
outside to inside, we show (a) seven chromosomes arranged by size, (b) gene density (the number
outside to inside, we show (a) seven chromosomes arranged by size, (b) gene density (the number of
of genes found in each 200 kb sequence window, high and low gene densities are indicated in orange
genes found
and blue, in each 200 kb
respectively), (c) sequence window,
GC content, high
(d) LTR and low
Gypsy gene(e)
density, densities are indicated
LTR Copia in orange
density, (f) LINE-1
and blue, respectively), (c) GC content, (d) LTR Gypsy
element density and (g) syntenic blocks between chromosomes.density, (e) LTR Copia density, (f) LINE-1
element density and (g) syntenic blocks between chromosomes.
Notably, repeat sequences accounted for 68.23% of the genome, in which long terminal
repeat (LTR) retrotransposons were the most abundant (25.49%) (Table S2). The percent-
age of total repeats in A. cherimola was higher than its closely related species A. muricata
(54.87%) [70], while the percentage of LTR retrotransposons in A. cherimola was lower
than that in A. muricata (41.28%). The percentage of repeat sequences in A. cherimola was
similar to that of two species in Magnoliales order, 66.48% in M. biondii [72] and 61.64%
in L. chinense [73], but lower than 81.44% in M. officinalis [27]. Among the LTR retrotrans-
posons, the number of Gypsy elements was larger than the number of Copia elements; how-
ever, the total length of Copia elements was longer. In addition, the unevenly distributed
LTR retrotransposons in A. cherimola genomes tended to accumulate in sequence regions
that have lower gene density, while LINE-1 elements showed the opposite trend (Figure 1B).
Plants 2024, 13, 636 7 of 20
In A. cherimola “Booth” genome, 45,272 protein-coding genes (≥30 aa) were predicted
with 92.9% of complete BUSCOs (protein mode) as the gene annotation completeness.
The number of predicted genes in A. cherimola lowered to 34,890 if filtered with ≥100 aa
sequence length; the gene number in A. cherimola was thus much higher than that in A.
muricata (23,375 genes) (Table 1). A total of 32,377 genes (71.52% of 45,272 genes) can be
predicted with GO annotations, KEGG pathway, or Pfam annotation. In addition, a total of
495 transfer RNA (tRNA) and 61 ribosomal RNA (rRNA) genes were predicted in the seven
chromosomes, while 1448 tRNA and 11,401 rRNA genes were predicted in the remaining
1370 scaffolds. This huge number of rRNA genes was unusual, and most of them were
5S ribosomal RNAs. Examination of the genomic locations of these rRNA genes found
that only 61 genes were located on the seven chromosomes and the rest were on unplaced
scaffolds (Table S3). This indicates a possibility of mis-annotation of rRNA genes, although
it is not uncommon that plants may contain a large copy number of rRNA genes [74].
Figure
Figure 2.
2. Duplication
Duplication analysis
analysis of
of the
the A.
A. cherimola
cherimola genome.
genome. (A)
(A) The
The Ks
Ks distributions
distributions for
for all
all gene
gene pairs
pairs
in syntenic blocks computed by MCScanX. The three-letter codes for species are the 1st letter of
in syntenic blocks computed by MCScanX. The three-letter codes for species are the 1st letter of
genus + the 1st two letters of species (see Figure 3 for species name). The shades indicate the
genus + the 1st two letters of species (see Figure 3 for species name). The shades indicate the peaks.
peaks. (B) The Ks distributions for the whole paranome of four species in Magnoliales order. (C) Dot
(B) The Ks distributions for the whole paranome of four species in Magnoliales order. (C) Dot plots of
plots of paralogs indicated self-syntenic blocks between chromosomes in the A. cherimola “Booth”
paralogs indicated self-syntenic blocks between chromosomes in the A. cherimola “Booth” genome.
genome. Dots are colored according to the Ks values.
Dots are colored according to the Ks values.
Plants 2024, 13, 636
Plants 2024, 13, x FOR PEER REVIEW 98 of
of 20
22
Figure 3.
Figure 3. Phylogenetic
Phylogenetic relationships
relationships among
among the
the three major angiosperm
three major angiosperm clades
clades and
and significantly
significantly
expanded/contracted gene families along clades. (A) The phylogenetic tree with estimated diver-
expanded/contracted gene families along clades. (A) The phylogenetic tree with estimated divergence
gence times for 20 species. A total of 18 species are selected from magnoliids (10 species), eudicots
times for 20 species. A total of 18 species are selected from magnoliids (10 species), eudicots (4),
(4), monocots (4) and compared with Amborella and Ginkgo as outgroups. The bootstrap values equal
monocots (4) andas
to 100 are shown compared
gray circles onAmborella
with each node.and
TheGinkgo as outgroups.
numbers The bootstrap
of gene families values
significantly (p ≤equal
0.05)
to 100 are shown as gray circles on each node. The numbers of gene families significantly
expanded and contracted on each node are labeled in red (+) and blue (−), respectively. (B) The (p ≤ 0.05)
GO
expanded
enrichmentandforcontracted on each
94 significantly node aregene
expanded labeled in red
families in(+)
A. and −), respectively.
blue (The
cherimola. rich factor is(B)
theThe
ratioGO
of
genes in significantly
enrichment expandedexpanded
for 94 significantly gene families
geneannotated
families inwith this GO function
A. cherimola. The richto all genes
factor is theinratio
the
background
of (gene families
genes in significantly shared by
expanded three
gene major clades)
families annotated
annotated with
with this GOthis GO function.
function The ter-
to all genes in
pene/terpenoids
the synthesis
background (gene and shared
families metabolism functions
by three major are shown
clades) in red. with this GO function. The
annotated
terpene/terpenoids synthesis and metabolism functions are shown in red.
However, when plotting the all the paralogs in the genomes (i.e., whole paranome),
we observed
However,a whenflat Ks~1.35
plotting peak
the in
allA.
thecherimola
paralogs(Figure 2B, see below).
in the genomes For comparison
(i.e., whole paranome),
with
we other magnoliid
observed genomes,
a flat Ks~1.35 peak we have
in A. also plotted
cherimola (Figurethe
2B,whole paranome
see below). Ks of L. chinense,
For comparison with
M. officinalis,
other andgenomes,
magnoliid M. biondiiweof the
have Magnoliales order,
also plotted which paranome
the whole have a Ks peak
Ks ofatL.0.7–0.8. This
chinense, M.
officinalis, and M. biondii of the Magnoliales order, which have a Ks peak at 0.7–0.8.
peak at 0.7–0.8 corresponds to a recent WGD event and is consistent with results deter- This peak
at 0.7–0.8
mined corresponds
in previous to a for
studies recent
the WGD event and
three species is consistent
[27,72,73]. with results
Obviously, determined
A. cherimola in
does not
share this WGD event as it does not have this Ks peak at 0.7–0.8.
Plants 2024, 13, 636 9 of 20
previous studies for the three species [27,72,73]. Obviously, A. cherimola does not share this
WGD event as it does not have this Ks peak at 0.7–0.8.
The inter-chromosome syntenic block collinearity dot plot also confirms the lack of a
recent WGD event, as the inter-chromosome syntenic blocks are all short and fragmented
(Figures 1B and 2C). Instead of WGD, these short blocks might be a result of segmental
duplications, which are smaller-scale duplications than WGD.
3.4. Comparative Genomics Identifies Intra-Specific Genomic Variations within Annona cherimola
To determine the genomic variations within the Annona genus, we have compared
the A. cherimola “Booth” genome with the “Fino de Jete” genome. A. cherimola cultivar
“Fino de Jete” is grown in Spain and its genome was published in February 2023 [31]. The
Spanish “Fino de Jete” chromosome-level genome assembly has a size of 1.13 Gb, which
is 343 Mb larger than the American “Booth” genome assembly (Table 1). The “Fino de
Jete” genome was reported to contain 743.28 Mb repeats [31], while our “Booth” contains
539.92 Mb repeats. Therefore, the “Fino de Jete” genome has 201.54 Mb extra repeats than
the “Booth” genome, accounting for 58.9% of the size difference of the two genomes. To
compare the two assemblies at the genomic DNA level, we used MUMmer [77] to generate
a whole genome alignment (Figure 4A). Despite the large difference in genome size, the
two genomes share >99% nucleotide sequence identity and very similar chromosomal co-
linearity, except for some major structural variations in Chr4 and Chr6. All chromosomes
single-copy orthogroups that are shared by five genomes of the Magnoliales order and Ar-
abidopsis. The species phylogeny was used to date the divergence times, using three fossil
calibrations from the TimeTree database [56]: (i) the divergent time of Ath and Lch was
~163 million years ago (Mya), (ii) the divergent time of Lch and Ach was between 95 and
Plants 2024, 13, 636 113 Mya, and (iii) the divergence time of Lch and Mof was between 28 and 50 Mya. These 10 of 20
results are consistent with what is shown in Figure 3A. The two cherimoya cultivars,
“Booth” and “Fino de Jete”, were estimated to have diverged from each other about 1.93
Mya (Figure
except 4B). are
for Chr4 These two cultivars
longer in “Fino were brought
de Jete” thantoinSpain and the
“Booth”. TheUnited States onlygene
protein-coding a
few hundred years ago. The divergence time estimation result demonstrated their
contents of the two genomes were also compared with MMSeqs2 [78], which revealed that differ-
ent origin,
41,080 and“Fino
(99.2%) different domestication
de Jete” genes andbefore
34,827they were“Booth”
(76.2%) imported fromare
genes theshared
Neotropical
between
region.
the two genomes.
Figure 4.4.Genomic
Figure Genomicvariations
variations detected
detected between
between twotwo cherimoya
cherimoya genomes
genomes (“Fino
(“Fino de and
de Jete” Jete”“Booth”).
and
“Booth”). (A) The dotplot of whole genome alignment of “Fino de Jete” and “Booth”
(A) The dotplot of whole genome alignment of “Fino de Jete” and “Booth” genomes computed genomes com-
puted by MUMmer. Only the seven chromosomes of the two genome assemblies were aligned. The
by MUMmer. Only the seven chromosomes of the two genome assemblies were aligned. The
“Booth” chromosomes (y-axis) are numbered according to the Chr length, while the “Fino de Jete”
“Booth” chromosomes
chromosomes (y-axis)
(x-axis) are are dot
not. Each numbered according
in the plot to thealignment.
is a fragment Chr length, Thewhile
colorthe “Fino de
represents theJete”
chromosomes (x-axis)
average nucleotide are not.
sequence Eachofdot
identity in the plot
all fragment is a fragment
alignments of eachalignment.
Chr. The ChrsTheare
color
drawnrepre-
sents the average
in proportion nucleotide
to their sequence
lengths. (B) identity
Phylogenetic of all fragment
relationship withinalignments of each
the Magnoliales Chr.
order. TheThe
boot-Chrs
strap
are valuesinequal
drawn to 100 are
proportion shown.
to their The phylogenetic
lengths. tree wasrelationship
(B) Phylogenetic constructed within
using 1330 single-copy or-
the Magnoliales
orthogroups.
der. The bootstrap values equal to 100 are shown. The phylogenetic tree was constructed using
1330 single-copy orthogroups.
To study when the two cultivars diverged, we built a species phylogeny using
1330 single-copy orthogroups that are shared by five genomes of the Magnoliales order
and Arabidopsis. The species phylogeny was used to date the divergence times, using
three fossil calibrations from the TimeTree database [56]: (i) the divergent time of Ath
and Lch was ~163 million years ago (Mya), (ii) the divergent time of Lch and Ach was
between 95 and 113 Mya, and (iii) the divergence time of Lch and Mof was between 28 and
50 Mya. These results are consistent with what is shown in Figure 3A. The two cherimoya
cultivars, “Booth” and “Fino de Jete”, were estimated to have diverged from each other
about 1.93 Mya (Figure 4B). These two cultivars were brought to Spain and the United
States only a few hundred years ago. The divergence time estimation result demonstrated
their different origin, and different domestication before they were imported from the
Neotropical region.
Plants 2024, 13, x FOR PEER REVIEW 12 of 22
Plants 2024, 13, 636 11 of 20
Figure
Figure 5. Orthologous
5. Orthologous gene
gene clusters(OGCs)
clusters (OGCs)shared
sharedby
by and
and unique
unique to
to different
differentgroups
groupsofofgenomes.
genomes.
(A) The Venn diagram shows that 10,396 OGCs are shared among three major clades of
(A) The Venn diagram shows that 10,396 OGCs are shared among three major clades of angiosperms.angio-
sperms. (B) The Venn diagram shows that 276 OGCs are shared among three major orders within
(B) The Venn diagram shows that 276 OGCs are shared among three major orders within the magnoliid
clade. (C) The top 20 GO functions enriched in the 2863 unique OGCs that are only present in the
Magnoliales order. The terpene/terpenoids synthesis and metabolism functions are shown in red. See
Figure 3B for legends.
Plants 2024, 13, 636 12 of 20
To study which GO functions are enriched in genes unique to the magnoliid clade,
binomial tests were performed for a GO enrichment analysis (see Section 2). Specifically,
the 11,698 OGCs unique to the magnoliid clade were used as the foreground, while the
10,396 core OGCs (Figure 5A) were used as the background. Similarly, GO enrichment anal-
ysis was also performed to identify enriched functions in the Magnoliales order; 2863 OGCs
unique to Magnoliales order were used as the foreground, and 10,396 core OGCs used
as the background. Species in the Magnoliales order or magnoliids clade had enriched
functions related to terpene synthesis or terpenoid metabolism from the GO enrichment
analysis results (Figures 5C and S4). Terpenes are natural products found in plants and
are responsible for their fragrance, taste, and pigment [79]. Terpenoids are responsible for
plants’ defense against biotic and abiotic stresses, used as signal molecules to attract insects
for pollination, and have substantial pharmacological bioactivity [80]. The biosynthesis
of terpenoids has been well studied in many magnoliid species, including M. biondii [72],
L. cubeba [29], A. fimbriata [24] and C. kanehirae [28].
synthesis of ACGs, plantiSMASH [69] was run on the A. cherimola “Booth” genome. A total
of 24 biosynthetic gene clusters (BGCs, Table S4) were predicted, including 1 polyketide
BGC. This polyketide BGC is inferred to be the most likely candidate gene cluster
Plants 2024, 13, x FOR PEER REVIEW 14 offor
22 the
synthesis of ACGs in A. cherimola for the following reasons.
Gbi
Lcu
Ach
Osa
Ath
Bootstrap
0.8
0.85
0.9
0.95
1
Colored branches
TNL
TN
CNL
NL
CN
NBS
OTHER
Figure6.6.Phylogenetic
Figure Phylogenetic tree
tree of
of NBS
NBS domains
domainsfromfromfive
fiverepresentative
representative plant genomes.
plant genomes.TheThe
R gene
R gene
family is colored in branches, and species is colored in the ring outside the tree. See Table 2 caption
family is colored in branches, and species is colored in the ring outside the tree. See Table 2
for the R gene family name explanation. The species’ three-letter codes are the 1st letter of genus +
caption for the
the 1st two R gene
letters family
of species. Allname explanation.
the full species namesThecanspecies’ three-letter
be found in Figure 3.codes are solid
Two red the 1st letter
lines
ofseparate
genus the+ the
whole tree into the TNL and CNL major clades. The TNL clade (TNL/TN) contains two 3.
1st two letters of species. All the full species names can be found in Figure
Two red solid lines separate
sub-clades, TNL-Ath the whole
and TNL-Gbi. The tree
CNLinto the(all
clade TNL andfor
except CNL major clades.
TNL/TN) The TNL
can be divided clade
into
seven sub-clades based on the tree topology and presence/absence of
(TNL/TN) contains two sub-clades, TNL-Ath and TNL-Gbi. The CNL clade (all except for the 5 species (pink dashed
line). Some
TNL/TN) sub-clade
can contains
be divided into NBS
seven domains frombased
sub-clades all 5 species,
on thewhile others contain
tree topology fewer or single
and presence/absence
species.
of the 5 species (pink dashed line). Some sub-clade contains NBS domains from all 5 species,
while others contain fewer or single species.
Table 2. Distribution of NLR genes in 20 genomes.
Figure 7. The putative Acetogenin (ACG) biosynthetic gene cluster. (A) Potential BGC1 (Ach20829-
Figure 7. The putative Acetogenin (ACG) biosynthetic gene cluster. (A) Potential BGC1 (Ach20829-
Ach20837)
Ach20837)waswaspredicted
predictedto tosynthesize
synthesizepolyketide
polyketideby byplantiSMASH.
plantiSMASH. The The domains
domains ofof four
four signature
signature
genes
genes are colored differently. (B) Overview of annonaceous ACG structure. ACG is composed
are colored differently. (B) Overview of annonaceous ACG structure. ACG is composed of of
three
threeparts:
parts: Lactone
Lactone ring
ring (red),
(red), Tetrahydrofuran
Tetrahydrofuranrings rings(blue)
(blue)and
andlong-chain
long-chainfatty
fattyacids
acids(purple).
(purple).The
The
numberof
number oftetrahydrofuran
tetrahydrofuranrings
ringsvaries
variesfrom
from1 1toto3 3(1(1≤≤m
m≤ ≤ 3). The
The Ach
Ach genes
genes are colored differently
differently
corresponding to functions involved in three biosynthetic parts of ACG. Gray genes
corresponding to functions involved in three biosynthetic parts of ACG. Gray genes have functions have functions
not clearly relevant to ACG synthesis.
not clearly relevant to ACG synthesis.
4. Discussion
This ACG-BGC contains nine genes (Ach20829-Ach20837, total 97 kb, Figure 7B)
according to plantiSMASH,
Cherimoya encoding
is a commercially proteins with
important crop Pfam
knowndomains annotatedfruits
for its delicious for polyketide
and valu-
synthesis. Ach20829
able bioactive contains
compounds. It isK-box andtoSRF-TF
thought domains,from
have originated and the
its best
AndesA. and
thaliana hit
Central
America. Over a dozen cherimoya cultivars are described in the literature [87]. In this
study, we selected a Californian cultivar “Booth” [88] for genome sequencing. The first
cherimoya genome was published in 2023 from a Spanish cherimoya cultivar “Fino de
Jete”. Both “Booth” and “Fino de Jete” genomes were assembled into a chromosome-level
assembly at a similar level of genome quality (Table 1), which allows a whole genome
Plants 2024, 13, 636 15 of 20
(AT4G11880) is a MADS-box protein, which might control the expression of the ACG-
BGC. Ach20830, Ach20831, Ach20833, and Ach20834 are four proteins that contain an N-
terminal FAE1_CUT1_RppA domain and a C-terminal ACP_syn_III_C domain. Their best
A. thaliana hits (AT1G04220 and AT2G26640) are members of the 3-ketoacyl-CoA synthase
family for the synthesis of VLCFAs (very long-chain fatty acids). Importantly, according
to the plantiSMASH signature gene search result of this BGC, Ach20830, Ach20831, and
Ach20834 are all chalcone synthases and contain a Chal_sti_synt_C domain, which may
be responsible for adding the lactone ring in ACGs. Ach20832 contains two domains
(Copine and zf-C3HC4_3), and its best hit (AT3G01650) is an E3 ubiquitin–protein ligase.
Ach20835 has its best A. thaliana hit (AT3G22990) annotated as a SWI/SNF chromatin-
remodeling complex (CRC) component LFR (leaf- and flower-related) protein. Ach20836
has an Epimerase domain, and its best A. thaliana hit (AT5G28840) is a GDP-mannose
3,5-epimerase; plantiSMASH also annotates it as the signature gene of this BGC. The last
gene in the BGC Ach20837 has a PDDEXK_6 domain, formerly known as DUF506, and its
best A. thaliana hit (AT3G22970) were found to inhibit root hair elongation [86].
4. Discussion
Cherimoya is a commercially important crop known for its delicious fruits and valu-
able bioactive compounds. It is thought to have originated from the Andes and Central
America. Over a dozen cherimoya cultivars are described in the literature [87]. In this study,
we selected a Californian cultivar “Booth” [88] for genome sequencing. The first cherimoya
genome was published in 2023 from a Spanish cherimoya cultivar “Fino de Jete”. Both
“Booth” and “Fino de Jete” genomes were assembled into a chromosome-level assembly at a
similar level of genome quality (Table 1), which allows a whole genome alignment analysis
(Figures 4A and S5). It is striking that the two genomes have a significant difference in
genome size (343 Mb longer in “Fino de Jete”, Table 1), which is at least partially due to the
longer repeat regions in the “Fino de Jete” genome. Despite the genome size difference, the
two genomes show quite similar chromosome co-linearity and high nucleotide sequence
identity (Figure 4A). Major chromosomal rearrangements are observed between “Booth”
Chr6 and “Fino de Jete” Chr2. Interestingly, “Booth” Chr6 has telomere repeats identified
at both ends (Table S1), suggesting this major structural inversion might demonstrate
a real difference between the two closely related genomes. Most “Fino de Jete” genes
are also present in “Booth” but not the other way around. Searching the 10,875 “Booth”
genome-specific genes against the “Fino de Jete” genome using BLAT [89] found that 70%
of them are present in the “Fino de Jete” genome and 48% of them encode short proteins
(<100 aa) (Table S5). Almost 80% of these “Booth” genome-specific genes do not have
Arabidopsis homologs, and when they do, most match Arabidopsis proteins encoded or
located in chloroplast. Obviously, the gene models predicted in both genomes are not
perfect and will need future improvement.
The high heterozygosity rate is observed in both “Booth” (1.14%) and “Fino de Jete”
(1.05%) genomes, suggesting that high-quality phased haploid genomes will be needed in
the future. A newer version of Hifiasm (v0.19.5-r593) introduced a purge function that can
assemble haploid genomes. We have tried this new version using the 13.6 G Hifi reads and
obtained two phased genomes at 749.5 M and 700.5 M, which are close to the draft genome
size (794 M) reported in this paper. Additional sequencing will be needed in the future to
generate the high-quality phased haploid genomes. With the current draft assembly, the
98.43% complete BUSCOs of the “Booth” genome indicates the draft genome has a great
coverage of the protein coding genes, which is appropriate to perform the comparative
genomics analyses in this paper.
A genome divergence time estimation based on a sequence alignment of 1330 single-
copy genes shows that the two cultivars “Booth” and “Fino de Jete” diverged from each
other 1.93 Mya. This result suggests that the original plants of the two cultivars were
of different origins, and possibly with different domestication histories, too, when they
were brought to North America and Europe from the Neotropics a few hundred years
Plants 2024, 13, 636 16 of 20
ago. The divergence of these populations started in the early Pleistocene (2.58–0.773 Ma),
which is consistent with the divergence time of many other plants in the Neotropical
region [90,91]. The observed intraspecific genetic differentiation is also compatible with a
recent finding [92], which reported two different haplotypes in the germplasm collected
from three Central American countries (Honduras, Guatemala and Costa Rica). However,
based on microsatellite analysis of cherimoya germplasm from tropical Americas, recent
studies [18,21] also proposed that cherimoya originated in the highlands of Mesoamerica,
and humans brought cherimoya from Mesoamerica to present-day Peru through long-
distance sea-trade routes across the Pacific Ocean. Although the exact origin of the two
cultivars (“Fino de Jete” and “Booth”) is unknown, our result shows that the hypothesis
of the Mesoamerica—Andes dispersal of Cherimoya can be tested using comparative
genomics between landraces from Mesoamerica and the Andes. In addition, morphological
characteristics between these two cultivars should also be compared in future studies.
The speciation of A. cherimola occurred around 84 Mya (Figure 3A), which is consistent
with the estimated divergence time for A. muricata [72]. Based on 852 orthogroups for
20 species, the magnoliid clade has been shown to be closer to eudicots than monocots
(Figure 3A), which has been reported in some studies [28,30,72]. Given that more species
from magnoliid clade have been included than previous papers, our phylogenetic tree
should be more accurate in terms of revealing the phylogenetic relationships among
monocots, eudicots and magnoliids.
Focusing on the orthologous gene clusters (OGCs) unique to the Magnoliales order,
functions related to fruit flavors and plant stress responses, e.g., terpene synthesis or
terpenoid metabolism are significantly enriched (Figure 5). This also agrees with the
finding that terpene and terpenoid metabolic processes were significantly expanded in
the A. cherimola genome (Figure 3B). Interestingly, the TNL genes are completely absent
in genomes of the Magnoliales order and the Piperales order (Table 2), and the entire
magnoliid clade has no or very few TNL genes, while the CNL sequence diversity is higher
in magnoliids than in dicots and monocots (Figure 6).
Another major finding of this study is the candidate biosynthetic gene cluster (BGC)
for THF acetogenin (ACG), which is the hallmark of the Annonaceae family with demon-
strated anti-tumor activities. This was made possible by the A. cherimola “Booth” genome,
because the BGC genome mining tool plantiSMASH has to use the assembled genome as
input, which is the reason that the ACG BGC has never been identified before. Although
experimental characterization is needed to confirm this BGC is indeed responsible for ACG
synthesis, our sequence analysis of the nine member genes in this BGC strongly suggests it
is the most likely candidate (Figure 7). First, plantiSMASH only found 1 polyketide BGC
out of twenty-four total BGCs, and ACGs are lipophilic polyketide natural products. Sec-
ond, four of the nine genes of the BGC are members of the 3-ketoacyl-CoA synthase family
for the synthesis of VLCFAs (very long-chain fatty acids) according to their best A. thaliana
hits and Pfam domains. Third, three of the 3-ketoacyl-CoA synthase genes might encode
proteins with a Chal_sti_synt_C domain, which may be responsible for adding the lactone
ring in ACGs. This BGC also contains other genes that may be important for regulating the
ACG synthesis, such as the MAD-box protein and the SWI/SNF chromatin-remodeling
complex (CRC) component LFR protein. Future experimental validation will be necessary
to verify this BGC for acetogenin biosynthesis.
5. Conclusions
In summary, the cherimoya “Booth” genome, the second publicly available genome
of the Annonaceae family, will be a valuable resource for studying the genetic diversity of
Annonaceae, the evolution of magnoliids and flowering plants, the discovery of acetogenin
biosynthetic genes and the origin, domestication and dispersal of cherimoya. It also
provides novel genomic resources to support crop germplasm evaluation and new breeding
strategies to improve the production of this economically important fruit crop.
Plants 2024, 13, 636 17 of 20
Supplementary Materials: The following supporting information can be downloaded at: https://
www.mdpi.com/article/10.3390/plants13050636/s1. Figure S1: The young plant of cherimoya cultivar
‘Booth’; Figure S2: The k-mer frequency plot for A. cherimola genome; Figure S3: The GO enrichment
for 51 significantly contracted gene families in A. cherimola; Figure S4: The top 20 GO enrichment of
11,698 unique OGCs in magnoliids clade; Figure S5: Syntenic block plot of genes in “Fino de Jete” and
“Booth” genomes; Table S1: Telomere analysis result; Table S2: Repeat annotation result; Table S3:
rRNA gene prediction result; Table S4: Biosynthetic gene cluster prediction result; Table S5: BLAT
search result of “Booth” genome-specific genes in the “Fino de Jete” genome.
Author Contributions: Y.Y. (Yanbin Yin), D.Z. and L.W.M. conceived and designed the project. D.Z.,
R.G. and L.W.M. collected the plant materials and generated the sequencing data. T.L., O.N., Y.Y.
(Yuchen Yan) and J.Z. performed all the data analysis under the supervision of Y.Y. (Yanbin Yin),
T.L., J.Z. and Y.Y. (Yanbin Yin) draft the manuscript. All authors contributed and approved the final
manuscript. All authors have read and agreed to the published version of the manuscript.
Funding: This work was primarily supported by the United States Department of Agriculture
(USDA)/Agricultural Research Service (ARS) award [58-8042-9-089], and partially by the National
Science Foundation (NSF) CAREER award [DBI-1933521], National Institutes of Health (NIH) R01
award [R01GM140370], start-up grant of UNL [2019-YIN] granted to Y.Y. Any mention of trade names
or commercial products in this publication is solely for the purpose of providing specific information
and does not imply recommendation or endorsement by the U.S. Department of Agriculture. USDA
is an equal opportunity provider and employer.
Data Availability Statement: The A. cherimola “Booth” genome and gene annotation have been de-
posited in GenBank with a BioProject ID PRJNA954757 and are also made available at https://bcb.unl.
edu/Ach/ (accessed on 17 January 2024).
Acknowledgments: We would like to thank all our lab members for the helpful discussions. This
work was partially completed utilizing the Holland Computing Center of the University of Nebraska,
which receives support from the Nebraska Research Initiative.
Conflicts of Interest: The authors declare no conflicts of interest.
References
1. Pinto, A.D.Q.; Cordeiro, M.C.R.; De Andrade, S.R.M.; Ferreira, F.R.; Filgueiras, H.D.C.; Alves, R.E.; Kinpara, D.I. Annona Species;
University of Southampton, International Centre for Underutilised Crops: Southampton, UK, 2005; p. 284.
2. Leal, F.; Paull, R.E. The genus Annona: Botanical characteristics, horticultural requirements and uses. Crop Sci. 2023, 63, 1030–1049.
[CrossRef]
3. The Angiosperm Phylogeny Group. An update of the Angiosperm Phylogeny Group classification for the orders and families of
flowering plants: APG III. Bot. J. Linn. Soc. 2009, 161, 105–121. [CrossRef]
4. Pino, J.A. Annona Fruits. In Handbook of Fruit and Vegetable Flavors; Wiley: Hoboken, NJ, USA, 2010; pp. 229–246. [CrossRef]
5. National Research Council. Lost Crops of the Incas: Little-Known Plants of the Andes with Promise for Worldwide Cultivation; National
Academies Press: Washington, DC, USA, 1989. [CrossRef]
6. Amoo, I.A.; Emenike, A.E.; Akpambang, V.O.E. Compositional Evaluation of Annona cherimoya (Custard Apple) Fruit. Trends
Appl. Sci. Res. 2008, 3, 216–220. [CrossRef]
7. Mannino, G.; Gentile, C.; Porcu, A.; Agliassa, C.; Caradonna, F.; Bertea, C.M. Chemical Profile and Biological Activity of
Cherimoya (Annona cherimola Mill.) and Atemoya (Annona atemoya) Leaves. Molecules 2020, 25, 2612. [CrossRef] [PubMed]
8. Albuquerque, T.G.; Santos, F.; Sanches-Silva, A.; Beatriz Oliveira, M.; Bento, A.C.; Costa, H.S. Nutritional and phytochemical
composition of Annona cherimola Mill. fruits and by-products: Potential health benefits. Food Chem. 2016, 193, 187–195. [CrossRef]
[PubMed]
9. Quílez, A.M.; Fernández-Arche, M.A.; García-Giménez, M.D.; De la Puerta, R. Potential therapeutic applications of the genus
Annona: Local and traditional uses and pharmacology. J. Ethnopharmacol. 2018, 225, 244–270. [CrossRef] [PubMed]
10. Cortes, D.; Myint, S.H.; Dupont, B.; Davoust, D. Bioactive acetogenins from seeds of Annona cherimolia. Phytochemistry 1993, 32,
1475–1482. [CrossRef]
11. Neske, A.; Ruiz Hidalgo, J.; Cabedo, N.; Cortes, D. Acetogenins from Annonaceae family. Their potential biological applications.
Phytochemistry 2020, 174, 112332. [CrossRef]
12. Perrone, A.; Yousefi, S.; Salami, A.; Papini, A.; Martinelli, F. Botanical, genetic, phytochemical and pharmaceutical aspects of
Annona cherimola Mill. Sci. Hortic. 2022, 296, 110896. [CrossRef]
13. Nakanishi, Y.; Chang, F.-R.; Liaw, C.-C.; Wu, Y.-C.; Bastow, K.F.; Lee, K.-H. Acetogenins as Selective Inhibitors of the Human
Ovarian 1A9 Tumor Cell Line. J. Med. Chem. 2003, 46, 3185–3188. [CrossRef]
Plants 2024, 13, 636 18 of 20
14. Yuan, S.-S.F.; Chang, H.-L.; Chen, H.-W.; Yeh, Y.-T.; Kao, Y.-H.; Lin, K.-H.; Wu, Y.-C.; Su, J.-H. Annonacin, a mono-tetrahydrofuran
acetogenin, arrests cancer cells at the G1 phase and causes cytotoxicity in a Bax- and caspase-3-related pathway. Life Sci. 2003, 72,
2853–2861. [CrossRef] [PubMed]
15. Younes, M.; Ammoury, C.; Haykal, T.; Nasr, L.; Sarkis, R.; Rizk, S. The selective anti-proliferative and pro-apoptotic effect of A.
cherimola on MDA-MB-231 breast cancer cell line. BMC Complement. Med. Ther. 2020, 20, 343. [CrossRef] [PubMed]
16. Durán, A.G.; Gutiérrez, M.T.; Mejías, F.J.R.; Molinillo, J.M.G.; Macías, F.A. An Overview of the Chemical Characteristics, Bioactivity
and Achievements Regarding the Therapeutic Usage of Acetogenins from Annona cherimola Mill. Molecules 2021, 26, 2926. [CrossRef]
[PubMed]
17. Popenoe, W. The Native Home of the Cherimoya. J. Hered. 1921, 12, 331–336. [CrossRef]
18. Larranaga, N.; Albertazzi, F.J.; Fontecha, G.; Palmieri, M.; Rainer, H.; van Zonneveld, M.; Hormaza, J.I. A Mesoamerican origin of
cherimoya (Annona cherimola Mill.): Implications for the conservation of plant genetic resources. Mol. Ecol. 2017, 26, 4116–4130.
[CrossRef]
19. Bonavia, D.; Ochoa, C.M.; Óscar Tovar, S.; Palomino, R.C. Archaeological Evidence of Cherimoya (Annona cherimolia Mill.) and
Guanabana (Annona muricata L.) in Ancient Peru. Econ. Bot. 2004, 58, 509–522. [CrossRef]
20. Scheldeman, X. Distribution and Potential of Cherimoya (Annona cherimola Mill.) and Highland Papayas (Vasconcellea Spp.) in
Ecuador. Ph.D. Thesis, Faculty of Agricultural and Applied Biological Sciences, Department Plant Production, Laboratory of
Tropical and Subtropical Agronomy and Ethnobotany, Ghent, Belgium, 2002.
21. Larranaga, N.; Albertazzi, F.J.; Hormaza, J.I. Phylogenetics of Annona cherimola (Annonaceae) and some of its closest relatives.
J. Syst. Evol. 2019, 57, 211–221. [CrossRef]
22. Morton, J.F. Fruits of Warm Climates; Miami, FL, USA, 1987. Available online: https://www.echopointbooks.com/agriculture/
fruits-of-warm-climates (accessed on 17 January 2024).
23. Soltis, D.; Soltis, P.; Endress, P.; Chase, M.W.; Manchester, S.; Judd, W.; Majure, L.; Mavrodiev, E. Phylogeny and Evolution of the
Angiosperms; University of Chicago Press: Chicago, IL, USA, 2018.
24. Qin, L.; Hu, Y.; Wang, J.; Wang, X.; Zhao, R.; Shan, H.; Li, K.; Xu, P.; Wu, H.; Yan, X.; et al. Insights into angiosperm evolution,
floral development and chemical biosynthesis from the Aristolochia fimbriata genome. Nat. Plants 2021, 7, 1239–1253. [CrossRef]
25. Hu, L.; Xu, Z.; Wang, M.; Fan, R.; Yuan, D.; Wu, B.; Wu, H.; Qin, X.; Yan, L.; Tan, L.; et al. The chromosome-scale reference genome
of black pepper provides insight into piperine biosynthesis. Nat. Commun. 2019, 10, 4702. [CrossRef]
26. Rendón-Anaya, M.; Ibarra-Laclette, E.; Méndez-Bravo, A.; Lan, T.; Zheng, C.; Carretero-Paulet, L.; Perez-Torres, C.A.; Chacón-
López, A.; Hernandez-Guzmán, G.; Chang, T.-H.; et al. The avocado genome informs deep angiosperm phylogeny, highlights
introgressive hybridization, and reveals pathogen-influenced gene space adaptation. Proc. Natl. Acad. Sci. USA 2019, 116,
17081–17089. [CrossRef]
27. Yin, Y.; Peng, F.; Zhou, L.; Yin, X.; Chen, J.; Zhong, H.; Hou, F.; Xie, X.; Wang, L.; Shi, X.; et al. The chromosome-scale genome of
Magnolia officinalis provides insight into the evolutionary position of magnoliids. iScience 2021, 24, 102997. [CrossRef]
28. Chaw, S.-M.; Liu, Y.-C.; Wu, Y.-W.; Wang, H.-Y.; Lin, C.-Y.I.; Wu, C.-S.; Ke, H.-M.; Chang, L.-Y.; Hsu, C.-Y.; Yang, H.-T.; et al. Stout
camphor tree genome fills gaps in understanding of flowering plant genome evolution. Nat. Plants 2019, 5, 63–73. [CrossRef]
[PubMed]
29. Chen, Y.-C.; Li, Z.; Zhao, Y.-X.; Gao, M.; Wang, J.-Y.; Liu, K.-W.; Wang, X.; Wu, L.-W.; Jiao, Y.-L.; Xu, Z.-L.; et al. The Litsea genome
and the evolution of the laurel family. Nat. Commun. 2020, 11, 1675. [CrossRef] [PubMed]
30. Lv, Q.; Qiu, J.; Liu, J.; Li, Z.; Zhang, W.; Wang, Q.; Fang, J.; Pan, J.; Chen, Z.; Cheng, W.; et al. The Chimonanthus salicifolius genome
provides insight into magnoliid evolution and flavonoid biosynthesis. Plant J. 2020, 103, 1910–1923. [CrossRef] [PubMed]
31. Talavera, A.; Fernandez-Pozo, N.; Matas, A.J.; Hormaza, J.I.; Bombarely, A. Genomics in neglected and underutilized fruit crops:
A chromosome-scale genome sequence of cherimoya (Annona cherimola). Plants People Planet 2023, 5, 408–423. [CrossRef]
32. Grossberger, D. The California Cherimoya Industry. Acta Hortic. 1999, 497, 119–142. [CrossRef]
33. Zheng, J.; Meinhardt, L.W.; Goenaga, R.; Zhang, D.; Yin, Y. The chromosome-level genome of dragon fruit reveals whole-genome
duplication and chromosomal co-localization of betacyanin biosynthetic genes. Hortic. Res. 2021, 8, 63. [CrossRef]
34. Zheng, J.; Meinhardt, L.W.; Goenaga, R.; Matsumoto, T.; Zhang, D.; Yin, Y. The chromosome-level rambutan genome reveals a
significant role of segmental duplication in the expansion of resistance genes. Hortic. Res. 2022, 9, uhac014. [CrossRef]
35. Putnam, N.H.; O’Connell, B.L.; Stites, J.C.; Rice, B.J.; Blanchette, M.; Calef, R.; Troll, C.J.; Fields, A.; Hartley, P.D.; Sugnet, C.W.; et al.
Chromosome-scale shotgun assembly using an in vitro method for long-range linkage. Genome Res. 2016, 26, 342–350. [CrossRef]
36. Smit, A.F.A.; Hubley, R. RepeatModeler Open-1.0; ScienceOpen, Inc.: Lexington, MA, USA, 2010.
37. Ou, S.; Jiang, N. LTR_retriever: A Highly Accurate and Sensitive Program for Identification of Long Terminal Repeat Retrotrans-
posons. Plant Physiol. 2018, 176, 1410–1422. [CrossRef]
38. Crescente, J.M.; Zavallo, D.; Helguera, M.; Vanzetti, L.S. MITE Tracker: An accurate approach to identify miniature inverted-repeat
transposable elements in large genomes. BMC Bioinform. 2018, 19, 348. [CrossRef]
39. Smit, A.F.A.; Hubley, R.; Green, P. RepeatMasker Open-4.0. 2013–2015. Available online: http://www.repeatmasker.org
(accessed on 17 January 2024).
40. Abrusán, G.; Grundmann, N.; DeMester, L.; Makalowski, W. TEclass—A tool for automated classification of unknown eukaryotic
transposable elements. Bioinformatics 2009, 25, 1329–1330. [CrossRef]
Plants 2024, 13, 636 19 of 20
41. Chan, P.P.; Lowe, T.M. tRNAscan-SE: Searching for tRNA Genes in Genomic Sequences. In Gene Prediction: Methods and Protocols;
Methods Molecular Biology; Springer: Berlin/Heidelberg, Germany, 2019; Volume 1962, pp. 1–14. [CrossRef]
42. Seemann, T. Barrnap: BAsic Rapid Ribosomal RNA Predictor. 2013. Available online: https://github.com/tseemann/barrnap
(accessed on 17 January 2024).
43. Nawrocki, E.P.; Eddy, S.R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 2013, 29, 2933–2935. [CrossRef]
44. Kalvari, I.; Argasinska, J.; Quinones-Olvera, N.; Nawrocki, E.P.; Rivas, E.; Eddy, S.R.; Bateman, A.; Finn, R.D.; Petrov, A.I. Rfam
13.0: Shifting to a genome-centric resource for non-coding RNA families. Nucleic Acids Res. 2018, 46, D335–D342. [CrossRef]
[PubMed]
45. Cantarel, B.L.; Korf, I.; Robb, S.M.C.; Parra, G.; Ross, E.; Moore, B.; Holt, C.; Sánchez Alvarado, A.; Yandell, M. MAKER: An
easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res. 2008, 18, 188–196. [CrossRef]
46. Grabherr, M.G.; Haas, B.J.; Yassour, M.; Levin, J.Z.; Thompson, D.A.; Amit, I.; Adiconis, X.; Fan, L.; Raychowdhury, R.;
Zeng, Q.; et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 2011, 29,
644–652. [CrossRef]
47. Korf, I. Gene finding in novel genomes. BMC Bioinform. 2004, 5, 59. [CrossRef]
48. Stanke, M.; Keller, O.; Gunduz, I.; Hayes, A.; Waack, S.; Morgenstern, B. AUGUSTUS: Ab initio prediction of alternative transcripts.
Nucleic Acids Res. 2006, 34, W435–W439. [CrossRef]
49. Simão, F.A.; Waterhouse, R.M.; Ioannidis, P.; Kriventseva, E.V.; Zdobnov, E.M. BUSCO: Assessing genome assembly and
annotation completeness with single-copy orthologs. Bioinformatics 2015, 31, 3210–3212. [CrossRef] [PubMed]
50. Emms, D.M.; Kelly, S. OrthoFinder: Solving fundamental biases in whole genome comparisons dramatically improves orthogroup
inference accuracy. Genome Biol. 2015, 16, 157. [CrossRef] [PubMed]
51. Hauser, M.; Steinegger, M.; Söding, J. MMseqs software suite for fast and deep clustering and searching of large protein sequence
sets. Bioinformatics 2016, 32, 1323–1330. [CrossRef]
52. Edgar, R.C. MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004, 32,
1792–1797. [CrossRef]
53. Talavera, G.; Castresana, J. Improvement of Phylogenies after Removing Divergent and Ambiguously Aligned Blocks from
Protein Sequence Alignments. Syst. Biol. 2007, 56, 564–577. [CrossRef]
54. Nguyen, L.-T.; Schmidt, H.A.; von Haeseler, A.; Minh, B.Q. IQ-TREE: A fast and effective stochastic algorithm for estimating
maximum-likelihood phylogenies. Mol. Biol. Evol. 2015, 32, 268–274. [CrossRef]
55. Sanderson, M.J. r8s: Inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock.
Bioinformatics 2003, 19, 301–302. [CrossRef]
56. Kumar, S.; Stecher, G.; Suleski, M.; Hedges, S.B. TimeTree: A Resource for Timelines, Timetrees, and Divergence Times. Mol. Biol.
Evol. 2017, 34, 1812–1819. [CrossRef] [PubMed]
57. Mendes, F.K.; Vanderpool, D.; Fulton, B.; Hahn, M.W. CAFE 5 models variation in evolutionary rates among gene families.
Bioinformatics 2020, 36, 5516–5518. [CrossRef] [PubMed]
58. Huerta-Cepas, J.; Forslund, K.; Coelho, L.P.; Szklarczyk, D.; Jensen, L.J.; von Mering, C.; Bork, P. Fast Genome-Wide Functional
Annotation through Orthology Assignment by eggNOG-Mapper. Mol. Biol. Evol. 2017, 34, 2115–2122. [CrossRef] [PubMed]
59. Klopfenstein, D.V.; Zhang, L.; Pedersen, B.S.; Ramírez, F.; Warwick Vesztrocy, A.; Naldi, A.; Mungall, C.J.; Yunes, J.M.;
Botvinnik, O.; Weigel, M.; et al. GOATOOLS: A Python library for Gene Ontology analyses. Sci. Rep. 2018, 8, 10872. [CrossRef]
[PubMed]
60. Virtanen, P.; Gommers, R.; Oliphant, T.E.; Haberland, M.; Reddy, T.; Cournapeau, D.; Burovski, E.; Peterson, P.; Weckesser, W.;
Bright, J.; et al. Author Correction: SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nat. Methods 2020, 17, 352.
[CrossRef]
61. Wang, Y.; Tang, H.; Debarry, J.D.; Tan, X.; Li, J.; Wang, X.; Lee, T.-H.; Jin, H.; Marler, B.; Guo, H.; et al. MCScanX: A toolkit for
detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 2012, 40, e49. [CrossRef] [PubMed]
62. Sun, P.; Jiao, B.; Yang, Y.; Shan, L.; Li, T.; Li, X.; Xi, Z.; Wang, X.; Liu, J. WGDI: A user-friendly toolkit for evolutionary analyses of
whole-genome duplications and ancestral karyotypes. Mol. Plant 2022, 15, 1841–1851. [CrossRef]
63. Tang, H.; Bowers, J.E.; Wang, X.; Ming, R.; Alam, M.; Paterson, A.H. Synteny and Collinearity in Plant Genomes. Science 2008, 320,
486–488. [CrossRef]
64. Zwaenepoel, A.; Van de Peer, Y. wgd-simple command line tools for the analysis of ancient whole-genome duplications.
Bioinformatics 2019, 35, 2153–2155. [CrossRef]
65. Li, P.; Quan, X.; Jia, G.; Xiao, J.; Cloutier, S.; You, F.M. RGAugury: A pipeline for genome-wide prediction of resistance gene
analogs (RGAs) in plants. BMC Genom. 2016, 17, 852. [CrossRef] [PubMed]
66. Katoh, K.; Standley, D.M. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability.
Mol. Biol. Evol. 2013, 30, 772–780. [CrossRef] [PubMed]
67. Price, M.N.; Dehal, P.S.; Arkin, A.P. FastTree 2--approximately maximum-likelihood trees for large alignments. PLoS ONE
2010, 5, e9490. [CrossRef]
68. Letunic, I.; Bork, P. Interactive Tree of Life (iTOL) v5: An online tool for phylogenetic tree display and annotation. Nucleic Acids
Res. 2021, 49, W293–W296. [CrossRef]
Plants 2024, 13, 636 20 of 20
69. Kautsar, S.A.; Suarez Duran, H.G.; Blin, K.; Osbourn, A.; Medema, M.H. plantiSMASH: Automated identification, annotation and
expression analysis of plant biosynthetic gene clusters. Nucleic Acids Res. 2017, 45, W55–W63. [CrossRef]
70. Strijk, J.; Hinsinger, D.; Roeder, M.; Chatrou, L.; Couvreur, T.; Erkens, R.; Sauquet, H.; Pirie, M.; Thomas, D.; Cao, K. Chromosome-
level reference genome of the Soursop (Annona muricata), a new resource for Magnoliid research and tropical pomology. Mol. Ecol.
Resour. 2021, 21, 1608–1619. [CrossRef]
71. Martin, C.; Viruel, M.A.; Lora, J.; Hormaza, J.I. Polyploidy in Fruit Tree Crops of the Genus Annona (Annonaceae). Front. Plant Sci.
2019, 10, 99. [CrossRef] [PubMed]
72. Dong, S.; Liu, M.; Liu, Y.; Chen, F.; Yang, T.; Chen, L.; Zhang, X.; Guo, X.; Fang, D.; Li, L.; et al. The genome of Magnolia biondii
Pamp. provides insights into the evolution of Magnoliales and biosynthesis of terpenoids. Hortic. Res. 2021, 8, 38. [CrossRef]
[PubMed]
73. Chen, J.; Hao, Z.; Guang, X.; Zhao, C.; Wang, P.; Xue, L.; Zhu, Q.; Yang, L.; Sheng, Y.; Zhou, Y.; et al. Liriodendron genome sheds
light on angiosperm phylogeny and species-pair differentiation. Nat. Plants 2019, 5, 18–25. [CrossRef] [PubMed]
74. Long, E.O.; Dawid, I.B. Repeated Genes in Eukaryotes. Annu. Rev. Biochem. 1980, 49, 727–764. [CrossRef] [PubMed]
75. Moriyama, Y.; Koshiba-Takeuchi, K. Significance of whole-genome duplications on the emergence of evolutionary novelties. Brief.
Funct. Genom. 2018, 17, 329–338. [CrossRef] [PubMed]
76. Amborella Genome Project. The Amborella genome and the evolution of flowering plants. Science 2013, 342, 1241089. [CrossRef]
77. Marçais, G.; Delcher, A.L.; Phillippy, A.M.; Coston, R.; Salzberg, S.L.; Zimin, A. MUMmer4: A fast and versatile genome alignment
system. PLoS Comput. Biol. 2018, 14, e1005944. [CrossRef]
78. Steinegger, M.; Söding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat.
Biotechnol. 2017, 35, 1026–1028. [CrossRef]
79. Cox-Georgian, D.; Ramadoss, N.; Dona, C.; Basu, C. Therapeutic and Medicinal Uses of Terpenes. In Medicinal Plants; Springer:
Berlin/Heidelberg, Germany, 2019; pp. 333–359. [CrossRef]
80. Singh, B.; Sharma, R.A. Plant terpenes: Defense responses, phylogenetic analysis, regulation and clinical applications. 3 Biotech
2015, 5, 129–151. [CrossRef]
81. Jones, J.D.G.; Dangl, J.L. The plant immune system. Nature 2006, 444, 323–329. [CrossRef]
82. Tarr, D.E.K.; Alexander, H.M. TIR-NBS-LRR genes are rare in monocots: Evidence from diverse monocot orders. BMC Res. Notes
2009, 2, 197. [CrossRef] [PubMed]
83. Jacob, F.; Vernaldi, S.; Maekawa, T. Evolution and Conservation of Plant NLR Functions. Front. Immunol. 2013, 4, 297. [CrossRef]
[PubMed]
84. Ngou, B.P.M.; Ding, P.; Jones, J.D.G. Thirty years of resistance: Zig-zag through the plant immune system. Plant Cell 2022, 34,
1447–1478. [CrossRef] [PubMed]
85. Alali, F.Q.; Liu, X.-X.; McLaughlin, J.L. Annonaceous Acetogenins: Recent Progress. J. Nat. Prod. 1999, 62, 504–540. [CrossRef]
[PubMed]
86. Ying, S.; Scheible, W.-R. A novel calmodulin-interacting Domain of Unknown Function 506 protein represses root hair elongation
in Arabidopsis. Plant Cell Environ. 2022, 45, 1796–1812. [CrossRef]
87. Datiles, M.J.; Acevedo-Rodríguez, P. Annona cherimola (cherimoya). In CABI Compendium; 18/12/2014. Available online:
https://doi.org/10.1079/cabicompendium.5806 (accessed on 17 January 2024).
88. Kahn, T.L.; Adams, C.J.; Arpaia, M.L. Paternal and maternal effects on fruit and seed characteristics in cherimoya (Annona
cherimola Mill.). Sci. Hortic. 1994, 59, 11–25. [CrossRef]
89. Camacho, C.; Coulouris, G.; Avagyan, V.; Ma, N.; Papadopoulos, J.; Bealer, K.; Madden, T. BLAST plus: Architecture and
applications. BMC Bioinform. 2009, 10. [CrossRef]
90. Dick, C.W.; Pennington, R.T. History and Geography of Neotropical Tree Diversity. Annu. Rev. Ecol. Evol. Syst. 2019, 50, 279–301.
[CrossRef]
91. Baker, P.A.; Fritz, S.C.; Battisti, D.S.; Dick, C.W.; Vargas, O.M.; Asner, G.P.; Martin, R.E.; Wheatley, A.; Prates, I. Beyond Refugia:
New Insights on Quaternary Climate Variation and the Evolution of Biotic Diversity in Tropical South America. In Neotropical
Diversification: Patterns and Processes; Springer: Berlin/Heidelberg, Germany, 2020; pp. 51–70. [CrossRef]
92. Larranaga, N.; Fontecha, G.; Albertazzi, F.J.; Palmieri, M.; Hormaza, J.I. Amplification of Cherimoya (Annona cherimola Mill.)
with Chloroplast-Specific Markers: Geographical Implications on Diversity and Dispersion Studies. Horticulturae 2022, 8, 807.
[CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.