Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2021
…
34 pages
1 file
Despite recent biomedical breakthroughs and large genomic studies growing momentum, the Middle Eastern population, home to over 400 million people, is under-represented in the human genome variation databases. Here we describe insights from phase 1 of the Qatar Genome Program which whole genome sequenced 6,045 individuals from Qatar. We identified more than 88 million variants of which 24 million are novel and 23 million are singletons. Consistent with the high consanguinity and founder effects in the region, we found that several rare deleterious variants were more common in the Qatari population while others seem to provide protection against diseases and have shaped the genetic architecture of adaptive phenotypes. Insights into the genetic structure of the Qatari population revealed five non-admixed subgroups. Based on sequence data, we also reported the heritability and genetic marker associations for 45 clinical traits. These results highlight the value of our data as a resourc...
Nature Communications, 2021
Clinical laboratory tests play a pivotal role in medical decision making, but little is known about their genetic variability between populations. We report a genome-wide association study with 45 clinically relevant traits from the population of Qatar using a whole genome sequencing approach in a discovery set of 6218 individuals and replication in 7768 subjects. Trait heritability is more similar between Qatari and European populations (r = 0.81) than with Africans (r = 0.44). We identify 281 distinct variant-trait-associations at genome wide significance that replicate known associations. Allele frequencies for replicated loci show higher correlations with European (r = 0.94) than with African (r = 0.85) or Japanese (r = 0.80) populations. We find differences in linkage disequilibrium patterns and in effect sizes of the replicated loci compared to previous reports. We also report 17 novel and Qatari-predominate signals providing insights into the biological pathways regulating th...
The Qatari population, located at the Arabian migration crossroads of African and Eurasia, is comprised of Bedouin, Persian and African genetic subgroups. By deep exome sequencing of only 7 Qataris, including individuals in each subgroup, we identified 2,750 nonsynonymous SNPs predicted to be deleterious, many of which are linked to human health, or are in genes linked to human health. Many of these SNPs were at significantly elevated deleterious allele frequency in Qataris compared to other populations worldwide. Despite the small sample size, SNP allele frequency was highly correlated with a larger Qatari sample. Together, the data demonstrate that exome sequencing of only a small number of individuals can reveal genetic variations with potential health consequences in understudied populations.
2021
Genetic variation in populations of Middle Eastern origin remains highly underrepresented in most comprehensive genomic databases. This underrepresentation hampers the functional annotation of the human genome and challenges accurate clinical variant interpretation. To highlight the importance of capturing genetic variation in the Middle East, we aggregated whole exome and genome sequencing data from 2116 individuals in the Middle East and established the Middle East Variation (MEV) database. Of the high-impact coding (missense and loss of function) variants in this database, 53% were absent from the most comprehensive Genome Aggregation Database (gnomAD), thus representing a unique Middle Eastern variation dataset which might directly impact clinical variant interpretation. We highlight 39 variants with minor allele frequency >1% in the MEV database that were previously reported as rare disease variants in ClinVar and the Human Gene Mutation Database (HGMD). Furthermore, the MEV...
Global cardiology science & practice, 2014
Genetic disorders are not equally distributed over the geography of the Arab region. While a number of disorders have a wide geographical presence encompassing 10 or more Arab countries, almost half of these disorders occur in a single Arab country or population. Nearly, one-third of the genetic disorders in Arabs result from congenital malformations and chromosomal abnormalities, which are also responsible for a significant proportion of neonatal and perinatal deaths in Arab populations. Strikingly, about two-thirds of these diseases in Arab patients follow an autosomal recessive mode of inheritance. High fertility rates together with increased consanguineous marriages, generally noticed in Arab populations, tend to increase the rates of genetic and congenital abnormalities. Many of the nearly 500 genes studied in Arab people revealed striking spectra of heterogeneity with many novel and rare mutations causing large arrays of clinical outcomes. In this review we provided an overvie...
BMC Genomics, 2015
Background: The populations of the Arabian Peninsula remain the least represented in public genetic databases, both in terms of single nucleotide variants and of larger genomic mutations. We present the first high-resolution copy number variation (CNV) map for a Gulf Arab population, using a hybrid approach that integrates array genotyping intensity data and next-generation sequencing reads to call CNVs in the Qatari population. Methods: CNVs were detected in 97 unrelated Qatari individuals by running two calling algorithms on each of two primary datasets: high-resolution genotyping (Illumina Omni 2.5M) and high depth whole-genome sequencing (Illumina PE 100bp). The four call-sets were integrated to identify high confidence CNV regions, which were subsequently annotated for putative functional effect and compared to public databases of CNVs in other populations. The availability of genome sequence was leveraged to identify tagging SNPs in high LD with common deletions in this population, enabling their imputation from genotyping experiments in the future. Results: Genotyping intensities and genome sequencing data from 97 Qataris were analyzed with four different algorithms and integrated to discover 16,660 high confidence CNV regions (CNVRs) in the total population, affecting ~28 Mb in the median Qatari genome. Up to 40 % of all CNVs affected genes, including novel CNVs affecting Mendelian disease genes, segregating at different frequencies in the 3 major Qatari subpopulations, including those with Bedouin, Persian/South Asian, and African ancestry. Consistent with high consanguinity levels in the Bedouin subpopulation, we found an increased burden for homozygous deletions in this group. In comparison to known CNVs in the comprehensive Database of Genomic Variants, we found that 5 % of all CNVRs in Qataris were completely novel, with an enrichment of CNVs affecting several known chromosomal disorder loci and genes known to regulate sugar metabolism and type 2 diabetes in the Qatari cohort. Finally, we leveraged the availability of genome sequence to find suitable tagging SNPs for common deletions in this population. We combine four independently generated datasets from 97 individuals to study CNVs for the first time at high-resolution in a Gulf Arab population.
Scientific Reports
Consanguineous populations of the Arabian Peninsula have been underrepresented in global efforts that catalogue human exome variability. We sequenced 291 whole exomes of unrelated, healthy native Arab individuals from Kuwait to a median coverage of 45X and characterised 170,508 singlenucleotide variants (SNVs), of which 21.7% were 'personal'. Up to 12% of the SNVs were novel and 36% were population-specific. Half of the SNVs were rare and 54% were missense variants. The study complemented the Greater Middle East Variome by way of reporting many additional Arabian exome variants. The study corroborated Kuwaiti population genetic substructures previously derived using genome-wide genotype data and illustrated the genetic relatedness among Kuwaiti population subgroups, Middle Eastern, European and Ashkenazi Jewish populations. The study mapped 112 rare and frequent functional variants relating to pharmacogenomics and disorders (recessive and common) to the phenotypic characteristics of Arab population. Comparative allele frequency data and carrier distributions of known Arab mutations for 23 disorders seen among Arabs, of putative OMIM-listed causal mutations for 12 disorders observed among Arabs but not yet characterized for genetic basis in Arabs, and of 17 additional putative mutations for disorders characterized for genetic basis in Arab populations are presented for testing in future Arab studies. Characterising the patterns of genetic variation within and among human populations is crucial to understand human evolutionary history and the genetic basis of disorders 1. Many global genome-wide genotyping and whole-genome sequencing studies (such as the Human Genome Diversity Project 1,2 , the 1000 Genomes Project (1KGP) 3,4 and the UK10K project 5) have been undertaken to catalogue genetic variation. Coding exonic regions, though estimated to encompass only approximately 1-2% of the genome, harbour the most functional variation and contain almost 85% of the known disease-causing pathogenic variants 6,7 ; therefore, several global whole-exome sequencing studies have also been undertaken 8-10. Such large-scale global projects have revealed that human populations harbour a large amount of rare variations which exhibit little homology between diverged populations 3,9-17 , Mendelian and rare genetic disorders are often associated with rare coding variants. Likewise, common markers associated with complex disorders too can vary in frequency across populations 18. Considering that population-specific differences in allele frequencies are of clinical importance, it is fundamental to catalogue them in diverse ethnic populations 19. The Arabian Peninsula holds a strategic place in the early human migration routes out of Africa 20-22. The Peninsula was instrumental in shaping the genetic map of current global populations because the first Eurasian populations were established here 23. The ancestry of indigenous Arabs can largely be traced back to ancient lineages of the Arabian Peninsula 23,24. The Arab population is heterogeneous but well-structured 3,24-26. For example, the Kuwaiti population comprises three genetic subgroups, namely KWP (largely of West Asian ancestry
Nature genetics, 2016
The Greater Middle East (GME) has been a central hub of human migration and population admixture. The tradition of consanguinity, variably practiced in the Persian Gulf region, North Africa, and Central Asia, has resulted in an elevated burden of recessive disease. Here we generated a whole-exome GME variome from 1,111 unrelated subjects. We detected substantial diversity and admixture in continental and subregional populations, corresponding to several ancient founder populations with little evidence of bottlenecks. Measured consanguinity rates were an order of magnitude above those in other sampled populations, and the GME population exhibited an increased burden of runs of homozygosity (ROHs) but showed no evidence for reduced burden of deleterious variation due to classically theorized 'genetic purging'. Applying this database to unsolved recessive conditions in the GME population reduced the number of potential disease-causing variants by four- to sevenfold. These resul...
Scientific Reports, 2019
Whole Genome Sequencing (WGS) provides an in depth description of genome variation. In the era of large-scale population genome projects, the assembly of ethnic-specific genomes combined with mapping human reference genomes of underrepresented populations has improved the understanding of human diversity and disease associations. In this study, for the first time, whole genome sequences of two nationals of the United Arab Emirates (UAE) at >27X coverage are reported. The two Emirati individuals were predominantly of Central/South Asian ancestry. An in-house customized pipeline using BWA, Picard followed by the GATK tools to map the raw data from whole genome sequences of both individuals was used. A total of 3,994,521 variants (3,350,574 Single Nucleotide Polymorphisms (SNPs) and 643,947 indels) were identified for the first individual, the UAE S001 sample. A similar number of variants, 4,031,580 (3,373,501 SNPs and 658,079 indels), were identified for UAE S002. Variants that are...
Nucleic Acids Research, 2006
The Arabs comprise a genetically heterogeneous group that resulted from the admixture of different populations throughout history. They share many common characteristics responsible for a considerable proportion of perinatal and neonatal mortalities. To this end, the Centre for Arab Genomic Studies (CAGS) launched a pilot project to construct the 'Catalogue ofTransmissionGeneticsin Arabs' (CTGA) database for genetic disorders in Arabs. Information in CTGA is drawn from published research and mined hospital records. The database offers web-based basic and advanced search approaches. In either case, the final search result is a detailed HTML record that includes text-, URL-and graphic-based fields. At present, CTGA hosts entries for 692 phenotypes and 235 related genes described in Arab individuals. Of these, 213 phenotypic descriptions and 22 related genes were observed in the Arab population of the United Arab Emirates (UAE). These results emphasize the role of CTGA as an essential tool to promote scientific research on genetic disorders in the region. The priority of CTGA is to provide timely information on the occurrence of genetic disorders in Arab individuals. It is anticipated that data from Arab countries other than the UAE will be exhaustively searched and incorporated in CTGA (http://www.cags.org.ae).
Frontiers in Genetics, 2020
Whole genome sequences (WGS) of four nationals of the United Arab Emirates (UAE) at an average coverage of 33X have been completed and described. The selection of suitable subpopulation representatives was informed by a preceding comprehensive population structure analysis. Representatives were chosen based on their central location within the subpopulation on a principal component analysis (PCA) and the degree to which they were admixed. Novel genomic variations among the different subgroups of the UAE population are reported here. Specifically, the WGS analysis identified 4,161,067-4,798,806 variants in the four individual samples, where approximately 80% were single nucleotide polymorphisms (SNPs) and 20% were insertions or deletions (indels). An average of 2.75% was found to be novel variants according to dbSNP (build 151). This is the first report of structural variants (SV) from WGS data from UAE nationals. There were 15,677-20,339 called SVs, of which around 13.5% were novel. The four samples shared 1,399,178 variants, each with distinct variants as follows: 1,085,524 (for the individual denoted as UAE S011), 1,228,559 (UAE S012), 791,072 (UAE S013), and 906,818 (UAE S014). These results show a previously unappreciated population diversity in the region. The synergy of WGS and genotype array data was demonstrated through variant annotation of the former using 2.3 million allele frequencies for the local population derived from the latter technology platform. This novel approach of combining breadth and depth of array and WGS technologies has guided the choice of population genetic representatives and provides complementary, regionalized allele frequency annotation to new genomes comprising millions of loci.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
American Journal of Human Genetics, 2010