{"@attributes":{"version":"2.0"},"channel":{"title":"PyVideo.org - genomes","link":"https:\/\/pyvideo.org\/","description":{},"lastBuildDate":"Fri, 11 Mar 2011 00:00:00 +0000","item":{"title":"Rapid Python used on Big Data to Discover Human Genetic Variation","link":"https:\/\/pyvideo.org\/pycon-us-2011\/pycon-2011--rapid-python-used-on-big-data-to-disc.html","description":"<h3>Description<\/h3><p>Rapid Python used on Big Data to Discover Human Genetic Variation<\/p>\n<p>Presented by Deniz Kural<\/p>\n<p>Advances in genome sequencing has enabled large-scale projects such as\nthe 1000 Genomes Project to sequence genomes across diverse populations\naround the world, resulting in very large data sets. I use Python for\nrapid development of algorithms for processing &amp; analyzing genomes and\ndiscovering thousands of new variants, including &quot;Mobile Elements&quot; that\ncopy&amp;paste; themselves across the genome.<\/p>\n<p>Abstract<\/p>\n<p>Recent advances in high-throughput sequencing now enables accurate\nsequencing human genomes at a low cost &amp; high speed. This technology is\nnow used to initiate projects involving large-scale sequencing of many\ngenomes. The 1000 Genomes project aims to sequence 2500 genomes across\n27 world populations, and has initially completed its Pilot phase. The\naim of the project is to discover &amp; characterize novel variants. These\nvariants enable association studies that investigate the link between\ngenomic variation &amp; phenotypes, including disease.<\/p>\n<p>A class of variants, known as &quot;Structural Variants&quot; represent a\nheterogenous class of larger variants, such as inversions, duplications,\ndeletions, and various kinds of insertions.<\/p>\n<p>I use Python to for rapid development of algorithms to process, analyze,\nand annotate very large data sets. In particular, I focus on Mobile\nElements, pieces of DNA that copy&amp;paste; across the genome. These\nelements constitute roughly half of the genome, whereas protein-coding\ngenes account for roughly 1.5 % of the genome.<\/p>\n<p>I will discuss distributed computing, genomics, and big data within the\ncontext of Python.<\/p>\n","pubDate":"Fri, 11 Mar 2011 00:00:00 +0000","guid":"tag:pyvideo.org,2011-03-11:\/pycon-us-2011\/pycon-2011--rapid-python-used-on-big-data-to-disc.html","category":["PyCon US 2011","bigdata","casestudy","dna","genomes","pycon","pycon2011"]}}}