inproceedings by Pierre Larmande
articles by Pierre Larmande
BACKGROUND:In recent years, a large amount of "-omics" data have been produced. However, these da... more BACKGROUND:In recent years, a large amount of "-omics" data have been produced. However, these data are stored in many different species-specific databases that are managed by different institutes and laboratories. Biologists often need to find and assemble data from disparate sources to perform certain analyses. Searching for these data and assembling them is a time-consuming task. The Semantic Web helps to facilitate interoperability across databases. A common approach involves the development of wrapper systems that map a relational database schema onto existing domain ontologies. However, few attempts have been made to automate the creation of such wrappers.

A library of 29,482 T-DNA enhancer trap lines has been generated in rice cv. Nipponbare. The regi... more A library of 29,482 T-DNA enhancer trap lines has been generated in rice cv. Nipponbare. The regions flanking the T-DNA left border from the first 12,707 primary transformants were systematically isolated by adapter anchor PCR and sequenced. A survey of the 7480 genomic sequences larger than 30 bp (average length 250 bp), representing 56.4{\%} of the total readable sequences and matching the rice bacterial artificial chromosome/phage artificial chromosome (BAC/PAC) sequences assembled in pseudomolecules allowed the assigning of 6645 (88.8{\%}) T-DNA insertion sites to at least one position in the rice genome of cv. Nipponbare. T-DNA insertions appear to be rather randomly distributed over the 12 rice chromosomes, with a slightly higher insertion frequency in chromosomes 1, 2, 3 and 6. The distribution of 723 independent T-DNA insertions along the chromosome 1 pseudomolecule did not differ significantly from that of the predicted coding sequences in exhibiting a lower insertion density around the centromere region and a higher density in the subtelomeric regions where the gene density is higher. Further establishment of density graphs of T-DNA inserts along the recently released 12 rice pseudomolecules confirmed this non-uniform chromosome distribution. T-DNA appeared less prone to hot spots and cold spots of integration when compared with those revealed by a concurrent assignment of the Tos17 retrotransposon flanking sequences deposited in the National Center for Biotechnology Information (NCBI). T-DNA inserts rarely integrated into repetitive sequences. Based on the predicted gene annotation of chromosome 1, preferential insertion within the first 250 bp from the putative ATG start codon has been observed. Using 4 kb of sequences surrounding the insertion points, 62{\%} of the sequences showed significant similarity to gene encoding known proteins (E-value {\textless} 1.00 e(-05)). To illustrate the in silico reverse genetic approach, identification of 83 T-DNA insertions within genes coding for transcription factors (TF) is presented. Based both on the estimated number of members of several large TF gene families (e.g. Myb, WRKY, HD-ZIP, Zinc-finger) and on the frequency of insertions in chromosome 1 predicted genes, we could extrapolate that 7-10{\%} of the rice gene complement is already tagged by T-DNA insertion in the 6116 independent transformant population. This large resource is of high significance while assisting studies unravelling gene function in rice and cereals, notably through in silico reverse genetics.

To organize data resulting from the phenotypic characterization of a library of 30,000 T-DNA enha... more To organize data resulting from the phenotypic characterization of a library of 30,000 T-DNA enhancer trap (ET) insertion lines of rice (Oryza sativa L cv. Nipponbare), we developed the Oryza Tag Line (OTL) database ( OTL structure facilitates forward genetic search for specific phenotypes, putatively resulting from gene disruption, and/or for GUSA or GFP reporter gene expression patterns, reflecting ET-mediated endogenous gene detection. In the latest version, OTL gathers the detailed morpho-physiological alterations observed during field evaluation and specific screens in a first set of 13,928 lines. Detection of GUS or GFP activity in specific organ/tissues in a subset of the library is also provided. Search in OTL can be achieved through trait ontology category, organ and/or developmental stage, keywords, expression of reporter gene in specific organ/tissue as well as line identification number. OTL now contains the description of 9721 mutant phenotypic traits observed in 2636 lines and 1234 GUS or GFP expression patterns. Each insertion line is documented through a generic passport data including production records, seed stocks and FST information. 8004 and 6101 of the 13,928 lines are characterized by at least one T-DNA and one Tos17 FST, respectively that OTL links to the rice genome browser OryGenesDB.
inproceedings by Pierre Larmande
articles by Pierre Larmande