{"title":"Pierre-Edouard Guerin","subtitle":"Bioinformatician","link":[{"@attributes":{"rel":"self","type":"application\/atom+xml","href":"https:\/\/guerinpe.com\/atom.xml"}},{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com"}}],"generator":"Zola","updated":"2025-10-07T00:00:00+00:00","id":"https:\/\/guerinpe.com\/atom.xml","entry":[{"title":"Research Tax Credit","published":"2025-10-07T00:00:00+00:00","updated":"2025-10-07T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/research-tax-credit\/"}},"id":"https:\/\/guerinpe.com\/articles\/research-tax-credit\/","summary":"<p>The <strong>Research Activities Tax Credit<\/strong> is a tax credit that incentivizes private companies to increase their Research and Development (R&amp;D).  Within my company, I have been tasked with writing the <strong>Research Tax Credit (CIR) justification report<\/strong> for France. Here, the method for writing such a report.<\/p>"},{"title":"Turing Complete: From Logical Gates to CPU Architecture","published":"2025-09-17T00:00:00+00:00","updated":"2025-09-17T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/turing-complete-from-logical-gates-to-cpu-architecture\/"}},"id":"https:\/\/guerinpe.com\/articles\/turing-complete-from-logical-gates-to-cpu-architecture\/","summary":"<p>In 2021, LevelHead published <a href=\"https:\/\/turingcomplete.game\">Turing Complete<\/a>, a game about computer science. My friend <strong>Christophe Georgescu<\/strong> recommended me to play it. Unfortunately, I took his advice and now I can not stop to play this game! The game challenges you to design an entire computer from scratch. You start with basic logic gates, then move on to components, memory, CPU architecture, and finally assembly programming. By the way, the game is neat and present all these concepts in a playful and intuitive way.<\/p>"},{"title":"Portfolio","published":"2025-08-12T00:00:00+00:00","updated":"2025-08-12T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/portfolio\/"}},"id":"https:\/\/guerinpe.com\/portfolio\/","content":"<div class=\"all_proj\">\n<div role=\"button\" class=\"project\" onclick=\"location.href='https:\/\/github.com\/Grelot\/genbar2'\">\n<img align=\"center\" width=\"48rem\" height=\"48rem\" src=\"cnrs.png\">\n<div class=\"title\"> Genbar 2 <\/div>\n<div class=\"description\"> I programmed a software to identify genetic boundaries between populations from individual spatial coordinates and genetic variants data.<\/div>\n<div class=\"skills\"> (C, C++, htslib)<\/div>\n<hr width=\"31%\"> \n<div class=\"duration\"> May 2019 - Feb 2020 <\/div>\n<\/div>\n<div role=\"button\" class=\"project\" onclick=\"location.href='https:\/\/gitlab.mbb.univ-montp2.fr\/edna'\">\n<img align=\"center\" width=\"128rem\" height=\"48rem\" src=\"monaco.png\">\n<div class=\"title\">Metabarcoding<\/div>\n<div class=\"description\">I programmed several workflow to process metabarcoding environmental DNA data from MONACO MARINE WORLDWIDE EXPEDITION.<\/div>\n<div class=\"skills\"> (obitools, vsearch, swarm, cutadapt, bash, python, singularity, snakemake)<\/div>\n<hr width=\"31%\"> \n<div class=\"duration\"> Jan 2019 - Jul 2021<\/div>\n<\/div>\n<div role=\"button\" class=\"project\" onclick=\"location.href='https:\/\/github.com\/lmathon\/eDNA--benchmark_pipelines'\">\n<img align=\"center\" width=\"114rem\" height=\"48rem\" src=\"spygen.png\">\n<div class=\"title\">Benchmarking of metabarcoding workflows<\/div>\n<div class=\"description\">We tested some combinations of softwares to improve performances of metabarcoding data processing.<\/div>\n<div class=\"skills\"> (obitools, vsearch, qiime, python, singularity, Univa Grid Engine)<\/div>\n<hr width=\"31%\"> \n<div class=\"duration\"> Mar 2018 - Jul 2021<\/div>\n<\/div>\n<div role=\"button\" class=\"project\" onclick=\"location.href='https:\/\/gitlab.mbb.univ-montp2.fr\/reservebenefit\/snakemake_stacks2'\">\n<img align=\"center\" width=\"48rem\" height=\"48rem\" src=\"reservebenefit.png\">\n<div class=\"title\">Population genomics<\/div>\n<div class=\"description\">I genotyped 1200 individuals belonging to 3 fish species. I worked with restriction enzyme-based data such as RAD-seq.<\/div>\n<div class=\"skills\"> (illumina paired-end, STACKS, vcftools, bedtools, bwa, python, snakemake, singularity, Univa Grid Engine, bash)<\/div>\n<hr width=\"31%\"> \n<div class=\"duration\"> Jun 2017 - Oct 2019 <\/div>\n<\/div>\n<div role=\"button\" class=\"project\" onclick=\"location.href='https:\/\/github.com\/Grelot\/aker--beetGenomeEnvironmentAssociation'\">\n<img align=\"center\" width=\"48rem\" height=\"48rem\" src=\"florimond.png\">\n<div class=\"title\">Beets genome metrics<\/div>\n<div class=\"description\">I calculated metrics (nucleotide diversity, Tajima's D) on the beets genome from 14,409 random single nucleotide polymorphisms (SNPs) among 299 accessions of cultivated beets.<\/div>\n<div class=\"skills\"> (R, python, genpop)<\/div>\n<hr width=\"31%\"> \n<div class=\"duration\"> May 2017 - May 2018 <\/div>\n<\/div>\n<div role=\"button\" class=\"project\" onclick=\"location.href='https:\/\/github.com\/Grelot\/global_fish_genetic_diversity'\">\n<img align=\"center\" width=\"224rem\" height=\"48rem\" src=\"ephe.png\">\n<div class=\"title\">First Global Map of Fish Genetic Diversity<\/div>\n<div class=\"description\"> I built a database containing over 50,000 DNA sequences representing 3,815 species of marine fish and 1,611 species of freshwater fish. I estimated the average genetic diversity at different geographical scales.<\/div>\n<div class=\"skills\"> (julia, python, R, singularity, MUSCLE, UGENE, geonames, BOLD, shiny)<\/div>\n<hr width=\"31%\"> \n<div class=\"duration\"> May 2017 - Feb 2020 <\/div>\n<\/div>\n<div role=\"button\" class=\"project\" onclick=\"location.href='https:\/\/gitlab.mbb.univ-montp2.fr\/seaconnect'\">\n<img align=\"center\" width=\"136rem\" height=\"48rem\" src=\"total.png\">\n<div class=\"title\">Landscape genomics<\/div>\n<div class=\"description\">I processed low-coverage RAD-seq data from 1800 individuals belonging to 2 fish species collected from all over the Mediterranean sea.<\/div>\n<div class=\"skills\"> (dDocent, freebayes, vcftools, samtools, trimmomatic, bash, python, singularity, snakemake)<\/div>\n<hr width=\"31%\"> \n<div class=\"duration\"> Feb 2017 - Mar 2021 <\/div>\n<\/div>\n<div role=\"button\" class=\"project\" onclick=\"location.href='https:\/\/gitlab.mbb.univ-montp2.fr\/reservebenefit\/genomic_resources_for_med_fishes'\">\n<img align=\"center\" width=\"48rem\" height=\"48rem\" src=\"reservebenefit.png\">\n<div class=\"title\">Genome assembly<\/div>\n<div class=\"description\">I did the sequencing and assembly of 3 new fish species nuclear and mitochondrial genomes.<\/div>\n<div class=\"skills\"> (illumina paired-end, mate-pair, 10X genomics chromium, Abyss, Platanus, QUAST, SLURM, bash)<\/div>\n<hr width=\"31%\"> \n<div class=\"duration\"> Jan 2017 - Nov 2019 <\/div>\n<\/div>\n<div role=\"button\" class=\"project\" onclick=\"location.href='https:\/\/github.com\/Grelot\/diabetesGenetics--COAT'\">\n<img align=\"center\" width=\"136rem\" height=\"32rem\" src=\"inserm.png\">\n<div class=\"title\">Report bad quality region of coding sequences from genome sequencing data<\/div>\n<div class=\"description\">I developped a software able to detect human genomic variations with low coverage. Graphical Interface. <\/div>\n<div class=\"skills\"> (Illumina paired-end, samtools, bedtools, variation annotation, python, mySQL, qt4)<\/div>\n<hr width=\"31%\"> \n<div class=\"duration\"> Jan 2016 - Jul 2016 <\/div>\n<\/div>\n<div role=\"button\" class=\"project\" onclick=\"location.href='https:\/\/www.dsimb.inserm.fr\/orion\/'\">\n<img align=\"center\" width=\"136rem\" height=\"32rem\" src=\"inserm.png\">\n<div class=\"title\">Optimization of a method of fold recognition in protein structure<\/div>\n<div class=\"description\">I added a new algorithm to predict 3D-modelisation of protein structure at atomic resolution <\/div>\n<div class=\"skills\"> (PDB, pymol, C, C++, python, R, html, css )<\/div>\n<hr width=\"31%\"> \n<div class=\"duration\"> Feb 2015 - Jun 2015 <\/div>\n<\/div>\n<\/div>\n"},{"title":"Activity","published":"2025-06-25T00:00:00+00:00","updated":"2025-06-25T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/activity\/"}},"id":"https:\/\/guerinpe.com\/activity\/","content":"<h2 id=\"bioinformatics\">Bioinformatics?<\/h2>\n<p>Using computers and data to answer biological questions. The Bioinformatician collects, analyses, and interprets the large amounts of data generated by life science research or clinical studies through computing.<\/p>\n<p>It is multidisciplinary, bringing together biologists, computer scientists, mathematicians, statisticians, and physicists. By turning vast raw data into useful information, this approach helps researchers better understand complex systems and diseases, make diagnoses, and develop new medicines.<\/p>\n<h2 id=\"missions\">Missions<\/h2>\n<p>In Florimond Desprez Group, the missions of the bioinformatician are the following:<\/p>\n<h3 id=\"implement-bioinformatics-solution\">Implement bioinformatics solution<\/h3>\n<ul>\n<li>Conduct analyses using different kinds of omics data on our different crops.<\/li>\n<li>Develop and optimise tools, workflows and software to support plant and molecular breeders in the\nanalysis and interpretation of their data.<\/li>\n<li>Design and maintain databases, data storage systems, and data management processes to ensure the integrity and accessibility of biological data.<\/li>\n<\/ul>\n<h3 id=\"bring-expertise\">Bring expertise<\/h3>\n<ul>\n<li>Communicate research findings, insights, and recommendations to stakeholders.<\/li>\n<li>Participate in external and internal projects related to your field of expertise.<\/li>\n<li>Keep yourself up-to-date with findings from academia and other sources through literature\nsearch and participation to conferences.<\/li>\n<\/ul>\n<h3 id=\"operational-excellence\">Operational Excellence<\/h3>\n<ul>\n<li>Deliver timely results to meet project deadlines.<\/li>\n<li>Provide reporting on project contributions.<\/li>\n<li>Develop robust and documented software components to ensure traceability and reproducibility of your analyses.<\/li>\n<\/ul>\n"},{"title":"Publications","published":"2025-06-25T00:00:00+00:00","updated":"2025-06-25T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/publications\/"}},"id":"https:\/\/guerinpe.com\/publications\/","content":"<ul>\n<li>Publications on <a href=\"https:\/\/orcid.org\/0000-0001-7909-3729\">ORCiD<\/a><\/li>\n<li>Publications on <a href=\"https:\/\/www.researchgate.net\/profile\/Pierre-Edouard-Guerin\">ResearchGate<\/a><\/li>\n<li>Publications on <a href=\"https:\/\/scholar.google.com\/citations?hl=en&amp;user=hj1ClrsAAAAJ\">Google Scholar<\/a><\/li>\n<\/ul>\n<h2 id=\"scientific-articles\">Scientific articles<\/h2>\n<blockquote>\n<p><strong>The distribution of coastal fish eDNA sequences in the Anthropocene<\/strong><\/p>\n<p><em>Laetitia Mathon, Virginie Marques, St\u00e9phanie Manel, Camille Albouy, Marco Andrello, Emilie Boulanger, Julie Deter, R\u00e9gis Hocd\u00e9, Fabien Leprieur, Tom B Letessier, Nicolas Loiseau, Eva Maire, Alice Valentini, Laurent Vigliola, Florian Baletaud, Sandra Bessudo, Tony Dejean, Nadia Faure, Pierre\u2010edouard Guerin, Meret Jucker, Jean\u2010baptiste Juhel, Kadarusman, Andrea Polanco F, Laurent Pouyaud, Dario Schw\u00f6rer, Kirsten F Thompson, Marc Troussellier, Hagi Yulia Sugeha, Laure Velez, Xiaowei Zhang, Wenjun Zhong, Lo\u00efc Pellissier, David Mouillot<\/em><\/p>\n<p>Global Ecology and Biogeography. 2023 August 08. DOI: <a href=\"https:\/\/doi.org\/10.1111\/geb.13698\">10.1111\/geb.13698<\/a><\/p>\n<\/blockquote>\n<blockquote>\n<p><strong>Contrasting influence of seascape, space and marine reserves on genomic variation in multiple species<\/strong><\/p>\n<p><em>Laura Benestan, Nicolas Loiseau, Pierre\u2010edouard Gu\u00e9rin, Angel P\u00e9rez\u2010Ruzafa, Aitor Forcada, Esther Arcas, Philippe Lenfant, Sandra Mallol, Raquel Go\u00f1i, Laure Velez, David Mouillot, Oscar Puebla, St\u00e9phanie Manel<\/em><\/p>\n<p>Ecography. 2023 January 01. DOI: <a href=\"https:\/\/doi.org\/10.1111\/ecog.06127\">10.1111\/ecog.06127<\/a><\/p>\n<\/blockquote>\n<blockquote>\n<p><strong>Evaluating bioinformatics pipelines for population\u2010level inference using environmental DNA<\/strong><\/p>\n<p><em>Bastien Mac\u00e9, R\u00e9gis Hocd\u00e9, Virginie Marques, Pierre\u2010Edouard Guerin, Alice Valentini, V\u00e9ronique Arnal, Lo\u00efc Pellissier, St\u00e9phanie Manel<\/em><\/p>\n<p>Environmental DNA. 2022 February 08. DOI: <a href=\"https:\/\/doi.org\/10.1002\/edn3.269\">10.1002\/edn3.269<\/a><\/p>\n<\/blockquote>\n<blockquote>\n<p><strong>Cross-ocean patterns and processes in fish biodiversity on coral reefs through the lens of eDNA metabarcoding<\/strong><\/p>\n<p><em>Laetitia Mathon, Virginie Marques, David Mouillot, Camille Albouy, Marco Andrello, Florian Baletaud, Giomar H Borrero-P\u00e9rez, Tony Dejean, Graham J Edgar, Jonathan Grondin, Pierre-Edouard Guerin, R\u00e9gis Hocd\u00e9, Jean-Baptiste Juhel, Kadarusman, Eva Maire, Gael Mariani, Matthew Mclean, Andrea Polanco F, Laurent Pouyaud, Rick D Stuart-Smith, Hagi Yulia Sugeha, Alice Valentini, Laurent Vigliola, Indra B Vimono, Lo\u00efc Pellissier, St\u00e9phanie Manel<\/em><\/p>\n<p>Proceedings of the Royal Society B. 2022 April 20. DOI: <a href=\"https:\/\/doi.org\/10.1098\/rspb.2022.0162\">10.1098\/rspb.2022.0162<\/a><\/p>\n<\/blockquote>\n<blockquote>\n<p><strong>Climate differently influences the genomic patterns of two sympatric marine fish species<\/strong><\/p>\n<p><em>Emilie Boulanger, Laura Benestan, Pierre\u2010Edouard Guerin, Alicia Dalongeville, David Mouillot, St\u00e9phanie Manel<\/em><\/p>\n<p>Journal of Animal Ecology. 2021 October 30. DOI: <a href=\"https:\/\/doi.org\/10.1111\/1365-2656.13623\">10.1111\/1365-2656.13623<\/a><\/p>\n<\/blockquote>\n<blockquote>\n<p><strong>Use of environmental DNA in assessment of fish functional and phylogenetic diversity<\/strong><\/p>\n<p><em>Virginie Marques, Paul Castagne, Andr\u00e9a Polanco Fernandez, Giomar Helena Borrero-Perez, Regis Hocde,  Pierre-Edouard Guerin, Jean-Baptiste Juhel, Laure Velez, Nicolas Loiseau, Tom Bech Letessier, Sandra Bessudo, Alice Valentini, Tony Dejean, David Mouillot, Lo\u00efc Pellissier, S\u00e9bastien Vill\u00e9ger<\/em><\/p>\n<p>Conservation Biology. 2021 July 18. DOI: <a href=\"https:\/\/doi.org\/10.1111\/cobi.13802\">10.1111\/cobi.13802<\/a><\/p>\n<\/blockquote>\n<blockquote>\n<p><strong>Restricted dispersal in a sea of gene flow<\/strong><\/p>\n<p><em>Laura Benestan, Nicolas Loiseau, Pierre-Edouard Guerin, Katarina Fietz, Elena Trofimenko, Siren R\u00fchs, Willi Rath, Arne Biastoch, Angel Perez-Ruzafa, Pilar Baixauli, Aitor Forcada, Philippe  Lenfant, Sandra Mallol, Rachel Goni, Laure Velez, Marc H\u00f6ppner, Stuart Kininmonth, David Mouillot, Oscar Puebla, Stephanie Manel<\/em><\/p>\n<p>Proceedings of the Royal Society B. 2021 May 18. DOI: <a href=\"https:\/\/doi.org\/10.1098\/rspb.2021.0458\">10.1098\/rspb.2021.0458<\/a><\/p>\n<\/blockquote>\n<blockquote>\n<p><strong>Benchmarking bioinformatic tools for fast and accurate eDNA metabarcoding species identification<\/strong><\/p>\n<p><em>Laetitia Mathon, Alice Valentini, Pierre-Edouard Guerin, Eric Normandeau, Cyril Noel, Cl\u00e9ment Lionnet, Emilie Boulanger, Wilfried Thuiller, Louis Bernatchez, David Mouillot, Tony Dejean, Stephanie Manel<\/em><\/p>\n<p>Molecular Ecology Resources. 2021 May 17. DOI: <a href=\"https:\/\/doi.org\/10.1111\/1755-0998.13430\">10.1111\/1755-0998.13430<\/a><\/p>\n<\/blockquote>\n<blockquote>\n<p><strong>Blind assessment of vertebrate taxonomic diversity across spatial scales by clustering environmental DNA metabarcoding sequences<\/strong><\/p>\n<p><em>Virginie Marques, Pierre\u2010Edouard Guerin, Mathieu Rocle, Alice Valentini, Stephanie Manel, David Mouillot, Tony Dejean<\/em><\/p>\n<p>Ecography. 2020 Aug 04. DOI: <a href=\"https:\/\/doi.org\/10.1111\/ecog.05049\">10.1111\/ecog.05049<\/a><\/p>\n<\/blockquote>\n<blockquote>\n<p><strong>New genomic ressources for three exploited Mediterranean fishes<\/strong><\/p>\n<p><em>Katharina Fietz, Elena Trofimenko, Pierre-Edouard Guerin, Veronique Arnal, Montserrat Torres-Oliva, Stephane Lobreaux,Angel Perez-Ruzafa, Stephanie Manel, Oscar Puebla<\/em><\/p>\n<p>Genomics. 2020 July 03. DOI: <a href=\"https:\/\/doi.org\/10.1016\/j.ygeno.2020.06.041\">10.1016\/j.ygeno.2020.06.041<\/a><\/p>\n<\/blockquote>\n<blockquote>\n<p><strong>Global determinants of freshwater and marine fish genetic diversity<\/strong><\/p>\n<p><em>Stephanie Manel, Pierre-Edouard Guerin, David Mouillot, Simon Blanchet, Laure Velez, Camille Albouy &amp; Loic Pellissier<\/em><\/p>\n<p>Nature communications. 2020 Feb 10. DOI: <a href=\"https:\/\/doi.org\/10.1038\/s41467-020-14409-7\">10.1038\/s41467-020-14409-7<\/a><\/p>\n<\/blockquote>\n<blockquote>\n<p><strong>Predicting genotype environmental range from genome\u2013environment associations<\/strong><\/p>\n<p><em>Stephanie Manel, Marco Andrello, Karine Henry, Daphne Verdelet, Aude Darracq, Pierre\u2010Edouard Guerin, Bruno Desprez, Pierre Devaux<\/em><\/p>\n<p>Molecular Ecology. 2018 May 17. DOI: <a href=\"https:\/\/doi.org\/10.1111\/mec.14723\">10.1111\/mec.14723<\/a><\/p>\n<\/blockquote>\n<blockquote>\n<p><strong>ORION : a web server for protein fold recognition and structure prediction using evolutionary hybrid profiles<\/strong><\/p>\n<p><em>Yassine Ghouzam, Guillaume Postic, Pierre-Edouard Guerin, Alexandre G. de Brevern &amp; Jean-Christophe Gelly<\/em><\/p>\n<p>Scientific Reports. 2016 Jun 20. DOI: <a href=\"https:\/\/doi.org\/10.1038\/srep28268\">10.1038\/srep28268<\/a><\/p>\n<\/blockquote>\n<h1 id=\"software\">Software<\/h1>\n<h2 id=\"florimond-desprez-group-2021\">Florimond Desprez Group (2021-)<\/h2>\n<ul>\n<li>My software developed for the company is <strong>confidential<\/strong>. Please contact me for more information.<\/li>\n<\/ul>\n<h2 id=\"public-research-2015-2021\">Public research (2015-2021)<\/h2>\n<ul>\n<li><strong><a href=\"https:\/\/shiny.cefe.cnrs.fr\/wfgd\/\">WFGD<\/a><\/strong> (main contributor): interactive worldmap of fish genetic diversity.<\/li>\n<li><strong><a href=\"https:\/\/github.com\/Grelot\/anvage\">ANVAGE<\/a><\/strong> (main contributor): ANnotation Variants GEnome is a python toolkit software to perform routine operations such as detecting synonymous genetic variants from VCF, GFF3 and FASTA genome files.<\/li>\n<li><strong><a href=\"https:\/\/github.com\/Grelot\/rgeogendiv\">Rgeogendiv<\/a><\/strong> (main contributor): R package for downloading, preparing and aligning georeferenced DNA sequences on Genbank to calculate genetic diversity at different geographical scales<\/li>\n<li><strong><a href=\"https:\/\/gitlab.mbb.univ-montp2.fr\/edna\">Workflow to process environmental DNA sequencing data<\/a><\/strong> (main contributor): this workflow is open-source and was co-developped by the CEFE (lead stakeholder) and the company SPYGEN (data and tests), including interactions IFREMER, ETH Zurich and the marine explorations of Monaco.<\/li>\n<li><strong><a href=\"https:\/\/gitlab.mbb.univ-montp2.fr\/reservebenefit\/snakemake_stacks2\">Workflow to genotype reduced genome sequencing data<\/a><\/strong> (main contributor): this workflow processed over 3000 fish genomes in the context of the european project RESERVEBENEFIT in collaboration with Helmholtz-Zentrum f\u00fcr Ozeanforschung Kiel and Instituto Espa\u00f1ol de Oceanograf\u00eda.<\/li>\n<li><strong><a href=\"https:\/\/github.com\/Grelot\/genbar2\">Genbar2<\/a><\/strong> (main contributor): identify genetic boundaries between populations using individual spatial coordinates and genetic variants.<\/li>\n<li><strong><a href=\"https:\/\/pypi.org\/project\/demort\/\">DEMORT<\/a><\/strong> (main contributor): a DEmultiplexing MOnitoring Report Tool<\/li>\n<li><strong><a href=\"https:\/\/sourceforge.net\/projects\/exam-exome-analysis-and-mining\/\">EXAM<\/a><\/strong> (contributor): a whole exome sequencing analysis and its graphical interface<\/li>\n<li><strong><a href=\"https:\/\/www.dsimb.inserm.fr\/orion\/\">ORION<\/a><\/strong> (contributor): a sensivitive method for protein template detection<\/li>\n<\/ul>\n"},{"title":"Curriculum Vitae","published":"2025-06-25T00:00:00+00:00","updated":"2025-06-25T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/resume\/"}},"id":"https:\/\/guerinpe.com\/resume\/","content":"<h2 id=\"summary\">Summary<\/h2>\n<p>I am a bioinformatician who designs tools to help breeders improve the genetics of industrial crops. I am interested in any technology related to the processing, visualization or mining of <a href=\"\/activity\/\">biodata<\/a>.<\/p>\n<h2 id=\"contact\">Contact<\/h2>\n<ul>\n<li><a href=\"mailto:pierre-edouard.guerin@groupefd.com\">E-mail contact<\/a><\/li>\n<li><a href=\"https:\/\/www.linkedin.com\/in\/pierre-edouard-guerin\/\">Linkedin profile<\/a><\/li>\n<li><a href=\"https:\/\/www.kaggle.com\/pierreedouardguerin\">Kaggle portfolio<\/a><\/li>\n<\/ul>\n<h2 id=\"experience\">Experience<\/h2>\n<div class=\"timeline-entry\">\n  <div class=\"timeline-date\">July 2021 to present<\/div>\n  <div class=\"timeline-item\">\n    <h3>Bioinformatician<\/h3>\n    <h5>Florimond Desprez Group Research Unit<\/h5>\n    <h6>Cappelle-en-P\u00e9v\u00e8le, France<\/h6>\n    <p>\n      Developed and optimised software, workflows, and databases to support plant breeders in the analysis and interpretation of omics data.\n    <\/p>\n      <mark>Python<\/mark>\n      <mark>SQL<\/mark>\n      <mark>R<\/mark>\n      <mark>C++<\/mark>\n      <mark>Docker<\/mark>\n      <mark>Nextflow<\/mark>\n      <mark>Azure<\/mark>\n      <mark>Databricks<\/mark>\n      <mark>Elastic Stack<\/mark>\n  <\/div>\n  <div class=\"timeline-date\">January 2017 to July 2021<\/div>\n  <div class=\"timeline-item\">\n    <h3>Bioinformatician<\/h3>\n    <h5>CNRS Research Unit 5175, Center of Functional Ecology and Evolution<\/h5>\n    <h6>Montpellier, France<\/h6>\n    <p>\n      Processed DNA sequence data for population genetics studies of various fish species, enabling marine biologists to assess their conservation status.\n    <\/p>\n      <mark>Python<\/mark>\n      <mark>R<\/mark>\n      <mark>C++<\/mark>\n      <mark>Singularity<\/mark>\n      <mark>Snakemake<\/mark>\n      <mark>Jupyter<\/mark>\n      <mark>SGE<\/mark>\n      <mark>SLURM<\/mark>\n  <\/div>\n  <div class=\"timeline-date\">January to August 2016<\/div>\n  <div class=\"timeline-item\">\n    <h3>Bioinformatics Internship<\/h3>\n    <h5>INSERM Unit S598, Genetics of Diabetes<\/h5>\n    <h6>Paris, France<\/h6>\n    <p>\n      Developed a module and user interface for a medical software to detect and annotate rare variants in human genome sequences related to diabetes.\n    <\/p>\n      <mark>SQL<\/mark>\n      <mark>Python<\/mark>\n      <mark>Qt<\/mark>\n  <\/div>\n  <div class=\"timeline-date\">January to June 2015<\/div>\n  <div class=\"timeline-item\">\n    <h3>Bioinformatics Internship<\/h3>\n    <h5>INSERM Unit S1134, Macromolecular Biology<\/h5>\n    <h6>Paris, France<\/h6>\n    <p>\n      Optimised an algorithm to predict protein 3D structure at atomic resolution.\n    <\/p>\n      <mark>C<\/mark>\n      <mark>Python<\/mark>\n      <mark>PyMOL<\/mark>\n  <\/div>\n<\/div>\n<h2 id=\"training\">Training<\/h2>\n<ul>\n<li>2025: Generative AI, <a href=\"https:\/\/www.interactivity.nl\/\">Interactivity<\/a><\/li>\n<li>2024: Elastic Stack, <a href=\"https:\/\/www.ambient-it.net\/\">AMBIENT-IT<\/a><\/li>\n<li>2023: Databricks, <a href=\"https:\/\/www.databricks.com\/learn\/training\/home\">Databricks Academy<\/a><\/li>\n<li>2020: Landscape Genetics, <a href=\"https:\/\/sites.google.com\/site\/landscapegeneticscourse\/\">IDGC<\/a><\/li>\n<li>2019: ReproHackathon, <a href=\"https:\/\/github.com\/IFB-ElixirFr\/ReproHackathon\">CIRAD<\/a><\/li>\n<li>2018: Advanced Statistics for Data Sciences, <a href=\"https:\/\/www.ephe.psl.eu\/formations-conferences\">EPHE<\/a><\/li>\n<li>2017: High-Performance Computing, <a href=\"https:\/\/isem-evolution.fr\/plateau\/plateau-montpellier-bioinformatique-et-biodiversite\/\">MBB<\/a><\/li>\n<\/ul>\n<h2 id=\"education\">Education<\/h2>\n<ul>\n<li><strong>2016: M.Sc. in Bioinformatics<\/strong>, <a href=\"https:\/\/u-paris.fr\/\">Paris University<\/a>, France<\/li>\n<li><strong>2014: Licence in Bioinformatics<\/strong>, <a href=\"https:\/\/u-paris.fr\/\">Paris University<\/a>, France<\/li>\n<\/ul>\n"},{"title":"How to Manage a Project?","published":"2025-06-18T00:00:00+00:00","updated":"2025-06-18T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/how-to-manage-project\/"}},"id":"https:\/\/guerinpe.com\/articles\/how-to-manage-project\/","summary":"<p>In any company, every task is part of a project. I am responsible for managing multiple projects each year. I have to present deliverables to stakeholders, meet deadlines, allocate mandays and coordinate everyone\u2019s actions. This is a meticulous work that requires a strong methodology.<\/p>"},{"title":"Gantt Chart Excel for Project Management","published":"2025-05-12T00:00:00+00:00","updated":"2025-05-12T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/gantt-chart-excel-for-project-management\/"}},"id":"https:\/\/guerinpe.com\/articles\/gantt-chart-excel-for-project-management\/","summary":"<p>A Gantt chart visualizes the scheduling of tasks over the project timeline. My colleague <strong>Daphne Verdelet<\/strong> shared with me the Excel template her team at <a href=\"https:\/\/www.inrae.fr\/\">INRAE<\/a> uses to keep track of their tasks over the year.<\/p>"},{"title":"Chado: the GMOD Database Schema","published":"2025-01-18T00:00:00+00:00","updated":"2025-01-18T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/chado-the-gmod-database-schema\/"}},"id":"https:\/\/guerinpe.com\/articles\/chado-the-gmod-database-schema\/","summary":"<p><strong>Chado<\/strong> is a relational database schema that underlies many GMOD installations. It is capable of representing many of the general classes of data frequently encountered in modern biology such as sequence, sequence comparisons, phenotypes, genotypes, ontologies, publications, and phylogeny. It has been designed to handle complex representations of biological knowledge and is the most sophisticated relational schemas currently available in molecular biology.<\/p>"},{"title":"Error Messages with a CLI","published":"2024-11-28T00:00:00+00:00","updated":"2024-11-28T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/error-messages\/"}},"id":"https:\/\/guerinpe.com\/articles\/error-messages\/","summary":"<p>I am an anxious person. So <strong>error messages<\/strong> always makes my heart beat faster. Hopefully, following the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Pareto_principle\">Pareto Principle<\/a>, 80% of error messages are mild while 20% are the really tough one. The point is to solve the first kind as quickly as possible and effortless. To do so, allow the user to solve the issue by himself with clear messages and hints (in the case of errors related to input files or parameters). Clear presentation of the context and precise localization of the error in the code will save a lot of useless and tedious work to the developer. The time spared on the easy errors just by having better messages, then can be reallocated to the second kind of errors, the troublemakers.<\/p>"},{"title":"Generative AI: Integrate openAI API with Python","published":"2024-05-27T00:00:00+00:00","updated":"2024-05-27T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/generative-ai\/"}},"id":"https:\/\/guerinpe.com\/articles\/generative-ai\/","summary":"<p>I was fortunate to follow the course of <strong>Sven Warris<\/strong> about software tools to integrate genAI into your own work and applications. The course is aimed at data scientists and bioinformaticians.<\/p>"},{"title":"Elastic Stack","published":"2024-04-23T00:00:00+00:00","updated":"2024-04-23T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/elastic-stack\/"}},"id":"https:\/\/guerinpe.com\/articles\/elastic-stack\/","summary":"<p>strange not working<\/p>"},{"title":"BioGPT: Generative Pre-trained Transformer for Biomedical Text Mining","published":"2024-01-18T00:00:00+00:00","updated":"2024-01-18T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/biogpt-generative-pre-trained-transformer-for-biomedical-text-mining\/"}},"id":"https:\/\/guerinpe.com\/articles\/biogpt-generative-pre-trained-transformer-for-biomedical-text-mining\/","content":"<p>Following the recent breakthrough in Natural Language Processing models, everyone is now talking about Transformers and ChatGPT. I wondered whether some models had been trained for use in biology or bioinformatics. I discovered a research project called <strong>BioGPT<\/strong> developped by Microsoft. This article to both describe BioGPT and summarise the recent history of Transformers and NLP.<\/p>\n<h2 id=\"bibliography-is-hard-work-and-is-getting-worse\">Bibliography is Hard Work (and is getting worse)<\/h2>\n<p>The <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/\">PubMed database<\/a> contains more than 36 million publications. Every year this number increases following an exponential growth.<\/p>\n<p>They are no universal glossary <em>e.g.<\/em> a drug can have multiple synonyms depending on the field, the time frame, the country, etc. So, the research by keyword may be incomplete and not cover all the related articles. In addition, articles are sorted by field or key words making any cross-disciplinary research tedious.<\/p>\n<h2 id=\"natural-language-processing\">Natural Language Processing<\/h2>\n<p>NLP are methods to give the computer the ability to understand and speak human language. <strong>BioGPT<\/strong> is a model trained on the content of all available articles in PubMed.<\/p>\n<p><strong>BioGPT<\/strong> can do:<\/p>\n<ul>\n<li>Answer question using PubMed materials<\/li>\n<li>Find drugs relations using PubMed materials<\/li>\n<\/ul>\n<h3 id=\"history-of-nlp\">History of NLP<\/h3>\n<ul>\n<li>In the early 20th century, <strong>Ferdinand de Saussure<\/strong> establishes human language as a logical structure.<\/li>\n<li>In 1952, <strong>Alan Hodgkin<\/strong> and <strong>Andrew Huxley<\/strong> discover neural networks (Nobel Medicine 1963).<\/li>\n<\/ul>\n<p>These events inspire the idea of <strong>a machine able to speak and understand human language<\/strong>.<\/p>\n<p><strong>Alan Turing<\/strong> stated that if a machine could chat with you and you can\u2019t tell it apart from a human, then the machine could be considered capable of thinking.<\/p>\n<h3 id=\"boolean-request\">Boolean request<\/h3>\n<p>Boolean logic involves <code>TRUE<\/code> or <code>FALSE<\/code> answer. The found documents are the ones matching an <em>exact term<\/em>. To go further, you can combine statements <code>OR<\/code>, <code>AND<\/code>, <code>NEAR<\/code>.<\/p>\n<p>Example: in a search engine, you type \"puppy\" and \"kitten\" to find any documents that contains both \"puppy\" and \"kitten\".<\/p>\n<h3 id=\"terms-frequency\">Terms frequency<\/h3>\n<div class=\"encart_inside_article\">\n<p><strong>Term frequency<\/strong>: How often appears a word in a document.<\/p>\n<p><strong>Inverse Document frequency<\/strong>: importance of a word in multiple documents.<\/p>\n<\/div>\n<p>Term frequency identifies important word that are both frequent in a document and rare across the dataset. It ignores common words like \u201cthe\u201d or \u201cis\u201d as well.<\/p>\n<p>Example: <a href=\"https:\/\/en.wikipedia.org\/wiki\/Tag_cloud\">Tag cloud<\/a>.<\/p>\n<h3 id=\"co-occurence-matrix\">Co-occurence matrix<\/h3>\n<p><strong>Co-occurence matrix<\/strong> is the table of the number of times two words appear together. The aim is to find words significantly associated. It is possible to reduce a set of words into a set of vector of associated words in order to process vector of words instead of words.<\/p>\n<p>Example:<\/p>\n<blockquote>\n<p>Apples are green and red.<\/p>\n<p>Red apples are sweet.<\/p>\n<p>Green oranges are sour.<\/p>\n<\/blockquote>\n<table><thead><tr><th>-<\/th><th><code>apples<\/code><\/th><th><code>green<\/code><\/th><th><code>red <\/code><\/th><th><code>sweet<\/code><\/th><th><code>oranges<\/code><\/th><th><code>sour<\/code><\/th><\/tr><\/thead><tbody>\n<tr><td><code>apples<\/code><\/td><td>2<\/td><td>1<\/td><td>2<\/td><td>1<\/td><td>0<\/td><td>0<\/td><\/tr>\n<tr><td><code>green<\/code><\/td><td>1<\/td><td>1<\/td><td>1<\/td><td>0<\/td><td>0<\/td><td>0<\/td><\/tr>\n<tr><td><code>red <\/code><\/td><td>2<\/td><td>1<\/td><td>2<\/td><td>1<\/td><td>0<\/td><td>0<\/td><\/tr>\n<tr><td><code>sweet<\/code><\/td><td>1<\/td><td>0<\/td><td>1<\/td><td>1<\/td><td>0<\/td><td>0<\/td><\/tr>\n<tr><td><code>oranges<\/code><\/td><td>0<\/td><td>1<\/td><td>0<\/td><td>0<\/td><td>1<\/td><td>1<\/td><\/tr>\n<tr><td><code>sour<\/code><\/td><td>0<\/td><td>1<\/td><td>0<\/td><td>0<\/td><td>1<\/td><td>1<\/td><\/tr>\n<\/tbody><\/table>\n<p>In this text, <code>red<\/code> and <code>apples<\/code> are associated while <code>sour<\/code> and <code>apples<\/code> are not.<\/p>\n<h3 id=\"word2vec-word-to-vector\">Word2vec: Word to Vector<\/h3>\n<p>A word is considered as a vector. The computer follows an unsupervised learning (~no involved human) from the document. Each surrounding words is a <em>context<\/em> or <em>bag of words<\/em> teaching the word to predict given the context. The sum of all predictions produce a vector space or <em>semantic relationship<\/em>.<\/p>\n<p>Example:<\/p>\n<p>The computer understand from a document that the word <code>queen<\/code> and <code>king<\/code> are linked and <code>woman<\/code> and <code>man<\/code> relationship are aligned with <code>king<\/code> and <code>woman<\/code> relationship. By summing the vectors, the computer can deduce association between <code>queen<\/code> and <code>woman<\/code>  and <code>king<\/code> and <code>man<\/code>.<\/p>\n<p><img src=\"https:\/\/guerinpe.com\/articles\/biogpt-generative-pre-trained-transformer-for-biomedical-text-mining\/.\/sementic_relationship.jpg\" alt=\"semantic relationship\" \/><\/p>\n<h2 id=\"neural-network\">Neural Network<\/h2>\n<p>Inspired by human brain, they are interconnected neurons organised into layers (input, hidden and output). Each connection between neuron has a weight. The network is trained on documents to adjust weights. Weights are self-corrected to minimise the difference between the prediction and the expected result.<\/p>\n<p>Example: Word2vec is based on neural network of three layers.<\/p>\n<p><img src=\"https:\/\/guerinpe.com\/articles\/biogpt-generative-pre-trained-transformer-for-biomedical-text-mining\/.\/neural_network.jpg\" alt=\"neural network\" \/><\/p>\n<h2 id=\"perceptron\">Perceptron<\/h2>\n<p>Designed in 1969 with the idea of creating a robot in the image of human. The <strong>perceptron<\/strong> is a neural network of 2 layers: input and output.<\/p>\n<p><strong>\ud83d\udca1 Note:<\/strong> Without the hidden layer, a neural network is a <strong>logistic regression model<\/strong> (Berkson et al. 1944) with a Boolean output.<\/p>\n<p><img src=\"https:\/\/guerinpe.com\/articles\/biogpt-generative-pre-trained-transformer-for-biomedical-text-mining\/.\/perceptron.jpg\" alt=\"perceptron\" \/><\/p>\n<h2 id=\"transformer\">Transformer<\/h2>\n<p>The <strong>transformer<\/strong> is a neural network with four layers. It adds a a new intermediate layer called <em>attention<\/em> (Vaswani et al. 2017). The decoder layer transforms the input document into a <em>bag of words<\/em>. The encoder layer produces a predicted <em>vector of words<\/em> using word2vec. <em>Attention<\/em> assigns weights to link the encoder and decoder layers. The transformer is the state-of-the-art method in natural language processing. <strong>BERT<\/strong> and <strong>GPT<\/strong> are based on transformer.<\/p>\n<p>Example: <a href=\"https:\/\/www.deepl.com\/en\/translator\">deepL<\/a> for document translation.<\/p>\n<h3 id=\"bidirectionnel-encoder-representations-from-transformers-bert\">Bidirectionnel Encoder Representations from Transformers (BERT)<\/h3>\n<p>Transformer architecture developed by Google in 2018. In France, Martin et al. developed camembert in 2019. The context is processed twice from right to left and from left to right. Pre-trained: the neural network is already weighted based on a training using a HUGE dataset so it can be fine-tuned for specific use.<\/p>\n<p>Example: <a href=\"https:\/\/www.google.com\/\">Google Search<\/a> when you type a full sentence.<\/p>\n<pre><code>Can google find my missing sock?\n<\/code><\/pre>\n<p>People also ask<\/p>\n<pre><code>How do I find my lost sock?\nWhen socks disappear where do they go?\nWhy can&#x27;t I find my socks?\n<\/code><\/pre>\n<h3 id=\"generative-pretrained-transformer-gpt\">Generative Pretrained Transformer (GPT)<\/h3>\n<p>Transformer architecture developed by openAI in 2020. Contrary to BERT, the document is processed with an unidirectional attention. In addition, the pre-training is different between GPT and BERT. While BERT is trained to predict the missing word giving a context, GPT is trained to generate complete comprehensive sentence giving a context.<\/p>\n<p>Example: <a href=\"https:\/\/chatgpt.com\">chatGPT<\/a><\/p>\n<h2 id=\"biogpt\">BioGPT<\/h2>\n<ul>\n<li><strong>BioGPT<\/strong> is a <strong>GPT<\/strong> model fine-tuned based on PubMed publications.<\/li>\n<li>Microsoft provided the source code to build a bioGPT software: <a href=\"https:\/\/github.com\/microsoft\/BioGPT\">github source code BioGPT<\/a><\/li>\n<li>The <strong><a href=\"https:\/\/huggingface.co\/docs\/transformers\/en\/model_doc\/biogpt\"> \ud83e\udd17 Hugging Face<\/a><\/strong> is an online library of all natural language processing neural network models including transformers including BioGPT.<\/li>\n<\/ul>\n<h2 id=\"conclusion\">Conclusion<\/h2>\n<p>Transformer models are not trained on human words but on <em>token<\/em>. Token are the unit of chunks of information extracted from the original document. They are not human readable. Purely <em>black box<\/em> machine abstraction. Computer language comprehension relies on maths, not\u00a0on real text comprehension. On the other side, we could say human language comprehension relies on neurotransmitter concentration. The point is the computer and the human are able to understand each other.<\/p>\n<p>BioGPT is based on biomedical dataset so this is not relevant for other fields yet. For instance if I ask information about the sugar beet to BioGPT, he will give only information related to human nutrition. BioGPT is specialized into biomedical data and nothing else. Can we imagine a similar AI based on crop science literature in the future? The hardest part will be to collect and format relevant crop science text material.<\/p>\n<h2 id=\"references\">References<\/h2>\n<blockquote>\n<p><strong>BioGPT: generative pre-trained transformer for biomedical text generation and mining<\/strong><\/p>\n<p><em>Renquian Luo, Liai Sun, Yingce Xia, Tao Qin, Sheng Zhang, Hoifung Poon, Tie-Yan Liu<\/em><\/p>\n<p>Briefings in Bioinformatics, November 2022. DOI: <a href=\"https:\/\/doi.org\/10.1093\/bib\/bbac409\">10.1093\/bib\/bbac409<\/a><\/p>\n<\/blockquote>\n<blockquote>\n<p><strong>Hugging Face<\/strong><\/p>\n<p><em>Jain, S.M.<\/em><\/p>\n<p>Introduction to Transformers for NLP. Apress, Berkeley, CA, october 2022. DOI: <a href=\"https:\/\/doi.org\/10.1007\/978-1-4842-8844-3_4\">10.1007\/978-1-4842-8844-3_4<\/a><\/p>\n<\/blockquote>\n<blockquote>\n<p><strong>CamemBERT: a Tasty French Language Model<\/strong><\/p>\n<p><em>Louis Martin, Benjamin Muller, Pedro Javier Ortiz Su\u00e1rez, Yoann Dupont, Laurent Romary, \u00c9ric Villemonte de la Clergerie, Djam\u00e9 Seddah, Beno\u00eet Sagot<\/em><\/p>\n<p>Submitted in november 2019. DOI: <a href=\"https:\/\/doi.org\/10.48550\/arXiv.1911.03894\">10.48550\/arXiv.1911.03894<\/a><\/p>\n<\/blockquote>\n<blockquote>\n<p><strong>Attention is All you Need<\/strong><\/p>\n<p><em>Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, \u0141ukasz Kaiser, Illia Polosukhin<\/em><\/p>\n<p>Neural information processing systems, december 2017. DOI: <a href=\"https:\/\/doi.org\/10.48550\/arXiv.1706.03762\">10.48550\/arXiv.1706.03762<\/a><\/p>\n<\/blockquote>\n<blockquote>\n<p><strong>Word2Vec<\/strong><\/p>\n<p><em>Kenneth Ward Church<\/em><\/p>\n<p>Natural Language Engineering, january 2017. DOI: <a href=\"https:\/\/doi.org\/10.1017\/S1351324916000334\">10.1017\/S1351324916000334<\/a><\/p>\n<\/blockquote>\n<blockquote>\n<p><strong>The Perceptron: A Model for Brain Functioning<\/strong><\/p>\n<p><em>H.D. Block<\/em><\/p>\n<p>Reviews of Modern Physics, january 1962. DOI: <a href=\"https:\/\/doi.org\/10.1103\/RevModPhys.34.123\">10.1103\/RevModPhys.34.123<\/a><\/p>\n<\/blockquote>\n<blockquote>\n<p><strong>Propagation of electrical signals along giant nerve fibres<\/strong><\/p>\n<p><em>Alan Lloyd Hodgkin and Andrew Fielding Huxley<\/em><\/p>\n<p>Processdings of The Royal Society B, october 1952. DOI: <a href=\"https:\/\/doi.org\/10.1098\/rspb.1952.0054\">10.1098\/rspb.1952.0054<\/a><\/p>\n<\/blockquote>\n"},{"title":"What is a Bad Programmer?","published":"2023-12-23T00:00:00+00:00","updated":"2023-12-23T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/what-is-a-bad-programmer\/"}},"id":"https:\/\/guerinpe.com\/articles\/what-is-a-bad-programmer\/","summary":"<p><strong>Coding<\/strong> refers to the act of writing instructions in a programming language. In that sense, the coder is also a programmer. The program must serve a clear purpose: automating a routine task, performing repetitive or complex calculations, managing data, and so on. In bioinformatics, programs are designed to transform raw biological data into formats that can be used by other specialists for further analysis.<\/p>"},{"title":"Project Report","published":"2023-09-30T00:00:00+00:00","updated":"2023-09-30T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/project-report\/"}},"id":"https:\/\/guerinpe.com\/articles\/project-report\/","summary":"<p>intro<\/p>"},{"title":"Object Relational Mapper","published":"2023-05-12T00:00:00+00:00","updated":"2023-05-12T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/object-relational-mapper\/"}},"id":"https:\/\/guerinpe.com\/articles\/object-relational-mapper\/","summary":"<p>In bioinformatics, we are familiar with Python and SQL. Python is a general coding language used to problem solve and create programs. SQL is a specific language used to store and retrieve specific information from databases. Then you have SQLAlchemy which is the bridge between both languages. This presents a method to use Python to create databases and facilitates the communication between Python programs and the databases.<\/p>\n<p><strong>Object Relational Mapper (ORM)<\/strong> tool translates Python classes to tables on relational databases and automatically converts function calls to SQL statements.<\/p>"},{"title":"RNA-Seq","published":"2022-08-22T00:00:00+00:00","updated":"2022-08-22T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/rna-seq\/"}},"id":"https:\/\/guerinpe.com\/articles\/rna-seq\/","summary":"<p>One of the approaches to study host\u2013pathogen interactions at the molecular level is <strong>RNA sequencing (RNA-Seq)<\/strong>. This technology provides access to gene expression profiles in various conditions such as viral infection or environmental stress. I had the opportunity to process RNA-Seq data from sugar beet plants to investigate transcriptional responses to virus yellows disease, a condition caused by a complex of aphid-transmissible viruses. Here, I describe transcriptomics and the methods used for RNA-Seq data analysis.<\/p>"},{"title":"Phasing","published":"2022-07-18T00:00:00+00:00","updated":"2022-07-18T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/phasing\/"}},"id":"https:\/\/guerinpe.com\/articles\/phasing\/","summary":"<p>Given a genotype, <strong>phasing<\/strong> is the attribution of each allele to the corresponding parent.<\/p>"},{"title":"Multi Dimensional Scaling","published":"2022-04-03T00:00:00+00:00","updated":"2022-04-03T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/multi-dimensional-scaling\/"}},"id":"https:\/\/guerinpe.com\/articles\/multi-dimensional-scaling\/","summary":"<p>Multi Dimensional Scaling (MDS) is very similar to <a href=\"\/articles\/principal-component-analysis\/\">Principal Component Analysis (PCA)<\/a>, except that instead of converting correlations into a 2D graph, it converts distances among the samples into a 2D graph.<\/p>"},{"title":"Analytical Skill","published":"2022-02-15T00:00:00+00:00","updated":"2022-02-15T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/analytical-skill\/"}},"id":"https:\/\/guerinpe.com\/articles\/analytical-skill\/","summary":"<p>When I was in junior high school, I had the opportunity to take History and Geography classes from <strong>Nicolas Ben Fredj<\/strong>. He taught us the method (\"the way\" in Greek) to compose and articulate our thoughts and knowledge. I recently watched a <a href=\"https:\/\/www.youtube.com\/watch?v=IxPyIlTKZnI\">MOOC he recorded for Lyc\u00e9e fran\u00e7ais international Louis-Massignon<\/a>. The ability to think analytically and to express ideas is useful. In science, whether in academia or industry, we need to compare different solutions or hypotheses objectively and communicate our analysis to our peers in order to make the best decisions.<\/p>"},{"title":"DNA chip","published":"2021-12-26T00:00:00+00:00","updated":"2021-12-26T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/dna-chip\/"}},"id":"https:\/\/guerinpe.com\/articles\/dna-chip\/","summary":"<p>A <strong>DNA chip<\/strong> or DNA microarray is a collection of microscopic DNA spots, commonly representing single genes, arrayed on a solid surface by covalent attachment to chemically suitable matrices.<\/p>"},{"title":"Making a Good Resume","published":"2021-08-12T00:00:00+00:00","updated":"2021-08-12T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/making-a-good-resume\/"}},"id":"https:\/\/guerinpe.com\/articles\/making-a-good-resume\/","summary":"<p>The resume is the first thing the recruiter will read about you. I revised my resume multiple times already and continue to sharpen it. Here I describe the structure and overall basics tips to make a competitive resume.<\/p>"},{"title":"Nextflow: A Tool for Reproducible and Scalable Bioinformatics Workflows","published":"2021-07-03T00:00:00+00:00","updated":"2021-07-03T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/nextflow-reproducible-scalable-workflow\/"}},"id":"https:\/\/guerinpe.com\/articles\/nextflow-reproducible-scalable-workflow\/","summary":"<p>In my work, I use <strong>Nextflow<\/strong> to design and implement bioinformatics pipelines that automate complex data analyses.<\/p>"},{"title":"The Entity Relationship Model","published":"2021-01-23T00:00:00+00:00","updated":"2021-01-23T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/entity-relationship-model\/"}},"id":"https:\/\/guerinpe.com\/articles\/entity-relationship-model\/","summary":"<p>The <strong>Entity Relationship Model<\/strong> is a model for designing a database. Data is represented as <strong>entities<\/strong> with attributes. Entities are linked by <strong>relations<\/strong>.<\/p>"},{"title":"A Global Ocean Atlas of Eukaryotic Genes","published":"2020-09-19T00:00:00+00:00","updated":"2020-09-19T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/a-global-ocean-atlas-of-eukaryotic-genes\/"}},"id":"https:\/\/guerinpe.com\/articles\/a-global-ocean-atlas-of-eukaryotic-genes\/","summary":"<p>Single-celled microeukaryotes and small multicellular zooplankton account for most of the planktonic biomass in the world\u2019s ocean. Seawater samples were collected from the global ocean during the Tara Oceans expedition to generate a global ocean reference catalog of genes from planktonic eukaryotes sampled RNA and to explore the expression patterns of community at different microscopic scales with respect to biogeography and environmental conditions.<\/p>"},{"title":"Genome Assembly","published":"2020-07-03T00:00:00+00:00","updated":"2020-07-03T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/genome-assembly\/"}},"id":"https:\/\/guerinpe.com\/articles\/genome-assembly\/","summary":"<p>I successfuly made the genome assembly of three Mediterranean fish draft genomes. Their genome sequence have been published recently in this <a href=\"https:\/\/doi.org\/10.1016\/j.ygeno.2020.06.041\">paper<\/a>. Here, I share the lessons I learned from assembling a genome using both long and short sequencing reads.<\/p>"},{"title":"OBITools: Processing DNA Metabarcoding Data","published":"2020-06-27T00:00:00+00:00","updated":"2020-06-27T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/obitools-processing-dna-metabarcoding-data\/"}},"id":"https:\/\/guerinpe.com\/articles\/obitools-processing-dna-metabarcoding-data\/","summary":"<p>The <a href=\"https:\/\/doi.org\/10.1111\/1755-0998.12428\">OBITools<\/a> software is a set of tools specifically designed for processing Next Generation Sequencing data in a DNA metabarcoding context, taking into account taxonomic information.&hellip;\n<\/p>\n"},{"title":"Style and Writing in the 21st Century","published":"2020-04-02T00:00:00+00:00","updated":"2020-04-02T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/style-and-writing-in-the-21-century\/"}},"id":"https:\/\/guerinpe.com\/articles\/style-and-writing-in-the-21-century\/","summary":"<p>I watched Steven Pinker\u2019s talk on YouTube about writing style in the 21st century, and it was brilliant. Does writing well matter in an age of instant communication? According to Pinker, yes it does, but we must minimize the flaws of the post-modern style. Writing has never been natural for humans. Some difficulties are as old as writting itself, some are new, and some have only recently been revealed thanks to cognitive science and linguistics. Here, I have summarized the advices from Pinker on how to write in a better style.<\/p>"},{"title":"Global Determinants of Fish Genetic Diversity","published":"2020-03-04T00:00:00+00:00","updated":"2020-03-04T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/global-determinants-of-fish-genetic-diversity\/"}},"id":"https:\/\/guerinpe.com\/articles\/global-determinants-of-fish-genetic-diversity\/","summary":"<p>At the beginning of 2020, our team* has published the first global map of the genetic diversity of marine and freshwater fishes. This is an important instrument for the preservation of species. This first map is published in the journal <a href=\"https:\/\/doi.org\/10.1038\/s41467-020-14409-7\">Nature Communications<\/a>. As I have done all the bioinformatics analysis, I thought I could present this work from my point of view as a computer scientist. Indeed, this work required the collaboration of a wide range of professions: ecologist, oceanographer, statistician and geneticist.<\/p>"},{"title":"Phenomics","published":"2019-11-25T00:00:00+00:00","updated":"2019-11-25T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/phenomics\/"}},"id":"https:\/\/guerinpe.com\/articles\/phenomics\/","summary":"<p>As part of the <a href=\"https:\/\/www.madics.fr\/\">MaDICS working group<\/a> on data science and plant phenotyping, I had the opportunity to visit the Montpellier Plant Phenotyping Platform. During this workshop, I tested <strong>Phenomenal<\/strong>, a pipeline designed to reconstruct the 3D architecture of maize plants grown in greenhouse conditions. I discovered a new omics called \"Phenomics\" and how high-thoughput bioimaging is used to generate phenome information at large-scale.<\/p>"},{"title":"Landscape Genomics","published":"2019-11-14T00:00:00+00:00","updated":"2019-11-14T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/landscape-genomics\/"}},"id":"https:\/\/guerinpe.com\/articles\/landscape-genomics\/","summary":"<p>Environmental conservation issues have urged a need to better understand and describe species and populations on Earth. Recently, progress in sequencing technologies made it possible to refine this understanding through genomics. Understanding and describing populations of living organisms in a given environment by exploiting sequencing data is the ultimate goal of landscape genomics. So this article is an introduction of this field.<\/p>"},{"title":"Metabarcoding","published":"2019-03-09T00:00:00+00:00","updated":"2019-03-09T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/metabarcoding\/"}},"id":"https:\/\/guerinpe.com\/articles\/metabarcoding\/","summary":"<p>One of the most promising genetic techniques for improving biodiversity assessments is the <em>metabarcoding<\/em> of environmental DNA (eDNA). I did a state of the Art of available methods and developed serveral workflows to process and manage metabarcoding data from the Monaco Marine Scientific Exploration.<\/p>"},{"title":"Step-by-Step Guide to Creating R packages","published":"2019-02-27T00:00:00+00:00","updated":"2019-02-27T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/step-by-step-guide-to-creating-r-package\/"}},"id":"https:\/\/guerinpe.com\/articles\/step-by-step-guide-to-creating-r-package\/","summary":"<p>In R, the fundamental unit of shareable code is the package. A package bundles together code, data, documentation, and tests, and is easy to share with others.&hellip;\n<\/p>\n"},{"title":"Pascal's Triangle","published":"2019-01-15T00:00:00+00:00","updated":"2019-01-15T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/pascal-triangle\/"}},"id":"https:\/\/guerinpe.com\/articles\/pascal-triangle\/","summary":"<p>Draft<\/p>"},{"title":"Team Working with Git","published":"2018-11-26T00:00:00+00:00","updated":"2018-11-26T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/team-working-with-git\/"}},"id":"https:\/\/guerinpe.com\/articles\/team-working-with-git\/","summary":"<p>I've been using <code>Git<\/code> for years, alone or in collaboration with my team. Over time, I developed habits that I present in this article.<\/p>"},{"title":"Virtual Environments, Containers and Scientific Reproducibility","published":"2018-01-19T00:00:00+00:00","updated":"2018-01-19T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/virtual-environment-reproducibility\/"}},"id":"https:\/\/guerinpe.com\/articles\/virtual-environment-reproducibility\/","summary":"<p>Containers provide an easy-to-use, secure, and reproducible environment for scientists to transport their studies between computational resources. As more communities are using Docker or Singularity, we enter into the age of high reproducibility or at least replicability so that the use of containers is mandatory for any study.<\/p>"},{"title":"SGE User Commands","published":"2017-09-17T00:00:00+00:00","updated":"2017-09-17T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/sge-user-commands\/"}},"id":"https:\/\/guerinpe.com\/articles\/sge-user-commands\/","summary":"<p>Using a cluster environment is similar to using linux environments for your job submission. The difference is that you need to specify needed resources beforehand. The cluster is controlled by a SGE (Sun Grid Engine Software) that organizes the queues and resources. This sort of scheduling system is necessary when limited computational resources are shared by many. Here I show how to use Sun Grid Engine for job submission, monitoring and troubleshooting.<\/p>"},{"title":"Disk Management on Linux","published":"2017-07-27T00:00:00+00:00","updated":"2017-07-27T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/disk-management-on-linux\/"}},"id":"https:\/\/guerinpe.com\/articles\/disk-management-on-linux\/","summary":"<p>In this article, I resume some bash commands to check up or manage partitions on a linux system. The commands would check what partitions there are on each disk and other details like the total size, used up space and file system etc.<\/p>"},{"title":"Linux System Administration Basics","published":"2017-04-24T00:00:00+00:00","updated":"2017-04-24T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/linux-administration-system-basics\/"}},"id":"https:\/\/guerinpe.com\/articles\/linux-administration-system-basics\/","summary":"<p>Since Linux is a multi-user operating system, several people may be logged in and actively working on a given machine at the same time. Security-wise, it is never a good idea to allow users to share the credentials of the same account. In fact, best practices dictate the use of as many user accounts as people needing access to the machine.<\/p>"},{"title":"Principal Component Analysis","published":"2017-01-02T00:00:00+00:00","updated":"2017-01-02T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/principal-component-analysis\/"}},"id":"https:\/\/guerinpe.com\/articles\/principal-component-analysis\/","summary":"<p>blabla PCA<\/p>"},{"title":"Online Reputation","published":"2016-09-03T00:00:00+00:00","updated":"2016-09-03T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/online-reputation\/"}},"id":"https:\/\/guerinpe.com\/articles\/online-reputation\/","summary":"<p>When I was on an internship, the director, <strong>Alexandre De Brevern<\/strong>, told me that I was invisible online. It is not reassuring because people can not verify who you are, what you have done or whether you are trustworthy. Indeed, to have a professional presence on the internet is important, both for yourself and for representing your institution, laboratory or company.<\/p>"},{"title":"Genomics","published":"2016-05-23T00:00:00+00:00","updated":"2016-05-23T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/genomics\/"}},"id":"https:\/\/guerinpe.com\/articles\/genomics\/","summary":"<p><strong>Genomics<\/strong> is the study of the whole DNA sequences of an organism. <strong>Next-generation Sequencing (NGS)<\/strong>, particularly Illumina sequencing, have transformed DNA sequencing by allowing millions of fragments to be sequenced in parallel, dramatically increasing speed and reducing cost. High-throughput DNA sequencing has become essential for applications such as whole-genome sequencing, targeted resequencing, and variant discovery<\/p>"},{"title":"A little introduction to BASH (Born Again SHell)","published":"2016-03-23T00:00:00+00:00","updated":"2016-03-23T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/a-little-introduction-to-bash\/"}},"id":"https:\/\/guerinpe.com\/articles\/a-little-introduction-to-bash\/","summary":"<p>This is some hints about shell script programming based on examples. I provide here some little scripts which will hopefully help to understand the very basics.<\/p>"},{"title":"State of the Art","published":"2016-03-11T00:00:00+00:00","updated":"2016-03-11T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/state-of-the-art\/"}},"id":"https:\/\/guerinpe.com\/articles\/state-of-the-art\/","summary":"<p>The <strong>state of the art<\/strong> describes the current knowledge in a specific field through the analysis of a corpus of scientific publications. It serves as a basis for formulating the research question and developing related hypotheses.<\/p>"},{"title":"The Ethymologicon of Science","published":"2016-02-29T00:00:00+00:00","updated":"2016-02-29T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/ethymologicon-of-science\/"}},"id":"https:\/\/guerinpe.com\/articles\/ethymologicon-of-science\/","summary":"<p>Scientific terms are largely of Greco-Latin origin. This collection gather the roots of scientific terms to better understand specialized nomenclature based on Greek and Latin.<\/p>"},{"title":"Journal Impact Factor","published":"2016-01-20T00:00:00+00:00","updated":"2016-01-20T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/journal-impact-factor\/"}},"id":"https:\/\/guerinpe.com\/articles\/journal-impact-factor\/","summary":"<p>During the second world war, the USA and the UK create research centers. Their aim is to industrialize scientific discovery to increase the pace of innovation for military applications. From that point on, the figure of the lonesome wizard in his ivory tower gives way to the salaryman working within a team, a laboratory and an Institute. The Research Unit was born.<\/p>\n<p>In this modern structure, science evolves into a collective and standardized activity. The laboratory is no longer a cabinet of wonders or a private workshop, like those of Lavoisier or Newton. Instead, it becomes a production unit organized around teams, stakeholders, deliverables and measured outcomes.<\/p>\n<p>Institutions like the Massachusetts Institute of Technology (MIT) or the Los Alamos National Laboratory, created for the Manhattan Project, symbolize this transformation. In France, the Centre National de la Recherche Scientifique (CNRS) is created in 1939.<\/p>\n<p>In this modern area, the successful scientist is, above all, a productive one. But what does productivity mean in science? Peer-reviewed publications in scientific journal. The more you publish, the more visible and fundable you become. In 1942, the sociologist <strong>Logan Wilson<\/strong> coins the mantra <em>Publish or Perish<\/em>, capturing the pressure on academics to continually produce papers. Later, in 1962, <strong>Eugene Garfield<\/strong> introduces the metrics to evaluate journal and indirectly their authors: the <strong>Impact Factor<\/strong>.<\/p>"},{"title":"Internship Report","published":"2015-05-13T00:00:00+00:00","updated":"2015-05-13T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/internship-report\/"}},"id":"https:\/\/guerinpe.com\/articles\/internship-report\/","summary":"<p>During my intership in <a href=\"https:\/\/www.dsimb.inserm.fr\/\">INSERM Unit S1134, Macromolecular Biology<\/a>, my supervisors <strong>Yassine Ghouzam<\/strong> and <strong>Jean-Christophe Gelly<\/strong> explained to me the structure of a master internship report. Here is the structure to follow and what to include in each section.<\/p>"},{"title":"ROC Curve","published":"2015-04-15T00:00:00+00:00","updated":"2015-04-15T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/roc-curve\/"}},"id":"https:\/\/guerinpe.com\/articles\/roc-curve\/","summary":"<p>intro<\/p>"},{"title":"Sensitivity and Specificity","published":"2015-04-06T00:00:00+00:00","updated":"2015-04-06T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/sensitivity-and-specificity\/"}},"id":"https:\/\/guerinpe.com\/articles\/sensitivity-and-specificity\/","summary":"<p>How do we know if a test or classifier is reliable? In both medicine and machine learning, classifier are used to attribute a class to an input data. Therefore, <em>performance metrics<\/em> are necessary to assess the reliability of the resulting classification.&hellip;\n<\/p>\n"},{"title":"Scientific Article","published":"2015-03-20T00:00:00+00:00","updated":"2015-03-20T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/scientific-article\/"}},"id":"https:\/\/guerinpe.com\/articles\/scientific-article\/","summary":"<p>A <strong>scientific article<\/strong> is the report that serves for the dissemination of scientific findings to other researchers. Every scientific article has a title, summary, introduction, materials and methods, results and discussion.<\/p>"},{"title":"What Playing in Rock Bands Taught Me About Collaboration","published":"2015-02-01T00:00:00+00:00","updated":"2015-02-01T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/what-playing-in-rock-bands-taught-me-about-collaboration\/"}},"id":"https:\/\/guerinpe.com\/articles\/what-playing-in-rock-bands-taught-me-about-collaboration\/","summary":"<p>Few knows but Music is part of Mathematics. Musicians often develop a way of thinking that is close to the logic used in science or engineering. Examples of scientists who are also musicians and reciprocally are numerous: <strong>Albert Einstein<\/strong>, physicist and violonist; <strong>Bruce Dickinson<\/strong>, lead singer of <em>Iron Maiden<\/em> and airline pilot; <strong>Brian May<\/strong> lead guitarist of <em>Queen<\/em> and astrophysicist; <strong>Bryan Holland<\/strong>, guitarist of <em>The Offspring<\/em>, who holds a Ph.D. in Molecular Biology.<\/p>\n<p>Last but not least, <strong>Steve Jobs<\/strong> shown how collaborative skills in a rock band can be applied to teams that develop high tech products.<\/p>"},{"title":"BLAST: Basic Local Alignment Search Tool","published":"2015-01-14T00:00:00+00:00","updated":"2015-01-14T00:00:00+00:00","author":{"name":"\n            \n              Unknown\n            \n          "},"link":{"@attributes":{"rel":"alternate","type":"text\/html","href":"https:\/\/guerinpe.com\/articles\/basic-local-alignment-search-tool\/"}},"id":"https:\/\/guerinpe.com\/articles\/basic-local-alignment-search-tool\/","summary":"<p>Published in 1990 and entitled <strong>Basic Local Alignment Search Tool<\/strong>, was the most highly cited publication of its time. The development of BLAST bridged Biology and Computer Science giving birth to the new field of <strong>Bioinformatics<\/strong>.<\/p>"}]}