g-tries

Pedro Ribeiro; Fernando Silva

G-tries: an efficient data structure for discovering network motifs

Pedro Ribeiro

2010, Proceedings of the 2010 ACM Symposium on …

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

In this paper we propose a novel specialized data structure that we call g-trie, designed to deal with collections of subgraphs. The main conceptual idea is akin to a prefix tree in the sense that we take advantage of common topology by constructing a multiway tree where the descendants of a node share a common substructure. We give algorithms to construct a g-trie, to list all stored subgraphs, and to find occurrences on another graph of the subgraphs stored in the g-trie. We evaluate the implementation of this structure and its associated algorithms on a set of representative benchmark biological networks in order to find network motifs. To assess the efficiency of our algorithms we compare their performance with other known network motif algorithms also implemented in the same common platform. Our results show that indeed, g-tries are a feasible, adequate and very efficient data structure for network motifs discovery, clearly outperforming previous algorithms and data structures.

Rahul Nikam bt18d011

Journal of Data Mining in Genomics & Proteomics, 2016

Network motif is a pattern of inter-connections occurring in complex network in numbers that are significantly higher than those in similar randomized network. The basic premise of finding network motifs lie in the ability to compute the frequency of the subgraphs. In order to discover network motif, one has to compute a subgraph census on the original network that calculates the frequency of all the subgraphs of certain type. Then there is a need to compute the frequency of a set of subgraphs on the randomized similar network. The bottleneck of the entire motif discovery process is therefore to compute the subgraph frequencies and this is the core computational problem. The proposed work is to present the Suffix-Graph, a data structure that store graphs efficiently and to design an algorithm to retrieve subgraph efficiently that detects network motifs and apply them to transcriptional interactions in Escherichia coli.

Log In

G-tries: an efficient data structure for discovering network motifs

Sign up for access to the world's latest research

Abstract

Related papers

Related topics