Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2010, Proceedings of the 2010 ACM Symposium on …
In this paper we propose a novel specialized data structure that we call g-trie, designed to deal with collections of subgraphs. The main conceptual idea is akin to a prefix tree in the sense that we take advantage of common topology by constructing a multiway tree where the descendants of a node share a common substructure. We give algorithms to construct a g-trie, to list all stored subgraphs, and to find occurrences on another graph of the subgraphs stored in the g-trie. We evaluate the implementation of this structure and its associated algorithms on a set of representative benchmark biological networks in order to find network motifs. To assess the efficiency of our algorithms we compare their performance with other known network motif algorithms also implemented in the same common platform. Our results show that indeed, g-tries are a feasible, adequate and very efficient data structure for network motifs discovery, clearly outperforming previous algorithms and data structures.
Journal of Data Mining in Genomics & Proteomics, 2016
Network motif is a pattern of inter-connections occurring in complex network in numbers that are significantly higher than those in similar randomized network. The basic premise of finding network motifs lie in the ability to compute the frequency of the subgraphs. In order to discover network motif, one has to compute a subgraph census on the original network that calculates the frequency of all the subgraphs of certain type. Then there is a need to compute the frequency of a set of subgraphs on the randomized similar network. The bottleneck of the entire motif discovery process is therefore to compute the subgraph frequencies and this is the core computational problem. The proposed work is to present the Suffix-Graph, a data structure that store graphs efficiently and to design an algorithm to retrieve subgraph efficiently that detects network motifs and apply them to transcriptional interactions in Escherichia coli.
Frontiers of Computer Science in China, 2009
Despite several algorithms for searching subgraphs in motif detection presented in the literature, no effort has been done for characterizing their performance till now. This paper presents a methodology to evaluate the performance of three algorithms: edge sampling algorithm (ESA), enumerate subgraphs (ESU) and randomly enumerate subgraphs (RAND-ESU). A series of experiments are performed to test sampling speed and sampling quality. The results show that RAND-ESU is more efficient and has less computational cost than other algorithms for large-size motif detection, and ESU has its own advantage in small-size motif detection.
BMC …, 2009
Background: Complex networks are studied across many fields of science and are particularly important to understand biological processes. Motifs in networks are small connected sub-graphs that occur significantly in higher frequencies than in random networks. They have recently gathered much attention as a useful concept to uncover structural design principles of complex networks. Existing algorithms for finding network motifs are extremely costly in CPU time and memory consumption and have practically restrictions on the size of motifs.
e-Science, 2009. e-Science'09. …, 2009
Complex networks from domains like Biology or Sociology are present in many e-Science data sets. Dealing with networks can often form a workflow bottleneck as several related algorithms are computationally hard. One example is detecting characteristic patterns or "network motifs" -a problem involving subgraph mining and graph isomorphism. This paper provides a review and runtime comparison of current motif detection algorithms in the field. We present the strategies and the corresponding algorithms in pseudo-code yielding a framework for comparison. We categorize the algorithms outlining the main differences and advantages of each strategy. We finally implement all strategies in a common platform to allow a fair and objective efficiency comparison using a set of benchmark networks. We hope to inform the choice of strategy and critically discuss future improvements in motif detection.
Briefings in Bioinformatics, 2012
Network motifs are statistically overrepresented sub-structures (sub-graphs) in a network, and have been recognized as 'the simple building blocks of complex networks'. Study of biological network motifs may reveal answers to many important biological questions. The main difficulty in detecting larger network motifs in biological networks lies in the facts that the number of possible sub-graphs increases exponentially with the network or motif size (node counts, in general), and that no known polynomial-time algorithm exists in deciding if two graphs are topologically equivalent. This article discusses the biological significance of network motifs, the motivation behind solving the motif-finding problem, and strategies to solve the various aspects of this problem. A simple classification scheme is designed to analyze the strengths and weaknesses of several existing algorithms. Experimental results derived from a few comparative studies in the literature are discussed, with conclusions that lead to future research directions. Exact Search MAVisto [28, 42] Grochow and Kellis [21] NeMoFinder [43] Kavosh [20] Sampling MFinder [39, 44] MODA [27] FANMOD [40, 45]
IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2014
Network motif algorithms have been a topic of research mainly after the 2002-seminal paper from Milo et al. [1], which provided motifs as a way to uncover the basic building blocks of most networks. Motifs have been mainly applied in Bioinformatics, regarding gene regulation networks. Motif detection is based on induced subgraph counting. This paper proposes an algorithm to count subgraphs of size k + 2 based on the set of induced subgraphs of size k. The general technique was applied to detect 3, 4 and 5-sized motifs in directed graphs. Such algorithms have time complexity O(a(G)m), O(m 2) and O(nm 2), respectively, where a(G) is the arboricity of G(V, E). The computational experiments in public datasets show that the proposed technique was one order of magnitude faster than Kavosh and FANMOD. When compared to NetMODE, acc-Motif had a slightly improved performance.
Lecture Notes in Computer Science, 2007
The study of biological networks and network motifs can yield significant new insights into systems biology. Previous methods of discovering network motifs -network-centric subgraph enumeration and sampling -have been limited to motifs of 6 to 8 nodes, revealing only the smallest network components. New methods are necessary to identify larger network sub-structures and functional motifs.
2018
Network motif provides a way to uncover the basic building blocks of most complex networks. This task usually demands high computer processing, specially for motif with 5 or more vertices. This paper presents an extended methodology with the following features: (i) search for motifs up to 6 vertices, (ii) multithread processing, and a (iii) new enumeration algorithm with lower complexity. The algorithm to compute motifs solve isomorphism in $O(1)$ with the use of hash table. Concurrent threads evaluates distinct graphs. The enumeration algorithm has smaller computational complexity. The experiments shows better performance with respect to other methods available in literature, allowing bioinformatic researchers to efficiently identify motifs of size 3, 4, 5, and 6.
PeerJ
Network motifs play an important role in the structural analysis of biological networks. Identification of such network motifs leads to many important applications such as understanding the modularity and the large-scale structure of biological networks, classification of networks into super-families, and protein function annotation. However, identification of large network motifs is a challenging task as it involves the graph isomorphism problem. Although this problem has been studied extensively in the literature using different computational approaches, still there is a lot of scope for improvement. Motivated by the challenges involved in this field, an efficient and scalable network motif finding algorithm using a dynamic expansion tree is proposed. The novelty of the proposed algorithm is that it avoids computationally expensive graph isomorphism tests and overcomes the space limitation of the static expansion tree (SET) which makes it enable to find large motifs. In this algor...
Engineering Computations, 2020
PurposeThe problem of motif discovery has become a significant challenge in the era of big data where there are hundreds of genomes requiring annotations. The importance of motifs has led many researchers to develop different tools and algorithms for finding them. The purpose of this paper is to propose a new algorithm to increase the speed and accuracy of the motif discovering process, which is the main drawback of motif discovery algorithms.Design/methodology/approachAll motifs are sorted in a tree-based indexing structure where each motif is created from a combination of nucleotides: ‘A’, ‘C’, ‘T’ and ‘G’. The full motif can be discovered by extending the search around 4-mer nucleotides in both directions, left and right. Resultant motifs would be identical or degenerated with various lengths.FindingsThe developed implementation discovers conserved string motifs in DNA without having prior information about the motifs. Even for a large data set that contains millions of nucleotid...
2018
Network motifs play an important role in structural analysis of biological networks. Identification of such network motifs leads to many important applications, such as: understanding the modularity and the large-scale structure of biological networks, classification of networks into super-families etc. However, identification of network motifs is challenging as it involved graph isomorphism which is computationally hard problem. Though this problem has been studied extensively in the literature using different computational approaches, we are far from encouraging results. Motivated by the challenges involved in this field we have proposed an efficient and scalable Motif discovery algorithm using a Dynamic Expansion Tree (MDET). In this algorithm embeddings corresponding to child node of expansion tree are obtained from the embeddings of parent node, either by adding a vertex with time complexity O(n) or by adding an edge with time complexity O(1) without involving any isomorphic ch...
Briefings in bioinformatics, 2014
Network motif detection is the search for statistically overrepresented subgraphs present in a larger target network. They are thought to represent key structure and control mechanisms. Although the problem is exponential in nature, several algorithms and tools have been developed for efficiently detecting network motifs. This work analyzes 11 network motif detection tools and algorithms. Detailed comparisons and insightful directions for using these tools and algorithms are discussed. Key aspects of network motif detection are investigated. Network motif types and common network motifs as well as their biological functions are discussed. Applications of network motifs are also presented. Finally, the challenges, future improvements and future research directions for network motif detection are also discussed.
2010 IEEE 30th International Conference on Distributed Computing Systems Workshops, 2010
Counting network motifs has an important role in studying a wide range of complex networks. However, when the network size is large, as in the case of Internet Topology and WWW graphs counting the number of motifs becomes prohibitive. Devising efficient motif counting algorithms thus becomes an important goal.
Finding motifs in biological, social, technological, and other types of networks has become a widespread method to gain more knowledge about these networks' structure and function. However, this task is very computationally demanding, because it is highly associated with the graph isomorphism which is an NP problem (not known to belong to P or NPcomplete subsets yet). Accordingly, this research is endeavoring to decrease the need to call NAUTY isomorphism detection method, which is the most time-consuming step in many existing algorithms. The work provides an extremely fast motif detection algorithm called QuateXelero, which has a Quaternary Tree data structure in the heart. The proposed algorithm is based on the well-known ESU (FANMOD) motif detection algorithm. The results of experiments on some standard model networks approve the overal superiority of the proposed algorithm, namely QuateXelero, compared with two of the fastest existing algorithms, G-Tries and Kavosh. QuateXelero is especially fastest in constructing the central data structure of the algorithm from scratch based on the input network.
Journal of Parallel and Distributed Computing, 2011
Many natural structures can be naturally represented by complex networks. Discovering network motifs, which are overrepresented patterns of inter-connections, is a computationally hard task related to graph isomorphism. Sequential methods are hindered by an exponential execution time growth when we increase the size of motifs and networks. In this article we study the opportunities for parallelism in existing methods and propose new parallel strategies that adapt and extend one of the most efficient serial methods known from the Fanmod tool. We propose both a master-worker strategy and one with distributed control, in which we employ a randomized receiver initiated methodology capable of providing dynamic load balancing during the whole computation process. Our strategies are capable of dealing both with exact and approximate network motifs discovery. We implement and apply our algorithms to a set of representative networks and examine their scalability up to 128 processor cores. We obtain almost linear speedups, showcasing the efficiency of our proposed approach and are able to reach motif sizes that were not previously achievable using conventional serial algorithms.
Physical Review X
Many real world networks contain a statistically surprising number of certain subgraphs, called network motifs. In the prevalent approach to motif analysis, network motifs are detected by comparing subgraph frequencies in the original network with a statistical null model. In this paper we propose an alternative approach to motif analysis where network motifs are defined to be connectivity patterns that occur in a subgraph cover that represents the network using minimal total information. A subgraph cover is defined to be a set of subgraphs such that every edge of the graph is contained in at least one of the subgraphs in the cover. Some recently introduced random graph models that can incorporate significant densities of motifs have natural formulations in terms of subgraph covers and the presented approach can be used to match networks with such models. To prove the practical value of our approach we also present a heuristic for the resulting NP-hard optimization problem and give results for several real world networks.
Networks is now the most popular way to describe interaction between biological ob-jects. Studying network motifs is of particular interest in systems biology because these building blocks constitute functional units. We propose a tool to compute and statistically study the total number of occurrences of a given connected sub-graph, called topological motif, in a network. This tool relies on two very efficient algorithms to enumerate and/or count all the occurrences of a given topological motif in a given graph. Moreover, it implements approximate p-value computa-tion in several probabilistic graph models extending some previous statistical results. The method is available through an R package named NeMo.
Advances in Science, Technology and Engineering Systems Journal, 2018
Network motif analysis has several applications in many different fields such as biological study and social network modeling, yet motif detection tools are still limited by the intensive computation. Currently, there are two categories for network motif detection method: network-centric and motif-centric approach. While most network-centric algorithms excel in enumerating all potential motifs of a given size, the runtime is infeasible for larger size of motifs. Researchers who are interested in larger motifs and have established a set of potential motif patterns could utilize motif-centric tools to check whether such patterns are truly network motifs by mapping them to the target network and counting their frequency. In the paper, we present NemoMap (Network Motif Mapping algorithm) which is an improvement of the motif-centric algorithm, GK (by Grochow and Kellis) and MODA (Motif Detection Algorithm). Experimental results on three different protein-protein interaction networks show that NemoMap is more efficient in mapping complex motif patterns, while GK and MODA is much faster in analyzing simpler patterns with fewer edges. We also compare the performance of NemoMap and ParaMODA (introduced previously to improve MODA), and the result shows that NemoMap yields better runtime due to the implementation of Grochow-Kellis' symmetry-breaking technique and the better node selection process.
Data Mining and Knowledge Discovery, 2017
Network motif discovery is the problem of finding subgraphs of a network that occur more frequently than expected, according to some reasonable null hypothesis. Such subgraphs may indicate small scale interaction features in genomic inter-Responsible editor: Srinivasan Parthasarathy.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.