Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2016, International journal of computer science
The problem of optimally removing a set of vertices from a graph to minimize the size of the largest resultant component is known to be NP-complete. Prior work has provided near optimal heuristics with a high time complexity that function on up to hundreds of nodes and less optimal but faster techniques that function on up to thousands of nodes. In this work, we analyze how to perform vertex partitioning on massive graphs of tens of millions of nodes. We use a previously known and very simple heuristic technique: iteratively removing the node of largest degree and all of its edges. This approach has an apparent quadratic complexity since, upon removal of a node and adjoining set of edges, the node degree calculations must be updated prior to choosing the next node. However, we describe a linear time complexity solution using an array whose indices map to node degree and whose values are hash tables indicating the presence or absence of a node at that degree value. This approach also...
Information Processing Letters, 1992
Discrete Applied Mathematics, 2007
proved that if every vertex v in a graph G has degree d(v) ≥ a(v)+ b(v) + 1 (where a and b are arbitrarily given nonnegative integer-valued functions) then G has a nontrivial vertex partition ( and strengthened this result, proving that it suffices to assume d(v) ≥ a+b (a, b ≥ 1) or just d(v) ≥ a+b−1 (a, b ≥ 2) if G contains no cycles shorter than 4 or 5, respectively.
Mathematical Programming, 1994
Let G = (N; E) be an edge-weighted undirected graph. The graph partitioning problem is the problem of partitioning the node set N into k disjoint subsets of speci ed sizes so as to minimize the total weight of the edges connecting nodes in distinct subsets of the partition. We present a numerical study on the use of eigenvalue-based techniques to nd upper and lower bounds for this problem. Results for bisecting graphs with up to several thousand nodes are given, and for small graphs some trisection results are presented. We show that the techniques are very robust and consistently produce upper and lower bounds having a relative gap of typically a few percentage points.
2020
Big graphs are part of the movement of "Not Only SQL" databases (also called NoSQL) focusing on the relationships between data, rather than the values themselves. The data is stored in vertices while the edges model the interactions or relationships between these data. They offer flexibility in handling data that is strongly connected to each other. The analysis of a big graph generally involves exploring all of its vertices. Thus, this operation is costly in time and resources because big graphs are generally composed of millions of vertices connected through billions of edges. Consequently, the graph algorithms are expansive compared to the size of the big graph, and are therefore ineffective for data exploration. Thus, partitioning the graph stands out as an efficient and less expensive alternative for exploring a big graph. This technique consists in partitioning the graph into a set of k sub-graphs in order to reduce the complexity of the queries. Nevertheless, it pre...
Proceedings of the 2012 international conference on Management of Data - SIGMOD '12, 2012
Searching and mining large graphs today is critical to a variety of application domains, ranging from community detection in social networks, to de novo genome sequence assembly. Scalable processing of large graphs requires careful partitioning and distribution of graphs across clusters. In this paper, we investigate the problem of managing large-scale graphs in clusters and study access characteristics of local graph queries such as breadth-first search, random walk, and SPARQL queries, which are popular in real applications. These queries exhibit strong access locality, and therefore require specific data partitioning strategies. In this work, we propose a Self Evolving Distributed Graph Management Environment (Sedge), to minimize inter-machine communication during graph query processing in multiple machines. In order to improve query response time and throughput, Sedge introduces a two-level partition management architecture with complimentary primary partitions and dynamic secondary partitions. These two kinds of partitions are able to adapt in real time to changes in query workload. Sedge also includes a set of workload analyzing algorithms whose time complexity is linear or sublinear to graph size. Empirical results show that it significantly improves distributed graph processing on today's commodity clusters.
IEEE Transactions on Computers, 1999
New heuristic algorithms are proposed for the Graph Partitioning problem. A greedy construction scheme with an appropriate tie-breaking rule (MIN-MAX-GREEDY) produces initial assignments in a very fast time. For some classes of graphs, independent repetitions of MIN-MAX-GREEDY are sufficient to reproduce solutions found by more complex techniques. When the method is not competitive, the initial assignments are used as starting points for a prohibition-based scheme, where the prohibition is chosen in a randomized and reactive way, with a bias towards more successful choices in the previous part of the run. The relationship between prohibition-based diversification (Tabu Search) and the variable-depth Kernighan-Lin algorithm is discussed. Detailed experimental results are presented on benchmark suites used in the previous literature, consisting of graphs derived from parametric models (random graphs, geometric graphs, etc.) and of "realworld" graphs of large size. On the first series of graphs a better performance for equivalent or smaller computing times is obtained, while on the large "real-world" instances significantly better results than those of multi-level algorithms are obtained but for a much larger computational effort.
Data & Knowledge Engineering, 2012
Graphs are used for modeling a large spectrum of data from the web, to social connections between individuals, to concept maps and ontologies. As the number and complexities of graph based applications increase, rendering these graphs more compact, easier to understand, and navigate through are becoming crucial tasks. One approach to graph simplification is to partition the graph into smaller parts, so that instead of the whole graph, the partitions and their inter-connections need to be considered. Common approaches to graph partitioning involve identifying sets of edges (or edge-cuts) or vertices (or vertex-cuts) whose removal partitions the graph into the target number of disconnected components. While edge-cuts result in partitions that are vertex disjoint, in vertex-cuts the data vertices can serve as bridges between the resulting data partitions; consequently, vertex-cut based approaches are especially suitable when the vertices on the vertex-cut will be replicated on all relevant partitions. A significant challenge in vertex-cut based partitioning, however, is ensuring the balance of the resulting partitions while simultaneously minimizing the number of vertices that are cut (and thus replicated). In this paper, we propose a SBV-Cut algorithm which identifies a set of balance vertices that can be used to effectively and efficiently bisect a directed graph. The graph can then be further partitioned by a recursive application of structurally-balanced cuts to obtain a hierarchical partitioning of the graph. Experiments show that SBV-Cut provides better vertex-cut based expansion and modularity scores than its competitors and works several orders more efficiently than constraint-minimization based approaches.
2008
The realization of efficient parallel graph partitioners requires the parallelization of the multi-level framework which is commonly used to improve the quality and speed of sequential partitioners. The two most critical issues in this framework are the coarsening phase, and the local refinement step performed in the uncoarsening phase. These two phases are difficult to parallelize, because the direct transposition in parallel of the matching algorithms used for coarsening, and of the inherently sequential Fiduccia-Mattheyses type algorithms traditionally used for local optimization, require much communication and synchronization, which hinder scalability. This paper describes new parallel algorithms which tackle these two issues: a simplified probabilistic matching algorithm, and a parallel banded diffusive algorithm, both of which are implemented in the PT-Scotch parallel graph partitioning software. Experimental results illustrate the efficiency and the scalability of these methods.
In the family of clustering problems, we are given a set of objects (vertices of the graph), together with some observed pairwise similarities (edges). The goal is to identify clusters of similar objects by slightly modifying the graph to obtain a cluster graph (disjoint union of cliques).
Proceedings of the 1995 ACM/IEEE …, 1995
We consider the graph parameter boolean-width, related to the number of different unions of neighborhoods across a cut of a graph. Boolean-width is similar to rankwidth, which is related to the number of GF [2]-sums (1+1=0) of neighborhoods instead of the Boolean-sums (1+1=1) used for boolean-width. It compares well to the other four well-known width parameters tree-width, branch-width, clique-width, and rank-width: for many graph classes boolean-width is bounded whereas tree-width and branch-width are unbounded; for some graph classes boolean-width has been shown to be exponentially smaller than any of the other four; for arbitrary graphs, boolean-width is never larger than branchwidth (except for extreme values of zero and one), nor tree-width plus one, nor clique-width, and has been shown to be at least smaller than the square of rank-width. Boolean-width has been shown to be a very natural parameter to consider when solving Maximum Independent Set and Minimum Dominating Set using a divide-and-conquer approach. In this paper we investigate which are the graph problems having the same behaviour, and extend them to a large class of NP-hard vertex subset and vertex partitioning problems by giving algorithms that are FPT when parameterized by either boolean-width, rank-width or clique-width, with runtime single exponential in either parameter if given the pertinent optimal decomposition.
Annals of Operations Research, 1995
Let G = (N; E) be a given undirected graph. We present several new techniques for partitioning the node set N into k disjoint subsets of speci ed sizes. These techniques involve eigenvalue bounds and tools from continuous optimization. Comparisons with examples taken from the literature show these techniques to be very successful.
HAL (Le Centre pour la Communication Scientifique Directe), 1994
2014
The Graph Partitioning Problem (GPP) has several practical applications in many areas, such as design of VLSI (Very-large-scale integration) circuits, solution of numerical methods for simulation problems that include factorization of sparse matrix and partitioning of finite elements meshes for parallel programming applications, between others. The GPP tends to be NP-hard and optimal solutions for solving them are infeasible when the number of vertices of the graph is very large. There has been an increased used of heuristic and metaheuristic algorithms to solve the PPG to get good results where exceptional results are not obtainable by practical means. This article proposes an efficient parallel solution to the GPP problem based on the implementation of existing heuristics in a computational cluster. The proposed solution improves the execution time and, by introducing some random features into the original heuristics, improve the quality of the created partitions.
Graph algorithms and applications I, 2002
We present practical algorithms for constructing partitions of graphs into a fixed number of vertex-disjoint subgraphs that satisfy particular degree constraints. We use this in particular to find k-cuts of graphs of maximum degree ∆ that cut at least a k−1 k (1 + 1 2∆+k−1 ) fraction of the edges, improving previous bounds known. The partitions also apply to constraint networks, for which we give a tight analysis of natural local search heuristics for the maximum constraint satisfaction problem.
Journal of Graph Algorithms and Applications, 1997
We present practical algorithms for constructing partitions of graphs into a fixed number of vertex-disjoint subgraphs that satisfy particular degree constraints. We use this in particular to find k-cuts of graphs of maximum degree ∆ that cut at least a k−1 k (1 + 1 2∆+k−1) fraction of the edges, improving previous bounds known. The partitions also apply to constraint networks, for which we give a tight analysis of natural local search heuristics for the maximum constraint satisfaction problem. These partitions also imply efficient approximations for several problems on weighted bounded-degree graphs. In particular, we improve the best performance ratio for the weighted independent set problem to 3 ∆+2 , and obtain an efficient algorithm for coloring 3-colorable graphs with at most 3∆+2 4 colors.
SIAM Journal on Discrete Mathematics, 2015
We introduce a graph-theoretic vertex dissolution model that applies to a number of redistribution scenarios such as gerrymandering in political districting or work balancing in an online situation. The central aspect of our model is the deletion of certain vertices and the redistribution of their load to neighboring vertices in a completely balanced way.
Information Processing Letters, 1992
Bui, T.N. and C. Jones, Finding good approximate vertex and edge partitions is NP-hard, Information Processing Letters 42 (1992) 153-159. In this paper we show that for n-vertex graphs with maximum degree 3, and for any fixed E > 0, it is NP-hard to find a-edge separators and a-vertex separators of size no more than OPT + n"*-', where OPT is the size of the optimal solution. For general graphs we show that it is NP-hard to find an u-edge separator of size no more than OPT + n*-'. We also show that an O(f(n))-approximation algorithm for finding u-vertex separators of maximum degree 3 graphs can be used to find an O(f(n"))-approximation algorithm for finding a-edge separators of general graphs. Since it is NP-hard to find optimal a-edge separators for general graphs this means that it is NP-hard to find optimal vertex separators even when restricted to maximum degree 3 graphs.
We survey recent trends in practical algorithms for balanced graph partitioning, point to applications and discuss future research directions.
Proceedings of the VLDB Endowment, 2019
We propose Distributed Neighbor Expansion (Distributed NE), a parallel and distributed graph partitioning method that can scale to trillion-edge graphs while providing high partitioning quality. Distributed NE is based on a new heuristic, called parallel expansion, where each partition is constructed in parallel by greedily expanding its edge set from a single vertex in such a way that the increase of the vertex cuts becomes local minimal. We theoretically prove that the proposed method has the upper bound in the partitioning quality. The empirical evaluation with various graphs shows that the proposed method produces higher-quality partitions than the state-of-the-art distributed graph partitioning algorithms. The performance evaluation shows that the space efficiency of the proposed method is an order-of-magnitude better than the existing algorithms, keeping its time efficiency comparable. As a result, Distributed NE can partition a trillion-edge graph using only 256 machines with...
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.