Skip to main content

Henning Meyerhenke

Karlsruhe Institute of Technology (KIT), ITI, Faculty Member

Followers

23

Following

5

Co-authors

5

Public Views

Florian Pelupessy

University of Science

University of Pennsylvania

Carlos Rodriguez Lucatero

Universidad Autónoma Metropolitana-Cuajimalpa

Jeffrey Scroggs

Tamás Szirányi

MTA SZTAKI

Markus Stumptner

Luigi Portinale

Università degli Studi del Piemonte Orientale "Amedeo Avogadro"

Interests

Uploads

Papers by Henning Meyerhenke

Beyond Good Shapes: Diffusion-Based Graph Partitioning Is Relaxed Cut Optimization

Lecture Notes in Computer Science, 2010

In this paper we study the prevalent problem of graph partitioning by analyzing the diffusion-bas... more In this paper we study the prevalent problem of graph partitioning by analyzing the diffusion-based partitioning heuristic BUBBLE-FOS/C, a key component of the practically successful partitioner DIBAP . Our analysis reveals that BUBBLE-FOS/C, which yields well-shaped partitions in experiments, computes a relaxed solution to an edge cut minimizing binary quadratic program (BQP). It therefore provides the first substantial theoretical insights (beyond intuition) why BUBBLE-FOS/C (and therefore indirectly DIBAP) yields good experimental results. Moreover, we show that in bisections computed by BUBBLE-FOS/C, at least one of the two parts is connected. Using arguments based on random walk techniques, we prove that in vertex-transitive graphs actually both parts must be connected components each. All these results may help to eventually bridge the gap between practical and theoretical graph partitioning.

Dynamic Load Balancing for Parallel Numerical Simulations Based on Repartitioning with Disturbed Diffusion

2009 15th International Conference on Parallel and Distributed Systems, 2009

Load balancing is an important requirement for the efficient execution of parallel numerical simu... more Load balancing is an important requirement for the efficient execution of parallel numerical simulations. In particular when the simulation domain changes over time, the mapping of computational tasks to processors needs to be modified accordingly. State-of-the-art libraries for this problem are based on graph repartitioning. They have a number of drawbacks, including the optimized metric and the difficulty of parallelizing the popular repartitioning heuristic Kernighan-Lin (KL).

Shape optimizing load balancing for MPI-parallel adaptive numerical simulations

Contemporary Mathematics, 2013

ABSTRACT

Generating Random Hyperbolic Graphs in Subquadratic Time

Lecture Notes in Computer Science, 2015

Recent trends in graph partitioning for scientific computing

We survey recent trends in practical graph partitioning for scientific computing applications.

Slides for 'Recent trends in graph partitioning for scientific computing

Skript zur Vorlesung Algorithmen für synchrone Rechnernetze

Graphenalgorithmen und lineare Algebra Hand in Hand

Disturbed diffusive processes for solving partitioning problems on graphs

ABSTRACT

Constructing higher-order Voronoi diagrams in parallel

Static and Dynamic Aspects of Scientific Collaboration Networks

2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 2012

Collaboration networks arise when we map the connections between scientists which are formed thro... more Collaboration networks arise when we map the connections between scientists which are formed through joint publications. These networks thus display the social structure of academia, and also allow conclusions about the structure of scientific knowledge. Using the computer science publication database DBLP, we compile relations between authors and publications as graphs and proceed with examining and quantifying collaborative relations with graph-based methods. We review standard properties of the network and rank authors and publications by centrality. Additionally, we detect communities with modularitybased clustering and compare the resulting clusters to a groundtruth based on conferences and thus topical similarity. In a second part, we are the first to combine DBLP network data with data from the Dagstuhl Seminars: We investigate whether seminars of this kind, as social and academic events designed to connect researchers, leave a visible track in the structure of the collaboration network. Our results suggest that such single events are not influential enough to change the network structure significantly. However, the network structure seems to influence a participant's decision to accept or decline an invitation.

Structure-Preserving Sparsification of Social Networks

Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015 - ASONAM '15, 2015

Sparsification reduces the size of networks while preserving structural and statistical propertie... more Sparsification reduces the size of networks while preserving structural and statistical properties of interest. Various sparsifying algorithms have been proposed in different contexts. We contribute the first systematic conceptual and experimental comparison of edge sparsification methods on a diverse set of network properties. It is shown that they can be understood as methods for rating edges by importance and then filtering globally by these scores. In addition, we propose a new sparsification method (Local Degree) which preserves edges leading to local hub nodes. All methods are evaluated on a set of 100 Facebook social networks with respect to network properties including diameter, connected components, community structure, and multiple node centrality measures. Experiments with our implementations of the sparsification methods (using the open-source network analysis tool suite NetworKit) show that many network properties can be preserved down to about 20% of the original set of edges. Furthermore, the experimental results allow us to differentiate the behavior of different methods and show which method is suitable with respect to which property. Our Local Degree method is fast enough for large-scale networks and performs well across a wider range of properties than previously proposed methods.

k -way Hypergraph Partitioning via n -Level Recursive Bisection

by Henning Meyerhenke and Sebastian Schlag

2016 Proceedings of the Eighteenth Workshop on Algorithm Engineering and Experiments (ALENEX), 2015

Fully-Dynamic Approximation of Betweenness Centrality

by Henning Meyerhenke and Elisabetta Bergamini

Lecture Notes in Computer Science, 2015

Betweenness is a well-known centrality measure that ranks the nodes of a network according to the... more Betweenness is a well-known centrality measure that ranks the nodes of a network according to their participation in shortest paths. Since an exact computation is prohibitive in large networks, several approximation algorithms have been proposed. Besides that, recent years have seen the publication of dynamic algorithms for efficient recomputation of betweenness in evolving networks. In previous work we proposed the first semi-dynamic algorithms that recompute an approximation of betweenness in connected graphs after batches of edge insertions. In this paper we propose the first fully-dynamic approximation algorithms (for weighted and unweighted graphs that need not to be connected) with a provable guarantee on the maximum approximation error. The transfer to fully-dynamic and disconnected graphs implies additional algorithmic problems that can be of independent interest. In particular, we propose a new upper bound on the vertex diameter for weighted undirected graphs. For both weighted and unweighted graphs, we also propose the first fully-dynamic algorithms that keep track of such upper bound. In addition, we extend our former algorithm for semi-dynamic BFS to batches of both edge insertions and deletions. Using approximation, our algorithms are the first to make in-memory computation of betweenness in fully-dynamic networks with millions of edges feasible. Our experiments show that they can achieve substantial speedups compared to recomputation, up to several orders of magnitude.

Drawing Large Graphs by Multilevel Maxent-Stress Optimization

Lecture Notes in Computer Science, 2015

Drawing large graphs appropriately is an important step for the visual 7 analysis of data from re... more Drawing large graphs appropriately is an important step for the visual 7 analysis of data from real-world networks. Here we present a novel multilevel 8 algorithm to compute a graph layout with respect to a recently proposed metric 9 that combines layout stress and entropy. As opposed to previous work, we do 10 not solve the linear systems of the maxent-stress metric with a typical numerical 11 solver. Instead we use a simple local iterative scheme within a multilevel ap-12 proach. To accelerate local optimization, we approximate long-range forces and 13 use shared-memory parallelism. Our experiments validate the high potential of 14 our approach, which is particularly appealing for dynamic graphs. In comparison 15 to the previously best maxent-stress optimizer, which is sequential, our parallel 16 implementation is on average 30 times faster already for static graphs (and still 17 faster if executed on one thread) while producing a comparable solution quality.

A distributed diffusive heuristic for clustering a virtual P2P supercomputer

2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010

For the management of a virtual P2P supercomputer one is interested in subgroups of processors th... more For the management of a virtual P2P supercomputer one is interested in subgroups of processors that can communicate with each other efficiently. The task of finding these subgroups can be formulated as a graph clustering problem, where clusters are vertex subsets that are densely connected within themselves, but sparsely connected to each other. Due to resource constraints, clustering using global knowledge (i. e., knowing (nearly) the whole input graph) might not be permissible in a P2P scenario, e. g., because collecting the data is not possible or would consume a high amount of resources. That is why we present a distributed heuristic using only limited local knowledge for clustering static and dynamic graphs.

Engineering High-Performance Community Detection Heuristics for Massive Graphs

2013 42nd International Conference on Parallel Processing, 2013

The amount of graph-structured data has recently experienced an enormous growth in many applicati... more The amount of graph-structured data has recently experienced an enormous growth in many applications. To transform such data into useful information, high-performance analytics algorithms and software tools are necessary. One common graph analytics kernel is community detection (or graph clustering). Despite extensive research on heuristic solvers for this task, only few parallel codes exist, although parallelism will be necessary to scale to the data volume of real-world applications.

Parallel Graph Partitioning for Complex Networks

2015 IEEE International Parallel and Distributed Processing Symposium, 2015

Processing large complex networks like social networks or web graphs has recently attracted consi... more Processing large complex networks like social networks or web graphs has recently attracted considerable interest. In order to do this in parallel, we need to partition them into pieces of about equal size. Unfortunately, previous parallel graph partitioners originally developed for more regular meshlike networks do not work well for these networks. This paper addresses this problem by parallelizing and adapting the label propagation technique originally developed for graph clustering. By introducing size constraints, label propagation becomes applicable for both the coarsening and the refinement phase of multilevel graph partitioning. We obtain very high quality by applying a highly parallel evolutionary algorithm to the coarsened graph. The resulting system is both more scalable and achieves higher quality than state-of-the-art systems like ParMetis or PT-Scotch. For large complex networks the performance differences are very big. For example, our algorithm can partition a web graph with 3.3 billion edges in less than sixteen seconds using 512 cores of a high performance cluster while producing a high quality partition -none of the competing systems can handle this graph on our system.

Algorithms for Mapping Parallel Processes onto Grid and Torus Architectures

More importantly, one of our new mapping algorithms always yields the best results in terms of th... more More importantly, one of our new mapping algorithms always yields the best results in terms of the quality measure maximum congestion when the application graphs are complex networks. In case of meshes as application graphs, this mapping algorithm always leads in terms of maximum congestion AND maximum dilation, another common quality measure.

Engineering Parallel Algorithms for Community Detection in Massive Networks

IEEE Transactions on Parallel and Distributed Systems, 2015

The amount of graph-structured data has recently experienced an enormous growth in many applicati... more The amount of graph-structured data has recently experienced an enormous growth in many applications. To transform such data into useful information, fast analytics algorithms and software tools are necessary. One common graph analytics kernel is disjoint community detection (or graph clustering). Despite extensive research on heuristic solvers for this task, only few parallel codes exist, although parallelism will be necessary to scale to the data volume of real-world applications. We address the deficit in computing capability by a flexible and extensible community detection framework with shared-memory parallelism. Within this framework we design and implement efficient parallel community detection heuristics: A parallel label propagation scheme; the first large-scale parallelization of the well-known Louvain method, as well as an extension of the method adding refinement; and an ensemble scheme combining the above. In extensive experiments driven by the algorithm engineering paradigm, we identify the most successful parameters and combinations of these algorithms. We also compare our implementations with state-of-the-art competitors. The processing rate of our fastest algorithm often reaches 50M edges/second. We recommend the parallel Louvain method and our variant with refinement as both qualitatively strong and fast. Our methods are suitable for massive data sets with billions of edges. 1

$O} LITPLermmerllaliOr. YVO TWdKe ad LOW TWOGMICAUONS tO UIC original algorithm. In the original description [28], nodes are traversed in random order. Since the cost of explicitly randomizing the node order in parallel is not insignificant, we make this optional and rely on some randomization through parallelism otherwise. We also observe that forgoing random- ization has a negligible effect on quality. We avoid unnecessary computation by distinguishing between active and inactive nodes. It is unnecessary to recompute the label weights for a node whose neighborhood labels have not changed in the previous iteration. Nodes which already have the heaviest label become inactive (Algorithm 1, line 14), and are only reacti- vated if a neighboring node is updated (line 12). We restrict iteration to the set of active nodes. Iterations are repeated until the number of nodes updated falls below a threshold value. The motivation for setting threshold values other than zero is that on some graph instances, the majority of iterations are spent on updating only a very small fraction of high-degree nodes (see Fig. 12 in the supplementary material for an example). Since preliminary experiments have shown that time can be saved and quality is not significantly degraded by simply omitting these iterations, we set an update threshold of 9 = n- 107°. Note that we do not use the termination criterion specified in [27] as it does not lead to convergence on some inputs. The original criterion is to stop when all nodes have the label of the relative majority in their neighborhood [28].$

Beyond Good Shapes: Diffusion-Based Graph Partitioning Is Relaxed Cut Optimization

Lecture Notes in Computer Science, 2010

In this paper we study the prevalent problem of graph partitioning by analyzing the diffusion-bas... more In this paper we study the prevalent problem of graph partitioning by analyzing the diffusion-based partitioning heuristic BUBBLE-FOS/C, a key component of the practically successful partitioner DIBAP . Our analysis reveals that BUBBLE-FOS/C, which yields well-shaped partitions in experiments, computes a relaxed solution to an edge cut minimizing binary quadratic program (BQP). It therefore provides the first substantial theoretical insights (beyond intuition) why BUBBLE-FOS/C (and therefore indirectly DIBAP) yields good experimental results. Moreover, we show that in bisections computed by BUBBLE-FOS/C, at least one of the two parts is connected. Using arguments based on random walk techniques, we prove that in vertex-transitive graphs actually both parts must be connected components each. All these results may help to eventually bridge the gap between practical and theoretical graph partitioning.

Dynamic Load Balancing for Parallel Numerical Simulations Based on Repartitioning with Disturbed Diffusion

2009 15th International Conference on Parallel and Distributed Systems, 2009

Load balancing is an important requirement for the efficient execution of parallel numerical simu... more Load balancing is an important requirement for the efficient execution of parallel numerical simulations. In particular when the simulation domain changes over time, the mapping of computational tasks to processors needs to be modified accordingly. State-of-the-art libraries for this problem are based on graph repartitioning. They have a number of drawbacks, including the optimized metric and the difficulty of parallelizing the popular repartitioning heuristic Kernighan-Lin (KL).

Shape optimizing load balancing for MPI-parallel adaptive numerical simulations

Contemporary Mathematics, 2013

ABSTRACT

Generating Random Hyperbolic Graphs in Subquadratic Time

Lecture Notes in Computer Science, 2015

Recent trends in graph partitioning for scientific computing

We survey recent trends in practical graph partitioning for scientific computing applications.

Slides for 'Recent trends in graph partitioning for scientific computing

Skript zur Vorlesung Algorithmen für synchrone Rechnernetze

Graphenalgorithmen und lineare Algebra Hand in Hand

Disturbed diffusive processes for solving partitioning problems on graphs

ABSTRACT

Constructing higher-order Voronoi diagrams in parallel

Static and Dynamic Aspects of Scientific Collaboration Networks

2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 2012

Collaboration networks arise when we map the connections between scientists which are formed thro... more Collaboration networks arise when we map the connections between scientists which are formed through joint publications. These networks thus display the social structure of academia, and also allow conclusions about the structure of scientific knowledge. Using the computer science publication database DBLP, we compile relations between authors and publications as graphs and proceed with examining and quantifying collaborative relations with graph-based methods. We review standard properties of the network and rank authors and publications by centrality. Additionally, we detect communities with modularitybased clustering and compare the resulting clusters to a groundtruth based on conferences and thus topical similarity. In a second part, we are the first to combine DBLP network data with data from the Dagstuhl Seminars: We investigate whether seminars of this kind, as social and academic events designed to connect researchers, leave a visible track in the structure of the collaboration network. Our results suggest that such single events are not influential enough to change the network structure significantly. However, the network structure seems to influence a participant's decision to accept or decline an invitation.

Structure-Preserving Sparsification of Social Networks

Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015 - ASONAM '15, 2015

Sparsification reduces the size of networks while preserving structural and statistical propertie... more Sparsification reduces the size of networks while preserving structural and statistical properties of interest. Various sparsifying algorithms have been proposed in different contexts. We contribute the first systematic conceptual and experimental comparison of edge sparsification methods on a diverse set of network properties. It is shown that they can be understood as methods for rating edges by importance and then filtering globally by these scores. In addition, we propose a new sparsification method (Local Degree) which preserves edges leading to local hub nodes. All methods are evaluated on a set of 100 Facebook social networks with respect to network properties including diameter, connected components, community structure, and multiple node centrality measures. Experiments with our implementations of the sparsification methods (using the open-source network analysis tool suite NetworKit) show that many network properties can be preserved down to about 20% of the original set of edges. Furthermore, the experimental results allow us to differentiate the behavior of different methods and show which method is suitable with respect to which property. Our Local Degree method is fast enough for large-scale networks and performs well across a wider range of properties than previously proposed methods.

k -way Hypergraph Partitioning via n -Level Recursive Bisection

by Henning Meyerhenke and Sebastian Schlag

2016 Proceedings of the Eighteenth Workshop on Algorithm Engineering and Experiments (ALENEX), 2015

Fully-Dynamic Approximation of Betweenness Centrality

by Henning Meyerhenke and Elisabetta Bergamini

Lecture Notes in Computer Science, 2015

Betweenness is a well-known centrality measure that ranks the nodes of a network according to the... more Betweenness is a well-known centrality measure that ranks the nodes of a network according to their participation in shortest paths. Since an exact computation is prohibitive in large networks, several approximation algorithms have been proposed. Besides that, recent years have seen the publication of dynamic algorithms for efficient recomputation of betweenness in evolving networks. In previous work we proposed the first semi-dynamic algorithms that recompute an approximation of betweenness in connected graphs after batches of edge insertions. In this paper we propose the first fully-dynamic approximation algorithms (for weighted and unweighted graphs that need not to be connected) with a provable guarantee on the maximum approximation error. The transfer to fully-dynamic and disconnected graphs implies additional algorithmic problems that can be of independent interest. In particular, we propose a new upper bound on the vertex diameter for weighted undirected graphs. For both weighted and unweighted graphs, we also propose the first fully-dynamic algorithms that keep track of such upper bound. In addition, we extend our former algorithm for semi-dynamic BFS to batches of both edge insertions and deletions. Using approximation, our algorithms are the first to make in-memory computation of betweenness in fully-dynamic networks with millions of edges feasible. Our experiments show that they can achieve substantial speedups compared to recomputation, up to several orders of magnitude.

Drawing Large Graphs by Multilevel Maxent-Stress Optimization

Lecture Notes in Computer Science, 2015

Drawing large graphs appropriately is an important step for the visual 7 analysis of data from re... more Drawing large graphs appropriately is an important step for the visual 7 analysis of data from real-world networks. Here we present a novel multilevel 8 algorithm to compute a graph layout with respect to a recently proposed metric 9 that combines layout stress and entropy. As opposed to previous work, we do 10 not solve the linear systems of the maxent-stress metric with a typical numerical 11 solver. Instead we use a simple local iterative scheme within a multilevel ap-12 proach. To accelerate local optimization, we approximate long-range forces and 13 use shared-memory parallelism. Our experiments validate the high potential of 14 our approach, which is particularly appealing for dynamic graphs. In comparison 15 to the previously best maxent-stress optimizer, which is sequential, our parallel 16 implementation is on average 30 times faster already for static graphs (and still 17 faster if executed on one thread) while producing a comparable solution quality.

A distributed diffusive heuristic for clustering a virtual P2P supercomputer

2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010

For the management of a virtual P2P supercomputer one is interested in subgroups of processors th... more For the management of a virtual P2P supercomputer one is interested in subgroups of processors that can communicate with each other efficiently. The task of finding these subgroups can be formulated as a graph clustering problem, where clusters are vertex subsets that are densely connected within themselves, but sparsely connected to each other. Due to resource constraints, clustering using global knowledge (i. e., knowing (nearly) the whole input graph) might not be permissible in a P2P scenario, e. g., because collecting the data is not possible or would consume a high amount of resources. That is why we present a distributed heuristic using only limited local knowledge for clustering static and dynamic graphs.

Engineering High-Performance Community Detection Heuristics for Massive Graphs

2013 42nd International Conference on Parallel Processing, 2013

The amount of graph-structured data has recently experienced an enormous growth in many applicati... more The amount of graph-structured data has recently experienced an enormous growth in many applications. To transform such data into useful information, high-performance analytics algorithms and software tools are necessary. One common graph analytics kernel is community detection (or graph clustering). Despite extensive research on heuristic solvers for this task, only few parallel codes exist, although parallelism will be necessary to scale to the data volume of real-world applications.

Parallel Graph Partitioning for Complex Networks

2015 IEEE International Parallel and Distributed Processing Symposium, 2015

Processing large complex networks like social networks or web graphs has recently attracted consi... more Processing large complex networks like social networks or web graphs has recently attracted considerable interest. In order to do this in parallel, we need to partition them into pieces of about equal size. Unfortunately, previous parallel graph partitioners originally developed for more regular meshlike networks do not work well for these networks. This paper addresses this problem by parallelizing and adapting the label propagation technique originally developed for graph clustering. By introducing size constraints, label propagation becomes applicable for both the coarsening and the refinement phase of multilevel graph partitioning. We obtain very high quality by applying a highly parallel evolutionary algorithm to the coarsened graph. The resulting system is both more scalable and achieves higher quality than state-of-the-art systems like ParMetis or PT-Scotch. For large complex networks the performance differences are very big. For example, our algorithm can partition a web graph with 3.3 billion edges in less than sixteen seconds using 512 cores of a high performance cluster while producing a high quality partition -none of the competing systems can handle this graph on our system.

Algorithms for Mapping Parallel Processes onto Grid and Torus Architectures

More importantly, one of our new mapping algorithms always yields the best results in terms of th... more More importantly, one of our new mapping algorithms always yields the best results in terms of the quality measure maximum congestion when the application graphs are complex networks. In case of meshes as application graphs, this mapping algorithm always leads in terms of maximum congestion AND maximum dilation, another common quality measure.

Engineering Parallel Algorithms for Community Detection in Massive Networks

IEEE Transactions on Parallel and Distributed Systems, 2015

The amount of graph-structured data has recently experienced an enormous growth in many applicati... more The amount of graph-structured data has recently experienced an enormous growth in many applications. To transform such data into useful information, fast analytics algorithms and software tools are necessary. One common graph analytics kernel is disjoint community detection (or graph clustering). Despite extensive research on heuristic solvers for this task, only few parallel codes exist, although parallelism will be necessary to scale to the data volume of real-world applications. We address the deficit in computing capability by a flexible and extensible community detection framework with shared-memory parallelism. Within this framework we design and implement efficient parallel community detection heuristics: A parallel label propagation scheme; the first large-scale parallelization of the well-known Louvain method, as well as an extension of the method adding refinement; and an ensemble scheme combining the above. In extensive experiments driven by the algorithm engineering paradigm, we identify the most successful parameters and combinations of these algorithms. We also compare our implementations with state-of-the-art competitors. The processing rate of our fastest algorithm often reaches 50M edges/second. We recommend the parallel Louvain method and our variant with refinement as both qualitatively strong and fast. Our methods are suitable for massive data sets with billions of edges. 1

$O} LITPLermmerllaliOr. YVO TWdKe ad LOW TWOGMICAUONS tO UIC original algorithm. In the original description [28], nodes are traversed in random order. Since the cost of explicitly randomizing the node order in parallel is not insignificant, we make this optional and rely on some randomization through parallelism otherwise. We also observe that forgoing random- ization has a negligible effect on quality. We avoid unnecessary computation by distinguishing between active and inactive nodes. It is unnecessary to recompute the label weights for a node whose neighborhood labels have not changed in the previous iteration. Nodes which already have the heaviest label become inactive (Algorithm 1, line 14), and are only reacti- vated if a neighboring node is updated (line 12). We restrict iteration to the set of active nodes. Iterations are repeated until the number of nodes updated falls below a threshold value. The motivation for setting threshold values other than zero is that on some graph instances, the majority of iterations are spent on updating only a very small fraction of high-degree nodes (see Fig. 12 in the supplementary material for an example). Since preliminary experiments have shown that time can be saved and quality is not significantly degraded by simply omitting these iterations, we set an update threshold of 9 = n- 107°. Note that we do not use the termination criterion specified in [27] as it does not lead to convergence on some inputs. The original criterion is to stop when all nodes have the label of the relative majority in their neighborhood [28].$