Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2018, Proceedings of the ACM on Computer Graphics and Interactive Techniques
…
34 pages
1 file
In this paper, we question the premise that graphics hardware uses a post-transform cache to avoid redundant vertex shader invocations. A large body of existing work on optimizing indexed triangle sets for rendering speed is based upon this widely-accepted assumption. We conclusively show that this assumption does not hold up on modern graphics hardware. We design and conduct experiments that demonstrate the behavior of current hardware of all major vendors to be inconsistent with the presence of a common post-transform cache. Our results strongly suggest that modern hardware rather relies on a batch-based approach, most likely for reasons of scalability. A more thorough investigation based on these initial experiments allows us to partially uncover the actual strategies implemented on graphics processors today. We reevaluate existing mesh optimization algorithms in light of these new findings and present a new mesh optimization algorithm designed from the ground up to target archit...
Proceedings of the ACM on Computer Graphics and Interactive Techniques, 2018
Due to its flexibility, compute mode is becoming more and more attractive as a way to implement many of the algorithms part of a state-of-the-art rendering pipeline. A key problem commonly encountered in graphics applications is streaming vertex and geometry processing. In a typical triangle mesh, the same vertex is on average referenced six times. To avoid redundant computation during rendering, a post-transform cache is traditionally employed to reuse vertex processing results. However, such a vertex cache can generally not be implemented efficiently in software and does not scale well as parallelism increases. We explore alternative strategies for reusing per-vertex results on-the-fly during massively-parallel software geometry processing. Given an input stream divided into batches, we analyze the effectiveness of sorting, hashing, and intra-thread-group communication for identifying and exploiting local reuse potential. We design and present four vertex reuse strategies tailored...
IEEE Transactions on Visualization and Computer Graphics, 2000
One important bottleneck when visualizing large data sets is the data transfer between processor and memory. Cacheaware (CA) and cache-oblivious (CO) algorithms take into consideration the memory hierarchy to design cache efficient algorithms. CO approaches have the advantage to adapt to unknown and varying memory hierarchies. Recent CA and CO algorithms developed for 3D mesh layouts significantly improve performance of previous approaches, but they lack of theoretical performance guarantees. We present in this paper a O OðN log NÞ algorithm to compute a CO layout for unstructured but well shaped meshes. We prove that a coherent traversal of a N-size mesh in dimension d induces less than N=B þ O OðN=M 1=d Þ cache-misses where B and M are the block size and the cache size, respectively. Experiments show that our layout computation is faster and significantly less memory consuming than the best known CO algorithm. Performance is comparable to this algorithm for classical visualization algorithm access patterns, or better when the BSP tree produced while computing the layout is used as an acceleration data structure adjusted to the layout. We also show that cache oblivious approaches lead to significant performance increases on recent GPU architectures.
ACM Transactions on …, 2005
We present a novel method for computing cache-oblivious layouts of large meshes that improve the performance of interactive visualization and geometric processing algorithms. Given that the mesh is accessed in a reasonably coherent manner, we assume no particular data access patterns or cache parameters of the memory hierarchy involved in the computation. Furthermore, our formulation extends directly to computing layouts of multi-resolution and bounding volume hierarchies of large meshes. We develop a simple and practical cache-oblivious metric for estimating cache misses. Computing a coherent mesh layout is reduced to a combinatorial optimization problem. We designed and implemented an out-of-core multilevel minimization algorithm and tested its performance on unstructured meshes composed of tens to hundreds of millions of triangles. Our layouts can significantly reduce the number of cache misses. We have observed 2-20 times speedups in view-dependent rendering, collision detection, and isocontour extraction without any modification of the algorithms or runtime applications.
2009
Abstract: One important bottleneck when visualizing large data sets is the data trans-fer between processor and memory. Cache-aware (CA) and cache-oblivious (CO) al-gorithms take into consideration the memory hierarchy to design cache efficient algo-rithms. CO approaches have the ...
Journal of Information Science and Engineering, 2006
Traditional iterative contraction based polygonal mesh simplification (PMS) algo- rithms usually require enormous amounts of main memory cost in processing large meshes. On the other hand, fast out-of-core algorithms based on the grid re-sampling scheme usually produce low quality output. In this paper, we propose a novel cache- based approach to large polygonal mesh simplification. The new approach introduces the
Computer-aided Design, 2000
Triangle strips are a widely used hardware-supported data-structure to compactly represent and efficiently render polygonal meshes. In this paper we survey the efficient generation of triangle strips as well as their variants. We present efficient algorithms for partitioning polygonal meshes into triangle strips. Triangle strips have traditionally used a buffer size of two vertices. In this paper we also study the impact of larger buffer sizes and various queuing disciplines on the effectiveness of triangle strips. View-dependent simplification has emerged as a powerful tool for graphics acceleration in visualization of complex environments. However, in a view-dependent framework the triangle mesh connectivity changes at every frame making it difficult to use triangle strips. In this paper we present a novel data-structure, Skip Strip, that efficiently maintains triangle strips during such view-dependent changes. A Skip Strip stores the vertex hierarchy nodes in a skip-list-like manner with path compression. We anticipate that Skip Strips will provide a road-map to combine rendering acceleration techniques for static datasets, typical of retained-mode graphics applications, with those for dynamic datasets found in immediate-mode applications.
Current computer architectures employ caching to improve the performance of a wide variety of applications. One of the main characteristics of such cache schemes is the use of block fetching whenever an uncached data element is accessed. To maximize the benefit of the block fetching mechanism, we present novel cache-aware and cache-oblivious layouts of surface and volume meshes that improve the performance of interactive visualization and geometric processing algorithms. Based on a general I/O model, we derive new cache-aware and cache-oblivious metrics that have high correlations with the number of cache misses when accessing a mesh. In addition to guiding the layout process, our metrics can be used to quantify the quality of a layout, e.g. for comparing different layouts of the same mesh and for determining whether a given layout is amenable to significant improvement. We show that layouts of unstructured meshes optimized for our metrics result in improvements over conventional layouts in the performance of visualization applications such as isosurface extraction and view-dependent rendering. Moreover, we improve upon recent cache-oblivious mesh layouts in terms of performance, applicability, and accuracy.
1996
Almost all scientific visualization involving surfaces is currently done via triangles. The speed at which such triangulated surfaces can be displayed is crucial to interactive visualization and is bounded by the rate at which triangulated data can be sent to the graphics subsystem for rendering. Partitioning polygonal models into triangle strips can significantly reduce rendering times over transmitting each triangle individually.
Proceedings of the 21st International Meshing Roundtable, 2013
Mesh simplification and mesh compression are important processes in computer graphics and scientific computing, as such contexts allow for a mesh which takes up less memory than the original mesh. Current simplification and compression algorithms do not take advantage of both the central processing unit (CPU) and the graphics processing unit (GPU). We propose three simplification algorithms, one of which runs on the CPU and two of which run on the GPU. We combine these algorithms into two CPU-GPU algorithms for mesh simplification. Our CPU-GPU algorithms are the naïve marking algorithm and the inverse reduction algorithm. Experimental results show that when the algorithms take advantage of both the CPU and the GPU, there is a decrease in running time for simplification compared to performing all of the computation on the CPU. The marking algorithm provides higher simplification rates than the inverse reduction algorithm, whereas the inverse reduction algorithm has a lower running time than the marking algorithm.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Proceedings of the 13th …
ACM Transactions on Graphics, 2015
Eurographics Workshop on Parallel Graphics and Visualization, 2012
IEEE Transactions on Visualization and Computer Graphics, 1999
IEEE Transactions on Visualization and Computer Graphics, 2014
IEEE Computer Graphics and Applications, 2001
Proceedings Computer Graphics International, 2004.
Future Generation Computer Systems, 2004
Future Generation Computer Systems, 2004
Indian Conference on Computer Vision, Graphics & Image Processing, 2002
Mathematics and Visualization, 2003