[Pass]add AscendSyncInsert pass #74

fuhouyu-hw · 2025-11-06T10:52:50Z

No description provided.

* [Enhancement] Add VectorizeLoop function and update imports for compatibility * [CI][Test] Improve test cases for vectorization and fix typos in parser comments * lint fix * Fix incorrect module reference for VectorizeLoop transformation * Refactor vectorize_loop transformation by removing unused extent mutation logic * [Enhancement] Add support for FP8 data types and global barriers in CUDA codegen * Fix formatting in CUDA FP8 header file for consistency * Refactor CI workflow to use 'tilelang_ci' virtual environment and update CUDA type printing for better clarity * Update submodule 'tvm' to latest commit for improved functionality * Refactor execution backend references from 'dl_pack' to 'dlpack' for consistency and clarity; add apply_simplify function to simplify PrimFunc or IRModule. * Refactor CUDA code for improved readability; clean up formatting and remove unnecessary whitespace in multiple files. * Refactor import statement in test_tilelang_kernel_dequantize_gemm.py to use 'tilelang.language' for consistency * Add CUDA requirements to FP8 test cases and update references for clarity * Add a blank line for improved readability in test_tilelang_kernel_fp8_gemm_mma.py * Fix data type in reference result calculation for consistency in test_tilelang_kernel_gemm_mma_intrinsic.py * Add CUDA requirements and FP8 test cases for matmul and gemv simulations * Remove debug print statements and use tilelang's testing assertion for result validation in test_tilelang_kernel_gemm_mma_intrinsic.py * Remove outdated comment regarding FP8 tests in test_tilelang_kernel_gemv_simt.py * Add BF16 support to matrix multiplication and introduce corresponding test cases * Add a blank line for improved readability in BF16 GEMM test * Update acknowledgements in README to include supervision by Zhi Yang at Peking University * enhance acknowledgement * Replace tutorial on memory layout optimization with new tutorial on writing high-performance kernels with thread primitives * Update subproject commit for TVM dependency * Update subproject commit for TVM dependency * Add int4_t type and functions for packing char values in CUDA common header * Add plot_layout example and implement GetForwardVars method in layout classes * Refactor code for improved readability by adjusting line breaks and formatting in layout and test files * Fix formatting by removing unnecessary line break in layout.h * Refactor make_int4 function for improved readability by adjusting parameter formatting * Add legend to plot_layout for improved clarity of thread and local IDs * Remove unnecessary dependencies from requirements files for cleaner setup * Remove flash_mha.py and add .gitkeep to deepseek_mla directory * Add build requirements and update installation scripts for improved setup * Introduce carver * Refactor imports and improve code formatting for consistency * Add unit tests for carver recommendation hints * lint fix * Enhance ElementwiseTemplate and BaseTemplate with detailed docstrings for improved code documentation and clarity * Refactor import statements and clean up whitespace in template files for improved readability * Add README.md for Carver framework with usage examples and architecture support * Refactor import statement in matmul_analysis.py for consistency * Refactor TileDict and TensorCorePolicy methods for improved clarity and functionality * Add tests for general matrix multiplication emit configurations * Refactor formatting in test_tilelang_carver_generate_hints.py for improved readability * Add FlashAttentionTemplate and related functionality for hint recommendations * Refactor whitespace in FlashAttentionTemplate and test_tilelang_carver_recommend_hints for improved readability * Update README.md to include FlashAttentionTemplate in the carver section

add AscendSyncInsert pass

45d303d

xwhzz merged commit 14515f8 into tile-ai:ascendc_pto Nov 7, 2025

ArmandAlbert mentioned this pull request Nov 11, 2025

Poor performance of operators in the model #50

Open

fuhouyu-hw changed the title ~~add AscendSyncInsert pass~~ [PASS]add AscendSyncInsert pass Dec 18, 2025

fuhouyu-hw changed the title ~~[PASS]add AscendSyncInsert pass~~ [Pass]add AscendSyncInsert pass Dec 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Pass]add AscendSyncInsert pass #74

[Pass]add AscendSyncInsert pass #74

Uh oh!

fuhouyu-hw commented Nov 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Pass]add AscendSyncInsert pass #74

[Pass]add AscendSyncInsert pass #74

Uh oh!

Conversation

fuhouyu-hw commented Nov 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants