Spatial-information Guided Adaptive Context-aware Network for Efficient RGB-D Semantic Segmentation
Efficient RGB-D semantic segmentation has received considerable attention in mobile robots, which plays a vital role in analyzing and recognizing environmental informa...
Tags:Paper and LLMsSemantic SegmentationPricing Type
- Pricing Type: Free
- Price Range Start($):
GitHub Link
The GitHub link is https://github.com/mvme-hbut/sgacnet
Introduce
Title GitHub Repository Update for SGACNet Spatial-information Guided Adaptive Context-aware Network. Summary This GitHub repository branch is current and synced with the CyunXiong/SGACNetmain repository, which pertains to the development of SGACNet—a network designed for efficient RGB-D semantic segmentation, incorporating spatial information guidance and adaptive context awareness.
Efficient RGB-D semantic segmentation has received considerable attention in mobile robots, which plays a vital role in analyzing and recognizing environmental information.
Content
We provide the weights for our selected ESANet-R34-NBt1D (with ResNet34 NBt1D backbones) on NYUv2, SunRGBD, and Cityscapes. Download and extract the models to ./trained_models. Please navigate to the cloned directory. Note we are using Python 3.7+. Torch 1.3.1 and torchvision 0.4.2 ImageNet can be downloaded for our selected backbones on the above datasets. Stored in <dir>/trained_models/imagenet. Note that some parameters are different in Cityscapes. Evaluation on SUN RGB-D is similar to NYUv2. Yang Zhang, Chenyun Xiong, Junjie Liu, Xuhui Ye, and Guodong Sun. Spatial-information Guided Adaptive Context-aware Network for Efficient RGBD Semantic Segmentation[J]. IEEE Sensors Journal, 2023.

Related
Based on the modeling method, we present FocusFlow, a framework consisting of 1) a mix loss function combined with a classic photometric loss function and our proposed Conditional Point Control Loss (CPCL) function for diverse point-wise supervision; 2) a conditioned controlling model which substitutes the conventional feature encoder by our proposed Condition Control Encoder (CCE).

LongLLaMA is a large language model designed to handle very long text contexts, up to 256,000 tokens. It's based on OpenLLaMA and uses a technique called Focused Transformer (FoT) for training. The repository provides a smaller 3B version of LongLLaMA for free use. It can also be used as a replacement for LLaMA models with shorter contexts.







