Skip to content

attention实现的问题 #2

@ddz16

Description

@ddz16

您好,您提到的层次注意力是不是指的是band attention(如下图所示),只不过随着层数增加,窗口大小指数递增。这样的话model.py里这个函数里的那个for循环内容,是不是应该改为window_mask[:, i, i:i+self.bl] = 1

def construct_window_mask(self):
    window_mask = torch.zeros((1, self.bl, self.bl + 2* (self.bl //2)))
    for i in range(self.bl):
        window_mask[:, :, i:i+self.bl] = 1
    return window_mask.to(device)

image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions