Skip to content

Fix get_label_mask docstrings and get_assistant_mask return type#85

Open
lonexreb wants to merge 1 commit intoNVlabs:mainfrom
lonexreb:fix/get-label-mask-docstrings-and-return-type
Open

Fix get_label_mask docstrings and get_assistant_mask return type#85
lonexreb wants to merge 1 commit intoNVlabs:mainfrom
lonexreb:fix/get-label-mask-docstrings-and-return-type

Conversation

@lonexreb
Copy link
Copy Markdown
Contributor

@lonexreb lonexreb commented May 4, 2026

Problem

Three small accuracy issues in src/alpamayo_r1/utils/get_label_mask.py:

  1. Truncated docstring in fill_masks_between_special_tokens — the entry for labels_mask cuts off mid-sentence:

    labels_mask (B, seq_len): tensor of labels mask, indicating the positions that should
    

    Readers can't tell what the function actually does to the mask, and the in-place mutation isn't documented.

  2. Wrong return-type annotation on get_assistant_mask. The function declares -> torch.Tensor, but the body returns masks.tolist() (a list[bool]) when tokens is a list[int]:

    if isinstance(tokens, torch.Tensor):
        return torch.from_numpy(masks)
    else:
        return masks.tolist()

    This trips up type-checkers and is misleading to callers passing a list.

  3. Dangling reference block in get_assistant_mask — the docstring ends with:

    Reference:
        Adapted from:
    

    No URL or citation follows. Either fill it in or drop it; this PR drops it.

Fix

Pure docstring + annotation hygiene. No behavioral change.

  • Complete the truncated labels_mask description and note in-place mutation + inclusive end-token semantics (matches the # NOTE: we should include the end token in the mask comment that already exists in the body).
  • Update get_assistant_mask annotation to torch.Tensor | list[bool] and reflect this in the docstring.
  • Remove the empty Reference: Adapted from: block.

Verification

No code paths changed. Mypy / pyright now agree with the actual runtime types.

Three small docstring/typing nits in src/alpamayo_r1/utils/get_label_mask.py:

1. fill_masks_between_special_tokens: the labels_mask docstring entry
   was truncated mid-sentence ("...indicating the positions that should").
   Complete the sentence and document that the tensor is mutated
   in-place and that the matched span includes the end token (matching
   the existing `# NOTE: we should include the end token in the mask`
   comment in the body).

2. get_assistant_mask: the function returns either a torch.Tensor or a
   list[bool] depending on the input type (line `return masks.tolist()`
   when tokens is a list[int]), but the annotation declared only
   torch.Tensor. Update the annotation to torch.Tensor | list[bool] and
   reflect this in the docstring.

3. get_assistant_mask: drop the dangling "Reference: Adapted from:"
   block that has no content. If a future reference is identified it
   can be added back in a separate commit.

Pure docstring/annotation hygiene -- no behavioral change.

Signed-off-by: lonexreb <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant