Issue tracking overall status of the integration.
During the bringup of NeMo RL we duplicated some logic that was in NeMo Automodel in order to OSS quickly. It's now time to converge and deduplicate so we rely on a single source of truth for things related to DTensor and Automodel.
Here is a rough breakdown of the stages:
- Stage 1: support NeMo Automodel APIs in a separate policy (named something like
DTensorPolicyWorkerV2)
- Stage 2: upstream parallelize plans and changes specific to NeMo RL (e.g., checkpointing, CP, tied-embedding, seq-packing)
- Stage 3: test NeMo AutoModel vs. DTensorPolicy to ensure parity
- ---- mark DTensorPolicyWorker for deprecation -----
- Stage 4: sunset DTensorPolicyWorker and promote DTensorPolicyWorker in its place
- Stage 5: Use backported 2.7.0 DCP checkpointing in NeMo RL
- Stage 6: integrate cut cross entropy/liger kernels from Automodel
CC: @akoumpa
related: #224
Issue tracking overall status of the integration.
During the bringup of NeMo RL we duplicated some logic that was in NeMo Automodel in order to OSS quickly. It's now time to converge and deduplicate so we rely on a single source of truth for things related to DTensor and Automodel.
Here is a rough breakdown of the stages:
DTensorPolicyWorkerV2)CC: @akoumpa
related: #224