feat(pt): Add support for finetuning from .pth (frozen) models #4956

Copilot · 2025-09-03T14:38:03Z

This PR fixes the NotImplementedError that occurs when attempting to finetune from .pth (frozen/scripted) models in PyTorch backend.

Problem

Users encountered a RuntimeError when trying to finetune from frozen models:

dp --pt train input.json -t dpa2.pth --use-pretrain-script

The error occurred because the get_finetune_rules() function unconditionally used torch.load() with weights_only=True to load finetune models, which fails for .pth files that are created with torch.jit.save() and require torch.jit.load().

Solution

Updated get_finetune_rules() function in deepmd/pt/utils/finetune.py:

Added file extension detection to use appropriate loading method
.pt files: torch.load() with weights_only=True (existing behavior)
.pth files: torch.jit.load() and extract model params via get_model_def_script()

Updated training logic in deepmd/pt/train/training.py:

Added proper .pth support in model resuming/loading logic
Used strict=False when loading state dict from .pth files to handle different key structures
Gracefully handle missing optimizer state and step info in frozen models

The implementation follows the existing pattern used in the change_bias() function, ensuring consistency across the codebase.

Testing

Added comprehensive test cases covering both .pt and .pth finetune workflows
Verified backward compatibility with existing .pt finetune functionality
Tested error handling for invalid file extensions
Manual CLI testing confirms end-to-end workflow works correctly

Users can now successfully finetune from both checkpoint (.pt) and frozen (.pth) models:

# Works with checkpoint files (existing functionality)
dp --pt train input.json --finetune model.pt --use-pretrain-script

# Now works with frozen models (new functionality)  
dp --pt train input.json --finetune model.pth --use-pretrain-script

Fixes #4262.

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

- Fix get_finetune_rules() to handle both .pt and .pth files - Update training logic to support .pth model loading - Add comprehensive tests for both file types - Use appropriate loading methods: torch.load() for .pt, torch.jit.load() for .pth - Maintain backward compatibility with existing .pt workflow Fixes #4262

Initial plan

a984ed2

Copilot AI assigned Copilot and njzjz Sep 3, 2025

Copilot started work on behalf of njzjz September 3, 2025 14:38 View session

Copilot AI changed the title ~~[WIP] Encounting NotImplementedError when finetuning a Single-Task .pth model~~ feat(pt): Add support for finetuning from .pth (frozen) models Sep 3, 2025

Copilot AI requested a review from njzjz September 3, 2025 15:02

Copilot finished work on behalf of njzjz September 3, 2025 15:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(pt): Add support for finetuning from .pth (frozen) models #4956

feat(pt): Add support for finetuning from .pth (frozen) models #4956

Uh oh!

Copilot AI commented Sep 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

feat(pt): Add support for finetuning from .pth (frozen) models #4956

Are you sure you want to change the base?

feat(pt): Add support for finetuning from .pth (frozen) models #4956

Uh oh!

Conversation

Copilot AI commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Testing

Uh oh!

Uh oh!

Copilot AI commented Sep 3, 2025 •

edited

Loading