-
Notifications
You must be signed in to change notification settings - Fork 30.1k
Open
Labels
Description
System Info
Related package versions:
transformers=4.51.3
accelerate==1.3.0
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Here is a minimal reproducible example:
from accelerate import Accelerator
accelerator = Accelerator()
print("(L3) AcceleratorState has distributed_type:", hasattr(accelerator.state, 'distributed_type'))
print("(L4) distributed_type value:", getattr(accelerator.state, 'distributed_type', 'NOT_FOUND'))
from transformers import TrainingArguments
training_args = TrainingArguments()
print("(L7) AcceleratorState has distributed_type:", hasattr(accelerator.state, 'distributed_type'))
print("(L8) distributed_type value:", getattr(accelerator.state, 'distributed_type', 'NOT_FOUND'))
And this returns:
(L3) AcceleratorState has distributed_type: True
(L4) distributed_type value: DistributedType.MULTI_GPU
...
(L7) AcceleratorState has distributed_type: False
(L8) distributed_type value: NOT_FOUND
This behavior of silently resetting the accelerator can be frustrating for people who have already defined it in an outer scope. I’m wondering how this could be resolved, and I’d be glad to help work on a fix.
Expected behavior
I personally suppose that initializing a dictionary parameter in TrainingArguments
should not alter the global behavior of accelerator. Therefore, the snippet above should return True
in both cases.