huggingface / trl Public

generated from fastai/nbdev_template

Notifications You must be signed in to change notification settings
Fork 2.2k
Star 15.5k

Code
Issues 485
Pull requests 75
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: huggingface/trl

Labels 33 Milestones 0

New pull request New

75 Open 1,933 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Context Parallelism benchmark guide

#4075 opened Sep 12, 2025 by sergiopaniego

Loading…

5 tasks

Add config_init_kwargs option in GRPOConfig

#4069 opened Sep 12, 2025 by hokuyama0106

Loading…

2 of 5 tasks

🗑️ Remove deprecated AlignPropTrainer, DDPOTrainer and IterativeSFTTrainer

#4068 opened Sep 12, 2025 by qgallouedec

Loading…

5 tasks

Add VLM support to RLOO trainer

#4067 opened Sep 11, 2025 by behroozazarkhalili

Loading…

🧹 Remove max_batch_tokens, num_blocks and block_size from generation kwargs

#4065 opened Sep 11, 2025 by qgallouedec

Loading…

[GRPO]: Sample from a Replay Buffer To Substitute Groups with 0 std.

#4060 opened Sep 10, 2025 by pramodith • Draft

4 of 5 tasks

[vllm] ensure MASTER_ADDR/MASTER_PORT are set safely

#4057 opened Sep 10, 2025 by kashif

Loading…

feat: Add NPU and XPU support for activation offloading

#4056 opened Sep 10, 2025 by zilongzheng

Loading…

2 of 5 tasks

✨ Add logging for training completion and model saving in training scripts

#4048 opened Sep 9, 2025 by qgallouedec

Loading…

[Draft] Add configurable dataset column logging to GRPOTrainer W&B tables

#4045 opened Sep 9, 2025 by davanstrien • Draft

✂️ [GRPO VLM] Update split sizes to generalize

#4032 opened Sep 8, 2025 by zucchini-nlp

Loading…

Enable XPU for vllm client

#4031 opened Sep 8, 2025 by jiqing-feng

Loading…

vllm sleep mode support

#4028 opened Sep 8, 2025 by ved1beta

Loading…

2 of 5 tasks

Fix #3982: Fix DPO Trainer support for Gemma 3 vision models

#4022 opened Sep 6, 2025 by akshay-babbar

Loading…

Fix: undefined current_gradient_accumulation_steps

#4014 opened Sep 5, 2025 by ysjprojects

Loading…

2 of 5 tasks

Fix: ignore precompute_ref_log_probs when use_liger_loss=True

#4008 opened Sep 4, 2025 by ginkyenglee

Loading…

5 tasks

Improve typing of SFT trainer

#4007 opened Sep 4, 2025 by cyyever

Loading…

⚖️ Align SFT and DPO for model creation and deprecate DPOConfig.padding_value in favour or pad_token_id

#4006 opened Sep 4, 2025 by qgallouedec

Loading…

5 tasks

Remove attention mask when position ids is returned

#3997 opened Sep 2, 2025 by qgallouedec • Draft

[GFPO]: implement GFPO in GRPOTrainer

#3989 opened Sep 1, 2025 by Peter-Chou

Loading…

3 of 5 tasks

Enable saving and loading precomputed reference log probabilities in …

#3986 opened Sep 1, 2025 by ginkyenglee

Loading…

3 tasks

fix bug when using dataset streaming by accelerate

#3950 opened Aug 25, 2025 by kaixuanliu

Loading…

🐳 Docker update

#3931 opened Aug 20, 2025 by qgallouedec

Loading…

[SFTTrainer]: Check for assistant mask up to max_length

#3930 opened Aug 20, 2025 by pramodith

Loading…

3 of 5 tasks

[DRAFT] Refactor DPO

#3906 opened Aug 15, 2025 by qgallouedec • Draft

5 tasks

Previous 1 2 3 Next

Previous Next

ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!