Transformers documentation

Trainer

Transformers

Get started

Transformers Installation Quickstart

Base classes

Models

Preprocessors

Inference

Pipeline API

Generate API

Optimization

Chat with models

Serving

Training

Get started

Customization

Parameter-efficient fine-tuning

Performance

Distributed training

Hardware

Quantization

Ecosystem integrations

Resources

API

Main Classes

Models

Internal helpers

Reference

Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces

Faster examples with accelerated inference

Switch between documentation themes

to get started

Trainer

The Trainer class provides an API for feature-complete training in PyTorch, and it supports distributed training on multiple GPUs/TPUs, mixed precision for NVIDIA GPUs, AMD GPUs, and torch.amp for PyTorch. Trainer goes hand-in-hand with the TrainingArguments class, which offers a wide range of options to customize how a model is trained. Together, these two classes provide a complete training API.

Seq2SeqTrainer and Seq2SeqTrainingArguments inherit from the Trainer and TrainingArguments classes and they’re adapted for training models for sequence-to-sequence tasks such as summarization or translation.

The Trainer class is optimized for 🤗 Transformers models and can have surprising behaviors when used with other models. When using it with your own model, make sure:

your model always return tuples or subclasses of ModelOutput

your model can compute the loss if a labels argument is provided and that loss is returned as the first element of the tuple (if your model returns tuples)

your model can accept multiple label arguments (use label_names in TrainingArguments to indicate their name to the Trainer) but none of them should be named "label"

Transformers

Trainer

Trainer

class transformers.Trainer

add_callback

autocast_smart_context_manager

call_model_init

compute_loss

compute_loss_context_manager

create_accelerator_and_postprocess

create_model_card

create_optimizer

create_optimizer_and_scheduler

create_scheduler

evaluate

evaluation_loop

floating_point_ops

get_batch_samples

get_cp_size

get_decay_parameter_names

get_eval_dataloader

get_learning_rates

get_num_trainable_parameters

get_optimizer_cls_and_kwargs

get_optimizer_group

get_sp_size

get_test_dataloader

get_total_train_batch_size

get_tp_size

get_train_dataloader

hyperparameter_search

init_hf_repo

is_local_process_zero

is_world_process_zero

log

log_metrics

metrics_format

num_examples

pop_callback

predict

prediction_step

push_to_hub

remove_callback

save_metrics

save_model

save_state

set_initial_training_values

store_flos

train

training_step

Seq2SeqTrainer

class transformers.Seq2SeqTrainer

evaluate

predict

TrainingArguments

class transformers.TrainingArguments

get_process_log_level

get_warmup_steps

main_process_first

set_dataloader

set_evaluate

set_logging

set_lr_scheduler

set_optimizer

set_push_to_hub

set_save

set_testing

set_training

to_dict

to_json_string

to_sanitized_dict

Seq2SeqTrainingArguments

class transformers.Seq2SeqTrainingArguments

to_dict