PretrainedConfig class
The PretrainedConfig
class is the base class for all configuration classes in the Transformers library. It provides a unified interface for handling configuration parameters common to all models, as well as methods for loading, saving, and updating configurations.
Let's analyse the class in detail:
Initialisation
The
PretrainedConfig
class is initialized with arbitrary keyword arguments (**kwargs
).It defines several common parameters such as
output_hidden_states
,output_attentions
,return_dict
,is_encoder_decoder
,is_decoder
, etc., which are used by various models.
Class Attributes
model_type
: An identifier for the model type, serialised into the JSON file and used to recreate the correct object inAutoConfig
.is_composition
: A boolean indicating whether the config class is composed of multiple sub-configs.keys_to_ignore_at_inference
: A list of keys to ignore when looking at dictionary outputs of the model during inference.attribute_map
: A dictionary that maps model-specific attribute names to standardized attribute names.
Common Attributes
The class defines common attributes such as
vocab_size
,hidden_size
,num_attention_heads
,num_hidden_layers
, which are present in all subclasses.
Methods
from_pretrained
: A class method that instantiates aPretrainedConfig
(or a derived class) from a pretrained model configuration.It takes the
pretrained_model_name_or_path
as input, which can be a model identifier, a path to a directory containing the configuration file, or a URL to a saved configuration JSON file.It supports additional parameters such as
cache_dir
,force_download
,revision
, etc., to control the behavior of downloading and caching the configuration files.
save_pretrained
: A method to save the configuration object to a directory, so that it can be re-loaded using thefrom_pretrained
method.It takes the
save_directory
as input and saves the configuration JSON file in that directory.It also supports pushing the configuration to the Hugging Face Model Hub using the
push_to_hub
parameter.
to_dict
: A method that serializes the configuration instance to a Python dictionary.to_json_string
: A method that serializes the configuration instance to a JSON string.to_json_file
: A method that saves the configuration instance to a JSON file.update
: A method that updates the attributes of the configuration instance with attributes from a dictionary.update_from_string
: A method that updates the attributes of the configuration instance from a string representation.
Auto Class Registration
The
register_for_auto_class
method allows registering the configuration class with a given auto class (e.g.,AutoConfig
).This is useful for custom configurations to be automatically discoverable by the
AutoConfig
class.
Serialization and Deserialization
The
to_dict
,to_json_string
, andto_json_file
methods provide functionality to serialize the configuration instance to different formats.The
from_dict
andfrom_json_file
methods allow instantiating aPretrainedConfig
from a dictionary or a JSON file, respectively.
The PretrainedConfig
class serves as a foundation for all configuration classes in the Transformers library.
It provides a standardised way to handle configuration parameters, load and save configurations, and interact with pretrained models.
Subclasses of PretrainedConfig
can extend or override the base class methods and attributes to define model-specific configurations. This allows for a consistent and unified approach to working with configurations across different models in the library.
The class also supports integration with the Hugging Face Model Hub, enabling easy sharing and loading of pretrained configurations from the hub.
Overall, the PretrainedConfig
class is a crucial component in the Transformers library, facilitating the management and organization of model configurations in a standardized and efficient manner.
Last updated