Skip to content

Ligare Configuration System

Aaron Holmes edited this page Oct 28, 2024 · 4 revisions

Ligare.programming supports a "pluggable," type safe system for loading TOML files into Python objects.

Getting Started

Ligare.programming's configuration system lives in Ligare.programming.config. Its exported types are:

  • AbstractConfig
  • TConfig
  • ConfigBuilder
  • load_config

With those exports in mind, the config system has four major components to be aware of:

  • AbstractConfig implementations
  • Pydantic integration
  • The config builder class ConfigBuilder(Generic[TConfig])
  • The load_config(...) method

AbstractConfig Implementations and Pydantic

The AbstractConfig abstract class is the base class used to refer to "pluggable" config types. All pluggable config types inherit AbstractConfig, and are used in ConfigBuilder(Generic[TConfig]) to create a new type from a set of these types, hereafter referred to as AbstractConfig subclasses.

Config types also inherit the BaseModel type from Pydantic to gain type safety. This ensures that any names, types, and values used to hydrate an AbstractConfig subclass will not contain incorrect data types.

Lastly, an AbstractConfig subclass's name must end with Config, or may be just Config.

An example AbstractConfig subclass might look something like this.

class Config(BaseModel, AbstractConfig):
    a_value: str
    logging: LoggingConfig = LoggingConfig()

    # more on post_load later ...
    @override
    def post_load(self) -> None:
        return super().post_load()

Make note that an AbstractConfig subclass may contain members whose types are also AbstractConfig subclasses, such as LoggingConfig, which looks like this.

class LoggingConfig(BaseModel):
    log_level: str = "INFO"

Loading a TOML File as a Config Type

load_config(
    config_type: type[TConfig],
    toml_file_path: str | Path,
    config_overrides: AnyDict | None = None
) -> TConfig

load_config(...) can be used with any Pydantic type, including the generated type from ConfigBuilder(Generic[TConfig]). 🖝 This means ConfigBuilder(Generic[TConfig]) is only needed to combine AbstractConfig subclasses dynamically, during runtime. load_config(...) takes in an AbstractConfig subclass, or a generated config type, and a path to a TOML file, and optionally any values that should override values contained in the TOML file.

load_config(...) is also responsible for calling post_load() on the AbstractConfig subclass it is given, which it does immediately after instantiating and hydrating the type.

If we consider an example like:

class Config:
    a_value: str
    logging: LoggingConfig = LoggingConfig()

    database: DatabaseConfig

    @override
    def post_load(self) -> None:
        return super().post_load()

We can expect TOML like this to hydrate the class correctly:

a_value = "abc123"

[logging]
log_level = "DEBUG"

[database]
connection_str = "sqlite:///:memory:

To do this, load_config(...) is called as such:

config = load_config(Config, "/path/to/the/file.toml")

These values can then be accessed through the object's attributes:

print(config.a_value) # abc123
print(config.logging.log_level) # DEBUG
print(config.database.connection_str) # sqlite:///:memory:

Using the Config Builder

ConfigBuilder(Generic[TConfig]) is a builder class.

The examples above are just Pydantic classes and they can be used as such. The true configuration system lies in how ConfigBuilder(Generic[TConfig]) takes one or more AbstractConfig subclass and creates a single new type with the same type safe semantics. The purpose of the builder is to create dynamic config types during runtime.

There are three methods available to add AbstractConfig subclasses to the builder:

  • with_root_config(type[TConfig])
  • with_configs(list[type[AbstractConfig]] | None)
  • with_config(type[AbstractConfig])

The build() method returns a new type[TConfig], where the final type contains all of the AbstractConfig subclasses configured with the builder. It operates thusly:

  1. If only a root AbstractConfig subclass is supplied, the returned type is the same root AbstractConfig subclass.
  2. If no AbstractConfig subclasses are supplied, or if any AbstractConfig subclass has an invalid name, the builder stops.
  3. If a root config is not supplied, the first config supplied becomes the root config.
  4. The root AbstractConfig subclass gains an attribute for every other config supplied, whose name is the lower-cased type name of those configs, without the trailing "config".

Using the earlier examples, these usages will all return the same value:

config_type = ConfigBuilder[Config]()\
    .with_root_config(Config)\
    .with_config(LoggingConfig)\
    .build()

# .with_config(LoggingConfig) isn't needed because Config already contains this attribute
config_type = ConfigBuilder[Config]()\
    .with_root_config(Config)\
    .build()

And when involving an additional AbstractConfig subclass:

class DatabaseConfig(BaseModel):
    connection_str: str

config_type = ConfigBuilder[Config]()\
    .with_root_config(Config)\
    .with_config(DatabaseConfig)\
    .build()

# just [DatabaseConfig] works too
config_type = ConfigBuilder[Config]()\
    .with_root_config(Config)\
    .with_configs([LoggingConfig, DatabaseConfig])\
    .build()

config_type = ConfigBuilder[Config]()\
    .with_config(Config)
    .with_config(DatabaseConfig)\
    .build()

config_type = ConfigBuilder[Config]()\
    .with_configs([Config, DatabaseConfig])\
    .build()

Advanced Usage

Varying Attribute Types

It is possible an application needs a configuration type that varies its structure depending on certain factors, such as a value within a BaseModel type instance's own configuration data.

An example of this might be changes in what configuration options are available if a database connection string is for SQLite, or for PostgreSQL. You can find an example of this here.

We accomplish this by creating a class inheriting BaseModel and AbstractConfig, in which we control the expected values of properties of that class inside the class's __init__ method.

In our example, we know we need to vary the expected values of database connection parameters. SQLite contains no extra parameters, while PostgreSQL contains sslmode and options. We create a base class for both database configuration types, which we use as the type of the property on the config class. Because this is a Pydantic class and not an AbstractConfig, we only inherit BaseModel.

from pydantic import BaseModel

# the base type for the SQLite and PostgreSQL config classes
class DatabaseConnectArgsConfig(BaseModel): ...

# the database-specific config classes
class PostgreSQLDatabaseConnectArgsConfig(DatabaseConnectArgsConfig):
    sslmode: str = ""
    options: str = ""
class SQLiteDatabaseConnectArgsConfig(DatabaseConnectArgsConfig): ...

# the config class used to hook into Ligare's config system
class DatabaseConfig(BaseModel, AbstractConfig):
    # the type of the property is the base class of the database-specific classes
    connect_args: DatabaseConnectArgsConfig

We have introduced a subtle bug with the above - Pydantic is intended to validate types being hydrated in a class. We want the properties of the database-specific configs to exist in instances of the base type DatabaseConnectArgsConfig, because the config class with such a property does not know the specific database config type, but still needs the property values. With the above, those properties will not contain the values we want - in fact, there will be no properties for instances of DatabaseConnectArgsConfig. We can remedy this with ConfigDict and the extra parameter.

from pydantic import BaseModel, ConfigDict

class DatabaseConnectArgsConfig(BaseModel):
    # Allow properties of subclasses to populate instances of this base type.
    model_config = ConfigDict(extra="allow")

class PostgreSQLDatabaseConnectArgsConfig(DatabaseConnectArgsConfig):
    # Ignore anything that isn't a property defined on this type.
    model_config = ConfigDict(extra="ignore")
    sslmode: str = ""
    options: str = ""
class SQLiteDatabaseConnectArgsConfig(DatabaseConnectArgsConfig):
    # Ignore anything that isn't a property defined on this type.
    model_config = ConfigDict(extra="ignore")

class DatabaseConfig(BaseModel, AbstractConfig):
    connect_args: DatabaseConnectArgsConfig

Now that we have the database configuration types, we can complete the implementation of DatabaseConfig, which ties everything together

class DatabaseConfig(BaseModel, AbstractConfig):
    def __init__(self, **data: Any):
        # Pydantic will populate `DatabaseConfig`'s properties from `data`
        super().__init__(**data)

        # This class checks the value of `connection_string` to determine
        # what type `connect_args` should be. `connection_string` is populated
        # by Pydantic when `setup().__init__(**data)` is called.
        model_data = self.connect_args.model_dump() if self.connect_args else {}

        # Both instantiations of `SQLite...Config` and PostgreSQL...Config` are
        # passed all data from `DatabaseConfig` `connect_args` property, which
        # is then reassigned with the database specific config type instance.
        if self.connection_string.startswith("sqlite://"):
            self.connect_args = SQLiteDatabaseConnectArgsConfig(**model_data)
        elif self.connection_string.startswith("postgresql://"):
            self.connect_args = PostgreSQLDatabaseConnectArgsConfig(**model_data)

    connection_string: str = "sqlite:///:memory:"
    # `None` is allowed because using `sqlite` doesn't require `connect_args`
    # in the TOML file.
    connect_args: DatabaseConnectArgsConfig | None = None

    @override
    def post_load(self): ...

When DatabaseConfig is instantiated, you will see the following:

>>> DatabaseConfig(connect_args={"sslmode": "foo", "options": "bar"}, connection_string="sqlite://")
DatabaseConfig(connection_string='sqlite://', connect_args=SQLiteDatabaseConnectArgsConfig())

>>> DatabaseConfig(connect_args={"sslmode": "foo", "options": "bar"}, connection_string="postgresql://")
DatabaseConfig(connection_string='postgresql://', connect_args=PostgreSQLDatabaseConnectArgsConfig(sslmode='foo', options='bar'))

>>> DatabaseConfig(connect_args={"sslmode": "foo", "options": "bar"}, connection_string="mssql+pyodbc://")
DatabaseConfig(connection_string='mssql+pyodbc://', connect_args=DatabaseConnectArgsConfig(sslmode='foo', options='bar'))

This, on its own, will load a TOML file that looks like this.

connection_string = "postgresql://"
[connect_args]
sslmode = "require"
options = "-c timezone=utc"

This is fine if all you need is database configuration. However, in the context of an application, we probably want many other types of configuration data. To achieve that, DatabaseConfig can be used as an attribute of another AbstractConfig subclass, like this.

class Config(BaseModel, AbstractConfig):
    database: DatabaseConfig

Now, we can segment database configuration information in the TOML file.

[database]
connection_string = "postgresql://"
[database.connect_args]
sslmode = "require"
options = "-c timezone=utc"

Operating on Data After Hydration

All subclasses of AbstractConfig must implement post_load(self) -> None, though the method may do nothing. The use of this method allows AbstractConfig subclasses to execute anything necessary immediately after the type is hydrated with data from a TOML file. One such instance is found here, where the class updates application environment variables for Flask's internals. This way, Flask can continue using its own configuration system while Ligare's configuration system can be used elsewhere.