Core Components

Benchmark

Bases: BenchmarkEntity

Benchmark class represents a benchmark in the LunaBench system.

This class is responsible for managing benchmark-related operations, including creating and deleting benchmarks. It provides methods for interacting with the benchmark data and executing benchmark runs.

`create(name: str) -> Benchmark` `staticmethod`

Create a new benchmark with the given name.

The name for a benchmark must be unique. The returned Benchmark object can be used to interact and configure the new benchmark.

Parameters:

name (str) –

The name of the new benchmark.

Returns:

Benchmark –

The newly created Benchmark object.

`open(name: str) -> Benchmark` `staticmethod`

Load a benchmark if it exists, otherwise create a new one.

Parameters:

name (str) –

The name of the benchmark.

Returns:

Benchmark –

The loaded or newly created Benchmark object.

`load(name: str) -> Benchmark` `staticmethod`

Load a benchmark from the database by its name.

Parameters:

name (str) –

The name of the benchmark to load.

Returns:

Benchmark –

The loaded Benchmark object.

`load_all() -> list[Benchmark]` `staticmethod`

Load all benchmarks from the database.

Loading all benchmarks from the database can be a slow operation and should be used sparingly.

Returns:

list[Benchmark] –

A list of Benchmark objects representing all benchmarks in the database. If no benchmarks are found, an empty list is returned.

`set_modelset(modelset: str | ModelSet) -> None`

Set the modelset for the benchmark.

This method sets the modelset for the benchmark. Changing the modelset can affect the results of the benchmark. Therfore its recommended to not change the modelset after the benchmark has been created. If it is necessary, the results of the benchmark should be deleted and the benchmark itself should be re-run.

Parameters:

modelset (str | ModelSet) –

Set the modelset for the benchmark to this modelset. It can be the name of the modelset or the modelset itself.

`remove_modelset() -> None`

Remove the modelset from the benchmark.

This method removes the modelset from the benchmark. If the modelset is not set, this method does nothing. After removing the modelset, the results of the benchmark may be invalid.j

`get_feature(name: str) -> FeatureEntity`

Get a feature by its name from a benchmark.

If the feature is not present, an error will be raised.

Parameters:

name (str) –

The name of the feature to be retrieved.

Raises:

DataNotExistError –

Raised if its name couldn't retrieve the feature.

`add_feature(name: str, feature: BaseFeature) -> FeatureEntity`

Add a feature to the benchmark with a given name.

This method adds a feature to the benchmark. The name must be unique within the benchmark. When the benchmark is rerun, the feature will be used to calculate the metrics for each algorithm result.

Also, the feature must be defined in the registry. If this isn't the case, an error will be received. To fix this, please check the documentation on how to do this.

Parameters:

name (str) –

Name of the feature to add.
feature (BaseFeature) –

The feature to add.

Returns:

Feature –

The added feature.

`remove_feature(feature: str | FeatureEntity) -> None`

Remove a feature from the benchmark.

Parameters:

feature (str | FeatureEntity) –

The name of the feature to remove or the feature object itself. Make sure to use the FeatureUserModel object and not only an IFeature object. This is important because the feature name is used to identify the feature.

`get_metric(name: str) -> MetricEntity`

Get a metric by its name from a benchmark.

If the metric is not present, an error will be raised.

Parameters:

name (str) –

The name of the metric to be retrieved.

Raises:

DataNotExistError –

Raised if its name couldn't retrieve the metric.

`add_metric(name: str, metric: BaseMetric) -> MetricEntity`

Add a metric to the benchmark with a given name.

This method adds a metric to the benchmark. The name must be unique within the benchmark. When the benchmark is rerun, the metric will be calculated for each algorithm result.

Also, the metric must be defined in the registry. If this isn't the case, an error will be received. To fix this, please check the documentation on how to do this.

Parameters:

name (str) –

The name of the metric to add.
metric (BaseMetric) –

An instance of the metric to add.

Returns:

Metric –

The added metric.

`remove_metric(metric: str | MetricEntity) -> None`

Remove a metric from the benchmark.

Parameters:

metric (str | MetricEntity) –

The name of the metric to remove or the metric object itself. Make sure to use the MetricUserModel object and not only an IMetric object. This is important because the metric name is used to identify the metric.

`get_algorithm(name: str) -> AlgorithmEntity`

Get an algorithm by its name from a benchmark.

If the algorithm is not present, an error will be raised.

Parameters:

name (str) –

The name of the algorithm to be retrieved.

Raises:

DataNotExistError –

Raised if its name couldn't retrieve the feature.

`add_algorithm(name: str, algorithm: IAlgorithm[Any] | BaseAlgorithmSync | BaseAlgorithmAsync[Any]) -> AlgorithmEntity`

Add an algorithm to the benchmark with a given name.

This method adds an algorithm to the benchmark. The name must be unique within the benchmark. When the benchmark is rerun, the results for this algorithm will be calculated.

Also, the algorithm must be defined in the registry. If this isn't the case, an error will be received. To fix this, please check the documentation on how to do this.

Parameters:

name (str) –

The name of the algorithm to add.
algorithm (IAlgorithm[Any] | BaseAlgorithmSync | BaseAlgorithmAsync[Any]) –

An instance of the algorithm to add.

Returns:

AlgorithmEntity –

The added algorithm.

`remove_algorithm(algorithm: str | AlgorithmEntity) -> None`

Remove an algorithm from the benchmark.

Parameters:

algorithm (str | AlgorithmEntity) –

The name of the algorithm to remove or the algorithm object itself. Make sure to use the AlgorithmUserModel object and not only an IAlgorithm object. This is important because the algorithm name is used to identify the algorithm.

`get_plot(name: str) -> PlotEntity`

Get a plot by its name from a benchmark.

If the plot is not present, an error will be raised.

Parameters:

name (str) –

The name of the algorithm to be retrieved.

Raises:

DataNotExistError –

Raised if its name couldn't retrieve the plot.

`add_plot(name: str, plot: BasePlot) -> PlotEntity`

Add a plot to the benchmark with a given name.

This method adds a plot to the benchmark. The name must be unique within the benchmark. When the benchmark is rerun, the results for this plot will be calculated.

Also, the plot must be defined in the registry. If this isn't the case, an error will be received. To fix this, please check the documentation on how to do this.

Parameters:

name (str) –

The name of the plot to add.
plot (BasePlot) –

The plot to add.

Returns:

Plot –

The added plot.

`remove_plot(plot: str | PlotEntity) -> None`

Remove a plot from the benchmark.

Parameters:

plot (str | Plot) –

The name of the plot to remove or the plot object itself. Make sure to use the Plot object and not only an IPlot object. This is important because the plot name is used to identify the plot.

`run_features() -> None`

Calculate all configured features for all models of this benchmark.

Parameters:

benchmark_run_features –

`run_algorithms() -> None`

Calculate all configured features for all models of this benchmark.

`run_plots() -> None`

Execute all plots registered in the benchmark.

Iterates through all plots in the benchmark, validates each plot against the benchmark data, and executes the plot generation. Each plot is validated before execution to ensure required data (metrics, features, etc.) is available. Plot execution is sequential and follows the order defined in the benchmark configuration.

Raises:

RuntimeError –

If plot validation or execution fails. The RuntimeError wraps the underlying error, which may be PlotRunError (for validation failures) or UnknownLunaBenchError (for unexpected execution errors). Only raised in FAIL_ON_ERROR mode; in CONTINUE_ON_ERROR mode, errors are logged as warnings instead.

Notes

In FAIL_ON_ERROR mode, the method stops at the first validation or execution error. In CONTINUE_ON_ERROR mode, errors are logged and execution continues with remaining plots.

`add_dependencies() -> None`

Add any required dependencies for the benchmark execution.

`run() -> None`

Execute the benchmark.

`results_to_dataframe(*, inlcude_solution: bool = False) -> pd.DataFrame`

Return all benchmark results as a single DataFrame.

Builds individual DataFrames for each feature (see .features_to_dataframe()), algorithm (see .algorithms_to_dataframe), and metric entity (see .metrics_to_dataframe()), then merges them. Features merge on model, metrics merge on (algorithm, model). Feature values are repeated across algorithms for the same model since features are model-level.

Returns:

DataFrame –

A DataFrame with columns algorithm, model, plus one column per result field of each feature and metric.

`features_to_dataframe(feature_entity: FeatureEntity) -> pd.DataFrame`

Return results for a single feature entity as a DataFrame with one row per model.

`all_features_to_dataframe() -> pd.DataFrame`

Return all feature results merged into a single DataFrame on model.

`metrics_to_dataframe(metric_entity: MetricEntity) -> pd.DataFrame`

Return results for a single metric entity as a DataFrame with one row per (algorithm, model).

`all_metrics_to_dataframe() -> pd.DataFrame`

Return all metric results merged into a single DataFrame on (algorithm, model).

`algorithms_to_dataframe(exclude: set[str] | None = None) -> pd.DataFrame`

Return all algorithm (algorithm, model) combinations as a DataFrame.

`list_feature_classes() -> list[type[BaseFeature]]`

Return the feature classes registered on this benchmark.

`list_metrics_classes() -> list[type[BaseMetric]]`

Return the metric classes registered on this benchmark.

`list_plots_classes() -> list[type[BasePlot]]`

Return the plot classes registered on this benchmark.

`list_algorithms() -> list[tuple[type[BaseAlgorithmSync | BaseAlgorithmAsync[Any]], dict[str, Any]]]`

Return the algorithm classes registered on this benchmark.

ModelSet

Bases: ModelSetEntity

Set of models.

Represents a collection of models with operations for creating, loading, adding, removing, and deleting models.

Attributes:

id (int) –

The unique identifier for the model set.
name (str) –

The name of the model set.
models (list[ModelMetadata]) –

A list of ModelData objects representing the models in this set.

`create(modelset_name: str) -> ModelSet` `staticmethod`

Create a new model set with the given dataset name.

Creates a new model set using the provided dataset name and a model set creation use case.

Parameters:

modelset_name (str) –

The name of the dataset.
modelset_create ((ModelSetCreateUc, injected)) –

The use case for creating model sets, by default, it's provided by dependency injection.

Returns:

ModelSet –

An instance of ModelSet representing the successfully created model set.

`load(name: str) -> ModelSet` `staticmethod`

Load a model set by its ID.

Retrieves a model set from the database using its unique identifier.

Parameters:

name (str) –

The unique name of the model set to load.
modelset_load ((ModelSetLoadUc, injected)) –

The use case for loading model sets, by default provided by dependency injection.

Returns:

ModelSet –

The loaded model set.

`load_all() -> list[ModelSet]` `staticmethod`

Load all model sets from the database.

Retrieves all model sets stored in the database.

Returns:

list[ModelSet] –

A list of all model sets.

`load_all_models() -> list[ModelMetadata]` `staticmethod`

Load all models from the database.

Retrieves all models stored in the database, regardless of which model set they belong to.

Parameters:

model_all ((ModelAllUc, injected)) –

The use case for retrieving all models, by default provided by dependency injection.

Returns:

list[ModelMetadata] –

A list of ModelData objects representing all models in the database.

`add(model: Model) -> None`

Add a model to this model set.

Adds the specified model to this model set and updates the model set's state.

Parameters:

model (Model) –

The model to add to this model set.
modelset_add ((ModelSetAddUc, injected)) –

The use case for adding models to a model set, by default provided by dependency injection.

`remove_model(model: Model) -> None`

Remove a model from this model set.

Removes the specified model from this model set and updates the model set's state.

Parameters:

model (Model) –

The model to remove from this model set.
modelset_remove ((ModelSetRemoveUc, injected)) –

The use case for removing models from a model set, by default provided by dependency injection.

`delete() -> None`

Delete this model set from the database.

Permanently removes this model set from the database.

Parameters:

modelset_delete_uc ((ModelSetDeleteUc, injected)) –

The use case for deleting model sets, by default provided by dependency injection.

Core Components

Benchmark

create(name: str) -> Benchmark staticmethod

open(name: str) -> Benchmark staticmethod

load(name: str) -> Benchmark staticmethod

load_all() -> list[Benchmark] staticmethod

set_modelset(modelset: str | ModelSet) -> None

remove_modelset() -> None

get_feature(name: str) -> FeatureEntity

add_feature(name: str, feature: BaseFeature) -> FeatureEntity

remove_feature(feature: str | FeatureEntity) -> None

get_metric(name: str) -> MetricEntity

add_metric(name: str, metric: BaseMetric) -> MetricEntity

remove_metric(metric: str | MetricEntity) -> None

get_algorithm(name: str) -> AlgorithmEntity

add_algorithm(name: str, algorithm: IAlgorithm[Any] | BaseAlgorithmSync | BaseAlgorithmAsync[Any]) -> AlgorithmEntity

remove_algorithm(algorithm: str | AlgorithmEntity) -> None

get_plot(name: str) -> PlotEntity

add_plot(name: str, plot: BasePlot) -> PlotEntity

remove_plot(plot: str | PlotEntity) -> None

run_features() -> None

run_algorithms() -> None

run_plots() -> None

add_dependencies() -> None

run() -> None

results_to_dataframe(*, inlcude_solution: bool = False) -> pd.DataFrame

features_to_dataframe(feature_entity: FeatureEntity) -> pd.DataFrame

all_features_to_dataframe() -> pd.DataFrame

metrics_to_dataframe(metric_entity: MetricEntity) -> pd.DataFrame

all_metrics_to_dataframe() -> pd.DataFrame

algorithms_to_dataframe(exclude: set[str] | None = None) -> pd.DataFrame

list_feature_classes() -> list[type[BaseFeature]]

list_metrics_classes() -> list[type[BaseMetric]]

list_plots_classes() -> list[type[BasePlot]]

list_algorithms() -> list[tuple[type[BaseAlgorithmSync | BaseAlgorithmAsync[Any]], dict[str, Any]]]

ModelSet

create(modelset_name: str) -> ModelSet staticmethod

load(name: str) -> ModelSet staticmethod

load_all() -> list[ModelSet] staticmethod

load_all_models() -> list[ModelMetadata] staticmethod

add(model: Model) -> None

remove_model(model: Model) -> None

delete() -> None

`create(name: str) -> Benchmark` `staticmethod`

`open(name: str) -> Benchmark` `staticmethod`

`load(name: str) -> Benchmark` `staticmethod`

`load_all() -> list[Benchmark]` `staticmethod`

`set_modelset(modelset: str | ModelSet) -> None`

`remove_modelset() -> None`

`get_feature(name: str) -> FeatureEntity`

`add_feature(name: str, feature: BaseFeature) -> FeatureEntity`

`remove_feature(feature: str | FeatureEntity) -> None`

`get_metric(name: str) -> MetricEntity`

`add_metric(name: str, metric: BaseMetric) -> MetricEntity`

`remove_metric(metric: str | MetricEntity) -> None`

`get_algorithm(name: str) -> AlgorithmEntity`

`add_algorithm(name: str, algorithm: IAlgorithm[Any] | BaseAlgorithmSync | BaseAlgorithmAsync[Any]) -> AlgorithmEntity`

`remove_algorithm(algorithm: str | AlgorithmEntity) -> None`

`get_plot(name: str) -> PlotEntity`

`add_plot(name: str, plot: BasePlot) -> PlotEntity`

`remove_plot(plot: str | PlotEntity) -> None`

`run_features() -> None`

`run_algorithms() -> None`

`run_plots() -> None`

`add_dependencies() -> None`

`run() -> None`

`results_to_dataframe(*, inlcude_solution: bool = False) -> pd.DataFrame`

`features_to_dataframe(feature_entity: FeatureEntity) -> pd.DataFrame`

`all_features_to_dataframe() -> pd.DataFrame`

`metrics_to_dataframe(metric_entity: MetricEntity) -> pd.DataFrame`

`all_metrics_to_dataframe() -> pd.DataFrame`

`algorithms_to_dataframe(exclude: set[str] | None = None) -> pd.DataFrame`

`list_feature_classes() -> list[type[BaseFeature]]`

`list_metrics_classes() -> list[type[BaseMetric]]`

`list_plots_classes() -> list[type[BasePlot]]`

`list_algorithms() -> list[tuple[type[BaseAlgorithmSync | BaseAlgorithmAsync[Any]], dict[str, Any]]]`

`create(modelset_name: str) -> ModelSet` `staticmethod`

`load(name: str) -> ModelSet` `staticmethod`

`load_all() -> list[ModelSet]` `staticmethod`

`load_all_models() -> list[ModelMetadata]` `staticmethod`

`add(model: Model) -> None`

`remove_model(model: Model) -> None`

`delete() -> None`