Core Components
Benchmark
Bases: BenchmarkEntity
Benchmark class represents a benchmark in the LunaBench system.
This class is responsible for managing benchmark-related operations, including creating and deleting benchmarks. It provides methods for interacting with the benchmark data and executing benchmark runs.
create(name: str) -> Benchmark
staticmethod
open(name: str) -> Benchmark
staticmethod
load(name: str) -> Benchmark
staticmethod
load_all() -> list[Benchmark]
staticmethod
set_modelset(modelset: str | ModelSet) -> None
Set the modelset for the benchmark.
This method sets the modelset for the benchmark. Changing the modelset can affect the results of the benchmark. Therfore its recommended to not change the modelset after the benchmark has been created. If it is necessary, the results of the benchmark should be deleted and the benchmark itself should be re-run.
Parameters:
remove_modelset() -> None
Remove the modelset from the benchmark.
This method removes the modelset from the benchmark. If the modelset is not set, this method does nothing. After removing the modelset, the results of the benchmark may be invalid.j
get_feature(name: str) -> FeatureEntity
Get a feature by its name from a benchmark.
If the feature is not present, an error will be raised.
Parameters:
-
name(str) –The name of the feature to be retrieved.
Raises:
-
DataNotExistError–Raised if its name couldn't retrieve the feature.
add_feature(name: str, feature: BaseFeature) -> FeatureEntity
Add a feature to the benchmark with a given name.
This method adds a feature to the benchmark. The name must be unique within the benchmark. When the benchmark is rerun, the feature will be used to calculate the metrics for each algorithm result.
Also, the feature must be defined in the registry. If this isn't the case, an error will be received. To fix this, please check the documentation on how to do this.
Parameters:
-
name(str) –Name of the feature to add.
-
feature(BaseFeature) –The feature to add.
Returns:
-
Feature–The added feature.
remove_feature(feature: str | FeatureEntity) -> None
Remove a feature from the benchmark.
Parameters:
-
feature(str | FeatureEntity) –The name of the feature to remove or the feature object itself. Make sure to use the
FeatureUserModelobject and not only anIFeatureobject. This is important because the feature name is used to identify the feature.
get_metric(name: str) -> MetricEntity
Get a metric by its name from a benchmark.
If the metric is not present, an error will be raised.
Parameters:
-
name(str) –The name of the metric to be retrieved.
Raises:
-
DataNotExistError–Raised if its name couldn't retrieve the metric.
add_metric(name: str, metric: BaseMetric) -> MetricEntity
Add a metric to the benchmark with a given name.
This method adds a metric to the benchmark. The name must be unique within the benchmark. When the benchmark is rerun, the metric will be calculated for each algorithm result.
Also, the metric must be defined in the registry. If this isn't the case, an error will be received. To fix this, please check the documentation on how to do this.
Parameters:
-
name(str) –The name of the metric to add.
-
metric(BaseMetric) –An instance of the metric to add.
Returns:
-
Metric–The added metric.
remove_metric(metric: str | MetricEntity) -> None
Remove a metric from the benchmark.
Parameters:
-
metric(str | MetricEntity) –The name of the metric to remove or the metric object itself. Make sure to use the
MetricUserModelobject and not only anIMetricobject. This is important because the metric name is used to identify the metric.
get_algorithm(name: str) -> AlgorithmEntity
Get an algorithm by its name from a benchmark.
If the algorithm is not present, an error will be raised.
Parameters:
-
name(str) –The name of the algorithm to be retrieved.
Raises:
-
DataNotExistError–Raised if its name couldn't retrieve the feature.
add_algorithm(name: str, algorithm: IAlgorithm[Any] | BaseAlgorithmSync | BaseAlgorithmAsync[Any]) -> AlgorithmEntity
Add an algorithm to the benchmark with a given name.
This method adds an algorithm to the benchmark. The name must be unique within the benchmark. When the benchmark is rerun, the results for this algorithm will be calculated.
Also, the algorithm must be defined in the registry. If this isn't the case, an error will be received. To fix this, please check the documentation on how to do this.
Parameters:
-
name(str) –The name of the algorithm to add.
-
algorithm(IAlgorithm[Any] | BaseAlgorithmSync | BaseAlgorithmAsync[Any]) –An instance of the algorithm to add.
Returns:
-
AlgorithmEntity–The added algorithm.
remove_algorithm(algorithm: str | AlgorithmEntity) -> None
Remove an algorithm from the benchmark.
Parameters:
-
algorithm(str | AlgorithmEntity) –The name of the algorithm to remove or the algorithm object itself. Make sure to use the
AlgorithmUserModelobject and not only anIAlgorithmobject. This is important because the algorithm name is used to identify the algorithm.
get_plot(name: str) -> PlotEntity
Get a plot by its name from a benchmark.
If the plot is not present, an error will be raised.
Parameters:
-
name(str) –The name of the algorithm to be retrieved.
Raises:
-
DataNotExistError–Raised if its name couldn't retrieve the plot.
add_plot(name: str, plot: BasePlot) -> PlotEntity
Add a plot to the benchmark with a given name.
This method adds a plot to the benchmark. The name must be unique within the benchmark. When the benchmark is rerun, the results for this plot will be calculated.
Also, the plot must be defined in the registry. If this isn't the case, an error will be received. To fix this, please check the documentation on how to do this.
Parameters:
Returns:
-
Plot–The added plot.
remove_plot(plot: str | PlotEntity) -> None
Remove a plot from the benchmark.
Parameters:
-
plot(str | Plot) –The name of the plot to remove or the plot object itself. Make sure to use the
Plotobject and not only anIPlotobject. This is important because the plot name is used to identify the plot.
run_features() -> None
Calculate all configured features for all models of this benchmark.
Parameters:
-
benchmark_run_features–
run_algorithms() -> None
Calculate all configured features for all models of this benchmark.
run_plots() -> None
Execute all plots registered in the benchmark.
Iterates through all plots in the benchmark, validates each plot against the benchmark data, and executes the plot generation. Each plot is validated before execution to ensure required data (metrics, features, etc.) is available. Plot execution is sequential and follows the order defined in the benchmark configuration.
Raises:
-
RuntimeError–If plot validation or execution fails. The RuntimeError wraps the underlying error, which may be PlotRunError (for validation failures) or UnknownLunaBenchError (for unexpected execution errors). Only raised in FAIL_ON_ERROR mode; in CONTINUE_ON_ERROR mode, errors are logged as warnings instead.
Notes
In FAIL_ON_ERROR mode, the method stops at the first validation or execution error. In CONTINUE_ON_ERROR mode, errors are logged and execution continues with remaining plots.
add_dependencies() -> None
Add any required dependencies for the benchmark execution.
run() -> None
Execute the benchmark.
results_to_dataframe(*, inlcude_solution: bool = False) -> pd.DataFrame
Return all benchmark results as a single DataFrame.
Builds individual DataFrames for each feature (see .features_to_dataframe()), algorithm
(see .algorithms_to_dataframe), and metric entity (see .metrics_to_dataframe()),
then merges them. Features merge on model, metrics merge on
(algorithm, model). Feature values are repeated across algorithms for the
same model since features are model-level.
Returns:
-
DataFrame–A DataFrame with columns
algorithm,model, plus one column per result field of each feature and metric.
features_to_dataframe(feature_entity: FeatureEntity) -> pd.DataFrame
Return results for a single feature entity as a DataFrame with one row per model.
all_features_to_dataframe() -> pd.DataFrame
Return all feature results merged into a single DataFrame on model.
metrics_to_dataframe(metric_entity: MetricEntity) -> pd.DataFrame
Return results for a single metric entity as a DataFrame with one row per (algorithm, model).
all_metrics_to_dataframe() -> pd.DataFrame
Return all metric results merged into a single DataFrame on (algorithm, model).
algorithms_to_dataframe(exclude: set[str] | None = None) -> pd.DataFrame
Return all algorithm (algorithm, model) combinations as a DataFrame.
list_feature_classes() -> list[type[BaseFeature]]
Return the feature classes registered on this benchmark.
list_metrics_classes() -> list[type[BaseMetric]]
Return the metric classes registered on this benchmark.
list_plots_classes() -> list[type[BasePlot]]
Return the plot classes registered on this benchmark.
list_algorithms() -> list[tuple[type[BaseAlgorithmSync | BaseAlgorithmAsync[Any]], dict[str, Any]]]
Return the algorithm classes registered on this benchmark.
ModelSet
Bases: ModelSetEntity
Set of models.
Represents a collection of models with operations for creating, loading, adding, removing, and deleting models.
Attributes:
-
id(int) –The unique identifier for the model set.
-
name(str) –The name of the model set.
-
models(list[ModelMetadata]) –A list of ModelData objects representing the models in this set.
create(modelset_name: str) -> ModelSet
staticmethod
Create a new model set with the given dataset name.
Creates a new model set using the provided dataset name and a model set creation use case.
Parameters:
-
modelset_name(str) –The name of the dataset.
-
modelset_create((ModelSetCreateUc, injected)) –The use case for creating model sets, by default, it's provided by dependency injection.
Returns:
-
ModelSet–An instance of ModelSet representing the successfully created model set.
load(name: str) -> ModelSet
staticmethod
Load a model set by its ID.
Retrieves a model set from the database using its unique identifier.
Parameters:
-
name(str) –The unique name of the model set to load.
-
modelset_load((ModelSetLoadUc, injected)) –The use case for loading model sets, by default provided by dependency injection.
Returns:
-
ModelSet–The loaded model set.
load_all() -> list[ModelSet]
staticmethod
load_all_models() -> list[ModelMetadata]
staticmethod
Load all models from the database.
Retrieves all models stored in the database, regardless of which model set they belong to.
Parameters:
-
model_all((ModelAllUc, injected)) –The use case for retrieving all models, by default provided by dependency injection.
Returns:
-
list[ModelMetadata]–A list of ModelData objects representing all models in the database.
add(model: Model) -> None
Add a model to this model set.
Adds the specified model to this model set and updates the model set's state.
Parameters:
-
model(Model) –The model to add to this model set.
-
modelset_add((ModelSetAddUc, injected)) –The use case for adding models to a model set, by default provided by dependency injection.
remove_model(model: Model) -> None
Remove a model from this model set.
Removes the specified model from this model set and updates the model set's state.
Parameters:
-
model(Model) –The model to remove from this model set.
-
modelset_remove((ModelSetRemoveUc, injected)) –The use case for removing models from a model set, by default provided by dependency injection.
delete() -> None
Delete this model set from the database.
Permanently removes this model set from the database.
Parameters:
-
modelset_delete_uc((ModelSetDeleteUc, injected)) –The use case for deleting model sets, by default provided by dependency injection.