Metrics

Overview

Metrics evaluate the quality and characteristics of algorithm solutions produced during a benchmark run. Each metric runs once per (algorithm, model) pair, receiving the Solution object together with any previously computed feature results. Metrics are the primary mechanism for quantifying solver performance in Luna Bench.

All metrics subclass BaseMetric (from luna_bench.custom) and implement a single method:

Python

def run(self, solution: Solution, feature_results: FeatureResultContainer) -> MetricResult

Metrics are registered with the @metric decorator and declare their feature dependencies by passing the feature class(es) straight to that decorator.

Built-in Metrics

Luna Bench ships with six built-in metrics covering the most common evaluation scenarios.

Runtime

Module: luna_bench.metrics.runtime

Captures the total wall-clock runtime of the algorithm in seconds.

Returns: RuntimeResult(runtime_seconds: float)

FeasibilityRatio

Module: luna_bench.metrics.feasbility_ratio

Computes the fraction of feasible solutions in the sample set. A value of 1.0 indicates that every sampled solution satisfies all constraints.

Returns: FeasibilityRatioResult(feasibility_ratio: float)

ApproximationRatio

Module: luna_bench.metrics.approximation_ratio

Computes the ratio of the solution quality relative to the known optimal.

Requires: OptSolFeature

Returns: ApproximationRatioResult(approximation_ratio: float)

Parameters:

Name	Type	Default	Description
`abt_diff`	`float`	`1e-3`	Absolute tolerance used in ratio comparison.

BestSolutionFound

Module: luna_bench.metrics.best_solution_found

Records the best objective function value found by the algorithm.

Requires: OptSolFeature

Returns: BestSolutionFoundResult(best_solution_found: float)

Parameters:

Name	Type	Default	Description
`abs_tol`	`float`	`1e-3`	Absolute tolerance for treating a value as zero (avoids divide-by-zero).

TimeToSolution

Module: luna_bench.metrics.time_to_solution

Estimates how long the solver needs to find the optimal solution with high probability, based on the fraction of samples that reached the optimum.

Requires: OptSolFeature

Returns: TimeToSolutionResult(time_to_solution: float, probability_optimal: float, num_optimal_found: int, num_samples: int)

Parameters:

Name	Type	Default	Description
`target_probability`	`float`	`0.99`	Target probability of finding the optimal solution.
`abs_tol`	`float`	`1e-6`	Absolute tolerance for comparing objective values.

FractionOfOverallBestSolution

Module: luna_bench.metrics.fraction_of_overall_best_solution

Computes the fraction of the solution quality relative to the best solution found across all algorithms in the benchmark, enabling cross-solver comparison.

Requires: OptSolFeature

Returns: FractionOfOverallBestSolutionResult(fraction_of_overall_best_solution: float)

Parameters:

Name	Type	Default	Description
`abs_tol`	`float`	`1e-6`	Absolute tolerance for treating two values as equal.

Feature Dependencies

Metrics can depend on features that must be computed before the metric runs. Declare those dependencies by passing the feature class (or a list of classes) to the @metric decorator:

Python

from luna_bench.custom import BaseMetric, metric
from luna_bench.features import OptSolFeature

@metric(OptSolFeature)
class MyMetric(BaseMetric[MyMetricResult]):
    ...

When a metric declares dependencies, Luna Bench guarantees those features have been computed and their results are available in the FeatureResultContainer passed to run().

Accessing feature results

FeatureResultContainer offers a few ways to retrieve computed feature data:

Python

# The single result for a feature class
result = feature_results.first(OptSolFeature)

# A specific result by feature name
result = feature_results.get(OptSolFeature, "optimal-solution")

# Every result for a feature class, keyed by name
all_results = feature_results.get_all(OptSolFeature)

Each method has a *_with_config variant (first_with_config, get_with_config, get_all_with_config) that also returns the feature instance that produced the result.

Writing Custom Metrics

You can write a custom metric as a class or as a plain function. The class form gives you a typed result and configuration parameters; the function form is a quick way to return a single value. The two paths differ, so pick a tab and follow its steps:

FunctionClass

Write the Function

Decorate a function that takes (solution, feature_results) and returns a value. A float or int is auto-wrapped in a MetricResult, so there is no separate result type to define. Declare feature dependencies by passing their classes to @metric, just as in the class form:

Python

from luna_bench.custom import FeatureResultContainer, metric
from luna_bench.features import OptSolFeature
from luna_quantum import Solution

@metric(OptSolFeature)
def my_metric(solution: Solution, feature_results: FeatureResultContainer) -> float:
    opt_sol = feature_results.first(OptSolFeature)
    return solution.expectation_value() / opt_sol.best_sol  # auto-wrapped in a MetricResult

Define a Result Type

Every metric returns an instance of a MetricResult subclass. Define the fields that your metric produces:

Python

from luna_bench.custom import MetricResult

class MyMetricResult(MetricResult):
    score: float

Implement the Metric

Subclass BaseMetric, implement run, and register with @metric. If the metric depends on features, pass them to the decorator:

Python

from luna_bench.custom import BaseMetric, FeatureResultContainer, metric
from luna_bench.features import OptSolFeature
from luna_quantum import Solution

@metric(OptSolFeature)
class MyMetric(BaseMetric[MyMetricResult]):
    threshold: float = 0.5

    def run(self, solution: Solution, feature_results: FeatureResultContainer) -> MyMetricResult:
        opt_sol = feature_results.first(OptSolFeature)
        score = solution.expectation_value() / opt_sol.best_sol
        return MyMetricResult(score=score)

Configuration Parameters

Metrics are Pydantic models, so configuration parameters are class-level attributes with default values. In the example above, threshold: float = 0.5 can be overridden when adding the metric to a benchmark.

Adding Metrics to a Benchmark

Once a metric is defined and registered, include it in your benchmark configuration. The framework resolves feature dependencies automatically: any features you passed to @metric are scheduled to run before the metric.

Refer to the Getting Started guide for details on how to wire metrics into a full benchmark run.