Architecture
Luna Bench follows a layered architecture that separates the public API from internal implementation details. This page describes the key architectural decisions, the registration system, the pipeline execution model, and the persistence layer.
Package Structure
The codebase is organized into two distinct layers: a public API layer that users interact with directly, and an internal layer that contains implementation details.
graph TB
subgraph "Public API Layer"
core["Benchmark, ModelSet\n(top-level)"]
builtins["algorithms/ features/ metrics/ plots/\nBuilt-in components"]
custom["custom/\nDecorators, base classes, result types"]
entities["entities/\nData transfer objects"]
end
subgraph "Internal Layer (_internal/)"
usecases["usecases/\nBusiness logic orchestration"]
mappers["mappers/\nEntity-to-domain and domain-to-entity conversion"]
registries["registries/\nThread-safe component registries"]
dao["dao/\nData access objects (Peewee ORM)"]
domain_models["domain_models/\nInternal domain representations"]
wrappers["wrappers/\nAdapter wrappers for external libraries"]
end
core --> usecases
builtins --> custom
custom --> registries
usecases --> mappers
usecases --> dao
usecases --> registries
mappers --> domain_models
dao --> domain_models
Public vs. Internal
Everything under _internal/ is considered an implementation detail. Users should only depend on the public surface: the top-level Benchmark/ModelSet, the built-in component packages (algorithms, features, metrics, plots), custom, and entities. The internal layer may change between versions without notice.
Public API Layer
| Package | Purpose |
|---|---|
Top level (benchmark.py, model_set.py) |
High-level entry points: Benchmark and ModelSet |
algorithms/, features/, metrics/, plots/ |
Built-in components, imported directly (e.g. from luna_bench.metrics import Runtime) |
custom/ |
Decorators (@feature, @algorithm, @metric, @plot), base classes (BaseFeature, BaseMetric, ...), and result types for building custom components |
entities/ |
Data transfer objects used to pass results between pipeline stages |
Internal Layer
| Package | Purpose |
|---|---|
usecases/ |
Orchestrates business logic for each pipeline stage and for benchmark/model set management |
mappers/ |
Converts between public entities and internal domain models |
registries/ |
Thread-safe registries that store registered component classes |
dao/ |
Data access objects backed by Peewee ORM for SQLite persistence |
domain_models/ |
Internal representations used by usecases and DAOs |
wrappers/ |
Adapters around external libraries and services |
Registration System
Luna Bench uses a decorator-based registration system backed by metaclasses. When you define a custom component, you decorate it with one of the provided decorators, and it is automatically registered in a thread-safe registry.
sequenceDiagram
participant User as User Code
participant Dec as @feature / @algorithm / ...
participant Meta as RegisteredClassMeta / MetricClassMeta
participant Reg as Registry
User->>Dec: Decorate class with @feature
Dec->>Meta: Metaclass __init_subclass__ triggers
Meta->>Reg: Register class in thread-safe registry
Note over Reg: Class is now discoverable<br/>via RegistryInfo
User->>Reg: RegistryInfo.log_registered_features()
Reg-->>User: Returns registered feature IDs
Decorators
The four registration decorators are:
@feature: Registers aBaseFeaturesubclass@algorithm: Registers aBaseAlgorithmSyncorBaseAlgorithmAsyncsubclass@metric: Registers aBaseMetricsubclass@plot: Registers aBasePlotsubclass
Metaclasses
Two metaclasses power the registration:
RegisteredClassMeta: Used by features, algorithms, and plots. Intercepts class creation to insert the class into the appropriate registry.MetricClassMeta: A specialized variant for metrics that handles additional metric-specific metadata.
Inspecting the registries
RegistryInfo (in luna_bench.custom) reports what is currently registered. Each method logs the IDs and returns them as a list:
| Method | Returns |
|---|---|
RegistryInfo.log_registered_features() |
registered feature IDs |
RegistryInfo.log_registered_sync() |
registered synchronous algorithm IDs |
RegistryInfo.log_registered_algorithms_async() |
registered asynchronous algorithm IDs |
RegistryInfo.log_registered_metrics() |
registered metric IDs |
RegistryInfo.log_registered_plots() |
registered plot IDs |
RegistryInfo.log_registry_contents() |
logs every registry at once |
Thread Safety
All registry operations are thread-safe. You can register components from multiple threads without risk of data races or lost registrations.
Pipeline Execution
The benchmark runs a four-stage pipeline. Each stage feeds its results into the next. The stages always execute in order: Features, Algorithms, Metrics, Plots.
flowchart TD
Start([bench.run]) --> FeaturesStage
subgraph FeaturesStage ["Stage 1: Features"]
direction LR
F1[For each model in ModelSet]
F2[Run all registered features]
F3[Store FeatureResult per model]
F1 --> F2 --> F3
end
FeaturesStage --> AlgorithmsStage
subgraph AlgorithmsStage ["Stage 2: Algorithms"]
direction LR
A1["Cartesian product:\nalgorithm x model"]
A2[Run algorithm on model]
A3[Store AlgorithmResult per pair]
A1 --> A2 --> A3
end
AlgorithmsStage --> MetricsStage
subgraph MetricsStage ["Stage 3: Metrics"]
direction LR
M1["For each (algorithm, model) pair"]
M2[Evaluate all registered metrics]
M3[Store MetricResult per triple]
M1 --> M2 --> M3
end
MetricsStage --> PlotsStage
subgraph PlotsStage ["Stage 4: Plots"]
direction LR
P1[Collect all feature and metric data]
P2[Run all registered plots]
P3[Generate and store visualizations]
P1 --> P2 --> P3
end
PlotsStage --> End([Pipeline complete])
Stage Details
Stage 1: Features. Feature extractors run once per model in the model set. They compute properties such as variable count, constraint count, or problem structure. Feature results are persisted and available to later stages.
Stage 2: Algorithms. Every registered algorithm runs on every model (cartesian product). Synchronous algorithms (BaseAlgorithmSync) execute in-process. Asynchronous algorithms (BaseAlgorithmAsync) submit jobs to a background task queue and poll for results.
Stage 3: Metrics. For each (algorithm, model) result pair, all registered metrics are evaluated. Metrics compute values like approximation ratio, runtime, feasibility rate, or time-to-solution.
Stage 4: Plots. Plot generators receive the full set of feature and metric results. They produce visualizations (e.g., comparison charts, scaling plots) and store them alongside the benchmark data.
Selective Execution
You can run individual stages instead of the full pipeline. The Benchmark class exposes methods for running each stage independently, which is useful during development or when re-generating plots after adjusting configurations.
Dependency Injection
Luna Bench uses a container-based dependency injection system. Internal components receive their dependencies through containers rather than constructing them directly. This makes the system testable and allows swapping implementations.
The framework defines the following containers:
| Container | Provides |
|---|---|
UsecaseContainer |
Business logic orchestrators for each pipeline stage |
RegistryContainer |
Access to all component registries |
MapperContainer |
Entity-to-domain and domain-to-entity mappers |
DaoContainer |
Data access objects for SQLite persistence |
BackgroundTaskContainer |
Job queue and worker management for async algorithms |
Dependencies are wired using the @inject decorator, which resolves dependencies from the appropriate container at call time.
Internal Detail
The dependency injection system is part of the internal layer. Users do not need to interact with containers directly. They are documented here for contributors and for understanding the internal architecture.
Database
Luna Bench persists all benchmark configurations and results in SQLite databases managed through the Peewee ORM. Two separate database files are used:
| Database File | Purpose |
|---|---|
luna_bench.db |
Main database: benchmarks, model sets, models, components, and all results |
luna_bench-jobs.db |
Job queue database: tracks async algorithm job state and scheduling |
Schema Overview
The main database contains the following tables:
- Benchmark: Benchmark configuration and metadata
- ModelSet: Named collections of models
- Model: Individual optimization models belonging to a model set
- Algorithm: Registered algorithm instances and their configurations
- Feature: Registered feature instances
- Metric: Registered metric instances
- Plot: Registered plot instances
- Result tables: Store outputs from each pipeline stage (feature results, algorithm results, metric results, plot artifacts)
Automatic Management
Database creation and schema migrations are handled automatically. Users do not need to manage the database directly. The Benchmark.create() and ModelSet.create() factory methods handle all persistence setup.
Async Algorithm Support
For algorithms that call cloud services or quantum hardware, Luna Bench provides BaseAlgorithmAsync. Async algorithms submit work to a Huey-based job queue rather than blocking the main thread.
Key characteristics:
- Subclass
BaseAlgorithmAsyncinstead ofBaseAlgorithmSyncto define an async algorithm. - Jobs are submitted to a background task queue backed by Huey and the
luna_bench-jobs.dbSQLite database. - A configurable pool of workers (default: 10) processes jobs concurrently.
- The pipeline automatically waits for all async jobs to complete before proceeding to the Metrics stage.
- Worker count can be adjusted based on API rate limits or hardware constraints.
When to Use Async
Use BaseAlgorithmAsync when your algorithm calls an external service with non-trivial latency (cloud solvers, quantum hardware APIs). For in-process solvers like SCIP or Gurobi, use the synchronous BaseAlgorithmSync base class instead.