Skip to content

Architecture

Luna Bench follows a layered architecture that separates the public API from internal implementation details. This page describes the key architectural decisions, the registration system, the pipeline execution model, and the persistence layer.

Package Structure

The codebase is organized into two distinct layers: a public API layer that users interact with directly, and an internal layer that contains implementation details.

graph TB
    subgraph "Public API Layer"
        core["Benchmark, ModelSet\n(top-level)"]
        builtins["algorithms/ features/ metrics/ plots/\nBuilt-in components"]
        custom["custom/\nDecorators, base classes, result types"]
        entities["entities/\nData transfer objects"]
    end

    subgraph "Internal Layer (_internal/)"
        usecases["usecases/\nBusiness logic orchestration"]
        mappers["mappers/\nEntity-to-domain and domain-to-entity conversion"]
        registries["registries/\nThread-safe component registries"]
        dao["dao/\nData access objects (Peewee ORM)"]
        domain_models["domain_models/\nInternal domain representations"]
        wrappers["wrappers/\nAdapter wrappers for external libraries"]
    end

    core --> usecases
    builtins --> custom
    custom --> registries
    usecases --> mappers
    usecases --> dao
    usecases --> registries
    mappers --> domain_models
    dao --> domain_models

Public vs. Internal

Everything under _internal/ is considered an implementation detail. Users should only depend on the public surface: the top-level Benchmark/ModelSet, the built-in component packages (algorithms, features, metrics, plots), custom, and entities. The internal layer may change between versions without notice.

Public API Layer

Package Purpose
Top level (benchmark.py, model_set.py) High-level entry points: Benchmark and ModelSet
algorithms/, features/, metrics/, plots/ Built-in components, imported directly (e.g. from luna_bench.metrics import Runtime)
custom/ Decorators (@feature, @algorithm, @metric, @plot), base classes (BaseFeature, BaseMetric, ...), and result types for building custom components
entities/ Data transfer objects used to pass results between pipeline stages

Internal Layer

Package Purpose
usecases/ Orchestrates business logic for each pipeline stage and for benchmark/model set management
mappers/ Converts between public entities and internal domain models
registries/ Thread-safe registries that store registered component classes
dao/ Data access objects backed by Peewee ORM for SQLite persistence
domain_models/ Internal representations used by usecases and DAOs
wrappers/ Adapters around external libraries and services

Registration System

Luna Bench uses a decorator-based registration system backed by metaclasses. When you define a custom component, you decorate it with one of the provided decorators, and it is automatically registered in a thread-safe registry.

sequenceDiagram
    participant User as User Code
    participant Dec as @feature / @algorithm / ...
    participant Meta as RegisteredClassMeta / MetricClassMeta
    participant Reg as Registry

    User->>Dec: Decorate class with @feature
    Dec->>Meta: Metaclass __init_subclass__ triggers
    Meta->>Reg: Register class in thread-safe registry
    Note over Reg: Class is now discoverable<br/>via RegistryInfo
    User->>Reg: RegistryInfo.log_registered_features()
    Reg-->>User: Returns registered feature IDs

Decorators

The four registration decorators are:

  • @feature: Registers a BaseFeature subclass
  • @algorithm: Registers a BaseAlgorithmSync or BaseAlgorithmAsync subclass
  • @metric: Registers a BaseMetric subclass
  • @plot: Registers a BasePlot subclass

Metaclasses

Two metaclasses power the registration:

  • RegisteredClassMeta: Used by features, algorithms, and plots. Intercepts class creation to insert the class into the appropriate registry.
  • MetricClassMeta: A specialized variant for metrics that handles additional metric-specific metadata.

Inspecting the registries

RegistryInfo (in luna_bench.custom) reports what is currently registered. Each method logs the IDs and returns them as a list:

Method Returns
RegistryInfo.log_registered_features() registered feature IDs
RegistryInfo.log_registered_sync() registered synchronous algorithm IDs
RegistryInfo.log_registered_algorithms_async() registered asynchronous algorithm IDs
RegistryInfo.log_registered_metrics() registered metric IDs
RegistryInfo.log_registered_plots() registered plot IDs
RegistryInfo.log_registry_contents() logs every registry at once

Thread Safety

All registry operations are thread-safe. You can register components from multiple threads without risk of data races or lost registrations.

Pipeline Execution

The benchmark runs a four-stage pipeline. Each stage feeds its results into the next. The stages always execute in order: Features, Algorithms, Metrics, Plots.

flowchart TD
    Start([bench.run]) --> FeaturesStage

    subgraph FeaturesStage ["Stage 1: Features"]
        direction LR
        F1[For each model in ModelSet]
        F2[Run all registered features]
        F3[Store FeatureResult per model]
        F1 --> F2 --> F3
    end

    FeaturesStage --> AlgorithmsStage

    subgraph AlgorithmsStage ["Stage 2: Algorithms"]
        direction LR
        A1["Cartesian product:\nalgorithm x model"]
        A2[Run algorithm on model]
        A3[Store AlgorithmResult per pair]
        A1 --> A2 --> A3
    end

    AlgorithmsStage --> MetricsStage

    subgraph MetricsStage ["Stage 3: Metrics"]
        direction LR
        M1["For each (algorithm, model) pair"]
        M2[Evaluate all registered metrics]
        M3[Store MetricResult per triple]
        M1 --> M2 --> M3
    end

    MetricsStage --> PlotsStage

    subgraph PlotsStage ["Stage 4: Plots"]
        direction LR
        P1[Collect all feature and metric data]
        P2[Run all registered plots]
        P3[Generate and store visualizations]
        P1 --> P2 --> P3
    end

    PlotsStage --> End([Pipeline complete])

Stage Details

Stage 1: Features. Feature extractors run once per model in the model set. They compute properties such as variable count, constraint count, or problem structure. Feature results are persisted and available to later stages.

Stage 2: Algorithms. Every registered algorithm runs on every model (cartesian product). Synchronous algorithms (BaseAlgorithmSync) execute in-process. Asynchronous algorithms (BaseAlgorithmAsync) submit jobs to a background task queue and poll for results.

Stage 3: Metrics. For each (algorithm, model) result pair, all registered metrics are evaluated. Metrics compute values like approximation ratio, runtime, feasibility rate, or time-to-solution.

Stage 4: Plots. Plot generators receive the full set of feature and metric results. They produce visualizations (e.g., comparison charts, scaling plots) and store them alongside the benchmark data.

Selective Execution

You can run individual stages instead of the full pipeline. The Benchmark class exposes methods for running each stage independently, which is useful during development or when re-generating plots after adjusting configurations.

Dependency Injection

Luna Bench uses a container-based dependency injection system. Internal components receive their dependencies through containers rather than constructing them directly. This makes the system testable and allows swapping implementations.

The framework defines the following containers:

Container Provides
UsecaseContainer Business logic orchestrators for each pipeline stage
RegistryContainer Access to all component registries
MapperContainer Entity-to-domain and domain-to-entity mappers
DaoContainer Data access objects for SQLite persistence
BackgroundTaskContainer Job queue and worker management for async algorithms

Dependencies are wired using the @inject decorator, which resolves dependencies from the appropriate container at call time.

Internal Detail

The dependency injection system is part of the internal layer. Users do not need to interact with containers directly. They are documented here for contributors and for understanding the internal architecture.

Database

Luna Bench persists all benchmark configurations and results in SQLite databases managed through the Peewee ORM. Two separate database files are used:

Database File Purpose
luna_bench.db Main database: benchmarks, model sets, models, components, and all results
luna_bench-jobs.db Job queue database: tracks async algorithm job state and scheduling

Schema Overview

The main database contains the following tables:

  • Benchmark: Benchmark configuration and metadata
  • ModelSet: Named collections of models
  • Model: Individual optimization models belonging to a model set
  • Algorithm: Registered algorithm instances and their configurations
  • Feature: Registered feature instances
  • Metric: Registered metric instances
  • Plot: Registered plot instances
  • Result tables: Store outputs from each pipeline stage (feature results, algorithm results, metric results, plot artifacts)

Automatic Management

Database creation and schema migrations are handled automatically. Users do not need to manage the database directly. The Benchmark.create() and ModelSet.create() factory methods handle all persistence setup.

Async Algorithm Support

For algorithms that call cloud services or quantum hardware, Luna Bench provides BaseAlgorithmAsync. Async algorithms submit work to a Huey-based job queue rather than blocking the main thread.

Key characteristics:

  • Subclass BaseAlgorithmAsync instead of BaseAlgorithmSync to define an async algorithm.
  • Jobs are submitted to a background task queue backed by Huey and the luna_bench-jobs.db SQLite database.
  • A configurable pool of workers (default: 10) processes jobs concurrently.
  • The pipeline automatically waits for all async jobs to complete before proceeding to the Metrics stage.
  • Worker count can be adjusted based on API rate limits or hardware constraints.

When to Use Async

Use BaseAlgorithmAsync when your algorithm calls an external service with non-trivial latency (cloud solvers, quantum hardware APIs). For in-process solvers like SCIP or Gurobi, use the synchronous BaseAlgorithmSync base class instead.