Skip to content

Credit Scoring Feature Selection API Reference

Data

Data model for Credit Scoring Feature Selection use case.

CreditScoringFeatureSelectionData

Bases: UcData

Data for the Credit Scoring Feature Selection use case.

Selects the most informative and least redundant subset of features for credit scoring by balancing label correlation and inter-feature correlation.

correlations: tuple[np.ndarray, np.ndarray] cached property

Compute absolute correlations between features and labels.

Returns:

  • tuple[ndarray, ndarray]

    corr_label : 1D array of shape (n_features,) Absolute correlation of each feature with the labels. Zero for constant features or constant labels. corr_feat : 2D array of shape (n_features, n_features) Absolute pairwise correlation between features. Diagonal is zero. Zero for constant features.

plot(*, ax: Axes | None = None) -> Axes

Plot feature-feature correlation and feature-label correlation.

This visualization consists of two linked views:

  1. Feature-Feature Correlation Matrix Shows redundancy structure between features.

  2. Feature-Label Correlation Shows predictive strength of each feature.

The trade-off controlled by alpha becomes visually interpretable: high alpha emphasizes label correlation, low alpha emphasizes diversity.

Parameters:

  • ax (Axes | None, default: None ) –

    Optional axis for correlation heatmap. If None, a new figure with two subplots is created.

Returns:

  • Axes

    Axes of the correlation heatmap.

to_string() -> str

Return a string describing the data.

from_values(design_matrix: np.ndarray, labels: list[int], alpha: float = 0.5) -> CreditScoringFeatureSelectionData staticmethod

Create a Credit Scoring Feature Selection data instance.

Parameters:

  • design_matrix (ndarray) –

    Feature matrix (n_samples x n_features).

  • labels (list[int]) –

    Binary labels: - 0 = non-default (loan repaid) - 1 = default (loan not repaid)

  • alpha (float, default: 0.5 ) –

    Trade-off between relevance and redundancy.

Returns:

generate_random(n_samples: int = 20, n_features: int = 5, alpha: float = 0.5, seed: int | None = None) -> CreditScoringFeatureSelectionData staticmethod

Generate a random Credit Scoring Feature Selection instance.

Parameters:

  • n_samples (int, default: 20 ) –

    Number of data samples, by default 20.

  • n_features (int, default: 5 ) –

    Number of features, by default 5.

  • alpha (float, default: 0.5 ) –

    Balance parameter, by default 0.5.

  • seed (int | None, default: None ) –

    Random seed for reproducibility, by default None.

Returns:

Formulation

Formulation for Credit Scoring Feature Selection use case.

CreditScoringFeatureSelectionFormulation

Bases: UcFormulation[CreditScoringFeatureSelectionData, CreditScoringFeatureSelectionSolution]

Formulation for Credit Scoring Feature Selection.

Selects the most informative and least redundant subset of features by balancing label correlation (influence) against inter-feature correlation (redundancy).

Mathematical Formulation

Decision Variables: x_i in {0,1}: 1 if feature i is selected

Labels: binary target variable representing credit outcome: 0 = non-default (successful repayment) 1 = default (credit failure)

Preprocessing: corr_label[i] = |correlation(feature_i, labels)| corr_feat[i,j] = |correlation(feature_i, feature_j)|

Objective: maximize alpha * sum_i corr_label[i] * x[i] - (1 - alpha) * sum_{i != j} corr_feat[i,j] * x[i] * x[j]

Constraints: None (unconstrained)

to_string(data: CreditScoringFeatureSelectionData) -> str staticmethod

Return a string describing the formulation.

Parameters:

Returns:

  • str

    String representation of the formulation.

formulate(data: CreditScoringFeatureSelectionData) -> Model staticmethod

Formulate the Credit Scoring Feature Selection problem as an optimization model.

Encodes feature selection as an unconstrained binary optimization problem that maximizes label influence while minimizing inter-feature redundancy.

Parameters:

Returns:

  • Model

    A Luna optimization model representing the feature selection problem.

interpret(solution: Solution, data: CreditScoringFeatureSelectionData) -> CreditScoringFeatureSelectionSolution staticmethod

Extract solution from solver result.

Reconstructs the selected feature subset and computes influence and independence scores based on label and inter-feature correlations.

Parameters:

Returns:

  • CreditScoringFeatureSelectionSolution

    A structured solution object with: - selected_features: indices of selected features - influence_score: sum of label correlations for selected features - independence_score: sum of inter-feature correlations among selected features - is_valid: whether at least one feature was selected

Raises:

Solution

Solution model for Credit Scoring Feature Selection use case.

CreditScoringFeatureSelectionSolution

Bases: UcSolution

Solution for the Credit Scoring Feature Selection use case.

Attributes:

  • name (Literal['credit_scoring_feature_selection']) –

    Identifier for this solution type.

  • selected_features (NumPyArray) –

    Indices of selected features.

  • influence_score (float) –

    Sum of absolute label correlations for selected features.

  • independence_score (float) –

    Sum of absolute inter-feature correlations for selected features.

  • is_valid (bool) –

    Always True (unconstrained problem).

plot(data: CreditScoringFeatureSelectionData | None = None, *, ax: Axes | None = None) -> Axes

Plot the feature selection solution in a 2D trade-off space.

This visualization shows each feature as a point in a 2D space defined by:

  • X-axis: Redundancy (mean absolute correlation to all other features)
  • Y-axis: Influence (absolute correlation with the target labels)

Selected features are highlighted, allowing direct interpretation of the optimization objective: features in the upper-left region are most desirable (high predictive power, low redundancy).

Parameters:

  • data (CreditScoringFeatureSelectionData | None, default: None ) –

    Original problem data used to compute feature correlations. If None, the plot will attempt to reconstruct required values from the solution context only (may be limited).

  • ax (Axes | None, default: None ) –

    Matplotlib axes to draw on. Creates a new figure if None.

Returns:

  • Axes

    The axes containing the trade-off scatter plot.

to_string() -> str

Return a string describing the solution.

Returns:

  • str

    String representation of the solution.

Instance

Instance model for CreditScoringFeatureSelection use case.

CreditScoringFeatureSelectionInstance

Bases: UcInstance[CreditScoringFeatureSelectionData, CreditScoringFeatureSelectionFormulation, CreditScoringFeatureSelectionSolution]

Instance combining data and formulation for CreditScoringFeatureSelection.

Collection

Collection of Credit Scoring Feature Selection instances.

CreditScoringFeatureSelectionCollection

Bases: UcInstanceCollection[CreditScoringFeatureSelectionInstance]

Collection of Credit Scoring Feature Selection instances.

from_random(min_size: int, max_size: int, num_instances: int = 1, *, n_samples: int = 20, alpha: float = 0.5, seed: int | None = None) -> CreditScoringFeatureSelectionCollection classmethod

Generate random Credit Scoring Feature Selection instances.

Parameters:

  • min_size (int) –

    Minimum number of features.

  • max_size (int) –

    Maximum number of features.

  • num_instances (int, default: 1 ) –

    Number of instances per size, by default 1.

  • n_samples (int, default: 20 ) –

    Number of samples per instance, by default 20.

  • alpha (float, default: 0.5 ) –

    Balance parameter, by default 0.5.

  • seed (int | None, default: None ) –

    Random seed for reproducibility, by default None.

Returns: