Credit Scoring Feature Selection API Reference
Data
Data model for Credit Scoring Feature Selection use case.
CreditScoringFeatureSelectionData
Bases: UcData
Data for the Credit Scoring Feature Selection use case.
Selects the most informative and least redundant subset of features for credit scoring by balancing label correlation and inter-feature correlation.
correlations: tuple[np.ndarray, np.ndarray]
cached
property
Compute absolute correlations between features and labels.
Returns:
-
tuple[ndarray, ndarray]–corr_label : 1D array of shape (n_features,) Absolute correlation of each feature with the labels. Zero for constant features or constant labels. corr_feat : 2D array of shape (n_features, n_features) Absolute pairwise correlation between features. Diagonal is zero. Zero for constant features.
plot(*, ax: Axes | None = None) -> Axes
Plot feature-feature correlation and feature-label correlation.
This visualization consists of two linked views:
-
Feature-Feature Correlation Matrix Shows redundancy structure between features.
-
Feature-Label Correlation Shows predictive strength of each feature.
The trade-off controlled by alpha becomes visually interpretable: high alpha emphasizes label correlation, low alpha emphasizes diversity.
Parameters:
-
ax(Axes | None, default:None) –Optional axis for correlation heatmap. If None, a new figure with two subplots is created.
Returns:
-
Axes–Axes of the correlation heatmap.
to_string() -> str
Return a string describing the data.
from_values(design_matrix: np.ndarray, labels: list[int], alpha: float = 0.5) -> CreditScoringFeatureSelectionData
staticmethod
Create a Credit Scoring Feature Selection data instance.
Parameters:
-
design_matrix(ndarray) –Feature matrix (n_samples x n_features).
-
labels(list[int]) –Binary labels: - 0 = non-default (loan repaid) - 1 = default (loan not repaid)
-
alpha(float, default:0.5) –Trade-off between relevance and redundancy.
Returns:
generate_random(n_samples: int = 20, n_features: int = 5, alpha: float = 0.5, seed: int | None = None) -> CreditScoringFeatureSelectionData
staticmethod
Generate a random Credit Scoring Feature Selection instance.
Parameters:
-
n_samples(int, default:20) –Number of data samples, by default 20.
-
n_features(int, default:5) –Number of features, by default 5.
-
alpha(float, default:0.5) –Balance parameter, by default 0.5.
-
seed(int | None, default:None) –Random seed for reproducibility, by default None.
Returns:
-
CreditScoringFeatureSelectionData–A randomly generated data instance.
Formulation
Formulation for Credit Scoring Feature Selection use case.
CreditScoringFeatureSelectionFormulation
Bases: UcFormulation[CreditScoringFeatureSelectionData, CreditScoringFeatureSelectionSolution]
Formulation for Credit Scoring Feature Selection.
Selects the most informative and least redundant subset of features by balancing label correlation (influence) against inter-feature correlation (redundancy).
Mathematical Formulation
Decision Variables: x_i in {0,1}: 1 if feature i is selected
Labels: binary target variable representing credit outcome: 0 = non-default (successful repayment) 1 = default (credit failure)
Preprocessing: corr_label[i] = |correlation(feature_i, labels)| corr_feat[i,j] = |correlation(feature_i, feature_j)|
Objective: maximize alpha * sum_i corr_label[i] * x[i] - (1 - alpha) * sum_{i != j} corr_feat[i,j] * x[i] * x[j]
Constraints: None (unconstrained)
to_string(data: CreditScoringFeatureSelectionData) -> str
staticmethod
Return a string describing the formulation.
Parameters:
-
data(CreditScoringFeatureSelectionData) –The problem data.
Returns:
-
str–String representation of the formulation.
formulate(data: CreditScoringFeatureSelectionData) -> Model
staticmethod
Formulate the Credit Scoring Feature Selection problem as an optimization model.
Encodes feature selection as an unconstrained binary optimization problem that maximizes label influence while minimizing inter-feature redundancy.
Parameters:
-
data(CreditScoringFeatureSelectionData) –The problem data containing the design matrix, labels, and alpha.
Returns:
-
Model–A Luna optimization model representing the feature selection problem.
interpret(solution: Solution, data: CreditScoringFeatureSelectionData) -> CreditScoringFeatureSelectionSolution
staticmethod
Extract solution from solver result.
Reconstructs the selected feature subset and computes influence and independence scores based on label and inter-feature correlations.
Parameters:
-
solution(Solution) –The solution containing variable assignments.
-
data(CreditScoringFeatureSelectionData) –The original problem data.
Returns:
-
CreditScoringFeatureSelectionSolution–A structured solution object with: - selected_features: indices of selected features - influence_score: sum of label correlations for selected features - independence_score: sum of inter-feature correlations among selected features - is_valid: whether at least one feature was selected
Raises:
-
NoSolutionFoundError–If the solver did not find a solution.
Solution
Solution model for Credit Scoring Feature Selection use case.
CreditScoringFeatureSelectionSolution
Bases: UcSolution
Solution for the Credit Scoring Feature Selection use case.
Attributes:
-
name(Literal['credit_scoring_feature_selection']) –Identifier for this solution type.
-
selected_features(NumPyArray) –Indices of selected features.
-
influence_score(float) –Sum of absolute label correlations for selected features.
-
independence_score(float) –Sum of absolute inter-feature correlations for selected features.
-
is_valid(bool) –Always True (unconstrained problem).
plot(data: CreditScoringFeatureSelectionData | None = None, *, ax: Axes | None = None) -> Axes
Plot the feature selection solution in a 2D trade-off space.
This visualization shows each feature as a point in a 2D space defined by:
- X-axis: Redundancy (mean absolute correlation to all other features)
- Y-axis: Influence (absolute correlation with the target labels)
Selected features are highlighted, allowing direct interpretation of the optimization objective: features in the upper-left region are most desirable (high predictive power, low redundancy).
Parameters:
-
data(CreditScoringFeatureSelectionData | None, default:None) –Original problem data used to compute feature correlations. If None, the plot will attempt to reconstruct required values from the solution context only (may be limited).
-
ax(Axes | None, default:None) –Matplotlib axes to draw on. Creates a new figure if None.
Returns:
-
Axes–The axes containing the trade-off scatter plot.
to_string() -> str
Instance
Instance model for CreditScoringFeatureSelection use case.
CreditScoringFeatureSelectionInstance
Bases: UcInstance[CreditScoringFeatureSelectionData, CreditScoringFeatureSelectionFormulation, CreditScoringFeatureSelectionSolution]
Instance combining data and formulation for CreditScoringFeatureSelection.
Collection
Collection of Credit Scoring Feature Selection instances.
CreditScoringFeatureSelectionCollection
Bases: UcInstanceCollection[CreditScoringFeatureSelectionInstance]
Collection of Credit Scoring Feature Selection instances.
from_random(min_size: int, max_size: int, num_instances: int = 1, *, n_samples: int = 20, alpha: float = 0.5, seed: int | None = None) -> CreditScoringFeatureSelectionCollection
classmethod
Generate random Credit Scoring Feature Selection instances.
Parameters:
-
min_size(int) –Minimum number of features.
-
max_size(int) –Maximum number of features.
-
num_instances(int, default:1) –Number of instances per size, by default 1.
-
n_samples(int, default:20) –Number of samples per instance, by default 20.
-
alpha(float, default:0.5) –Balance parameter, by default 0.5.
-
seed(int | None, default:None) –Random seed for reproducibility, by default None.
Returns:
-
CreditScoringFeatureSelectionCollection–Collection containing generated instances.