Market Graph Clustering Example
Market Graph Clustering applies k-medoids clustering to stocks based on their return correlations. It constructs a distance metric from correlation matrices and identifies clusters of similarly behaving assets.
import getpass
import os
import numpy as np
from dotenv import load_dotenv
from luna_quantum.algorithms import SCIP
from luna_usecases.market_graph_clustering import (
MarketGraphClusteringCollection,
MarketGraphClusteringData,
MarketGraphClusteringFormulation,
MarketGraphClusteringInstance,
)
load_dotenv()
if "LUNA_API_KEY" not in os.environ:
os.environ["LUNA_API_KEY"] = getpass.getpass("Enter your Luna API key: ")
Create Data
Define returns for 5 stocks over 5 periods and cluster into k=2 groups.
returns_matrix = np.array(
[
[0.02, -0.01, 0.03, 0.01, -0.02],
[0.03, -0.02, 0.04, 0.00, -0.01],
[-0.01, 0.02, -0.03, 0.01, 0.02],
[-0.02, 0.03, -0.02, 0.02, 0.01],
[0.01, 0.01, 0.02, -0.01, 0.03],
]
)
data = MarketGraphClusteringData(
returns_matrix=returns_matrix,
k=2,
stock_names=["AAPL", "MSFT", "JPM", "GS", "TSLA"],
)
print(data.to_string())
Market Graph Clustering Data:
Stocks: 5
Observations: 5
Clusters (k): 2
Stock names: ['AAPL', 'MSFT', 'JPM', 'GS', 'TSLA']
Plot Data
Visualize stock return correlations.
Create Formulation
Cluster stocks into groups based on return similarity.
Market Graph Clustering Formulation:
Stocks: 5
Clusters: 2
Indices:
i = stock index (i = 0, ..., 4)
j = medoid index (j = 0, ..., 4)
Decision Variables:
z[i] in {0,1} for i = 0, ..., 4
z[i] = 1 if stock i is a medoid
y[i,j] in {0,1} for i = 0, ..., 4, j = 0, ..., 4
y[i,j] = 1 if stock i is assigned to medoid j
Total: 5 + 25 = 30 binary variables
Objective:
minimize sum_{i,j} d[i,j] * y[i,j] where d[i,j] = sqrt(0.5 * (1 - corr[i,j]))
Constraints:
1. Exactly k medoids (1 constraint):
sum_i z[i] == 2
2. Each stock assigned to exactly one medoid (5 constraints):
sum_j y[i,j] == 1 for all i = 0, ..., 4
3. Assign only to medoids (25 constraints):
y[i,j] <= z[j] for all i = 0, ..., 4, j = 0, ..., 4
4. Medoid self-assignment (5 constraints):
y[j,j] >= z[j] for all j = 0, ..., 4
Create Instance
Combine data and formulation into a solvable instance.
instance = MarketGraphClusteringInstance(data=data, formulation=formulation)
print(instance.to_string())
Data:Market Graph Clustering Data:
Stocks: 5
Observations: 5
Clusters (k): 2
Stock names: ['AAPL', 'MSFT', 'JPM', 'GS', 'TSLA']
Formulation:Market Graph Clustering Formulation:
Stocks: 5
Clusters: 2
Indices:
i = stock index (i = 0, ..., 4)
j = medoid index (j = 0, ..., 4)
Decision Variables:
z[i] in {0,1} for i = 0, ..., 4
z[i] = 1 if stock i is a medoid
y[i,j] in {0,1} for i = 0, ..., 4, j = 0, ..., 4
y[i,j] = 1 if stock i is assigned to medoid j
Total: 5 + 25 = 30 binary variables
Objective:
minimize sum_{i,j} d[i,j] * y[i,j] where d[i,j] = sqrt(0.5 * (1 - corr[i,j]))
Constraints:
1. Exactly k medoids (1 constraint):
sum_i z[i] == 2
2. Each stock assigned to exactly one medoid (5 constraints):
sum_j y[i,j] == 1 for all i = 0, ..., 4
3. Assign only to medoids (25 constraints):
y[i,j] <= z[j] for all i = 0, ..., 4, j = 0, ..., 4
4. Medoid self-assignment (5 constraints):
y[j,j] >= z[j] for all j = 0, ..., 4
Formulate Model
Translate the instance into a mathematical optimization model.
Solve and Interpret
Solve the model with SCIP and interpret the raw result into a use-case-specific solution.
scip = SCIP()
job = scip.run(model)
sol = job.result()
uc_solution = instance.interpret(sol)
print(uc_solution.to_string())
/Users/maximilianjanetschek/PycharmProjects/luna-usecases/.venv/lib/python3.13/site-packages/rich/live.py:260:
UserWarning: install "ipywidgets" for Jupyter support
warnings.warn('install "ipywidgets" for Jupyter support')
Market Graph Clustering Solution:
Medoids: ['MSFT', 'JPM']
Total objective: 1.1309240850540656
Valid: True
Assignments: {'AAPL': 'MSFT', 'MSFT': 'MSFT', 'JPM': 'JPM', 'GS': 'JPM', 'TSLA': 'MSFT'}
Plot Solution
Visualize the optimal solution.
<Axes: title={'center': 'Market Graph Clustering — objective: 1.1309'}, xlabel='Mean Return', ylabel='Volatility'>
Collections
Generate a benchmark collection of random instances for batch processing.
collection = MarketGraphClusteringCollection.from_random(
min_size=4, max_size=8, num_instances=1, n_observations=20, k=2, seed=42
)
model = collection.instances[0].formulate()
print(model)
Model: market_graph_clustering<market_graph_clustering>
Minimize
0.2940636559455016 * y_Stock_0_Stock_1 + 0.707696268931348 * y_Stock_0_Stock_2
+ 0.7370537325724711 * y_Stock_0_Stock_3
+ 0.2940636559455016 * y_Stock_1_Stock_0
+ 0.6448315529333203 * y_Stock_1_Stock_2
+ 0.6649585803375584 * y_Stock_1_Stock_3
+ 0.707696268931348 * y_Stock_2_Stock_0
+ 0.6448315529333203 * y_Stock_2_Stock_1
+ 0.15037365389106994 * y_Stock_2_Stock_3
+ 0.7370537325724711 * y_Stock_3_Stock_0
+ 0.6649585803375584 * y_Stock_3_Stock_1
+ 0.15037365389107013 * y_Stock_3_Stock_2
Subject To
exactly_k_medoids: z_Stock_0 + z_Stock_1 + z_Stock_2 + z_Stock_3 == 2
assign_one_Stock_0: y_Stock_0_Stock_0 + y_Stock_0_Stock_1 + y_Stock_0_Stock_2
+ y_Stock_0_Stock_3 == 1
assign_one_Stock_1: y_Stock_1_Stock_0 + y_Stock_1_Stock_1 + y_Stock_1_Stock_2
+ y_Stock_1_Stock_3 == 1
assign_one_Stock_2: y_Stock_2_Stock_0 + y_Stock_2_Stock_1 + y_Stock_2_Stock_2
+ y_Stock_2_Stock_3 == 1
assign_one_Stock_3: y_Stock_3_Stock_0 + y_Stock_3_Stock_1 + y_Stock_3_Stock_2
+ y_Stock_3_Stock_3 == 1
only_medoid_Stock_0_Stock_0: -z_Stock_0 + y_Stock_0_Stock_0 <= 0
only_medoid_Stock_0_Stock_1: y_Stock_0_Stock_1 - z_Stock_1 <= 0
only_medoid_Stock_0_Stock_2: y_Stock_0_Stock_2 - z_Stock_2 <= 0
only_medoid_Stock_0_Stock_3: y_Stock_0_Stock_3 - z_Stock_3 <= 0
only_medoid_Stock_1_Stock_0: -z_Stock_0 + y_Stock_1_Stock_0 <= 0
only_medoid_Stock_1_Stock_1: -z_Stock_1 + y_Stock_1_Stock_1 <= 0
only_medoid_Stock_1_Stock_2: y_Stock_1_Stock_2 - z_Stock_2 <= 0
only_medoid_Stock_1_Stock_3: y_Stock_1_Stock_3 - z_Stock_3 <= 0
only_medoid_Stock_2_Stock_0: -z_Stock_0 + y_Stock_2_Stock_0 <= 0
only_medoid_Stock_2_Stock_1: -z_Stock_1 + y_Stock_2_Stock_1 <= 0
only_medoid_Stock_2_Stock_2: -z_Stock_2 + y_Stock_2_Stock_2 <= 0
only_medoid_Stock_2_Stock_3: y_Stock_2_Stock_3 - z_Stock_3 <= 0
only_medoid_Stock_3_Stock_0: -z_Stock_0 + y_Stock_3_Stock_0 <= 0
only_medoid_Stock_3_Stock_1: -z_Stock_1 + y_Stock_3_Stock_1 <= 0
only_medoid_Stock_3_Stock_2: -z_Stock_2 + y_Stock_3_Stock_2 <= 0
only_medoid_Stock_3_Stock_3: -z_Stock_3 + y_Stock_3_Stock_3 <= 0
self_assign_Stock_0: -z_Stock_0 + y_Stock_0_Stock_0 >= 0
self_assign_Stock_1: -z_Stock_1 + y_Stock_1_Stock_1 >= 0
self_assign_Stock_2: -z_Stock_2 + y_Stock_2_Stock_2 >= 0
self_assign_Stock_3: -z_Stock_3 + y_Stock_3_Stock_3 >= 0
Binary
z_Stock_0 y_Stock_0_Stock_0 y_Stock_0_Stock_1 y_Stock_0_Stock_2
y_Stock_0_Stock_3 z_Stock_1 y_Stock_1_Stock_0 y_Stock_1_Stock_1
y_Stock_1_Stock_2 y_Stock_1_Stock_3 z_Stock_2 y_Stock_2_Stock_0
y_Stock_2_Stock_1 y_Stock_2_Stock_2 y_Stock_2_Stock_3 z_Stock_3
y_Stock_3_Stock_0 y_Stock_3_Stock_1 y_Stock_3_Stock_2 y_Stock_3_Stock_3