slow runtimes for small alphas in GroupLasso & inf weights in weighted GroupLasso #229

sehoff · 2022-04-15T07:17:18Z

maybe my issue is related #177. The data is also in a finance context and I am sorry, if, because of my lack in knowledge, the behavior in my example is expected! I would then be happy to hear why :)

Here is the issue:

I discover quite enormous slowdowns for small alpha values. First, I thought that this only happens past the alpha that is the highest to keep all groups, let's call it alpha_min. However, especially in the second stage, i.e., when weighting the groups, the slowdown occurs also before reaching alpha_min. I have to say that the ratio between the largest weight and the smallest weight can be quite high in my applications (not in the example).

A second slowdown I find is between leaving np.inf as weights and running the optimization or dropping weights, groups and columns of the feature matrix accordingly. The latter is much faster than the former.

Here is the data

Here is my code:

import numpy as np
from celer import GroupLasso
import time

X = np.load("...design_matrix.npy")
y = np.load("...target.npy")
groups = np.load("...groups.npy")
weights = np.load("...weights.npy")
grps = [list(np.where(groups==i)[0]) for i in range(1,33)]
alpha_ratio = 1e-3
num_iter = 10

# Case 1: slower runtime for (very) small alphas
alpha_max = 0.003471727067743962
grid = np.geomspace(alpha_max*alpha_ratio, alpha_max, num_iter)[::-1]
for a in grid:
    clf = GroupLasso(alpha=a, fit_intercept=False, groups=grps, warm_start=True)
    t0 = time.time()
    clf.fit(X,y)
    t1 = time.time()
    print(f"Finished tuning with {np.round(a,5)}. Took {np.round(t1-t0,2)} seconds!")

# Case 2: slower runtime for (very) small alphas with weights
alpha_max_w = 0.0001897719130007628
grid_w = np.geomspace(alpha_max_w*alpha_ratio, alpha_max_w, num_iter)[::-1]

for a in grid_w:
    clf = GroupLasso(alpha=a, fit_intercept=False, weights=weights, groups=grps, warm_start=True)
    t0 = time.time()
    clf.fit(X,y)
    t1 = time.time()
    print(f"Finished tuning with {np.round(a,5)}. Took {np.round(t1-t0,2)} seconds!")

# Case 3.1 : (very) slow runtime when including a weight that is np.inf 
weights[-1] = np.inf
for a in grid_w:
    clf = GroupLasso(alpha=a, fit_intercept=False, weights=weights, groups=grps, warm_start=True)
    t0 = time.time()
    clf.fit(X,y)
    t1 = time.time()
    print(f"Finished tuning with {np.round(a,5)}. Took {np.round(t1-t0,2)} seconds!")

# Case 3.2: remove np.inf from weights and extract elements of X and grps accordingly --> much faster than 3.1
weights = weights[:-1]
grps = grps[:-1]
X_new = X[:,:-5]
for a in grid_w:
    clf = GroupLasso(alpha=a, fit_intercept=False, weights=weights, groups=grps, warm_start=True)
    t0 = time.time()
    clf.fit(X_new,y)
    t1 = time.time()
    print(f"Finished tuning with {np.round(a,5)}. Took {np.round(t1-t0,2)} seconds!")

The text was updated successfully, but these errors were encountered:

mathurinm · 2022-04-15T10:23:35Z

second issue is fixed by #232

mathurinm · 2022-04-26T13:58:02Z

first issue solved by gram solver implementation in scikit-learn-contrib/skglm#4

mathurinm mentioned this issue Apr 15, 2022

NOMRG reproduce slow group solver issue #231

Closed

mathurinm closed this as completed Apr 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

slow runtimes for small alphas in GroupLasso & inf weights in weighted GroupLasso #229

slow runtimes for small alphas in GroupLasso & inf weights in weighted GroupLasso #229

sehoff commented Apr 15, 2022 •

edited

Loading

mathurinm commented Apr 15, 2022

mathurinm commented Apr 26, 2022

slow runtimes for small alphas in GroupLasso & inf weights in weighted GroupLasso #229

slow runtimes for small alphas in GroupLasso & inf weights in weighted GroupLasso #229

Comments

sehoff commented Apr 15, 2022 • edited Loading

mathurinm commented Apr 15, 2022

mathurinm commented Apr 26, 2022

sehoff commented Apr 15, 2022 •

edited

Loading