mutar.GroupLasso¶

class mutar.GroupLasso(alpha=0.1, fit_intercept=True, normalize=False, max_iter=2000, tol=0.0001, positive=False, warm_start=False)[source]¶

GroupLasso estimator with L1/L2 mixed-norm as regularizer.

The optimization objective for Group Lasso is:

(1 / (2 * n_samples)) * ||Y - XW||^2_Fro + alpha * ||W||_21

Where:

||W||_21 = sum_i sqrt{sum_j w_ij^2}

i.e. the sum of norm of each row.

Parameters

alphafloat, optional: Constant that multiplies the L1/L2 term. Defaults to 1.
fit_interceptboolean: whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).
normalizeboolean: This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm.
max_iterint, optional: The maximum number of iterations
tolfloat, optional: The tolerance for the optimization: if the updates are smaller than tol, the optimization code checks the dual gap for optimality and continues until it is smaller than tol.
positiveboolean, optional (default False): If True, coefficients are constrained to be non-negative.
warm_startbool, optional: When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.

Examples

>>> from mutar import GroupLasso
>>> import numpy as np
>>> X = np.array([[[3, 1], [1, 0], [1, 0]],                     [[0, 2], [2, 3], [2, 3]]], dtype=float)
>>> y = X.sum(axis=2) + 2
>>> grouplasso = GroupLasso().fit(X, y)
>>> print(grouplasso.coef_shared_)
[[1.42045049 1.42045049]
 [0.         0.        ]]
>>> print(grouplasso.coef_specific_)
[[0. 0.]
 [0. 0.]]

Attributes

coef_array, shape (n_features, n_tasks): Parameter vector (W in the cost function formula).
intercept_array, shape (n_tasks,): independent term in decision function.
n_iter_int: number of iterations run by the coordinate descent solver to reach the specified tolerance.

__init__(self, alpha=0.1, fit_intercept=True, normalize=False, max_iter=2000, tol=0.0001, positive=False, warm_start=False)[source]¶: Initialize self. See help(type(self)) for accurate signature.

Methods

`__init__`(self[, alpha, fit_intercept, …])	Initialize self.
`fit`(self, X, y)
`get_params`(self[, deep])	Get parameters for this estimator.
`predict`(self, X)	Predict target given unseen data samples.
`score`(self, X, y[, sample_weight])	Returns the coefficient of determination R^2 of the prediction.
`set_params`(self, \\params)	Set the parameters of this estimator.

get_params(self, deep=True)¶

Get parameters for this estimator.

Parameters

deepboolean, optional: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

paramsmapping of string to any: Parameter names mapped to their values.

predict(self, X)¶

Predict target given unseen data samples.

Parameters

X{array-like}, shape (n_tasks, n_samples, n_features): The training input samples.

Returns

yndarray, shape (n_tasks, n_samples): Returns the predicted targets.

score(self, X, y, sample_weight=None)¶

Returns the coefficient of determination R^2 of the prediction.

Computes a score for each regression task. The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.

Parameters Xarray-like, shape = (n_tasks, n_samples, n_features) Test samples.

yarray-like, shape = (n_tasks, n_samples) True values for y.

sample_weightarray-like, shape = [n_tasks, n_samples], optional Sample weights.

Returns

array-like, shape = (n_tasks)
R^2 of self.predict(X) wrt. y for each task.

set_params(self, **params)¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns

self