mutar.DirtyModel¶

class mutar.DirtyModel(alpha=1.0, beta=1.0, fit_intercept=True, normalize=False, max_iter=2000, tol=0.0001, positive=False, warm_start=False)[source]¶

DirtyModel estimator with L1 and L1/L2 mixed-norm as regularizers.

The optimization objective for Dirty models is:

(1 / (2 * n_samples)) * ||Y - X(W_1 + W_2)||^2_Fro + alpha * ||W_1||_21
+ beta * ||W_2||_1

Where:

||W||_21 = sum_i sqrt{sum_j w_ij^2}

i.e. the sum of norm of each row.

and:

||W||_1 = sum_i sum_j |w_ij|

Parameters

alphafloat, optional: Constant that multiplies the L1/L2 term. Defaults to 1.0
betafloat, optional: Constant that multiplies the L1 term. Defaults to 1.0
fit_interceptboolean: whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).
normalizeboolean: This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm.
max_iterint, optional: The maximum number of iterations
tolfloat, optional: The tolerance for the optimization: if the updates are smaller than tol, the optimization code checks the dual gap for optimality and continues until it is smaller than tol.
positiveboolean, optional (default False): If True, coefficients are constrained to be non-negative.
warm_startbool, optional: When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.

Examples

>>> from mutar import DirtyModel
>>> import numpy as np
>>> X = np.array([[[3, 1], [2, 0], [1, 0]],                     [[0, 2], [-1, 3], [1, -2]]], dtype=float)
>>> coef = np.array([[1., 1.], [0., -1]])
>>> y = np.array([x.dot(c) for x, c in zip(X, coef.T)])
>>> y += 0.1
>>> dirty = DirtyModel(alpha=0.15, beta=0.12).fit(X, y)
>>> print(dirty.coef_shared_)
[[ 0.4652447  0.3465437]
 [ 0.        -0.       ]]
>>> print(dirty.coef_specific_)
[[ 0.35453532  0.        ]
 [ 0.         -1.20766296]]

Attributes

coef_array, shape (n_features, n_tasks): Parameter vector (W in the cost function formula).
intercept_array, shape (n_tasks,): independent term in decision function.
n_iter_int: number of iterations run by the coordinate descent solver to reach the specified tolerance.

__init__(self, alpha=1.0, beta=1.0, fit_intercept=True, normalize=False, max_iter=2000, tol=0.0001, positive=False, warm_start=False)[source]¶: Initialize self. See help(type(self)) for accurate signature.

Methods

`__init__`(self[, alpha, beta, fit_intercept, …])	Initialize self.
`fit`(self, X, y)
`get_params`(self[, deep])	Get parameters for this estimator.
`predict`(self, X)	Predict target given unseen data samples.
`score`(self, X, y[, sample_weight])	Returns the coefficient of determination R^2 of the prediction.
`set_params`(self, \\params)	Set the parameters of this estimator.

get_params(self, deep=True)¶

Get parameters for this estimator.

Parameters

deepboolean, optional: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

paramsmapping of string to any: Parameter names mapped to their values.

predict(self, X)¶

Predict target given unseen data samples.

Parameters

X{array-like}, shape (n_tasks, n_samples, n_features): The training input samples.

Returns

yndarray, shape (n_tasks, n_samples): Returns the predicted targets.

score(self, X, y, sample_weight=None)¶

Returns the coefficient of determination R^2 of the prediction.

Computes a score for each regression task. The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.

Parameters Xarray-like, shape = (n_tasks, n_samples, n_features) Test samples.

yarray-like, shape = (n_tasks, n_samples) True values for y.

sample_weightarray-like, shape = [n_tasks, n_samples], optional Sample weights.

Returns

array-like, shape = (n_tasks)
R^2 of self.predict(X) wrt. y for each task.

set_params(self, **params)¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns

self