mutar.DirtyModel¶
-
class
mutar.
DirtyModel
(alpha=1.0, beta=1.0, fit_intercept=True, normalize=False, max_iter=2000, tol=0.0001, positive=False, warm_start=False)[source]¶ DirtyModel estimator with L1 and L1/L2 mixed-norm as regularizers.
The optimization objective for Dirty models is:
(1 / (2 * n_samples)) * ||Y - X(W_1 + W_2)||^2_Fro + alpha * ||W_1||_21 + beta * ||W_2||_1
Where:
||W||_21 = sum_i sqrt{sum_j w_ij^2}
i.e. the sum of norm of each row.
and:
||W||_1 = sum_i sum_j |w_ij|
- Parameters
- alphafloat, optional
Constant that multiplies the L1/L2 term. Defaults to 1.0
- betafloat, optional
Constant that multiplies the L1 term. Defaults to 1.0
- fit_interceptboolean
whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).
- normalizeboolean
This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm.
- max_iterint, optional
The maximum number of iterations
- tolfloat, optional
The tolerance for the optimization: if the updates are smaller than
tol
, the optimization code checks the dual gap for optimality and continues until it is smaller thantol
.- positiveboolean, optional (default False)
If True, coefficients are constrained to be non-negative.
- warm_startbool, optional
When set to
True
, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.
Examples
>>> from mutar import DirtyModel >>> import numpy as np >>> X = np.array([[[3, 1], [2, 0], [1, 0]], [[0, 2], [-1, 3], [1, -2]]], dtype=float) >>> coef = np.array([[1., 1.], [0., -1]]) >>> y = np.array([x.dot(c) for x, c in zip(X, coef.T)]) >>> y += 0.1 >>> dirty = DirtyModel(alpha=0.15, beta=0.12).fit(X, y) >>> print(dirty.coef_shared_) [[ 0.4652447 0.3465437] [ 0. -0. ]] >>> print(dirty.coef_specific_) [[ 0.35453532 0. ] [ 0. -1.20766296]]
- Attributes
- coef_array, shape (n_features, n_tasks)
Parameter vector (W in the cost function formula).
- intercept_array, shape (n_tasks,)
independent term in decision function.
- n_iter_int
number of iterations run by the coordinate descent solver to reach the specified tolerance.
-
__init__
(self, alpha=1.0, beta=1.0, fit_intercept=True, normalize=False, max_iter=2000, tol=0.0001, positive=False, warm_start=False)[source]¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
(self[, alpha, beta, fit_intercept, …])Initialize self.
fit
(self, X, y)get_params
(self[, deep])Get parameters for this estimator.
predict
(self, X)Predict target given unseen data samples.
score
(self, X, y[, sample_weight])Returns the coefficient of determination R^2 of the prediction.
set_params
(self, \*\*params)Set the parameters of this estimator.
-
get_params
(self, deep=True)¶ Get parameters for this estimator.
- Parameters
- deepboolean, optional
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
- paramsmapping of string to any
Parameter names mapped to their values.
-
predict
(self, X)¶ Predict target given unseen data samples.
- Parameters
- X{array-like}, shape (n_tasks, n_samples, n_features)
The training input samples.
- Returns
- yndarray, shape (n_tasks, n_samples)
Returns the predicted targets.
-
score
(self, X, y, sample_weight=None)¶ Returns the coefficient of determination R^2 of the prediction.
Computes a score for each regression task. The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.
Parameters Xarray-like, shape = (n_tasks, n_samples, n_features) Test samples.
yarray-like, shape = (n_tasks, n_samples) True values for y.
sample_weightarray-like, shape = [n_tasks, n_samples], optional Sample weights.
- Returns
- array-like, shape = (n_tasks)
- R^2 of self.predict(X) wrt. y for each task.
-
set_params
(self, **params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.- Returns
- self