PICASSO: Penalized Generalized Linear Model Solver - Unleash the Power of Non-convex Penalty
Pycasso provides a Python interface to the PICASSO C++ solver for fitting sparse generalized linear models with L1 (Lasso), MCP, and SCAD penalties via pathwise coordinate optimization.
- Families: Gaussian (linear), Binomial (logistic), Poisson, Sqrt-Lasso
- Penalties: L1 (Lasso), MCP, SCAD
- Standardization: Automatic design matrix standardization with proper coefficient rescaling
- Gaussian solvers: Naive update and covariance update
- Early stopping:
dfmaxparameter to stop when too many coefficients become nonzero - Intercept: Optional intercept term with correct initialization for all families
- Linux or macOS
- Python 3
- NumPy, SciPy
Install from PyPI:
pip install pycasso
Install from source:
git clone https://github.com/jasonge27/picasso.git cd picasso mkdir build && cd build && cmake .. && make cd ../python-package pip install .
import numpy as np
import pycasso
# Generate example data
n, d = 200, 50
X = np.random.randn(n, d)
beta_true = np.zeros(d)
beta_true[:3] = [1, -0.5, 0.3]
y = X @ beta_true + np.random.randn(n) * 0.5
# Fit sparse linear regression with Lasso
s = pycasso.Solver(X, y, family="gaussian", penalty="l1")
s.train()
result = s.coef()
print(result['beta']) # coefficient matrix (nlambda x d)
print(result['intercept']) # intercept for each lambda
# Predict
y_pred = s.predict(X[:5, :])
# Logistic regression with MCP penalty
y_bin = (np.random.rand(n) < 0.5).astype(float)
s2 = pycasso.Solver(X, y_bin, family="binomial", penalty="mcp")
s2.train()
# Early stopping with dfmax
s3 = pycasso.Solver(X, y, family="gaussian", dfmax=10)
s3.train()Jason Ge, Xingguo Li, Haoming Jiang, Han Liu, Tong Zhang, Mengdi Wang, and Tuo Zhao. "Picasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python." Journal of Machine Learning Research, 20(44):1-5, 2019.
GPL-3.0