Scaling issues in l-bfgs for LogisticRegression

So it looks like l-bfgs is very sensitive to scaling of the data, which can lead to convergence issues.
I feel like we might be able to fix this by changing the framing of the optimization?

example:
```py
import pandas as pd
from sklearn.datasets import fetch_openml
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import scale

data = fetch_openml(data_id=1590, as_frame=True)
cross_val_score(LogisticRegression(), pd.get_dummies(data.data), data.target)
```
this gives convergence warnings, after scaling it doesn't. I have seen this in many places. While people *should* scale I think warning about number of iterations is not a good thing to show to the user. If we can fix this, I think we should.

Using the bank campaign data I got coefficients that were quite different if I increased the number of iterations (I got convergence warnings with the default of 100). If I scaled the data, that issue went away.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Scaling issues in l-bfgs for LogisticRegression #15556

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Scaling issues in l-bfgs for LogisticRegression #15556

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions