[MRG] Add decision_function api to LocalOutlierFactor #8707

refluster · 2017-04-05T14:49:03Z

sklearn.neighbors.LocalOutlierFactor has an ability to judge the outliers and return as bool, but we cannot get the level of outlier for each data, indicates inlier closed or obvious outlier. So in this pull request, decision_function() api is added to provide the outlier level as Local Outlier Factor.

Reference Issue

What does this implement/fix? Explain your changes.

add decision_function api into sklearn.neighbors.LocalOutlierFactor

Any other comments?

GaelVaroquaux · 2017-04-05T14:51:09Z

This should go in the decision_function method, rather than the score method: the decision_function is the unthresholded version of fit.

GaelVaroquaux · 2017-04-05T14:51:24Z

PS: thanks, this seems very useful!

refluster · 2017-04-05T14:54:10Z

Thank you so much for the comment! I'll fix it.

refluster · 2017-04-06T07:52:28Z

Fix is done

albertcthomas · 2017-04-06T13:22:51Z

There was a discussion on this in the LOF PR. It was decided to keep decision_function on new samples as a private method. You can access the decision function values on the training data by considering the opposite of the negative_outlier_factor_. cc @ngoix @amueller

GaelVaroquaux · 2017-04-06T13:24:49Z

There was a discussion on this in the LOF PR. It was decided to keep decision_function on new samples as a private method.

Why?

ngoix · 2017-04-06T13:47:55Z

Both decision function and predict methods are private and accessible through _decision_function() and _predict() methods. It was decided so because these two methods somehow extend the use of LOF prediction to "new data". LOF originally performs both training and testing on the same dataset, ie only performs fit_predict(X).
The main problem of making these methods public is that then fit_predict(X) would be different than fit(X).predict(X).

agramfort · 2017-04-06T22:44:01Z

i agree with @ngoix

fit_predict(X) different than fit(X).predict(X) is bad.

jnothman · 2017-04-06T23:19:50Z

we had similar issues with tsne. these algorithms are not designed to be inductive and should be documented as such. whether we then allow them to predict on new data with documented caveat is a matter of how much that is good practice

…

On 7 Apr 2017 8:44 am, "Alexandre Gramfort" ***@***.***> wrote: i agree with @ngoix <https://github.com/ngoix> fit_predict(X) different than fit(X).predict(X) is bad. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#8707 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz6wjXaPYAD67Iva8cCrfXBXOiIqwPks5rtWqygaJpZM4M0Wvp> .

refluster · 2017-04-07T04:29:13Z

Thanks for the comments! I understand it and am sorry that I could not care such matters. The decision_function should not be merged so I would close this pull request.
Thank you.

albertcthomas · 2017-04-07T09:47:37Z

Should we make it explicit somewhere (in the code or documentation) that predict is private because fit_predict(X) would be different than fit(X).predict(X), And that decision_function was made private because predict was made private ?

I think it could be good for people investigating the code (as I did when looking at this PR) to know why.

refluster · 2017-04-07T15:05:30Z

Thanks @albertcthomas for the advice. Comment regarding to keeping the method private was inserted to the code.

refluster · 2017-04-08T08:52:07Z

The reason that decision_function is not published is now added. Please let me know if the explanation is wrong.

ngoix · 2017-04-21T09:49:45Z

sklearn/neighbors/lof.py

        Also, the samples in X are not considered in the neighborhood of any
-        point.
+        point. To avoid training and testing on the different dataset, this
+        method is kept private as well as _predict().


I would say: " This method is kept private as it is an extension of LOF to the train-test setting, which is not described in the original paper."

+1 @ngoix maybe take over so we can move on.

ngoix · 2017-06-06T16:12:30Z

These two lines have been included in PR #9015. This PR can be closed.

refluster added 6 commits April 5, 2017 16:02

add score api to local outlier factor

2da99e2

check X

057c105

api comment fix

5f71a5f

rename api to predict_proba

663fb55

pass make test

481c381

fix deleted null line

e1463ab

rename the added api to decision_function

5194968

refluster changed the title ~~[MRG] Add score api to LocalOutlierFactor~~ [MRG] Add decision_function api to LocalOutlierFactor Apr 5, 2017

add test code for the new api

dbe0153

refluster closed this Apr 7, 2017

refluster added 2 commits April 7, 2017 23:42

reset the mods

0ec0ea2

make note the reason for keeping desicion_function private

ae5999c

refluster reopened this Apr 7, 2017

Merge branch 'master' into add_score_api_to_lof

c05516a

ngoix suggested changes Jun 6, 2017

View reviewed changes

ngoix mentioned this pull request Jun 6, 2017

[MRG+2] Outlier detection algorithms API consistency #9015

Merged

jnothman closed this in #9015 Feb 5, 2018

Uh oh!

[MRG] Add decision_function api to LocalOutlierFactor #8707

[MRG] Add decision_function api to LocalOutlierFactor #8707

Uh oh!

Conversation

refluster commented Apr 5, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issue

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

GaelVaroquaux commented Apr 5, 2017 via email

Uh oh!

GaelVaroquaux commented Apr 5, 2017 via email

Uh oh!

refluster commented Apr 5, 2017

Uh oh!

refluster commented Apr 6, 2017

Uh oh!

albertcthomas commented Apr 6, 2017

Uh oh!

GaelVaroquaux commented Apr 6, 2017 via email

Uh oh!

ngoix commented Apr 6, 2017

Uh oh!

agramfort commented Apr 6, 2017

Uh oh!

jnothman commented Apr 6, 2017 via email

Uh oh!

refluster commented Apr 7, 2017

Uh oh!

albertcthomas commented Apr 7, 2017

Uh oh!

refluster commented Apr 7, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

refluster commented Apr 8, 2017

Uh oh!

ngoix Apr 21, 2017

Choose a reason for hiding this comment

Uh oh!

agramfort Oct 25, 2017

Choose a reason for hiding this comment

Uh oh!

ngoix commented Jun 6, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

refluster commented Apr 5, 2017 •

edited

Loading

refluster commented Apr 7, 2017 •

edited

Loading