-
-
Notifications
You must be signed in to change notification settings - Fork 26.5k
[MRG] Add decision_function api to LocalOutlierFactor #8707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This should go in the decision_function method, rather than the score
method: the decision_function is the unthresholded version of fit.
|
|
PS: thanks, this seems very useful!
|
|
Thank you so much for the comment! I'll fix it. |
|
Fix is done |
|
There was a discussion on this in the LOF PR. It was decided to keep decision_function on new samples as a private method.
Why?
|
|
Both decision function and predict methods are private and accessible through |
|
i agree with @ngoix fit_predict(X) different than fit(X).predict(X) is bad. |
|
we had similar issues with tsne. these algorithms are not designed to be
inductive and should be documented as such. whether we then allow them to
predict on new data with documented caveat is a matter of how much that is
good practice
…On 7 Apr 2017 8:44 am, "Alexandre Gramfort" ***@***.***> wrote:
i agree with @ngoix <https://github.com/ngoix>
fit_predict(X) different than fit(X).predict(X) is bad.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#8707 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6wjXaPYAD67Iva8cCrfXBXOiIqwPks5rtWqygaJpZM4M0Wvp>
.
|
|
Thanks for the comments! I understand it and am sorry that I could not care such matters. The decision_function should not be merged so I would close this pull request. |
|
Should we make it explicit somewhere (in the code or documentation) that I think it could be good for people investigating the code (as I did when looking at this PR) to know why. |
|
Thanks @albertcthomas for the advice. Comment regarding to keeping the method private was inserted to the code. |
|
The reason that |
| Also, the samples in X are not considered in the neighborhood of any | ||
| point. | ||
| point. To avoid training and testing on the different dataset, this | ||
| method is kept private as well as _predict(). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say: " This method is kept private as it is an extension of LOF to the train-test setting, which is not described in the original paper."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 @ngoix maybe take over so we can move on.
|
These two lines have been included in PR #9015. This PR can be closed. |
sklearn.neighbors.LocalOutlierFactor has an ability to judge the outliers and return as bool, but we cannot get the level of outlier for each data, indicates inlier closed or obvious outlier. So in this pull request, decision_function() api is added to provide the outlier level as Local Outlier Factor.
Reference Issue
What does this implement/fix? Explain your changes.
add decision_function api into sklearn.neighbors.LocalOutlierFactor
Any other comments?