[MRG] ENH Add get_feature_names for OneHotEncoder#6441
[MRG] ENH Add get_feature_names for OneHotEncoder#6441yenchenlin wants to merge 1 commit intoscikit-learn:masterfrom
Conversation
sklearn/preprocessing/data.py
Outdated
| feature_names = [] | ||
| for (i, n_value) in enumerate(self.n_values_): | ||
| for j in xrange(n_value): | ||
| feature_names.append(input_features[i]) |
There was a problem hiding this comment.
I think you want something like "{}={}".format(name, value)
There was a problem hiding this comment.
Sorry can you elaborate more?
"{}={}".format(name, value)
What is name and value here?
There was a problem hiding this comment.
.format(input_features[i], j) rather
There was a problem hiding this comment.
Oh you mean adding j into feature_names to make it more clear, then the output will become something like
['x0 0', 'x0 1', 'x1 0', 'x1 1', 'x1 2', 'x2 0', 'x2 1', 'x2 2', 'x2 3']Am I wrong?
There was a problem hiding this comment.
Yeah I agree that the following output is better:
['x0=0', 'x0=1', 'x1=0', 'x1=1', 'x1=2', 'x2=0', 'x2=1', 'x2=2', 'x2=3']There was a problem hiding this comment.
I've updated the code.
Please have a look.
Thanks!
f4b4d5b to
acb09d2
Compare
sklearn/preprocessing/data.py
Outdated
| else: | ||
| if len(input_features) != len(self.n_values_): | ||
| raise ValueError("Number of input_features must equal to " | ||
| "n_feature. it has to be of shape " |
There was a problem hiding this comment.
Oh it should be n_features,
like n_features in this line:
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/preprocessing/data.py#L1710
I've updated the code.
Thanks!
acb09d2 to
7aa1754
Compare
sklearn/preprocessing/data.py
Outdated
| input_features = ['x%d' % i for i in range(len(self.n_values_))] | ||
| else: | ||
| if len(input_features) != len(self.n_values_): | ||
| raise ValueError("Number of input_features must equal to " |
There was a problem hiding this comment.
This is clunky still. How about Length of input_features is {0} but it must equal number of features when fitted: {1}.?
There was a problem hiding this comment.
Yeah, and showing len(self.n_values_) in error message may be more informative too.
Code updated.
61e1331 to
ac47fab
Compare
ac47fab to
15a4a75
Compare
|
Hello @jnothman , |
|
LGTM |
|
but it may be subject to an embargo :p |
|
Oh yeah ... |
|
LGTM as well. |
|
Actually, I take back my +1. This should probably wait for #5270 and the |
|
oh, i forgot about that... Withholding my +1. |
|
I think this should wait for the refactoring of OneHotEncoder for accepting strings in #7327 |
|
This has been added in #10198 in the meantime. So closing this, but @yenchenlin thanks for working on it anyway! |
This is a PR for #6425 .
I've added
get_feature_namestoOneHotEncoder.Can @jnothman please have a look at this?