Skip to content
This repository was archived by the owner on Feb 28, 2024. It is now read-only.

Conversation

@Hvass-Labs
Copy link
Contributor

This is a much improved version of PR #590 which was also discussed in various issues including #570 and #583.

It might be a good idea if I update the doc-strings of all optimizers to briefly describe that func can be wrapped using this decorator so as to use named arguments. Do you agree?

Thanks to @betatim for suggesting the use of a decorator which made this version very elegant.

I have not worked very much with decorators and this is a fairly advanced usage scenario, so I think it is better to direct the interested user to do an internet search if they want an explanation on how advanced decorators work in Python. See my comments in the code.

However, I do hope you appreciate the otherwise detailed level of commenting I have added, which is what I have been discussing in my other issues and PRs. It really makes it so much easier for others to understand the intention of the code in case there a bugs, or in case they want to modify the code in the future.

It is extremely important to try and polish and document the code when it is first added to the library. If I had not polished and documented this code properly and left it for others to do, then they would have had to guess at the intended semantics, and they would have spent maybe 10x as long as it has now taken me to polish and document this code - and they would have been pissed off that I hadn't done it properly in the first place.

I admit openly that this version is superior to what any of us suggested individually, and I suppose this shows how great the design and implementation can be under collaboration - but it requires for each person to properly polish and document his additions to the code. That is what I understand to be real collaboration, and not cutting corners which creates a much larger burden for future developers.

You can test this addition with this code:

from skopt.space import Real
from skopt.optimizer import forest_minimize
from skopt.utils import wrap_named_args

dim1 = Real(name='foo', low=0.0, high=1.0)
dim2 = Real(name='bar', low=0.0, high=1.0)
dim3 = Real(name='baz', low=0.0, high=1.0)

dimensions = [dim1, dim2, dim3]

@wrap_named_args(dimensions=dimensions)
def my_objective_function(foo, bar, baz):
    # print("foo", foo)
    # print("bar", bar)
    # print("baz", baz)
    return foo ** 2 + bar ** 4 + baz ** 8

# Test that the wrapped objective function is now callable
# using a list x instead of named arguments.
print("Test 1:", my_objective_function(x=[0.1, 1.0, 0.5]))
print("Test 2:", my_objective_function([0.1, 1.0, 0.5]))

result = forest_minimize(my_objective_function, dimensions=dimensions,
                         n_calls=20, base_estimator="ET", random_state=4)

print("Best objective value:", result.fun)
print("Best parameters:", result.x)

@pep8speaks
Copy link

pep8speaks commented Jan 1, 2018

Hello @Hvass-Labs! Thanks for updating the PR.

Line 32:1: W293 blank line contains whitespace

Line 15:1: W293 blank line contains whitespace

Line 36:1: W293 blank line contains whitespace

Line 36:1: W293 blank line contains whitespace

Line 47:1: W293 blank line contains whitespace

Line 181:1: W293 blank line contains whitespace

Line 561:32: W291 trailing whitespace
Line 588:80: E501 line too long (90 > 79 characters)
Line 609:1: W293 blank line contains whitespace
Line 616:1: W293 blank line contains whitespace
Line 619:32: W291 trailing whitespace

Comment last updated on January 06, 2018 at 09:57 Hours UTC

@Hvass-Labs
Copy link
Contributor Author

I wonder if a better name for @wrap_named_args would perhaps be @use_named_args? Any other suggestions?

@glouppe
Copy link
Member

glouppe commented Jan 2, 2018

Thanks! This looks good at first sight. Can you add tests though before we start the review process?

@glouppe glouppe changed the title Added support for named arguments with objective function (NEW version) [WIP] Added support for named arguments with objective function (NEW version) Jan 2, 2018
arg_dict = {dim.name: value for dim, value in zip(dimensions, x)}

# Call the wrapped objective function with the named arguments.
objective_value = func(**arg_dict)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not convinced the comments above (line 591, 592 and 595) bring anything helpful except repeating the semantics of the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are wrong. These comments explain in English what is actually fairly unusual Python programming techniques that certainly won't be understood very well by people who are not Python experts. Once again, this code is not written for you!

skopt/utils.py Outdated

def wrap_named_args(dimensions):
"""
Wrapper for an objective function that uses named arguments to make it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrapper -> Decorator? That's the standard name for that sort of things in Python.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are deliberately nitpicking here and although I'm not a Python expert I doubt that what you're saying is strictly true. We are importing and using something called wraps from functools and they repeatedly talk about wrapping functions in the documentation, and they even call it wrapper in their code examples. Yes, these are also called function decorators in Python when used with the @ symbol. But this code may be used by people who are not Python experts so they may not know what a decorator is, but they most likely know what a function-wrapper is.

I'll tell you what. I'll call it a Wrapper / Decorator so everyone is happy :-) Besides I think use_named_args is a better name as well.

https://docs.python.org/2/library/functools.html

function-decorator. To get explanations of how this works in Python,
please do an internet search for e.g.:
Python function decorator with additional arguments.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just remove that, or make that comment shorter. This is standard practice in Python.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an advanced usage of Python decorators. It is poorly understood even by experienced Python programmers as I could see on StackOverflow - I myself have never used this technique before and I had to look it up. It took me quite a while to find out how to do it, and it is not obvious where to find information on this. There is no good doc-page for this in the Python manuals. So to save people a lot of time, I give a good suggestion for a google search-string if they want to understand what is going on in this code.

Please stop being so goddamn stingy with comments in the code!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please stop being so goddamn stingy with comments in the code!

I must admit, I laughed reading that.

Please read my comment again: I asked to either remove or shorten your comment. Telling people to google if they dont understand Python does not really help.

So let's be constructive. Instead I suggest to point the readers to the page you read yourself to implement this pattern. This can be shortly expressed in 1 or 2 lines at most, and is more helpful to everyone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a bunch of pages and none of them were really good. I picked up a little info here and a little there. If you google for the 'wrong' keywords you find even worse pages. If there was a single good webpage explaining this I would of course have linked to it. I added these keywords because they really helped me. Again, this is not a common Python feature that is being used and even experienced developers on StackOverflow were confused about it.

Again you are deliberately nitpicking what you want changed in this PR.

You can bulldoze all these comments a month from now and I will never notice, but I suggest you instead have some empathy for who is going to be reading the code in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @glouppe. It would be better to state which pages exactly the user should visit where the helpful information is; Otherwise, every user who wants to understand the code will do the search from scratch, possibly ending on wrong pages.

After some search, these two links seem to explain what is going on:

Explanation to @wraps:
https://docs.python.org/2/library/functools.html#functools.wraps

Explanation to decorator with parameters:
https://stackoverflow.com/questions/5929107/decorators-with-parameters

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I concede and have rewritten these doc-strings. However, I didn't like the links you found as I found them confusing :-) I managed to find a 7 minute youtube video that explains it quite well, so I added that link. I hope everyone is happy with this.

@Hvass-Labs
Copy link
Contributor Author

Before I start, let me once again say that I think the idea of your project is absolutely awesome! But your code is often really, really bad! I wish you could take this as a big compliment that you are on to something very important with this library, but you need to improve certain aspects, and that starts with accepting that your code "style" is actually really awful!

As I wrote in #579 (comment) I expected you would demand that I remove all the comments from my code. That is not going to happen! If I add code to your project, then it will be beautifully commented and explained like this, otherwise I will retract the PRs. I will not have my name associated with crappy code! I get job offers from all over the world - even from places you probably wouldn't believe! A part of the reason is that I am so good at explaining what I do and write code so well. Whether you choose to remove all the comments afterwards is your business, but you would be making a huge mistake. Let me explain why.

There is one main thing that both you and @betatim misunderstand: This code is not written for you! It is written for the person who in 5 years has a bug or wants to modify the code. None of us will be around to help, or we will certainly have forgotten what exactly the code is supposed to do if it hasn't been documented really well. My code can be understood very easily by even first-year students, in part because the code itself is very clean and polished, but especially because it is so well documented in English. You don't need to have a deep understanding of skopt, matplotlib, or any other API, because the English comments describe very well what is supposed to happen. Yes, people could look up what all those API calls are doing, but I am saving them a ton of time by describing it in English, so they can very quickly understand what the intention is with the code and how everything fits together.

Your code on the other hand is extremely hard to understand - both because it is often very poorly programmed (it really is!), but the problem is then amplified greatly by the lack of comments, so the intention of the code isn't even clear. This makes it one gigantic puzzle to fix bugs or extend your code. Your code is confusing even to yourself, as I can see from the discussions here when bugs come up, semantics are unclear, etc.

I already showed @betatim some work I did 12 years ago on advanced C++ called ArrayOps. It was beautifully designed, coded and documented. Even though I had not really programmed C++ in 12 years - I was still able to grasp my old work from just browsing the manual and the well-documented code for 10 minutes or so the other day. Same goes for my SwarmOps project, which is also quite old now - and by the way is a much simpler way of doing hyper-parameter optimization than skopt, which is why I am so excited about skopt!

Now I am of course becoming well-known for my TensorFlow tutorials. They do not get the most views, but I will bet you that those tutorials have taught more people how to program TensorFlow than any other tutorials. And the reason people really like them, is in part because I am so disciplined and careful in polishing and explaining everything. You don't have to lookup API calls, or guess at semantics or anything like that, because it is all beautifully explained and commented in English.

With these examples it should be obvious that at least when I was younger I was a far better software engineer than you are now. And I was then, and am still, far better than you at explaining complicated things and documenting my code so it can be understood by mostly everyone. So I really don't understand why you don't want to learn from my way of doing things, or at least welcome it in the code that I have added to your project, instead of stubbornly insisting that your way is better.

http://www.hvass-labs.org/projects/arrayops/files/ArrayOps2.pdf
https://github.com/Hvass-Labs/swarmops/blob/master/swarmops/DE.py
https://github.com/Hvass-Labs/TensorFlow-Tutorials

@glouppe
Copy link
Member

glouppe commented Jan 3, 2018

Instead of keeping repeating how bad we are at writing code and how we dare to publish undocumented lines of Python, can we stick to the review process?

  • I asked for tests, which is a necessary thing if you wish to contribute to this project.
  • I expressed minor concerns regarding the unnecessary verbosity of some of the comments you added. There is no need to take this personally nor to attack us every time we express our opinion.

@Hvass-Labs
Copy link
Contributor Author

I am trying to convince you that in 5 years nobody will understand your code if you don't put more effort into it. Not even you will understand it. And yes I find it offensive when you want me to write as crappy code as you do, when I have clearly put effort into making it beautiful and easy to understand for everyone.

Now I have added tests. And I am certain that you are unhappy with the comments in the code. But I glanced at some of your other tests and they are completely incomprehensible to anybody except yourself. And in 5 years you won't even know what they're doing.

I really wish I knew what to say to make you realize this. But I don't think I can, so let's just finish this without killing each other :-)

return "Real(low={}, high={}, prior='{}', transform='{}')".format(
self.low, self.high, self.prior, self.transform_)
return "Real(low={}, high={}, prior='{}', transform='{}', name='{}')".format(
self.low, self.high, self.prior, self.transform_, self.name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also do that for dimensions of other types, for consistency? (Integer, etc).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disclaimer: this is not a nitpick.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, since names are None by default, I am not sure we actually want to print name=None in the default case. What do others think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now there's a good review request! :-)

It is already done in PR #579 which also fixes other things with Dimension-names. I just needed a single one for testing in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you backport from #579 please? (That's why we prefer self-contained atomic PRs that target a specific goal, to avoid these situations...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not an expert at git so I thought this would merge seamlessly as the change is identical for the two PRs. But this code was actually not required in this PR, I only needed it for some print-testing. So I just removed it from this PR.

@glouppe
Copy link
Member

glouppe commented Jan 3, 2018

Alright, thanks. These changes are good to go for me. I'll let a second opinion pitch in for merge.

@glouppe glouppe changed the title [WIP] Added support for named arguments with objective function (NEW version) [MRG+1] Added support for named arguments with objective function (NEW version) Jan 3, 2018
@Hvass-Labs
Copy link
Contributor Author

Very good, thank you!

@Hvass-Labs
Copy link
Contributor Author

@betatim is familiar with this and I give him partial credit for the solution as he came up with the idea of using a decorator (although not the actual implementation). So perhaps he could test if he is happy with it as well?

@Hvass-Labs
Copy link
Contributor Author

I think it would be helpful to add a short comment to the doc-strings of all optimizers so people know they can use this.

This is the doc-string from forest_minimize but I think they are all identical:

* `func` [callable]:
    Function to minimize. Should take a array of parameters and
    return the function values.

After:

* `func` [callable]:
    Function to minimize. Should take an array named `x` of parameters
and return the objective value as a scalar of type `float`.

    If you have a search-space where all dimensions have names,
    then you can use `skopt.utils.use_named_args` as a decorator
    on your objective function, in order to call it directly
    with the named arguments.

I don't think you support multi-objective optimization, right? So func only returns scalar values, and not arrays, right?

Should I add this doc-string to all optimizers?

"""

# Ensure all dimensions are correctly typed.
if not all(isinstance(dim, Dimension) for dim in dimensions):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to print the value of dim which is not an instance of Dimension - could help solve bugs faster for future users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

skopt/utils.py Outdated

# Ensure all dimensions have names.
if any(dim.name is None for dim in dimensions):
raise ValueError("All dimensions must have names.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe same as above - print which dimension does not have a name, should help identify a wrong dimension faster

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


@wraps(func)
def wrapper(x):
# Ensure the number of dimensions match
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a docstring explaining the type of x and what it is. It seems that Pycharm for instance can infer expected type of variable from docstring, providing extra variable checks and warnings, which sometimes save me bugs. Possibly could also simplify comments as per discussion below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

# Ensure the objective function can be called with a single
# argument that is an unnamed numpy array.
res = func(np.array(default_parameters))
assert (isinstance(res, float))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add checks whether with unnamed dimensions / non - Dimension objects an exception rises. I imagine that in future we would want to extend your code to support also implicit dimension definitions with names, which could look like:

(0.1, 1.0, 'log-uniform', 'foo')  # equivalent to Real(0.1, 1.0, prior='log-uniform', name='foo')
# currently we can only use
# (0.1, 1.0, 'log-uniform') 
# and then the dimension is unnamed

which might temper with these exceptions, and could lead to subtle bugs. While such definition of dimension seems to be somewhat ugly at first glance, it is appealing in that it leads to a more compact code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(in case you need further info on various dimension definitions, see here or here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm ... I have tested myself whether the exceptions are thrown and the new detailed error-messages are correct. I'm not sure a unit-test adds much "safety" to this? And I don't quite understand what subtle bugs might arise in the future?

Furthermore, regarding your short-hand notation I think it is actually quite problematic, both in terms of reading your code that uses skopt, such as this:

space  = [(1, 5),                           # max_depth
          (10**-5, 10**0, "log-uniform"),   # learning_rate
          (1, n_features),                  # max_features
          (2, 100),                         # min_samples_split
          (1, 100)]                         # min_samples_leaf

In my opinion, it would be much more clear if you wrote:

space  = [Int(1, 5, name='max_depth'),
          Int(10**-5, 10**0, prior="log-uniform", name='learning_rate'),
          Int(1, n_features, name='max_features'),
          Int(2, 100, name='min_samples_split'), 
          Int(1, 100, name='min_samples_leaf')] 

I can also see in your code that the short-hand notation causes problems other places and it requires numerous weird tests, etc.

You will need even more strange tests if you allow naming in the short-hand notation, because now you don't know whether the 3rd arg is the name of the dimension, or if it defines e.g. the prior-mapping. You can fix that with more weird tests, such as whether the 3rd string arg is one of the allowed prior-mappings, otherwise it is assumed to be the name. But what if the user has made a typo in the prior-mapping? ... Suddenly the API and code is very confusing and bloated. It's not going to be pretty! :-)

In my opinion, it is better to aim at making the code clear even if it means it becomes significantly more verbose. Having compactness of the code as a major goal is only really justified in performance-critical applications - in which case the code can be so hard to read, that extensive commenting is then needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I also expect the short-hand definition of dimension with names to be somewhat complex :)
Short-hand seems to go nicely for quick prototyping with spaces in BayesSearchCV, but there actually one does not need named spaces, so yeah not sure whether we would need it in the end. Lets see what future users say.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should try as much as possible to gently persuade people to not use the shorthand notation (anymore), for example by only allowing you to do simple stuff through it. Along the lines of explicit is better than implicit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think gentle persuasion is my thing so I'll leave that to you! :-)

Seriously though, I think it would be a wise decision to aim for crystal clear semantics of the API rather than having a million clever little features. I can see both in your code and your github threads that you are also confused about your code and its semantics, and I'm glad that you have now realized the confusion is 10x for newcomers :-)

@iaroslav-ai
Copy link
Member

I guess @betatim is busy, so I will take a look.

@iaroslav-ai
Copy link
Member

I think this is a really cool feature, thanks @Hvass-Labs ! As for multi-objective optimization, I will take a look whether #369 interferes

@iaroslav-ai
Copy link
Member

Ah see starting from here

assumed to return two values, the first being the objective value and

Some functions can return two values.

Maybe just state ...and return the objective value.

@iaroslav-ai
Copy link
Member

iaroslav-ai commented Jan 5, 2018

Overall the PR looks good to me! Some of my suggestions can be added in different PR.

@Hvass-Labs
Copy link
Contributor Author

@iaroslav-ai I'm glad you like the feature and thanks for the constructive comments! I'll start working on them now. If you have the time then please review PR #579. I really would like these two PRs to be completed soon so I can finish my tutorial on skopt and TensorFlow. And I think you guys would like to get rid of me as well :-)

@Hvass-Labs
Copy link
Contributor Author

I have pushed the fixes @iaroslav-ai requested so I think everything is ready to be merged now.

Thanks to everyone for contributing to this!

@iaroslav-ai
Copy link
Member

LGTM!

@iaroslav-ai
Copy link
Member

Thank you @Hvass-Labs for this PR!

I think I speak for everyone when I say that we all appreciate your contribution and efforts in this PR. I am also looking forward to your tutorial on tensorflow + skopt, so let us know when that is made! I will take a look at another PR (soon).

🥂 🎉

@iaroslav-ai iaroslav-ai merged commit 6b455b9 into scikit-optimize:master Jan 6, 2018
@iaroslav-ai iaroslav-ai changed the title [MRG+1] Added support for named arguments with objective function (NEW version) [MRG+2] Added support for named arguments with objective function (NEW version) Jan 6, 2018
@Hvass-Labs
Copy link
Contributor Author

@iaroslav-ai Thanks for the nice words and I'm glad that I was able to contribute a little to your fantastic project!

@Hvass-Labs Hvass-Labs deleted the named_args2 branch January 8, 2018 08:08
@pavelkomarov
Copy link
Contributor

pavelkomarov commented Jan 8, 2018

This still does not address my issue (#583), because the dimensions parameter must still be a list. A dimensions list where elements are dimension objects with name subfields is fine, but even better (because it agrees more naturally with BayesSearchCV) is to let dimensions be a dictionary mapping parameter names to Real or other dimensions.

Fundamentally, we should standardize the way spaces are defined. Rather than adding this decorator, which now requires the space to be in yet a different form in order to leverage the named-params functionality and adds code to utils.py, it seems to me we should just use dictionaries. I find all these alternatives confusing. @betatim, can you explain why decorators are a more elegant solution? Can you convince me this additional complexity is justified?

@Hvass-Labs, is there a way to define dictionary-form spaces so all works naturally with TensorFlow? If I recall correctly, your problem from #570 was you couldn't use BayesSearchCV because your data wouldn't fit in memory (same problem I had), not because defining the space as a dictionary was cumbersome.

As I said elsewhere, I have actually already created a version of base_minimize that can do what I need; I just have to carefully ensure it doesn't break any of the existing functionality. Unless one of you convinces me I am wrong, I will submit a well-documented pull request of my own (because even if I think Hvass-Labs could tone down the rudeness, he's right that the existing codebase is confusing and not well-enough commented).

@Hvass-Labs
Copy link
Contributor Author

@pavelkomarov I tried following your arguments in #583 and here, but it would be easier to understand if you provided a small code snippet of what you want to do.

But I do agree that the semantics of defining dimensions and search-spaces could perhaps be a bit more clear. But the problem is that we need this for different things. Personally I would have been happy if dimensions always had to be named so the objective function would be called directly with named arguments, instead of using this decorator. I also do not like the scipy API and think skopt could have been far more elegant if it was not built around that. But I don't think there's much chance of making such a huge change to skopt, that's why I just added this decorator-hack, which works well enough for me.

Here's an example of what you would do now to define the search-space dimensions:

dim1 = Real(name='foo', low=0.0, high=1.0)
dim2 = Real(name=None, low=2.0, high=3.0)
dim3 = Real(name='baz', low=4.0, high=5.0)

dimensions = [dim1, dim2, dim3]

And you can then define your objective function as:

@use_named_args(dimensions=dimensions)
def func(foo, bar, baz):
    pass

Once again, I agree that it would have been more elegant to omit the decorator and have the framework automatically call func with named args. I had made another PR #590 which would instead call e.g. gp_minimize(func, ..., use_arg_names=True) so there was no need for the decorator, but @betatim was not too happy with adding another arg to all the optimizers.

Now the framework supports both the old scipy-style of calling func and using named args by just adding the decorator. I think it is a reasonable solution.

Why do you want to use a dict to define the dimensions? Would it look something like this?

dimensions =
{
    'foo': (0.0, 1.0),
    'bar': (2.0, 3.0),
    'baz': (4.0, 5.0)
}

I prefer the more proper object-oriented style above. See also the discussion in #592 (comment) why I prefer that way.

@betatim
Copy link
Member

betatim commented Jan 9, 2018

In the world of scikit-learn and other machine-learning libs having a dictionary based definition is indeed natural and what people expect. Which is why BayesSearchCV let's you do that.

There are many applications for scikit-optmize beyond tuning hyper-parameters for ML models, with their preferred way of defining search spaces.

With the new decorator you can do the following, which I think is a good compromise between convenience for scikit-learn style parameter setting and the general purpose nature of *_minimize():

@use_named_args(dimensions)
def func(**kwargs):
  my_sklearn_clf.set_params(**kwargs)
  ..do stuff and measure performance..

If you name all the dimensions in your search space.

@betatim
Copy link
Member

betatim commented Jan 9, 2018

we should standardize the way spaces are defined.

I don't think that is possible. If you come from scikit-learn you prefer dictionary based definitions, if you come from scipy you prefer a different way, if you come from hyperopt you prefer yet another way, and the list goes on. Given that you can arrive in scikit-optimize from these different places I think the best we can do is offer users a familiar interface for the part of skopt that is "for them". This makes it easy and familiar to get started.

If you then want to dig deeper, really tweak things, or have other power user/special needs scikit-optimize provides the Optimizer class which has all the "brains" in it plus the tools that are used to build all the different "interfaces" between scikit-learn, hyperopt, scipy, etc and Optimizer.

Think of scikit-optimize as Ikea furniture. You can put it together according to the spec from the shop, but you can also heavily modify it or put it together in new ways.

The fact that @pavelkomarov managed to quickly put together a copy of gp_minimize that worked for out of core learning is a plus point. It doesn't have to be part of the core library (where there are a lot of compromises that need making).

@pavelkomarov
Copy link
Contributor

"quickly" was a couple of days, so I'd still lobby to tweak base_minimize so it can support the dictionary form natively rather than requiring people make a list of dimensions with names = keys from their dictionary if they're coming from an sklearn frame of mind. But it's more useful for this to be concrete in a pull request.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants