Added support for named arguments to objective function. #590

Hvass-Labs · 2017-12-28T08:21:14Z

This is a solution for issue #570 about named arguments for the objective function.

@betatim proposed using the Python inspect.signature() function - first time I heard about it :-)

But I think this a simpler solution that also has backwards compatibility with the current calling style of func(x) where x is a list. The user just needs to give all dimensions proper names (otherwise an exception is raised), and then set use_arg_names=True when calling e.g. gp_minimize(). The use_arg_names is then passed all the way down to base_minimize() which wraps the objective-function so it will be called with the named arguments instead of a list x as usual.

I found the function base_minimize() very confusing. I added a few TODO-comments which is a hope that you will spend an hour cleaning and commenting this function before you merge the code. It is unlikely that anyone else can / will do it for you.

The code below tests the new feature. Try changing use_arg_names so it is called with both the old and new methods. You can also try changing the order of the arguments to named_fitness() and it should still work. Also try changing a name e.g. from activation to activation2 and it should crash.

import numpy as np
from math import exp

from skopt import gp_minimize
from skopt.space import Real, Categorical, Integer

dim_learning_rate = Real(name='learning_rate', low=1e-6, high=1e-2, prior='log-uniform')
dim_num_dense_layers = Integer(name='num_dense_layers', low=1, high=5)
dim_num_dense_nodes = Integer(name='num_dense_nodes', low=5, high=512)
dim_activation = Categorical(name='activation', categories=['relu', 'sigmoid'])

dimensions = [dim_learning_rate,
              dim_num_dense_layers,
              dim_num_dense_nodes,
              dim_activation]

default_parameters = [1e-4, 1, 64, 'relu']

# TODO: Try changing the order of the args, which should work.
# TODO: Also try renaming an arg, which should crash.
def named_fitness(learning_rate, num_dense_layers, num_dense_nodes, activation):
    """New type of fitness function with named arguments."""

    fitness = ((exp(learning_rate) - 1.0) * 1000) ** 2 + \
              (num_dense_layers) ** 2 + \
              (num_dense_nodes/100) ** 2

    fitness *= 1.0 + 0.1 * np.random.rand()

    if activation == 'sigmoid':
        fitness += 10

    return fitness


def model_fitness(x):
    """Typical fitness function with list-argument."""
    learning_rate, num_dense_layers, num_dense_nodes, activation = x

    fitness = named_fitness(learning_rate=learning_rate,
                            num_dense_layers=num_dense_layers,
                            num_dense_nodes=num_dense_nodes,
                            activation=activation)

    return fitness


print(model_fitness(x=default_parameters))

# TODO: Try changing this!
use_arg_names = False

func = named_fitness if use_arg_names else model_fitness

search_result = gp_minimize(func=func,
                            dimensions=dimensions,
                            use_arg_names=use_arg_names,
                            n_calls=40,
                            x0=default_parameters)

print(search_result.x)
print(search_result.fun)

for fitness, x in sorted(zip(search_result.func_vals, search_result.x_iters)):
    print(fitness, x)

pep8speaks · 2017-12-28T08:21:17Z

Hello @Hvass-Labs! Thanks for submitting the PR.

In the file skopt/optimizer/base.py, following are the PEP8 issues :

Line 54:80: E501 line too long (89 > 79 characters)
Line 63:1: W293 blank line contains whitespace
Line 111:63: W291 trailing whitespace

In the file skopt/optimizer/forest.py, following are the PEP8 issues :

Line 10:1: E303 too many blank lines (3)

betatim · 2017-12-28T22:19:43Z

Thanks for this. Let's see if we can get some feedback from others on introducing a new argument. Personally I quite like the idea of having a adaptor that takes care of the boiler plate conversion. Mostly because there are already a lot of arguments which leads to long doc strings, and long function signatures.

I found the function base_minimize() very confusing. I added a few TODO-comments which is a hope that you will spend an hour cleaning and commenting this function before you merge the code. It is unlikely that anyone else can / will do it for you.

As a general rule a PR is edited by the contributor, including adding tests, making sure existing tests pass, documentation, code, etc. Frequently others will make suggestions or ask for changes which the person who created the PR then implements. Very rarely people do collaborate on a PR but this means a third person has to review it so it can be merged. Keeping the roles separated works very well for us.

Your comments on base_minimize are welcome, so thank you for taking the time to write them down. Source code is not the place to store them however, issues is the place to do that or if relevant to this PR create a comment on that line to discuss how best to address something.

This means I won't +1 this PR until you remove the TODO comments or convert them.

betatim · 2017-12-28T23:38:46Z

skopt/optimizer/base.py

@@ -1,10 +1,11 @@
+# TODO: This doc-string is way too short and quite meaningless.


We are working on this code base together and expect to work through issues collaboratively. If you've taken the time to understand this section, you probably understand there are tradeoffs. We can work through them. As it stands, these TODO comments feel like directed angst rather than a commitment to contribute to reducing confusion.

It is an indication that your code is frustrating to work on for others and the TODOs point to specific areas that might be improved. I asked myself last night if my old code was also undocumented and unpolished. I looked at ArrayOps which I did almost 12 years ago and look at how beautifully it was designed, implemented and documented. Even though it has been so long and I haven't worked on it since, and it is very advanced C++ programming, I was still able to grasp it again by simply browsing my docs and commented code. Take a look at the manual http://www.hvass-labs.org/projects/arrayops/files/ArrayOps2.pdf Then take a look at e.g. plots.py in your project which was particularly bad, or even the files we are discussing here. ArrayOps had almost no users (I think it was used in teaching advanced C++ but not really used in production). Your library has lots of users already. Why not make the code beautiful so others can contribute more easily? Right now there's a huge barrier to making contributions because your code is often impossible to understand.

betatim · 2017-12-28T23:46:29Z

skopt/optimizer/base.py

 from ..utils import eval_callbacks


+class _WrapFuncNamedArgs:


Instead of creating (and having to maintain) our own function wrapper we could use functools.wraps like so:

In [3]: def adaptor(f): ...: @wraps(f) ...: def wrapper(*args): ...: print('modify args', args) ...: return f(*args) ...: return wrapper ...: In [4]: def my_objective(x, y, z): ...: """doc here""" ...: return x + y + z ...: In [5]: adapted_obj = adaptor(my_objective) In [6]: adapted_obj? Signature: adapted_obj(x, y, z) Docstring: doc here File: ~/Downloads/<ipython-input-4-84edd263a788> Type: function

it takes care of forwarding doc strings, and signatures, etc.

Thanks for the tip, I'll take a look.

betatim · 2017-12-28T23:52:13Z

skopt/optimizer/base.py

-        n_calls -= len(y0)
+    # TODO: I tried to clean this up a little, but it is VERY confusing.
+    # Pass user suggested initialisation points to the optimizer.
+    if x0 is not None:


x0 should already be initialised as an empty list by the time we get here, so we should keep the old test (if x0).

I like the comment but splitting up the if clause makes it harder to understand

As I recall there was an if x0: statement here. That is not Python, that is C-style programming! And it can actually fail in Python if the object has overrided some dunder-method (I can't remember which one but I think there's a PyCon talk about this ...) In Python you should make it explicit what you are testing against e.g. by writing if x0 is not None:

I think I'll re-submit the wrapper function as discussed below and I'll remove all these changes to the other files and just add the wrapper-function with as few changes as possible. It would be nice if you would go over some of these files and polish them.

betatim · 2017-12-28T23:53:22Z

skopt/optimizer/base.py

-    # Bayesian optimization loop
+    # Bayesian optimization loop.
    for n in range(n_calls):
+        # Get next point in the search-space.


The next three comments restate what the line of code says so I would remove them

The code is easy to understand for you because you look at it every day. For newcomers the comments help and they don't do any harm to experts. But I'm not going to argue with you. I am just informing you that it would be easier for others to help you, if your code was better documented.

betatim · 2017-12-28T23:59:09Z

I would provide the wrapper in skopt.utils and let the user pass in the wrapped objective instead of adding a new argument to influence the behaviour. This has two advantages:

it does not increase the number of arguments. There is a feeling (and corresponding discussion) that there are already too many, and how could we reduce them
it provides the same functionality to power users who use Optimizer directly.

betatim · 2017-12-29T00:10:50Z

One more thought on having named arguments.

Without any changes to scikit-optimize having named arguments (or a nice way to unpack them) already works (as long as the dimensions are defined in the same order):

# note the double parentheses 
def named_fitness((learning_rate, num_dense_layers, num_dense_nodes, activation)):
  pass

The use case in #583 and when I've wanted the arguments to be named was when I had to pass them on to something else (like a scikit-learn estimator) where you want a dict to pass to set_params()

This makes me think we should remind people of argument unpacking in python and instead add support for objectives that receive a "single argument" like so: objective(**params) where we pass in a dictionary.

Hvass-Labs · 2017-12-29T08:03:13Z

@betatim Thanks again for the quick response! We don't agree on a lot of issues especially regarding commenting of code and who should do what on a project, but you have a strong sense of making clean API design and I appreciate that! Why don't you just lead the project? It is already my impression that you are the main developer on this project? You should just take charge.

I like that you keep reducing the ideas for API design until it is as simple as possible. That is how good design is done. It is important that you scrutinize the contributions from other team members as well. I don't think you would ever have approved plots.py if you had scrutinized it like you are doing with my PRs.

I am also learning something from your ideas and feedback, including Python features I wasn't aware of, so thanks for that!

Your final idea on this issue was this:

# note the double parentheses 
def named_fitness((learning_rate, num_dense_layers, num_dense_nodes, activation)):
    pass

I hadn't thought about unpacking x already in the function-declaration. But it is semantically similar to what I previously did myself:

def named_fitness(x):
learning_rate, num_dense_layers, num_dense_nodes, activation = x

However, this has a few problems:

It is a hack of the Python API and as you note, you have to write double-parentheses in the function declaration which is confusing.
One advantage of using Python's named arguments is that you have added security against making mistakes by re-ordering the arguments. This is probably an even more important reason than saving a line of code.

I thought about making a function-wrapper e.g. in skopt.utils. I think the main reason I decided against it was because named args is a feature I would personally use all the time, so I thought it would be better to include it directly in the API.

I now see that my wrapper-function is actually incorrect because dimensions may not be a list of Dimension-objects. Maybe it would indeed be better to make the wrapper-function external so we also avoid having to deal with all these special cases of the dimensions argument. I'll take a look and re-submit later. I'm working on plots.py now, which is taking forever because it is so hard to understand the code.

glouppe · 2017-12-29T14:06:01Z

Hi,

Thanks for your contribution. Can you remove your inline comments so that we can focus the review on changes that are concerned with this proposal? (i.e., adding support for named arguments)

We acknowledged that there might things that are difficult for newcomers to understand (as is with any complex bit of research code) and are open to discuss them, but in order to not waste everyone's time, it is probably more efficient to discuss them in standalone and targeted PRs.

Hvass-Labs · 2018-01-01T11:30:04Z

A much better version of this is made in PR #592. So I am closing this PR.

Added support for named arguments to objective function.

9a7717f

Hvass-Labs mentioned this pull request Dec 28, 2017

[MRG] Rewrote most of plots.py. Added 2 more plotting functions. Better support for named dimensions. #579

Merged

betatim reviewed Dec 28, 2017

View reviewed changes

betatim mentioned this pull request Dec 29, 2017

gp_minimize and company can not take points as dictionaries #583

Closed

Hvass-Labs mentioned this pull request Jan 1, 2018

[MRG+2] Added support for named arguments with objective function (NEW version) #592

Merged

Hvass-Labs closed this Jan 1, 2018

Hvass-Labs deleted the named_args branch January 8, 2018 08:31

		@@ -1,10 +1,11 @@
		# TODO: This doc-string is way too short and quite meaningless.

Added support for named arguments to objective function. #590

Added support for named arguments to objective function. #590

Uh oh!

Conversation

Hvass-Labs commented Dec 28, 2017

Uh oh!

pep8speaks commented Dec 28, 2017

Uh oh!

betatim commented Dec 28, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

betatim commented Dec 28, 2017

Uh oh!

betatim commented Dec 29, 2017

Uh oh!

Hvass-Labs commented Dec 29, 2017

Uh oh!

glouppe commented Dec 29, 2017

Uh oh!

Hvass-Labs commented Jan 1, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants