[WIP] Support Spaces with invalid parameter combinations #199

nfcampos · 2016-08-12T13:06:15Z

added validate_sample argument to Space

function that takes in each sample and return True if sample is valid
added tests for validate_sample
catch RecursionError to raise ValueError about dimensions and validate_func being incompatible

this isn't ready for merging or anything, opening the PR just for discussion

betatim · 2016-08-15T05:22:41Z

Could you explain a little how to use this with an example or two?

This is what I had in mind with conditional spaces (taken from the sklearn docs):

param_grid = [
  {'C': [1, 10, 100, 1000], 'kernel': ['linear']},
  {'C': [1, 10, 100, 1000], 'gamma': [0.001, 0.0001], 'kernel': ['rbf']},
 ]

This doesn't work in scikit-optimize yet because we specify the dimensions in a way inspired by scipy.optimize. In a very early version of Space we had support for the following:

param_grid = [
  ((1, 10, 100, 1000), ('linear',)),
  ((1, 10, 100, 1000), ('rbf',), Categorical(0.001, 0.0001)), # note the order
]

Already in this example you have to be careful on how to order your dimensions to not have to do a lot of work in your objective function to figure out what is what. For this having named dimensions would help, because then we could pass them as named arguments.

How would one implement handling these varying size spaces in a GP? -> should all this be supported by optimizing over two spaces separately "under the hood"?

The more I think about it, the more I am thinking that this is going to take some trial&error before we get it right API wise.

nfcampos · 2016-08-15T09:43:23Z

I agree that getting this API right will not be obvious. Because of that, I thought we could start by just having a function that validates samples (returning True for valid samples).
While this does not let you have dimensions that are conditional, it does let you define that this value of dimension 0 is not compatible with this other value of dimension 1.

example:

def validate_sample(sample):
  if sample[0] != 'liblinear' and sample[1] != 'l2':
    return False
  else:
    return True

space = Space([
  ('liblinear', 'lbfgs'),  # solver
  ('l1', 'l2'),  # penalty
], validate_sample=validate_sample)

For this having named dimensions would help, because then we could pass them as named arguments.

That's exactly the benefit of using the DictSpace from the other PR. Do you think it'd be better if the dimensions themselves had names? What if you then used a mix of named and unnamed dimensions in the same space? That's why I placed the names at the level of the space.

should all this be supported by optimizing over two spaces separately "under the hood"?

Maybe, but then how do you weigh the various spaces when sampling, are they all equally weighted, are they weighted by the number of distinct possibilities they define (a product of the bounds)?

- function that takes in each sample and return True if sample is valid - added tests for validate_sample - catch RecursionError to raise ValueError about dimensions and validate_func being incompatible

codecov-io · 2016-09-13T09:55:37Z

Current coverage is 81.75% (diff: 77.77%)

Merging #199 into master will decrease coverage by 0.08%

@@             master       #199   diff @@
==========================================
  Files            18         18          
  Lines           892        899     +7   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits            730        735     +5   
- Misses          162        164     +2   
  Partials          0          0

Powered by Codecov. Last update 6655873...6f22379

betatim · 2016-09-14T12:22:04Z

Brief thought: it feels more natural to me that the objective decides if a sample is valid or not. It keeps all the things "together". Am I the only one for whom that is intuitive? The objective function could either return +inf or raise or ... to signal that this configuration is invalid. WDYT?

nfcampos · 2016-09-14T19:04:30Z

@betatim Yeah that's a good point. I guess I went with this because I think of Space as something that is useful on its own, so I was thinking of sample validation as something that would make sense for the Space itself. Thinking just in terms of (which I'm well aware is probably the right thing here) how good the API for using *_minimize is, I do tend to agree that this can be done inside the objective function more naturally.

MechCoder · 2016-09-14T22:14:14Z

The problem with that approach is that it allows unnecessary expensive function evaluations (unless there is the strict assumption that the objecive function fails early given these invalid samples). It would be better to choose the candidate points from only those points which are "valid".

I would favour the dict-space method but not again sure of how to optimize spaces of different sizes. :-/

MechCoder · 2016-09-14T22:25:14Z

optimizing over two spaces separately "under the hood"?

That in a vague way is equivalent to optimising 2 different gp_minimize functions with different spaces and simply choosing the best among (n_calls / 2) * 2 candidate points, no?

betatim · 2016-09-16T11:45:09Z

optimizing over two spaces separately "under the hood"?

That in a vague way is equivalent to optimising 2 different gp_minimize functions with different spaces and simply choosing the best among (n_calls / 2) * 2 candidate points, no?

I think so. It would be merely syntactic sugar.

betatim · 2016-09-16T11:52:27Z

I'm not worried about the extra calls to the objective. I had assumed that if you setup a problem where parts of the parameter space are "invalid" you'd be smart enough to fail early in the objective.

One advantage of generating "invalid" samples and having the objective tell us that they are invalid is that you would have them in the OptimizeResult and could visualise them afterwards etc.

nfcampos · 2018-01-09T22:49:56Z

Closing this one as well as I won’t be updating.

sytham · 2018-06-21T12:29:37Z

It's a shame that this is closed without merging. I think calling the objective function for a sample that you can know in advance is invalid is a waste, even if you fail early in the objective. For example, if I set n_calls of the optimizer to 20, I want that to be 20 valid calls, not e.g. 20 calls where 50% of the samples is invalid, giving only 10 eval points. I personally also see no upside in having these invalid samples in OptimizeResult and visualizing them. They're invalid, so I'm not interested in them.

nfcampos mentioned this pull request Aug 12, 2016

Dict-like space #198

Closed

betatim changed the title ~~first attempt at conditional Space~~ [WIP] first attempt at conditional Space Aug 15, 2016

MechCoder added this to the 0.2 milestone Sep 8, 2016

added validate_sample argument to Space

6f22379

- function that takes in each sample and return True if sample is valid - added tests for validate_sample - catch RecursionError to raise ValueError about dimensions and validate_func being incompatible

nfcampos force-pushed the conditional-space branch from 2ef6dc5 to 6f22379 Compare September 13, 2016 09:09

betatim changed the title ~~[WIP] first attempt at conditional Space~~ [WIP] Support Spaces with invalid parameter combinations Sep 16, 2016

MechCoder mentioned this pull request Oct 20, 2016

How to handle exception/NaN in forest_minimize #249

Open

glouppe modified the milestone: 0.2 Nov 13, 2016

betatim mentioned this pull request Apr 17, 2017

Constrained optimization #355

Open

nfcampos closed this Jan 9, 2018

kernc mentioned this pull request Nov 17, 2020

ENH: Add constrained optimization #971

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Support Spaces with invalid parameter combinations #199

[WIP] Support Spaces with invalid parameter combinations #199

Uh oh!

nfcampos commented Aug 12, 2016 •

edited

Loading

Uh oh!

betatim commented Aug 15, 2016

Uh oh!

nfcampos commented Aug 15, 2016 •

edited

Loading

Uh oh!

codecov-io commented Sep 13, 2016

Uh oh!

betatim commented Sep 14, 2016

Uh oh!

nfcampos commented Sep 14, 2016

Uh oh!

MechCoder commented Sep 14, 2016 •

edited

Loading

Uh oh!

MechCoder commented Sep 14, 2016

Uh oh!

betatim commented Sep 16, 2016

Uh oh!

betatim commented Sep 16, 2016

Uh oh!

nfcampos commented Jan 9, 2018

Uh oh!

sytham commented Jun 21, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[WIP] Support Spaces with invalid parameter combinations #199

[WIP] Support Spaces with invalid parameter combinations #199

Uh oh!

Conversation

nfcampos commented Aug 12, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

betatim commented Aug 15, 2016

Uh oh!

nfcampos commented Aug 15, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-io commented Sep 13, 2016

Current coverage is 81.75% (diff: 77.77%)

Uh oh!

betatim commented Sep 14, 2016

Uh oh!

nfcampos commented Sep 14, 2016

Uh oh!

MechCoder commented Sep 14, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MechCoder commented Sep 14, 2016

Uh oh!

betatim commented Sep 16, 2016

Uh oh!

betatim commented Sep 16, 2016

Uh oh!

nfcampos commented Jan 9, 2018

Uh oh!

sytham commented Jun 21, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

nfcampos commented Aug 12, 2016 •

edited

Loading

nfcampos commented Aug 15, 2016 •

edited

Loading

MechCoder commented Sep 14, 2016 •

edited

Loading