-
Notifications
You must be signed in to change notification settings - Fork 554
[MRG+1] Implement greedy A-optimal acquisition function for pure exploration #432
Conversation
Codecov Report
@@ Coverage Diff @@
## master #432 +/- ##
==========================================
+ Coverage 86.43% 86.46% +0.02%
==========================================
Files 22 22
Lines 1563 1581 +18
==========================================
+ Hits 1351 1367 +16
- Misses 212 214 +2
Continue to review full report at Codecov.
|
|
Looks interesting! Could you elaborate a bit more on particular use cases for the function, eg a bit more description of practical use cases? Also could you provide some references to the literature where such thing is used? Would be good so that people can take a look at it a bit more in detail. One idea I would have in my mind is to possibly use this instead of random initialization for the optimizers, so that initial points generated are distributed "more evenly" across search space. |
|
The general setting is called active learning in which you want to learn the target function with as few evaluations as possible. "A-optimality" was established in optimal design . The goal is to specify design points in advance which reduce the average variance of the parameter estimates. See [1] for a good treatment of the different optimality criteria when applied in Bayesian optimization. This reference could also be useful if we want to implement more criteria like the mutual information. For initialization we could calculate a fixed set of [1] Krause, Andreas, Ajit Singh, and Carlos Guestrin. "Near-optimal sensor placements in Gaussian processes: Theory, efficient algorithms and empirical studies." Journal of Machine Learning Research 9.Feb (2008): 235-284. |
|
Naive question: how is this acquisition function different from evaluating the objective using a Sobol (or your favourite quasi random) sequence? Is it because with a Sobol sequence you explore the space "evenly" and here you pick points that have large uncertainty? Is there a simple example where the two don't lead to "the same" thing? (a heteroscedastic objective?) |
|
Hmm, I think you can achieve the same by setting |
|
@betatim I will play around with a few GPs to come up with an example where the behavior is different. In any case the Sobol sequence is not adaptive, ie it will not change if the user provides an initial set of points for which the objective value is already known. @MechCoder Yes, indeed I was doing exactly this as a workaround before deciding to implement the acquisition function. In my opinion it is cleaner this way, since the effect of the mean is completely removed. |
|
In that case, I would prefer having a special value for |
|
Ok, that sounds like a good compromise. I can make the change next week
since I'm going on vacation today.
…On Sat, 15 Jul 2017, 03:21 MechCoder, ***@***.***> wrote:
In that case, I would prefer having a special value for Kappa that will
set exploitation to zero (and will have no controversy in getting merged)
instead of having yet another acquisition function.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#432 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAic-cowSLCuHnAGWiiDJeMrf__pxlbtks5sOBQsgaJpZM4OVVss>
.
|
|
I made the change, by letting the user provide a special string Somehow Github did not like that I rebased the commits and force-pushed. Any ideas on how to fix the pull request without recreating it? |
|
Looks good to me. +1 for merge |
skopt/acquisition.py
Outdated
| Controls how much of the variance in the predicted values should be | ||
| taken into account. If set to be very high, then we are favouring | ||
| exploration over exploitation and vice versa. | ||
| If set to 'Aopt', the acquisition function will only use the variance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for being a prick but is Aopt the best name?
`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, since we do not have any other acquisition functions approximating optimal designs, we could call it something like 'var', 'variance', 'var_only' or 'explore_only'. I am open to suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"variance" is fine with me.
skopt/acquisition.py
Outdated
| Controls how much of the variance in the predicted values should be | ||
| taken into account. If set to be very high, then we are favouring | ||
| exploration over exploitation and vice versa. | ||
| If set to 'variance', the acquisition function will only use the variance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry again, but this should be `std'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you talk about the name of acquisition function? Some might have weird associations with 'std' as abbreviation 😅
|
After looking at the scikit-optimize documentation I would propose calling
it uncertainty which is the term used in the introduction of Bayesian
optimization. Though technically true that we pick the points maximizing
standard deviation and equivalently variance, I would say it is more
consistent to use uncertainty. Thoughts?
…On Wed, 26 Jul 2017, 06:22 MechCoder, ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In skopt/acquisition.py
<#432 (comment)>
:
> Controls how much of the variance in the predicted values should be
taken into account. If set to be very high, then we are favouring
exploration over exploitation and vice versa.
+ If set to 'variance', the acquisition function will only use the variance
Sorry again, but this should be `std'?
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#432 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAic-YV9JRVmmJuFnD9kbyJkysGenqgHks5sRr8RgaJpZM4OVVss>
.
|
|
So the confusion on my side is because I would be fine with allowing |
|
No strong opinion from my side. Either way is fine for me.
(sent from my phone)
…On 27 Jul 2017 07:42, "MechCoder" ***@***.***> wrote:
So the confusion on my side is because kappa denotes the value by which
the std is multiplied and not the acquisition function itself.
I would be fine with allowing kappa="inf" or/and kappa=np.inf with a note
that says this sets off exploitation. WDYT?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#432 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAdKSwKLv4ENwGT8IeHjOyvdKcEXGzZ1ks5sSCMqgaJpZM4OVVss>
.
|
|
I would be fine calling it `'inf'` and explaining it in the docstring.
Gilles Louppe <[email protected]> schrieb am Do., 27. Juli 2017 um
07:53 Uhr:
… No strong opinion from my side. Either way is fine for me.
(sent from my phone)
On 27 Jul 2017 07:42, "MechCoder" ***@***.***> wrote:
> So the confusion on my side is because kappa denotes the value by which
> the std is multiplied and not the acquisition function itself.
>
> I would be fine with allowing kappa="inf" or/and kappa=np.inf with a note
> that says this sets off exploitation. WDYT?
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <
#432 (comment)
>,
> or mute the thread
> <
https://github.com/notifications/unsubscribe-auth/AAdKSwKLv4ENwGT8IeHjOyvdKcEXGzZ1ks5sSCMqgaJpZM4OVVss
>
> .
>
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#432 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAic-dmLTgCPa8hRLqPdoQC2N4TFzxa_ks5sSCXsgaJpZM4OVVss>
.
|
Since in LCB the variable kappa is used to describe how much weight is given to the standard deviation, 'inf' is a more natural name for the limit of this weight.
|
Good to go for me when Travis is happy. |
|
The Travis build canceled due to |
|
Thanks! |
This acquisition function aims at reducing the overall uncertainty of our objective function approximation.
This is useful if you want to accurately gauge the effect of every hyperparameter on the objective function, typically to set proper ranges for the subsequent optimization or to remove a parameter completely.
The
gaussian_a_optfunction uses the standard deviation provided by the base estimator and samples those points first where it is maximal.Suggestions for improvement are welcome.