Skip to content
This repository was archived by the owner on Feb 28, 2024. It is now read-only.

Conversation

@holgern
Copy link
Contributor

@holgern holgern commented Feb 6, 2020

See also #433

  • Add unittests for IntegerEncoder
  • add Sobol, latin hypercube, hammersly and Halton for initial sampling generation
  • Example and unit tests were added

@pep8speaks
Copy link

pep8speaks commented Feb 6, 2020

Hello @holgern! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 26:1: E402 module level import not at top of file
Line 28:1: E402 module level import not at top of file
Line 29:1: E402 module level import not at top of file
Line 30:1: E402 module level import not at top of file
Line 31:1: E402 module level import not at top of file
Line 32:1: E402 module level import not at top of file
Line 33:1: E402 module level import not at top of file
Line 34:1: E402 module level import not at top of file
Line 38:1: E302 expected 2 blank lines, found 1
Line 41:80: E501 line too long (82 > 79 characters)

Line 27:1: E402 module level import not at top of file
Line 29:1: E402 module level import not at top of file
Line 30:1: E402 module level import not at top of file
Line 31:1: E402 module level import not at top of file
Line 32:1: E402 module level import not at top of file
Line 33:1: E402 module level import not at top of file
Line 34:1: E402 module level import not at top of file
Line 35:1: E402 module level import not at top of file
Line 39:1: E302 expected 2 blank lines, found 1
Line 42:80: E501 line too long (82 > 79 characters)
Line 50:1: E305 expected 2 blank lines after class or function definition, found 1

Line 25:1: E402 module level import not at top of file
Line 27:1: E402 module level import not at top of file
Line 33:80: E501 line too long (93 > 79 characters)
Line 37:1: E402 module level import not at top of file
Line 39:1: E302 expected 2 blank lines, found 0
Line 42:1: E305 expected 2 blank lines after class or function definition, found 1
Line 42:1: E402 module level import not at top of file
Line 44:1: E302 expected 2 blank lines, found 1
Line 49:1: E305 expected 2 blank lines after class or function definition, found 1
Line 49:1: E402 module level import not at top of file
Line 50:1: E402 module level import not at top of file
Line 51:1: E402 module level import not at top of file
Line 53:1: E302 expected 2 blank lines, found 1
Line 53:80: E501 line too long (92 > 79 characters)
Line 71:9: E265 block comment should start with '# '
Line 116:1: E305 expected 2 blank lines after class or function definition, found 1
Line 116:1: E402 module level import not at top of file
Line 133:11: E225 missing whitespace around operator
Line 140:25: E201 whitespace after '('
Line 194:11: W292 no newline at end of file

Comment last updated at 2020-02-19 10:47:18 UTC

* example added
* New samples package
* Latin hypercube, Sobol, Hammersly and Halton samples were added
@holgern holgern changed the title [WIP] Add IntegerEncoder and improve transformer handling [WIP] Add initial sampling generation from latin hypercube, sobol, hammersly and halton Feb 7, 2020
@betatim
Copy link
Member

betatim commented Feb 12, 2020

Are there benchmarks or comparisons somewhere that show the benefits of using these other samplers for the initial points?

@holgern
Copy link
Contributor Author

holgern commented Feb 15, 2020

image

Benchmark on hart6 with 25 initial points and 50 call in total. gp_minimize was used.
halton shows the best results, followed by hammersly, lhs with maximin, lhs and sobol.
All initial point generators are better than random samples.

I will remove optimized LHS using Enhanced Stochastic Evolutionary Alg., as it perforce not very well and is slow.

*  optimized LHS using Enhanced Stochastic Evolutionary Alg. are removed as it is slow and performs not so well
* LHS default is changed to maximin
* Optimizer setting simplified and init_point_gen_kwargs has beed added
* Hammersly and halton has now a min_skip and max_skip parameter similar to sobol
@holgern
Copy link
Contributor Author

holgern commented Feb 16, 2020

image
branin example with 256 retries

@betatim
Copy link
Member

betatim commented Feb 18, 2020

Thanks for making the comparisons! Nice work.

One thing I'd consider while adding (or not) things to scikit-optimize is that it is better for users to offer less choice in options that cover (say) 80% of what users want to do.

To cater to experts wanting to do expert things (for which they are by definition a minority) we have tried to make skopt a toolkit that people can use to build their own optimisation machine by plugging the components that make the Optimizer class together in a different way.

So I'd always prioritise making things extendable or replaceable and keeping the amount of code and options built-in to scikit-optimize low. This means less maintenance effort, high quality of what is here, and encourages others to make packages with tools that can be used in skopt. Someone else maintaining code is a good thing because it removes burden from the skopt maintainers and keeps us honest in terms of offering a good API for extending things.


Another thing I wondered: for MC integration Sobol and friends are great because they cover the space more uniformly than random sampling. However an equally good thing (for MC integration) would be a precomputed grid of points. Except in MC integration you don't know how many points you will sample so you can't make a precomputed/equally spaced grid. However in skopt we decide up front how many initial points to try before switching to 'bayesian optimisation" mode. So maybe another interesting thing to add would be a sampler that doesn't pick points at random but instead uniformly distributes points on a grid across all dimensions.

inverse transformation is performed inside the InitialpointGenerators
Adapt examples and doc
Fix normalize in lhs generation
@holgern holgern merged commit bc37008 into scikit-optimize:master Feb 19, 2020
@holgern
Copy link
Contributor Author

holgern commented Feb 20, 2020

Thanks for your reply, I accidentally merged this PR to early as a mistake.
I continued my work on this issue on PR #851

  • I removed the init_point_gen_kwargs parameter, so that it will be easier to use.
  • I created a cook_initial_point_generator for advance usage
  • I added a equally spaced grid layout

@holgern holgern changed the title [WIP] Add initial sampling generation from latin hypercube, sobol, hammersly and halton ENH Add initial sampling generation from latin hypercube, sobol, hammersly and halton Feb 22, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants