11.. _grid_search :
22
3+ .. currentmodule :: sklearn.grid_search
4+
35==========================================
46Grid Search: setting estimator parameters
57==========================================
68
7- .. currentmodule :: sklearn
8-
99Grid Search is used to optimize the parameters of a model (e.g. ``C ``,
1010``kernel `` and ``gamma `` for Support Vector Classifier, ``alpha `` for
1111Lasso, etc.) using an internal :ref: `cross_validation ` scheme).
@@ -15,10 +15,10 @@ GridSearchCV
1515============
1616
1717The main class for implementing hyperparameters grid search in
18- scikit-learn is :class: `grid_search. GridSearchCV `. This class is passed
18+ scikit-learn is :class: `GridSearchCV `. This class is passed
1919a base model instance (for example ``sklearn.svm.SVC() ``) along with a
2020grid of potential hyper-parameter values specified with the `param_grid `
21- attribute. For instace the following `param_grid `::
21+ attribute. For instance the following `param_grid `::
2222
2323 param_grid = [
2424 {'C': [1, 10, 100, 1000], 'kernel': ['linear']},
@@ -30,7 +30,7 @@ C values in [1, 10, 100, 1000], and the second one with an RBG kernel,
3030and the cross-product of C values ranging in [1, 10, 100, 1000] and gamma
3131values in [0.001, 0.0001].
3232
33- The :class: `grid_search. GridSearchCV ` instance implements the usual
33+ The :class: `GridSearchCV ` instance implements the usual
3434estimator API: when "fitting" it on a dataset all the possible
3535combinations of hyperparameter values are evaluated and the best
3636combinations is retained.
@@ -64,24 +64,76 @@ alternative scoring function can be specified via the ``scoring`` parameter to
6464:class: `GridSearchCV `.
6565See :ref: `score_func_objects ` for more details.
6666
67- Examples
68- ========
67+ .. topic :: Examples:
6968
70- - See :ref: `example_grid_search_digits.py ` for an example of
71- Grid Search computation on the digits dataset.
69+ - See :ref: `example_grid_search_digits.py ` for an example of
70+ Grid Search computation on the digits dataset.
7271
73- - See :ref: `example_grid_search_text_feature_extraction.py ` for an example
74- of Grid Search coupling parameters from a text documents feature
75- extractor (n-gram count vectorizer and TF-IDF transformer) with a
76- classifier (here a linear SVM trained with SGD with either elastic
77- net or L2 penalty) using a :class: `pipeline.Pipeline ` instance.
72+ - See :ref: `example_grid_search_text_feature_extraction.py ` for an example
73+ of Grid Search coupling parameters from a text documents feature
74+ extractor (n-gram count vectorizer and TF-IDF transformer) with a
75+ classifier (here a linear SVM trained with SGD with either elastic
76+ net or L2 penalty) using a :class: `pipeline.Pipeline ` instance.
7877
7978.. note ::
8079
8180 Computations can be run in parallel if your OS supports it, by using
8281 the keyword n_jobs=-1, see function signature for more details.
8382
8483
84+ Randomized Hyper-Parameter Optimization
85+ =======================================
86+ While using a grid of parameter settings is currently the most widely used
87+ method for hyper-parameter optimization, other search methods have more
88+ favourable properties.
89+ :class: `RandomizedSearchCV ` implements a randomized search over hyperparameters,
90+ where each setting is sampled from a distribution over possible parameter values.
91+ This has two main benefits over searching over a grid:
92+
93+ * A budget can be chosen independent of the number of parameters and possible values.
94+
95+ * Adding parameters that do not influence the performance does not decrease efficiency.
96+
97+ Specifying how parameters should be sampled is done using a dictionary, very
98+ similar to specifying parameters for :class: `GridSearchCV `. Additionally,
99+ a computation budget is specified using ``n_iter ``, which is the number
100+ of iterations (parameter samples) to be used.
101+ For each parameter, either a distribution over possible values or list of
102+ discrete choices (which will be sampled uniformly) can be specified::
103+
104+ [{'C': scipy.stats.expon(scale=100), 'gamma': scipy.stats.expon(scale=.1),
105+ 'kernel': ['rbf'], 'class_weight':['auto', None]}]
106+
107+ This example uses the ``scipy.stats `` module, which contains many useful
108+ distributions for sampling hyperparameters, such as ``expon ``, ``gamma ``,
109+ ``uniform `` or ``randint ``.
110+ In principle, any function can be passed that provides a ``rvs `` (random
111+ variate sample) method to sample a value. A call to the ``rvs `` function should
112+ provide independent random samples from possible parameter values on
113+ consecutive calls.
114+
115+ .. warning ::
116+
117+ The distributions in ``scipy.stats `` do not allow specifying a random
118+ state. Instead, they use the global numpy random state, that can be seeded
119+ via ``np.random.seed `` or set using ``np.random.set_state ``.
120+
121+ For continuous parameters, such as ``C `` above, it is important to specify
122+ a continuous distribution to take full advantage of the randomization. This way,
123+ increasing ``n_iter `` will always lead to a finer search.
124+
125+ .. topic :: Examples:
126+
127+ * :ref: `example_randomized_search.py ` compares the usage and efficiency
128+ of randomized search and grid search.
129+
130+ .. topic :: References:
131+
132+ * Bergstra, J. and Bengio, Y.,
133+ Random search for hyper-parameter optimization,
134+ The Journal of Machine Learning Research (2012)
135+
136+
85137Alternatives to brute force grid search
86138=======================================
87139
0 commit comments