Grid Search: Steps and Example
Grid search is a hyperparameter tuning technique used in machine learning to identify the best combination of
hyperparameters for a model. It systematically evaluates all possible combinations of hyperparameter values to
find the optimal set based on a scoring metric.
Steps in Grid Search
• Define the Hyperparameters and Their Ranges: Identify the hyperparameters to tune and their possible
values. For example:
- For a support vector machine (SVM), you might tune:
- C (regularization parameter): [0.1, 1, 10]
- kernel: ['linear', 'rbf']
- gamma (kernel coefficient): [0.001, 0.01, 0.1]
• Split the Dataset: Divide the dataset into training and validation sets (or use cross-validation).
• Construct the Grid: Create a grid of all possible hyperparameter combinations. For the above example:
- Total combinations: 3 (C values) × 2 (kernels) × 3 (gamma values) = 18.
• Train Models for Each Combination: For each combination of hyperparameters, train the model on the
training data and evaluate it on the validation data using a predefined scoring metric (e.g., accuracy, F1-
score).
• Select the Best Combination: Identify the combination that yields the best performance on the validation
set.
• Test the Best Model: Evaluate the selected model on the test set to estimate its performance.
Example: Grid Search for SVM
**Dataset**: Iris dataset (predicting flower species)
**Hyperparameters**:
- C: [0.1, 1, 10]
- kernel: ['linear', 'rbf']
- gamma: [0.001, 0.01, 0.1]
Process**:
Split the data into training and test sets.
Define the hyperparameter grid:
```python
param_grid = {
'C': [0.1, 1, 10],
'kernel': ['linear', 'rbf'],
'gamma': [0.001, 0.01, 0.1]
}
Use `GridSearchCV` (from `scikit-learn`) to automate the search:
python
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC
grid_search = GridSearchCV(SVC(), param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)
Retrieve the best parameters:
python code:
print("Best parameters:", grid_search.best_params_)
Output might be:
Best parameters: {'C': 10, 'gamma': 0.01, 'kernel': 'rbf'}
Evaluate the model with the best parameters on the test set.
This approach ensures that the model is fine-tuned to achieve the best performance for the given task.