## Machine Learning Study Notes
### Introduction to Machine Learning
* **Definition:** Machine learning (ML) is a field of computer science that gives computer systems
the ability to learn from data without being explicitly programmed.
* **Key Concepts:**
* **Data:** The raw material used to train ML models. Can be structured (e.g., tables) or
unstructured (e.g., text, images).
* **Algorithms:** The specific procedures used to learn patterns from data.
* **Models:** The representation of what an algorithm learns. Used to make predictions.
### Types of Machine Learning
* **Supervised Learning:**
* Learns from labeled data (input features and corresponding target variable).
* Goal: To predict the target variable given new input features.
* Examples:
* **Classification:** Predicting a categorical target (e.g., spam/not spam).
* **Regression:** Predicting a continuous target (e.g., house price).
* **Unsupervised Learning:**
* Learns from unlabeled data (input features only).
* Goal: To discover hidden patterns or structures in the data.
* Examples:
* **Clustering:** Grouping similar data points together.
* **Dimensionality Reduction:** Reducing the number of variables while preserving important
information.
* **Reinforcement Learning:**
* Learns through interaction with an environment.
* An agent learns to make decisions to maximize a reward signal.
* Examples:
* Game playing (e.g., AlphaGo).
* Robotics.
### The Machine Learning Process
* **Data Collection:** Gathering relevant data.
* **Data Preprocessing:** Cleaning, transforming, and preparing data for the model.
* **Model Selection:** Choosing an appropriate algorithm based on the problem and data.
* **Training:** Feeding the data to the algorithm to learn the model.
* **Evaluation:** Assessing the model's performance on unseen data.
* **Deployment:** Putting the model into production to make predictions.
### Key Considerations
* **Overfitting:** The model learns the training data too well and performs poorly on unseen data.
* **Underfitting:** The model is too simple and cannot capture the underlying patterns in the data.
* **Bias-Variance Tradeoff:** Balancing the model's ability to generalize (low variance) with its
ability to fit the training data (low bias).
* **Ethical Considerations:** Addressing potential biases and ensuring fairness in ML applications.