ML Classifiers: Logistic Regression (GD/IRLS) + Naive Bayes (Gaussian/Bernoulli)

A machine-learning project implementing:

Multiclass (softmax) logistic regression trained with gradient descent
Multiclass logistic regression (K−1 parameterization) trained with Newton / IRLS
Gaussian Naive Bayes (with variance smoothing) + sampling
Bernoulli Naive Bayes (with Lidstone smoothing) + sampling

Benchmarked on:

Iris (toy multiclass dataset)
MNIST (OpenML) (handwritten digits)

Setup

1) Create an environment (recommended)

python -m venv .venv
source .venv/bin/activate

2) Install dependencies

pip install -r requirements.txt

Optional dev dependencies (tests):

pip install -r requirements-dev.txt

Running experiments

All scripts are designed to run from the repo root.

Iris

Gradient Descent (no bias):

python scripts/iris_gd.py

Gradient Descent (with bias via intercept feature):

python scripts/iris_gd_bias.py

IRLS/Newton vs GD (no bias):

python scripts/iris_irls_vs_gd.py

IRLS/Newton vs GD (with bias via intercept feature):

python scripts/iris_irls_vs_gd_bias.py

Scikit-learn baseline comparisons:

python scripts/iris_sklearn_baseline.py

MNIST (OpenML)

Gaussian NB baseline (alpha=1e-7):

python scripts/mnist_gnb_baseline.py

Gaussian NB smoothing sweep:

python scripts/mnist_gnb_sweep.py

Gaussian NB digit generation (uses alpha_best=0.1 by default):

python scripts/mnist_gnb_generate.py

Bernoulli NB eval (alpha=1e-8):

python scripts/mnist_bnb_eval.py

Bernoulli NB digit generation:

python scripts/mnist_bnb_generate.py

Note on MNIST download: fetch_openml caches downloads under openml_cache/ (ignored by git). The first MNIST run can take a while depending on network speed.

Outputs (figures)

Iris: IRLS/Newton vs Gradient Descent (no bias)

MNIST: Gaussian Naive Bayes smoothing sweep

MNIST: Generated digits (Gaussian NB)

MNIST: Generated digits (Bernoulli NB)

Notes on numerical stability

Logistic regression loss uses scipy.special.log_softmax to avoid overflow/underflow.
Naive Bayes prediction is computed in log-space and normalized with scipy.special.logsumexp.
Gaussian NB variance smoothing improves stability on high-dimensional MNIST features.

Tests (optional)

Run sanity checks:

pytest -q

License

MIT (see LICENSE).

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
outputs/figures		outputs/figures
scripts		scripts
src/ml_classifiers		src/ml_classifiers
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML Classifiers: Logistic Regression (GD/IRLS) + Naive Bayes (Gaussian/Bernoulli)

Setup

1) Create an environment (recommended)

2) Install dependencies

Running experiments

Iris

MNIST (OpenML)

Outputs (figures)

Iris: IRLS/Newton vs Gradient Descent (no bias)

MNIST: Gaussian Naive Bayes smoothing sweep

MNIST: Generated digits (Gaussian NB)

MNIST: Generated digits (Bernoulli NB)

Notes on numerical stability

Tests (optional)

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ML Classifiers: Logistic Regression (GD/IRLS) + Naive Bayes (Gaussian/Bernoulli)

Setup

1) Create an environment (recommended)

2) Install dependencies

Running experiments

Iris

MNIST (OpenML)

Outputs (figures)

Iris: IRLS/Newton vs Gradient Descent (no bias)

MNIST: Gaussian Naive Bayes smoothing sweep

MNIST: Generated digits (Gaussian NB)

MNIST: Generated digits (Bernoulli NB)

Notes on numerical stability

Tests (optional)

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages