Concrete Strength Prediction using Machine Learning
Literature Review:
Author:[Link] Shaquadan
Used 90 samples of experimental analysis to determine whether adding silica effects
compressive strength.
Author:Sushmitha,Akash,Jalok,Ravikumar,Arjun V
In this,they used Aritificial neural networks with MATLAB to predict compressive
strength.
Used 1195 samples.
Author:Liu Pengcheng,Wu Xianguo,Hongyu,Zhong
ANN,SVM,Decision tree are more efficient since they reduce learning errors.
Author:MehdiNikoo,Farshid,Aukasz
Used 173 samples of data
Used ANN
About Project:
Samples:1030
Asking questions about different features and building the tree.
Making decisions by training the model.
Predicting new data.
Evaluating the guess by calculating matrices like Mean Squared Error,Mean Absolute
Error etc.
Visualizing the decision tree by understanding the questions.
Applications:
1. Mean Squared Error (MSE):
- What it does:
It looks at how wrong our guesses are and squares those errors.
- Concrete Strength Example:
If we say a concrete strength is 10, but it's actually 8, the error is 2. Squaring it gives
4. MSE adds up all these squared errors and finds the average.
- What We Want:
A smaller MSE is better. It means our guesses are really close to the actual strengths.
2. Mean Absolute Error (MAE):
- What it does:
Similar to MSE but without squaring. It just looks at how far off our guesses are.
- Concrete Strength Example:
If we say a concrete strength is 10, but it's actually 8, the error is 2. MAE adds up all
these errors and finds the average.
- What We Want:
A smaller MAE is better. It means our guesses are generally close to the actual
strengths.
3. R-squared Score:
- What it does:
It tells us how well our guesses explain the real strengths. If R-squared is 1, it's like
saying, "Hey, our guesses perfectly explain what's happening."
- Concrete Strength Example:
If we say concrete strength is related to age and amount of cement, R- squared
tells us how well our guesses using these things explain the actual strengths.
- What We Want:
A higher R-squared is better. It means our guesses are good at explaining the
concrete strengths.
4. Explained Variance Score:
- What it does:
Similar to R-squared, it says how much of the strength differences we can explain
with our guesses.
- Concrete Strength Example:
If we can say, "80% of the differences in concrete strength are because of age and
cement amount," that's good.
- What We Want:
A higher explained variance score is better. It means our guesses capture a big part
of why strengths are different.
5. Median Absolute Error (MedAE):
- What it does:
It looks at the middle value of all the errors. It's like saying, "What's the typical
mistake we make?"
- Concrete Strength Example:
If our typical mistake is saying a strength is 2 units too high, MedAE is 2.
- What We Want:
A smaller MedAE is better. It means our typical guess is not far off from the actual
strength.
Sample Code:
import pandas as pd
from sklearn.model_selection import train_test_split
from [Link] import DecisionTreeClassifier
from [Link] import accuracy_score, confusion_matrix, classification_report
from [Link] import load_iris
# Load the Iris dataset
iris = load_iris()
X = [Link]
y = [Link]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
dt_classifier = DecisionTreeClassifier()
dt_classifier.fit(X_train, y_train)
y_pred = dt_classifier.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
class_report = classification_report(y_test, y_pred)
print(f'Accuracy: {accuracy}')
print(f'Confusion Matrix:\n{conf_matrix}')
print(f'Classification Report:\n{class_report}')
Architecture: