0% found this document useful (0 votes)
14 views72 pages

BATCH7 ECE3 BreastCancerDetectionUsingAI

The document presents a project on breast cancer detection using artificial intelligence, specifically employing Convolutional Neural Networks (CNN) and Support Vector Machines (SVM) for image processing of mammograms. It outlines the methodology for early detection through image pre-processing, segmentation, feature extraction, and classification of tumors as benign or malignant. The project is part of the requirements for a Bachelor of Technology degree in Electronics and Communication Engineering at Gayatri Vidya Parishad College of Engineering.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views72 pages

BATCH7 ECE3 BreastCancerDetectionUsingAI

The document presents a project on breast cancer detection using artificial intelligence, specifically employing Convolutional Neural Networks (CNN) and Support Vector Machines (SVM) for image processing of mammograms. It outlines the methodology for early detection through image pre-processing, segmentation, feature extraction, and classification of tumors as benign or malignant. The project is part of the requirements for a Bachelor of Technology degree in Electronics and Communication Engineering at Gayatri Vidya Parishad College of Engineering.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 72

BREAST CANCER DETECTION USING ARTIFICIAL

INTELLIGENCE

Project work submitted in partial fulfillment of the


requirements for the award of degree of

Bachelor of Technology
in
Electronics and Communication Engineering
by

M J V PRAKASH (19131A04D8) P HARI PRAKASH (19131A04J1)

M VINAY KUMAR (19131A04D5) M VIDYA SAGAR (19131A04D9)

Under the guidance of

Prof. Dr. M. V. S. Sairam


Dean, Academics (UG)

Department of Electronics and Communication Engineering


GAYATRI VIDYA PARISHAD COLLEGE OF ENGINEERING (AUTONOMOUS)
(Affiliated to J.N.T University, Kakinada, A.P)
VISAKHAPATNAM- 530 048
April, 2023
ELECTRONICS AND COMMUNICATION ENGINEERING DEPARTMENT

CERTIFICATE
This is to certify that the project titled BREAST CANCER DETECTION USING
ARTIFICIAL INTELLIGENCE a bonafide record of the work done by M J V
PRAKASH (19131A04D8), P HARI PRAKASH (19131A04J1), M VINAY
KUMAR (19131A04D5) , M VIDYA SAGAR (19131A04D9) in partial fulfillment
of the requirements for the award of the degree of Bachelor of Technology in
Electronics and Communication Engineering of the Gayatri Vidya Parishad
College of Engineering (Autonomous) affiliated to Jawaharlal Nehru Technological
University, Kakinada during the year 2022-2023.

Under the guidance of: Head of the Department:

Prof. Dr. M. V. S. Sairam Dr. N. Deepika Rani


Dean, Academics (UG) Professor &HoD,
Dept. of ECE, Dept. of ECE,
GVPCE (A) GVPCE (A)

Project Viva-voce held on _____________________________

Signature of the Internal Examiner Signature of the External Examiner

DECLARATION
This is to certify that the project titled BREAST CANCER DETECTION USING
ARTIFICIAL INTELLIGENCE a bonafide record of the work done by M J V
PRAKASH (19131A04D8), P HARI PRAKASH (19131A04J1), M VINAY KUMAR
(19131A04D5), M VIDYA SAGAR (19131A04D9) in partial fulfillment of the
requirement for the award of degree of B. Tech. in Electronics and Communication
Engineering to Gayatri Vidya Parishad College of Engineering (Autonomous), affiliated
to J.N.T. University, Kakinada comprises only our original work and due
acknowledgement has been made in the text to all other material used.

Date:

Al l Student names (Roll Nos) Signature

M J V PRAKASH (19131A04D8) :

P HARI PRAKASH (19131A04J1) :

M VINAY KUMAR (19131A04D5) :

M VIDYA SAGAR (19131A04D9) :

ACKNOWLEDGEMENT
We take this opportunity to thank one and all who have helped in making
this possible. We are grateful to Gayatri Vidya Parishad College of Engineering
(Autonomous), for giving the opportunity to work on a project report as a part of
the curriculum.

Our sincere thanks to our guide [Link], Professor, Dean,


Academics (UG), Electronics and Communication Engineering Department, for his
continuous and valuable guidance.

With a great sense of pleasure and privilege, we extend our gratitude and
sincere thanks to, Dr. N. Deepika Rani, Professor and Head of the Electronics and
Communication Engineering Department, for her encouragement.

With a great sense of pleasure and privilege, we extend our gratitude and
sincere thanks to Dr. A. Bala Koteswara Rao, Principal, Gayatri Vidya Parishad
College of Engineering (Autonomous) for his continuous encouragement during
the course of study.

We would like to express our sincere thanks to all the faculty and staff of
the Department of Electronics and Communication Engineering for their advice and
cooperation.

We owe our thanks much more than our words can express to our families
who sacrificed in all respects with love and affection to complete this thesis work
satisfactorily.

Finally, we take this opportunity to thank all the people who helped us in
completion of thesis work, directly or indirectly, and for their timely
encouragement and faithful services.

ABSTRACT

Breast Cancer is one of the major causes of death in women. Many research has been done
on the diagnosis and detection of breast cancer using various image processing techniques.
Nonetheless, the disease remains as one of the deadliest diseases. Since the cause of breast
cancer stays obscure, prevention becomes impossible. Thus, early detection of tumor in
breast is the only way to cure breast cancer. We have proposed a Convolutional Neural
Network (CNN) algorithm and also Support Vector Machine (SVM) for Breast Cancer
Detection.

Digital image processing techniques such as image pre-processing, image segmentation,


feature extraction and image classification are applied in this project on the digital
mammogram images to achieve early and automated detection of breast cancer.

Firstly, the image pre-processing of the mammogram is carried out which helps in
removing noise in the image, if any. Second the segmentation techniques were used with
which the tumor part dilates in the breast and erodes the remaining parts. Along with the
above two image processing techniques, feature extraction is also done using PYTHON.
Finally, the features extracted are used for classification of mammograms into benign and
malignant. The image classification process is done with python using about
6000(approx.) images.

Keywords: CNN, SVM, mammogram images, feature extraction, image segmentation


and classification.

CONTENTS
Page No.

CERTIFICATE II

DECLARATION III

ACKNOWLEDGEMENT IV
ABSTRACT V

CONTENTS VI

LIST OF TABLES IX

LIST OF FIGURES X

CHAPTER 1 INTRODUCTION 11

1.1. PROJECT OBJECTIVE 2

1.2. PROJECT OUTLINE 2

CHAPTER 2 DETAILS ON BREAST CANCER 3

2.1. WHERE BREAST CANCER STARTS 3

2.2. TYPES OF BREAST CANCER 4

2.3. HOW BREAST CANCER SPREADS 4

2.4. HOW COMMON IS BREAST CANCER 5

2.5. BREAST CANCER SIGNS AND SYMPTOMS 6

2.6. MAMMOGRAMS 6

2.7. OTHER COMMON TESTS 8

CHAPTER 3 LITERATURE SURVEY 11

3.1. EXISTING WORK 11

CHAPTER 4 IMAGE PROCESSING AND ANALYSIS 12

4.1. IMAGE PROCESSING 12

4.2. IMAGE ANALYSIS 14

CHAPTER 5 SUPPORT VECTOR MACHINES 24

5.1. SVM IN MACHINE LERANING TECHNIQUE 24

5.2. ADVANTAGES OF SVM 25

5.3. DISADVANTAGES OF SVM 26

5.4. PYTHON LIBRARIES IN SVM 26

CHAPTER 6 ARTIFICIAL NEURAL NETWORK 33


6.1. ANN TECHNIQUE IN DEEPLEARNING 33

6.2. ARCHITECTURE OF ANN 34

6.3. ADVANTAGES OF ANN 35

6.4. DISADVANTAGES OF ANN 36

CHAPTER 7 CONVOLUTIONAL NEURAL NETWORKS 37

7.1. CNN TECHNIQUE IN DEEPLEARNING 37

7.2. ARCHITECTURE OF CNN 38

7.3. ADVANTAGES OF CNN 42

7.4. DISADVANTAGES OF CNN 42

7.5. PYTHON LIBRARIES IN CNN 43

CHAPTER 8 PROPOSED WORK 46

8.1. FLOW CHART FOR SVM 46

8.2. FLOW CHART FOR CNN 53

CHAPTER 9 SIMULATION RESULTS 55

9.1. OUTPUT OF SVM 55

9.2. FEATURES FROM DATASET 56

9.3. HYBRID FEATURES OBTAINED FROM PCA 56

9.4. ACCURACY FOR SVM 56

9.5. OUTPUT OF CNN 57

9.6. PLOT ACCURACY VS VALIDATION LOSS 57

9.7. ACCURACY OF CNN 58

CONCLUSION 59

FUTURE SCOPE 60

REFERENCES 61
LIST OF TABLES
TABLE 4.1. Features 22

TABLE 8.1. Features from Dataset 50

TABLE 8.2. Hybrid Features 51


LIST OF FIGURES
Figure 2.1. Breast.......................................................................................................3

Figure 2.2. Mammograms..........................................................................................7

Figure 2.3. Breast MRI..............................................................................................8

Figure 2.4. Breast BIOPSY......................................................................................10

Figure 4.1. Edge Detection......................................................................................13

Figure 4.2. Skewness...............................................................................................16

Figure 4.3. Kurtosis..................................................................................................18

Figure 4.4. Classifiers..............................................................................................23

Figure 5.1. SVC Classification................................................................................25

Figure 5.2. Principal Component Analysis (PCA)...................................................29

Figure 5.3. Confusion Matrix...................................................................................31

Figure 6.1. Architecture of Artificial Neural Network............................................35

Figure 7.1. Architecture of Convolutional Neural Network (CNN)........................38

Figure 7.2. Convolutional Layer..............................................................................39

Figure 7.3. ReLU Layer...........................................................................................39

Figure 7.4. Max-Pooling Layer................................................................................40

Figure 7.5. Flattening Data......................................................................................41

Figure 7.6. Fully Connected Layer..........................................................................41

Figure 8.1. Flow chart of SVM................................................................................46

Figure 8.2. Edge Detection......................................................................................48

Figure 8.3. Hough Circles........................................................................................49

Figure 8.4. SVC Classification................................................................................52

Figure 8.5. Flow chart of CNN................................................................................53


CHAPTER 1
ON
Breast Cancer is major cause of death in women around the world. According to WHO
(World Health Organization), breast cancer accounted for maximum deaths (2.3 million
cases), worldwide in 2020 out of the 10 million cases of cancer. Breast cancer starts when
cells in the breast begin to grow out of control. These accumulations of cells are called
tumors and they can often be seen on an x-ray or felt as a lump. Breast cancer can spread
when the cancer cells get into the blood or lymph system and are carried to other parts of
the body making them prone to cancer. There are many different types of breast cancer
and common ones include ductal carcinoma in situ (DCIS) and invasive carcinoma. The
side effects of Breast Cancer are – Fatigue, Headaches, Pain and numbness (peripheral
neuropathy), Bone loss and osteoporosis. There are two types of tumors. One is benign
which is non-cancerous and the other one is malignant which is cancerous. Benign breast
tumors are abnormal growths in the breast, but they do not spread outside. So, this means
that they are not life threatening. Different imaging tests are used for detecting breast
cancer. Some of them are mammograms, breast ultrasound and breast MRI. A
mammogram is nothing but an x-ray of breast and it is used to look for any changes in
the breast. A mammogram makes it easy to treat by finding and detecting breast cancer
early, when the tumor is small and even before a lump can be felt. Detection of breast
cancer in its early stages using image processing techniques includes four parts. In the first
part the digital images (mammograms) are pre-processed to remove any kind noise. Then
in the second part the images undergo the segmentation process to enhance the tumor part.
After this, in the third part, the important features in the segmented images are extracted.
Finally, in the fourth part, with the help of the extracted features, the images are classified
into benign or malignant. Here ‘benign’ represents the breast with non-cancerous tumor
and ‘malignant’ represents breast with cancerous tumor.[1]

METHDOLOGY
1.1. PROJECT OBJECTIVE
The objective of the project is to detect the initial phase tumors which shall not be prone
to human error using image processing techniques such as image pre-processing, image
segmentation, features extraction and selection and image classification.

Firstly, the image pre-processing of the mammogram is carried out which helps in
removing noise in the image, if any. Secondly the segmentation techniques were used
with which the tumor part dilates in the breast and erodes the remaining parts. Along with
the above two image processing techniques, feature extraction is also done using
PYTHON. Finally, the features extracted are used for classification of mammograms into
either benign or malignant. The image classification process is done with python using
about 6000(approx.) images.

1.2. PROJECT OUTLINE


This project report is presented over the five remaining chapters.

 Chapter 2 describes the details on breast cancer.

 Chapter 3 is about Literature Survey.

 Chapter 4 is image processing and Analysis.

 Chapter 5 explains the concept of SVM technique in Machine Learning.

 Chapter 6 explains the concept of ANN technique in Deep Learning.

 Chapter 7 explains the concept of CNN technique in Deep Learning.

 Chapter 8 presents the proposed work which is used in the detection of breast

cancer using CNN and SVM.

 Chapter 9 presents the simulation results of the detection of breast cancer using

Python using various IMAGES.


CHAPTER 2
BREAST CANCER
Breast cancer is a type of cancer that starts in the breast. Cancer starts when cells begin
to grow out of control. Breast cancer cells usually form a tumor that can often be seen
on an x-ray or felt as a lump. Breast cancer occurs almost entirely in women, but men
can get breast cancer, too.

It’s important to understand that most breast lumps are benign and not cancer. Non-
cancerous breast tumors are abnormal growths, but they do not spread outside of the
breast. They are not life threatening, but some types of benign breast lumps can increase
a woman's risk of getting breast cancer. Any breast lump or change needs to be checked
by a health care professional to determine if it is benign or malignant (cancer) and if it
might affect your future cancer risk.[2]

2.1. WHERE BREAST CANCER STARTS


Breast cancers can start from different parts of the breast.

 Most breast cancers begin in the ducts that carry milk to the nipple (ductal
cancers)
 Some start in the glands that make breast milk (lobular cancers)
 There are also other types of breast cancer that are less common like phyllodes
tumor and angiosarcoma
 A small number of cancers start in other tissues in the breast. These cancers are
called sarcomas and lymphomas and are not really thought of as breast cancers.
Figure 2.1. Breast

Although many types of breast cancer can cause a lump in the breast, not all do. Many
breast cancers are also found on screening mammograms, which can detect cancers at
an earlier stage, often before they can be felt, and before symptoms develop.

2.2. TYPES OF BREAST CANCER


There are many different types of breast cancer and common ones include ductal
carcinoma in situ (DCIS) and invasive carcinoma. Others, like phyllodes tumors and
angiosarcoma are less common.

Once a biopsy is done, breast cancer cells are tested for proteins called estrogen
receptors, progesterone receptors and HER2. The tumor cells are also closely looked at
in the lab to find out what grade it is. The specific proteins found and the tumor grade
can help decide treatment options.

2.3. HOW BREAST CANCER SPREADS


Breast cancer can spread when the cancer cells get into the blood or lymph system and
are carried to other parts of the body.

The lymph system is a network of lymph (or lymphatic) vessels found throughout the
body that connects lymph nodes (small bean-shaped collections of immune system cells).
The clear fluid inside the lymph vessels, called lymph, contains tissue byproducts and
waste material, as well as immune system cells. The lymph vessels carry lymph fluid
away from the breast. In the case of breast cancer, cancer cells can enter those lymph
vessels and start to grow in lymph nodes.
Most of the lymph vessels of the breast drain into:

 Lymph nodes under the arm (auxiliary nodes)


 Lymph nodes around the collar bone (supraclavicular [above the collar bone]
and infraclavicular [below the collar bone] lymph nodes)
 Lymph nodes inside the chest near the breast bone (internal mammary lymph
nodes).
If cancer cells have spread to your lymph nodes, there is a higher chance that the cells
could have travelled through the lymph system and spread (metastasized) to other parts of
your body. The more lymph nodes with breast cancer cells, the more likely it is that the
cancer may be found in other organs. Because of this, finding cancer in one or more
lymph nodes often affects your treatment plan. Usually, you will need surgery to remove
one or more lymph nodes to know whether the cancer has spread. Still, not all women
with cancer cells in their lymph nodes develop metastases, and some women with no
cancer cells in their lymph nodes develop metastases later.

2.4. HOW COMMON IS BREAST CANCER


Breast cancer is the most common cancer in American women, except for skin cancers.
The average risk of a woman in the United States developing breast cancer sometime in
her life is about 30%. This means there is a 1 in 3 chance she will develop breast cancer.

2.4.1. CURRENT YEAR ESTIMATES FOR BREAST CANCER

The American Cancer Society's estimates for breast cancer in the United States for 2023
are:

 About 297,790 new cases of invasive breast cancer will be diagnosed in women.
 About 55,720 new cases of ductal carcinoma in situ (DCIS) will be diagnosed.
 About 43,700 women will die from breast cancer.[3]
2.4.2. TRENDS IN BREAST CANCER

In recent years, incidence rates have increased by 0.5% per year. Breast cancer is the
second leading cause of cancer death in women. The chance that a woman will die from
breast cancer is about 1 in 39 (about 2.6%). Since 2007, breast cancer death rates have
been steady in women younger than 50, but have continued to decrease in older women.
From 2013 to 2018, the death rate went down by 1% per year. These decreases are
believed to be the result of finding breast cancer earlier through screening and increased
awareness, as well as better treatments. At this time there are more than 3.8 million breast
cancer survivors in the United States. This includes women still being treated and those
who have completed treatment.

2.5. BREAST CANCER SIGNS AND SYMPTOMS


Knowing how your breasts normally look and feel is an important part of breast health.
Although having regular screening tests for breast cancer is important, mammograms do
not find every breast cancer. This means it's also important for you to be aware of
changes in your breasts and to know the signs and symptoms of breast cancer.

The most common symptom of breast cancer is a new lump or mass. A painless, hard
mass that has irregular edges is more likely to be cancer, but breast cancers can be tender,
soft, or round. They can even be painful. For this reason, it's important to have any new
breast mass, lump, or breast change checked by an experienced health care professional.
Other possible symptoms of breast cancer include:

 Swelling of all or part of a breast (even if no lump is felt)


 Skin dimpling (sometimes looking like an orange peel)
 Breast or nipple pain
 Nipple retraction (turning inward)
 Nipple or breast skin that is red, dry, flaking or thickened
 Nipple discharge (other than breast milk)
 Swollen lymph nodes (Sometimes a breast cancer can spread to lymph nodes
under the arm or around the collar bone and cause a lump or swelling there,
even before the original tumor in the breast is large enough to be felt.)

Although any of these symptoms can be caused by things other than breast cancer, if you
have them, they should be reported to a health care professional so the cause can be
found.

Remember that knowing what to look for does not take the place of having regular
mammograms and other screening tests. Screening tests can help find breast cancer early,
before any symptoms appear. Finding breast cancer early gives you a better chance of
successful treatment.

2.6. MAMMOGRAMS
Mammograms are low-dose x-rays of the breast. Regular mammograms can help find
breast cancer at an early stage, when treatment is most successful. A mammogram can
often find breast changes that could be cancer years before physical symptoms develop.
Results from many decades of research clearly show that women who have regular
mammograms are more likely to have breast cancer found early, are less likely to need
aggressive treatment like surgery to remove the breast (mastectomy) and chemotherapy,
and are more likely to be cured.

Mammograms are not perfect. They miss some cancers. And sometimes a woman will
need more tests to find out if something found on a mammogram is or is not cancer.
There’s also a small possibility of being diagnosed with a cancer that never would have
caused any problems had it not been found during screening. (This is called
overdiagnosis.)

Figure 2.1. Mammograms

There are two types of mammograms. A screening mammogram is used to look for signs
of breast cancer in women who don’t have any breast symptoms or problems. X-ray
pictures of each breast are taken, typically from 2 different angles. Mammograms can
also be used to look at a woman’s breast if she has breast symptoms or if a change is seen
on a screening mammogram. When used in this way, they are called diagnostic
mammograms. They may include extra views (images) of the breast that aren’t part of
screening mammograms. Sometimes diagnostic mammograms are used to screen women
who were treated for breast cancer in the past.

In the past, mammograms were typically printed on large sheets of film. Today, digital
mammograms are much more common. Digital images are recorded and saved as files in
a computer.

2.7. OTHER COMMON TESTS


2.7.1. BREAST MRI

Breast MRI (magnetic resonance imaging) uses radio waves and strong magnets to make
detailed pictures of the inside of the breast. It is used:

 To help determine the extent of breast cancer: Breast MRI is sometimes used in
women who already have been diagnosed with breast cancer, to help measure the
size of the cancer, look for other tumors in the breast, and to check for tumors in
the opposite breast. But not every woman who has been diagnosed with breast
cancer needs a breast MRI.
 To screen for breast cancer: For certain women at high risk for breast cancer, a
screening MRI is recommended along with a yearly mammogram. MRI is not
recommended as a screening test by itself because it can miss some cancers that a
mammogram would find.
Figure 2.1. Breast MRI

Although MRI can find some cancers not seen on a mammogram, it’s also more likely
to find things that turn out not to be cancer (called a false positive). This can result in a
woman getting tests and/or biopsies that end up not being needed. This is why MRI is
not recommended as a screening test for women at average risk of breast cancer.

2.7.2. BREAST ULTRASOUND

Breast ultrasound uses sound waves to make a computer picture of the inside of the
breast. It can show certain breast changes, like fluid-filled cysts, that are harder to
identify on mammograms. It is used when:

 Ultrasound is useful for looking at some breast changes, such as lumps


(especially those that can be felt but not seen on a mammogram) or changes in
women with dense breast tissue. It also can be used to look at a suspicious area
that was seen on a mammogram.
 Ultrasound is useful because it can often tell the difference between fluid filled
cysts (which are very unlikely to be cancer) and solid masses (which might need
further testing to be sure they're not cancer).
 Ultrasound can also be used to help guide a biopsy needle into an area so that
cells can be taken out and tested for cancer. This can also be done in swollen
lymph nodes under the arm.
 Ultrasound is widely available, easy to have, and does not expose a person to
radiation. It also costs less than a lot of other options.

2.7.3. BREAST BIOPSY

When other tests show that there is breast cancer, then biopsy is done. Needing a breast
biopsy doesn’t necessarily mean that there is cancer. Most biopsy results are not cancer,
but a biopsy is the only way to find out for sure. During a biopsy, a doctor will remove
small pieces from the suspicious area so they can be looked at in the lab to see if they
contain cancer cells.

There are different kinds of breast biopsies. Some are done using a hollow needle, and
some use an incision (cut in the skin). Each has pros and cons.
 In an FNA biopsy, a very thin, hollow needle attached to a syringe is used to
withdraw (aspirate) a small amount of tissue from a suspicious area. The needle
used for an FNA biopsy is thinner than the one used for blood tests.
 A core biopsy uses a larger needle to sample breast changes felt by the doctor or
seen on an ultrasound, mammogram, or MRI. This is often the preferred type of
biopsy if breast cancer is suspected.

In rare cases, surgery is needed to remove all or part of the lump for testing. This is called
a surgical or open biopsy. Most often, the surgeon removes the entire mass or abnormal
area as well as a surrounding margin of normal breast tissue.

Figure 2.1. Breast BIOPSY

Regardless of the type of biopsy, the biopsy samples will be sent to a lab where a
specialized doctor called a pathologist will look at them. It typically will take at least a
few days for you to find out the results.[4]
CHAPTER 3
SURVEY
The survey of the associated work is made to study existing methods for the detection of
Breast Cancer using various Image Processing Techniques. The associated work on this
subject matter is branded in the literature survey in which the concentration is mainly on
various techniques on detection of Breast Cancer.

3.1. EXISTING WORK


R. Jeeva, S. Dhanasekar, A. Harshathunnisa, V. Eshwin, Amit Karn Proposed
an SVM algorithm with an accuracy of 85% and they have used DWT (Discrete Wavelet
Transform) for the Feature Extraction and they have compared with ANN algorithm
which has an accuracy of 70% which they have done Feed forward and Back-
Propagation.[5]

Siddhartha Gupta, Sudha R, Neha Sinha, Challa Babu in the Proposed work, a
variety of algorithms has been applied but the best one suited for cancer detection is the
combination of K Means, Closing, Dilation and Canny Edge Detection algorithm.[6]

Prannoy Giri and K Saravana Kumar This paper mainly studies the multiple
image processing algorithms which can be extensively used for finding cancerous cells.
The techniques in computer aided mammography includes image pre-processing, image
segmentation, feature extraction, feature selection and classification. Further
developments are required to extract more features to find pattern in tumor to have a
better understanding on them. Texture analysis method can be used to classify between
benign and malignant masses by means to identify the micro-calcification in the
mammography.[7]

Dina A. Ragab, Maha Sharkas, Stephen Marshall and Jinchang Ren This
paper mainly works on DCNN (Deep Convolutional Neural Network) and SVM and they
have used Region Based Segmentation method and which gives an 88% of accuracy in
SVM and 73.6% of accuracy in DCNN.[8]
CHAPTER 4
PROCESSING AND ANALYSIS

4.1. IMAGE PROCESSING


Image processing is a field of study that involves the analysis and manipulation of
images. It involves the use of various mathematical and algorithmic techniques to
improve, modify, or analyze an image in some way.

4.1.1. IMAGE PRE-PROCESSING

Gray scale conversion is a common technique used in image pre-processing to convert a


color image to a grayscale image. The grayscale image consists of shades of gray,
ranging from black (0) to white (255), and is used to simplify image processing by
reducing the amount of data required to represent the image. There are several methods
for performing Gray scale conversion, but one of the most common is luminosity method.
In this method, the RGB values of each pixel in the colour image are weighted based on
their perceived brightness, and then combined to create a single grayscale value for each
pixel. The formula for this method is:

Grayscale value = 0.299 * Red + 0.587 * Green + 0.114 * Blue

where Red, Green, and Blue are the RGB values of the pixel, and the weights are based
on the relative brightness of each colour.

4.1.2. IMAGE SEGMENTATION

Edge detection is another important technique in image processing that involves detecting
boundaries or edges in an image. Edges are defined as sudden changes in intensity or
colour within an image, and they can be used for a variety of applications, such as object
detection and segmentation.

There are several methods for performing edge detection, but one of the most commonly
used is the Canny edge detection algorithm. This algorithm involves several steps,
including smoothing the image to remove noise, calculating the gradient of the image to
find regions of rapid change in intensity, and applying thresholding to identify edges.[9]

The Canny algorithm is a multi-stage process that involves the following steps:

1. Smoothing: The image is convolved with a Gaussian filter to reduce noise and
blur the edges.

2. Gradient calculation: The gradient magnitude and direction are calculated for each
pixel in the image.

3. Non-maximum suppression: The gradient magnitude is compared to its


neighbouring pixels, and if it is the maximum, the pixel is retained, otherwise, it is
suppressed.

4. Double thresholding: Two threshold values are applied to the gradient magnitude,
and pixels above the high threshold are considered as strong edges, while pixels
below the low threshold are considered as non-edges. Pixels between the two
thresholds are considered as weak edges.

5. Edge tracking by hysteresis: The weak edges are connected to the strong edges if
they are adjacent to each other, forming continuous edges.

The output of the Canny algorithm is a binary image that indicates the location of edges
in the original image.

INPU
T IMAGE OUTPUT IMAGE

Figure 4.1. Edge Detection


4.2. IMAGE ANALYSIS
4.2.1. FEATURE EXTRACTION

Feature extraction is a process of identifying and extracting important and relevant


information from raw data, such as images, audio signals, or text, that can be used as
input for machine learning algorithms or other applications. In image processing, feature
extraction involves extracting useful and distinctive features from images that can be
used to identify objects or patterns within the image.

The various Features extracted from the mammography images are Mean, Variance,
Entropy, Skewness, Kurtosis, Mean Symmetry, Mean Concave Points and Mean
smoothness.[10]

Mean:

The mean value is the ratio of the sum of pixel values and the total number of pixel
values. Mean value gives the contribution of individual pixel intensity for the entire
image.

To calculate the mean intensity of this image, we can use the same formula as before:
Mean = (I1 + I2 + ... + I16) / N
where I1, I2, ..., I16 are the intensities of the 16 pixels, and N is the total number of
pixels (which is 16 in this case).

Example: Consider the following distribution of pixels. Calculate the average of all
the pixel values and replace the value of the centre pixel with the mean.
[10 20 30 40]
[50 60 70 80]
[90 100 110 120]
[130 140 150 160]
Mean= (10+20+30+40+50+60+70+80+90+100+110+120+130+140+150+160)/16=85
Variance:

The variance of pixel intensities in the image can be calculated using the formula:

Variance = (1/N) * sum ((Ii - Mean) ^2), for i = 1 to N

where Ii is the intensity of the ith pixel, mean is the mean intensity of the image, and N is
the total number of pixels.

Example: Consider the following distribution of pixels.

[10 20 30 40]
[50 60 70 80]
[90 100 110 120]
[130 140 150 160]
Using the pixel intensities and mean calculated earlier, we can substitute the values into
the formula to get:

variance = (1/16) * [(10-85) ^2 + (20-85) ^2 + ... + (160-85) ^2]

Simplifying this expression, we get:

So, the variance of pixel intensities for the given image is 2266.667.

Standard Deviation:

Standard deviation is defined as the tendency of the values in a data set to deviate from
the average value. The standard deviation is the average amount of variability in your
data set.

Example: Consider the following distribution of pixels.

[10 20 30 40]
[50 60 70 80]
[90 100 110 120]
[130 140 150 160]
Using the pixel intensities and Variance calculated earlier, the square root of the Variance
gives the Standard Deviation.

Standard Deviation = sqrt (Variance)

So, the Standard Deviation of pixel intensities for the given image is 47.610.
Skewness:

Skewness measures how “lopsided” the distribution of pixels is. In terms of digital image
processing, Darker and glossier surfaces tend to be more positively skewed than lighter
and matte surfaces. Hence, we can use skewness in making judgments about image
surfaces. Skewness is a measure of symmetry, or more precisely, the lack of symmetry. A
distribution, or data set, is symmetric if it looks the same to the left and right of the centre
point. The skewness for a normal distribution is zero, and any symmetric data should
have a skewness near zero. Negative values for the skewness indicate data that are
skewed left and positive values for the skewness indicate data that are skewed right. By
skewed left, we mean that the left tail is long relative to the right tail. Similarly, skewed
right means that the right tail is long relative to the left tail. It is shown in the below
figure.

Figure 4.1. Skewness

The skewness of pixel intensities in the image can be calculated using the formula:

skewness = (1/N) * sum ((Ii - mean) ^3 / sigma^3), for i = 1 to N

where Ii is the intensity of the ith pixel, mean is the mean intensity of the image, sigma is
the standard deviation of pixel intensities in the image, and N is the total number of
pixels.
Using the pixel intensities and mean calculated earlier, we can calculate the standard
deviation of pixel intensities using the formula:

sigma = sqrt((1/N) * sum ((Ii - mean) ^2), for i = 1 to N)

Substituting the values of pixel intensities and mean into the above formula, we get:

Using this value of sigma, we can substitute the values of pixel intensities, mean, and
sigma into the formula for skewness to get its value.

Entropy:

In Image, Entropy is defined as the measure of the degree of randomness in the


image. It is used in the quantitative analysis and also it provides a better comparison
between the image details. The entropy of an image can be calculated by calculating
at each pixel position (i, j) the entropy of the pixel-values within a 2-dim region
cantered at (i, j).

Example: Consider an image as given below. Observe the number of transitions from
0↔1

000000000 →0

011110111 →3

000111101 →3

000111111 →1

Sum of the transitions is equal to the entropy.

In this example the entropy = 0+3+3+1=7.

Kurtosis:

Kurtosis is a measure of the combined weight of distribution’s tails relative to the centre
of the distribution. Sometimes it is quite hard to distinguish from noise and image
content, especially if you handle with low contrast textures. So, if we want to be able to
make a statement of how good an algorithm works on that, we have to establish a new
numeric quantity, which is called “kurtosis”.
Kurtosis is used to measure how heavy the tails of the distribution differ from the tails of
a normal distribution. In other words, it identifies whether the tails of the distribution
contain extreme values. In digital image processing kurtosis values are interpreted in
combination with noise and resolution measurement. High kurtosis values should go hand
in hand with low noise and low resolution. Images with moderate amounts of salt and
pepper noise are likely to have a high kurtosis value

Figure 4.2. Kurtosis

An excess kurtosis is a metric that compares the kurtosis of a distribution against the
kurtosis of a normal distribution. The kurtosis of a normal distribution equals 3.
Therefore, the excess kurtosis is found using the formula below:

Excess kurtosis = Kurtosis -3

Example: consider the following distribution. Mean, variance, standard deviation of the
distribution is 94, 27, 5 respectively. This calculation involves following steps:

 Subtract the mean of the distribution from each value of the pixel. This results the

following distribution.

6 6 6

6 -44 6

6 6 6
 Divide each value of the pixel with standard deviation and multiply the result four

times for each value of pixel.

2.0736 2.0736 2.0736

2.0736 5996.9536 2.0736

2.0736 2.0736 2.0736

 Add all the above values and divide the result with the product of no: of rows and

columns.

This will give you the kurtosis of the image.

Kurtosis= ((2.0736*8) +5996.9536)/81 = 74.2412.

Mean Symmetry:

To calculate the mean symmetry of an image, you need to first define what you mean by
symmetry. One common definition of symmetry for a 2D image is mirror symmetry,
which means that the image can be divided into two halves that are mirror images of each
other.

To calculate the mean symmetry of an image using this definition, you can follow these
steps for a 4x4 matrix:

1. Define the mirror line: For a 4x4 matrix, there are two possible mirror lines - the
vertical line that divides the matrix into two 2x4 halves, and the horizontal line
that divides the matrix into two 4x2 halves. Choose one of these lines as the
mirror line for your calculation.

2. Reflect the image: Reflect the half of the image on one side of the mirror line
across the line to create a mirror image of that half. For example, if you choose
the vertical line as the mirror line, reflect the left half of the image across the line
to create a mirror image of the right half.

3. Calculate the difference: Calculate the absolute difference between the original
half of the image and its mirror image. For each pixel in the half, subtract the
corresponding pixel value in the mirror image from the original pixel value, take
the absolute value of the difference, and sum up all the differences.

4. Repeat for the other half: Repeat steps 2 and 3 for the other half of the image on
the other side of the mirror line.

5. Calculate the mean: Add up the total differences from both halves and divide by
the total number of pixels in the image to get the mean symmetry value.

Example: Calculation using a 4x4 matrix:

Original image:

1234

5678

9876

5432

Reflected image:

4321

8765

6789

2345

Absolute Difference:

3113

3113

3113

3113

Mean symmetry: (3 + 1 + 1 + 3 + 3 + 1 + 1 + 3) / 16 = 1.875

So, the mean symmetry value for this image using the vertical mirror line is 1.875.

Mean Smoothness:
To calculate the mean smoothness of an image, you need to first define what you mean
by smoothness. One common definition of smoothness for a 2D image is how rapidly the
pixel intensities change from one pixel to the next.

To calculate the mean smoothness of an image, you can follow these steps for a 4x4
matrix:

1. Calculate the gradient: Calculate the gradient of the image using a Sobel or Scharr
filter, which highlights edges and is commonly used for edge detection. The
gradient is a measure of how rapidly the pixel intensities change from one pixel to
the next in both the x and y directions.

2. Calculate the absolute gradient: Take the absolute value of the gradient at each
pixel to get the magnitude of the gradient.

3. Calculate the smoothness: Calculate the smoothness of the image as the average
of the magnitudes of the gradient at each pixel. This gives a measure of how
rapidly the pixel intensities change on average across the image.

4. Normalize: You can optionally normalize the smoothness value by dividing it by


the maximum possible magnitude of the gradient, which is the square root of 2
times the maximum pixel value in the image.

Example: calculation using a 4x4 matrix:

Original image:

1234

5678

9876

5432

Gradient:

-2 -2 -2 -2

-2 -2 -2 -2

-2 2 2 2
2 2 2 2

Absolute gradient:

2222

2222

2222

2222

Smoothness: (2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2) / 16 = 2

So, the mean smoothness value for this image is 2. If we normalize this value, we get:

Normalized smoothness: 2 / (sqrt (2) * 8) = 0.0884

So, the normalized mean smoothness value for this image is 0.0884.

TABLE 4.1. Features

4.2.2. IMAGE CLASSIFICATION

All these values of the features are stored and passed through the classifier. A
classification model attempts to draw some conclusion from observed values. Given
one or more inputs a classification model will try to predict the value of one or more
outcomes. Outcomes are labels that can be applied to a dataset. As more data are
entered the better the prediction and accuracy.
Classification is the process of predicting the class of given data points. Classes are
sometimes called targets/ labels or categories. Classification predictive modelling is
the task of approximating a mapping function (f) from input variables (X) to discrete
output variables (y).
Classification belongs to the category of supervised learning where the targets are
also provided with the input data. There are many applications in classification in
many domains such as in credit approval, medical diagnosis, target marketing etc.
Image classification analyses the numerical properties of various image features and
organizes data into categories. ... In the subsequent testing phase, these feature-space
partitions are used to classify image features. The description of training classes is an
extremely important component of the classification process.
The objective of image classification is to identify and portray, as a unique gray level
(or color), the features occurring in an image in terms of the object or type of land
cover these features actually represent on the ground. Image classification is perhaps
the most important part of digital image analysis.

Figure 4.1. Classifiers

The process of dividing a particle-laden gas stream into two, ideally at a particular
particle size, known as the cut size. An important industrial application of classifiers
is to reduce over grinding in a mill by separating the grinding zone output into fine
and coarse fractions.
Classification belongs to the category of supervised learning where the targets also
provided with the input data.
Here in the project, we used the RandomForestClassifier and SVC classifier.
SVC Classifier:
SVC (Support Vector Classification) is one of the implementations of SVM for
classification problems. The SVC classifier is used to find the optimal hyperplane
that can classify the data points into different classes.
The SVC classifier works by finding the support vectors, which are the data points
closest to the decision boundary. These support vectors are used to define the
hyperplane and maximize the margin between the classes.
One of the advantages of using SVC is that it can handle non-linear data by using
kernel functions. By transforming the data into a higher-dimensional space, the SVC
classifier can find a hyperplane that can separate the data into different classes. Some
commonly used kernel functions in SVC include the linear kernel, polynomial
kernel, RBF kernel, and sigmoid kernel.
SVC is a powerful algorithm that can be used for a wide range of classification
problems. By selecting the appropriate hyperparameters and kernel function, SVC
can achieve high accuracy and generalization performance.
CHAPTER 5
VECTOR MACHINES

5.1. SVM IN MACHINE LERANING TECHNIQUE


Support Vector Machines (SVM) is a popular algorithm in machine learning used for
classification, regression, and outlier detection. SVM is a supervised learning algorithm
that tries to find the best decision boundary that separates the data into different classes.

In SVM, the decision boundary is chosen in such a way that it maximizes the margin
between the two classes. The margin is the distance between the decision boundary and
the closest points of each class. SVM tries to find the decision boundary that has the
largest margin, as this is likely to generalize well to new, unseen data.

SVM has several advantages over other machine learning algorithms, such as high
accuracy, ability to handle large datasets, and good generalization performance.
However, SVM also has some limitations, such as the need for careful parameter tuning,
sensitivity to the choice of kernel function, and difficulty in handling multi-class
classification problems.

SVM is widely used in various applications, such as image classification, text


classification, and bioinformatics. By understanding the strengths and limitations of
SVM, you can choose the appropriate parameters and kernel functions to achieve the best
performance for your particular problem.

5.1.1. SVC CLASSIFIER

SVC (Support Vector Classification) is one of the implementations of SVM for


classification problems. The SVC classifier is used to find the optimal hyperplane that
can classify the data points into different classes.
The SVC classifier works by finding the support vectors, which are the data points
closest to the decision boundary. These support vectors are used to define the hyperplane
and maximize the margin between the classes.

One of the advantages of using SVC is that it can handle non-linear data by using kernel
functions. By transforming the data into a higher-dimensional space, the SVC classifier
can find a hyperplane that can separate the data into different classes. Some commonly
used kernel functions in SVC include the linear kernel, polynomial kernel, RBF kernel,
and sigmoid kernel.

SVC is a powerful algorithm that can be used for a wide range of classification problems.
By selecting the appropriate hyperparameters and kernel function, SVC can achieve high
accuracy and generalization performance.[11]

Figure 5.1. SVC Classification

5.2. ADVANTAGES OF SVM


1. Effective in high-dimensional spaces: SVMs perform well in high-dimensional
spaces, such as those commonly found in image or text data, where the number of
features can be very large. This is because SVMs only require a subset of the data
(the support vectors) to define the decision boundary, which helps to reduce the
impact of irrelevant or noisy features.
2. Effective with small sample sizes: SVMs can be effective even when the number
of training examples is small, which is particularly important in many real-world
applications where data is scarce or expensive to obtain.
3. Flexibility in choosing kernel functions: SVMs can be used with different kernel
functions, such as linear, polynomial, radial basis function (RBF), and sigmoid.
This flexibility allows SVMs to be used for a wide range of classification tasks,
including those that are not linearly separable.
4. Regularization: SVMs have a regularization parameter (C) that can be used to
control the trade-off between maximizing the margin and minimizing the
classification error. This can help to prevent overfitting and improve
generalization performance.

5.3. DISADVANTAGES OF SVM


1. Computationally intensive: SVMs can be computationally intensive, particularly
when dealing with large datasets or complex kernel functions. This can make
training and testing times relatively slow, which can be a practical issue in some
applications.
2. Sensitivity to kernel choice: The choice of kernel function can have a significant
impact on the performance of SVMs, and it may not always be clear which kernel
is best suited to a particular problem. Additionally, some kernel functions, such as
the RBF kernel, have hyperparameters that can be difficult to tune.
3. Sensitivity to parameter choice: SVMs have several hyperparameters that can be
tuned, such as the regularization parameter C and the kernel parameters. The
choice of these parameters can have a significant impact on the performance of
the model, and selecting the optimal values can be a challenging task.

5.4. PYTHON LIBRARIES IN SVM


5.4.1. NumPy

NumPy is a popular Python library for numerical and scientific computing. It provides an
array object that is faster and more efficient than Python's built-in lists, and also provides
a wide range of mathematical functions for working with these arrays.

NumPy provides a large number of mathematical functions, including linear algebra,


Fourier transforms, random number generation, and more. These functions are optimized
for performance and can handle large arrays of data efficiently.
NumPy is a popular choice for machine learning tasks, such as creating and manipulating
datasets, and training and testing machine learning models.

5.4.2. Pandas

Pandas is a popular open-source Python library that is used for data manipulation and
analysis. It provides highly efficient data structures and tools for working with structured
data such as tabular, time-series, and matrix data. Pandas is built on top of the NumPy
library and is used extensively in data science, machine learning, and finance.

It provides efficient data structures and tools for data manipulation, cleaning, and
analysis, making it an essential tool for anyone working with structured data in Python.

Data Frame: A two-dimensional labeled data structure with columns of potentially


different types. It is similar to a spreadsheet or a SQL table.

5.4.3. openCV

OpenCV (Open Source Computer Vision Library) is an open-source computer vision and
machine learning software library. It includes a wide range of image and video
processing tools and algorithms, such as object detection, face recognition, feature
detection, and optical flow.

OpenCV has interfaces for many programming languages. In Python, the OpenCV library
can be used by installing the "OpenCV-python" package using pip. The library is widely
used in research, industry, and academia for a variety of computer vision and machine
learning tasks.

5.4.4. [Link]

It is a module in the popular Python library scikit-learn that provides a set of functions for
preprocessing and scaling data before it is used for modeling. The module includes a
variety of methods that can be used to preprocess the data, including feature scaling,
normalization, and transformation.

By applying these preprocessing methods, data can be made more suitable for machine
learning algorithms, potentially leading to more accurate and reliable models.
Normalization: The MinMaxScaler class can be used to normalize the features of a
dataset to a specified range (e.g., [0, 1]). This is useful for algorithms that are sensitive to
the scale of the input data, such as neural networks.

A Min-Max scaling is typically done via the following equation:

5.4.5. [Link]

The [Link] module is a widely used Python library for creating static,
animated, and interactive visualizations in Python. It provides a range of functions for
creating plots, histograms, bar charts, scatter plots, and many other types of
visualizations.

%matplotlib inline is a command used in Jupyter Notebook to display Matplotlib graphs


and visualizations inline with the notebook. It allows you to view the plots and charts you
create using Matplotlib without having to open a separate window or file to view them.

pyplot provides a wide range of customization options for creating plots, such as setting
axis labels, adding legends, changing line styles, and controlling colors.

It integrates well with other scientific computing libraries, such as NumPy and SciPy,
making it easy to create plots from data stored in these libraries.

5.4.6. [Link]

The [Link] module is a sub-module of the popular scikit-learn library,


which provides a collection of efficient tools for data mining and data analysis. The
[Link] module provides a range of unsupervised learning algorithms for
dimensionality reduction, feature extraction, and matrix factorization.

5.4.7. Principal Component Analysis (PCA)

It is a widely used statistical technique for reducing the dimensionality of high-


dimensional data. It is a type of unsupervised learning that transforms the original data
into a new set of variables called principal components that are linearly uncorrelated and
capture most of the variation in the original data.[12]

The basic steps involved in PCA are as follows:

1. Standardize the data: PCA assumes that the data is standardized (i.e., has zero
mean and unit variance) so that all variables are on the same scale.
2. Compute the covariance matrix: PCA calculates the covariance matrix of the
standardized data to identify the relationships between the variables.
3. Compute the eigenvectors and eigenvalues: PCA decomposes the covariance
matrix into its eigenvectors and eigenvalues. The eigenvectors represent the
directions of maximum variance in the data, and the eigenvalues represent the
magnitude of the variance in each eigenvector.
4. Choose the number of principal components: PCA selects the number of principal
components based on the proportion of the total variance explained by each
component.
5. Transform the data: PCA transforms the original data into a new set of variables
that are linear combinations of the original variables, called principal components.

Figure 5.1. Principal Component Analysis (PCA)

5.4.8. sklearn.model_selection

The train_test_split function from the sklearn.model_selection module is used to split a


dataset into training and testing sets. This function is commonly used in machine learning
to evaluate the performance of a model on unseen data.
The train_test_split function takes several arguments, including:

 arrays: the dataset to be split. This can be a single array or a tuple of arrays.
 test_size: the fraction of the dataset to be used for testing. This can be a float
between 0 and 1, or an integer specifying the number of samples to use for
testing.
 train_size: the fraction of the dataset to be used for training. This can be a float
between 0 and 1, or an integer specifying the number of samples to use for
training. If both train_size and test_size are specified, test_size will take priority.
 random_state: a seed value for the random number generator used to split the
dataset. This is optional, but can be useful for reproducibility.

5.4.9. [Link]

The SVC class from the [Link] module is used to train a support vector machine
(SVM) classification model. SVMs are a type of machine learning algorithm used for
classification, regression, and outlier detection. They are particularly useful when
working with high-dimensional datasets.

The SVC class in scikit-learn provides an implementation of SVMs for classification.


The basic idea behind SVMs is to find the hyperplane that maximally separates the
different classes in the input data. The SVM algorithm achieves this by finding the
optimal values for the support vectors and the margin.

5.4.10. [Link]

[Link] is a module in the scikit-learn machine learning library that provides a


wide range of functions for evaluating the performance of machine learning models. It
includes functions for classification, regression, and clustering tasks. Here are some of
the main functions provided by [Link]:

Classification Metrics:
 accuracy_score: computes the accuracy of classification predictions
 confusion_matrix: computes a confusion matrix from true and predicted labels
 precision_score: computes the precision of classification predictions
 recall_score: computes the recall of classification predictions
 f1_score: computes the F1 score of classification predictions

5.4.11. Confusion Matrix

The confusion matrix is a matrix used to determine the performance of the classification
models for a given set of test data. It can only be determined if the true values for test
data are known. The matrix itself can be easily understood, but the related terminologies
may be confusing. Since it shows the errors in the model performance in the form of a
matrix, hence also known as an error matrix.

Figure 5.1. Confusion Matrix

The above table has the following cases:

 True Negative: Model has given prediction No, and the real or actual value was
also No.
 True Positive: The model has predicted yes, and the actual value was also true.
 False Negative: The model has predicted no, but the actual value was Yes, it is
also called as Type-II error.
 False Positive: The model has predicted Yes, but the actual value was No. It is
also called a Type-I error.

Example: Calculation of Accuracy, Precision, Recall and F1 score using Confusion


Matrix.
Classification Accuracy: It is one of the important parameters to determine the accuracy
of the classification problems. It defines how often the model predicts the correct output.
It can be calculated as the ratio of the number of correct predictions made by the
classifier to all number of predictions made by the classifiers. The formula is given
below:

Accuracy =0.89

Precision: It can be defined as the number of correct outputs provided by the model or
out of all positive classes that have predicted correctly by the model, how many of them
were actually true. It can be calculated using the below formula:

Precision =0.88

Recall: It is defined as the out of total positive classes, how our model predicted
correctly. The recall must be as high as possible.

Recall =0.75

F-measure: If two models have low precision and high recall or vice versa, it is difficult
to compare these models. So, for this purpose, we can use F-score. This score helps us to
evaluate the recall and precision at the same time. The F-score is maximum if the recall is
equal to the precision. It can be calculated using the below formula:

F1 Score =0.80
CHAPTER 6
NEURAL NETWORK

6.1. ANN TECHNIQUE IN DEEPLEARNING


ANN stands for Artificial Neural Network, which is a class of machine learning models
inspired by the structure and function of the human brain. In deep learning, ANNs are
used as a fundamental building block for many advanced models, such as convolutional
neural networks (CNNs) and recurrent neural networks (RNNs).

The basic idea behind ANNs is to create a network of interconnected nodes, or "neurons,"
that can process and transmit information. Each neuron takes input from one or more
other neurons, performs a simple computation on that input, and then passes its output to
other neurons in the network. By adjusting the strength of the connections between
neurons, the network can learn to recognize patterns in data and make predictions based
on that data.

Artificial Neural Networks (ANNs) are a core technique in deep learning, which is a
subfield of machine learning focused on training models with multiple layers of neural
networks. ANNs in deep learning are used for a wide range of tasks, including image and
speech recognition, natural language processing, and predictive modeling.[13]

The technique of using ANNs in deep learning involves several key steps:

1. Data preparation: The first step is to prepare the data by preprocessing and
transforming it into a format suitable for training an ANN. This may involve tasks
such as normalization, feature scaling, and one-hot encoding.

2. Model architecture: Next, the architecture of the neural network is designed,


which involves selecting the number of layers, the number of neurons in each
layer, the activation functions, and the type of connections between the neurons.

3. Training: The neural network is trained on the prepared data using an


optimization algorithm to minimize the difference between the predicted output
and the actual output.
4. Validation: The trained model is then evaluated on a separate validation set to
ensure that it generalizes well to new, unseen data.

5. Testing: Finally, the model is tested on a separate test set to measure its
performance on new, unseen data.

In deep learning, ANNs are often designed with many layers, which is why they are also
called deep neural networks. These deep neural networks are able to learn increasingly
complex features and patterns in the data as they process it through multiple layers.

There are also several variations of ANNs used in deep learning, including Convolutional
Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and Generative
Adversarial Networks (GANs). These variations are designed to handle specific types of
data or problems and have their own unique architectures and training algorithms.

In a deep neural network, there are typically many layers of neurons, each layer
processing the output of the previous layer. This allows the network to learn increasingly
complex features of the data as it progresses through the layers. For example, in a CNN
designed for image recognition, the early layers might learn to detect simple features like
edges and corners, while later layers might learn to recognize more complex shapes and
objects.

To train an ANN in deep learning, a large amount of labeled data is fed into the network,
and the weights of the connections between neurons are adjusted using an optimization
algorithm (such as stochastic gradient descent) to minimize the difference between the
network's predictions and the true labels. This process is repeated over many iterations
until the network can accurately classify new, unseen data.

6.2. ARCHITECTURE OF ANN


An Artificial Neural Network (ANN) typically consists of three types of layers: an input
layer, one or more hidden layers, and an output layer.
Figure 6.1. Architecture of Artificial Neural Network

The input layer receives input data, which could be a vector, an image, or any other
structured data format. The input layer neurons simply pass on the input data to the first
hidden layer neurons.

The hidden layers process the input data through a series of mathematical operations to
extract relevant features and patterns. Each neuron in a hidden layer receives input from
multiple neurons in the previous layer, performs a computation on that input, and passes
the result on to the next layer.

The output layer produces the final output of the network, which could be a class label, a
probability distribution over multiple classes, a regression value, or any other type of
output depending on the problem being solved.

Each neuron in a neural network has a set of weights and biases associated with it, which
are learned during the training process. These weights and biases control the strength and
direction of the connections between neurons, and determine the output of each neuron
based on its input.[14]

6.3. ADVANTAGES OF ANN


1. Flexibility: ANNs can be applied to a wide range of problems, including
classification, regression, pattern recognition, and image and speech processing.
2. Ability to learn complex relationships: ANNs are capable of learning complex
non-linear relationships between inputs and outputs, making them suitable for
modeling real-world problems where the relationships are often non-linear and
difficult to describe.
3. Ability to generalize: ANNs are able to generalize well to new, unseen data,
allowing them to be used for prediction and classification tasks.
4. Robustness: ANNs are able to tolerate noisy or missing data, making them
suitable for real-world applications where the data may not be perfect.
5. Scalability: ANNs can be scaled up to handle large datasets and complex models,
making them suitable for big data applications.

6.4. DISADVANTAGES OF ANN


1. Black-box nature: ANNs are often considered as "black-box" models, meaning
that it can be difficult to understand how the network is making its predictions or
decisions.
2. Computational complexity: ANNs can be computationally expensive, especially
when dealing with large datasets or complex models. This can make training and
testing the network time-consuming and resource-intensive.
3. Data requirements: ANNs require large amounts of labeled data to be trained
effectively. Obtaining such data can be expensive and time-consuming, especially
in certain domains.
4. Overfitting: ANNs can be prone to overfitting, where the model learns the noise
in the training data rather than the underlying patterns. This can lead to poor
generalization performance on new, unseen data.
5. Interpretability: ANNs are difficult to interpret, making it challenging to diagnose
and fix problems in the model or understand its behavior.
CHAPTER 7
NAL NEURAL NETWORKS

7.1. CNN TECHNIQUE IN DEEPLEARNING


Convolutional Neural Networks (CNNs) are a type of deep neural network that are
primarily used for analyzing visual imagery. They are widely used in computer vision
applications such as image classification, object detection, segmentation, and recognition.

The primary advantage of CNNs is their ability to automatically learn spatial hierarchies
of features from the raw input data, without the need for manual feature engineering. This
is achieved through the use of convolutional layers, which apply a set of learnable filters
to the input image to produce a set of feature maps. These feature maps capture important
spatial and structural information from the input image, and subsequent layers in the
network use this information to extract higher-level features and make predictions.

The basic structure of a CNN typically consists of multiple convolutional layers,


followed by a set of pooling layers, and then several fully connected layers at the end for
classification. Pooling layers are used to downsample the feature maps and reduce their
dimensionality, which helps to reduce the computational complexity of the network and
prevent overfitting.

CNNs have achieved state-of-the-art performance in a wide range of computer vision


tasks, including image classification on large-scale datasets such as ImageNet, object
detection, and semantic segmentation. They have also been successfully applied to other
domains such as speech recognition, natural language processing, and drug discovery.

One notable example of a CNN architecture is the famous VGGNet, which was
developed by the Visual Geometry Group at the University of Oxford. VGGNet consists
of 16 or 19 layers, depending on the variant, and has achieved excellent performance on
the ImageNet dataset. Other popular CNN architectures include AlexNet, ResNet,
Inception, and MobileNet.

In summary, CNNs are a powerful and widely used type of neural network for analyzing
visual imagery, and have achieved state-of-the-art performance on a range of computer
vision tasks. If you are working on a project related to computer vision, CNNs are
definitely worth exploring further.[14]

7.2. ARCHITECTURE OF CNN

Figure 7.1. Architecture of Convolutional Neural Network (CNN)

7.2.2. Convolutional Layer

A CNN (Convolutional Neural Network) layer is one of the building blocks of a deep
learning model designed for image or video recognition tasks. It is a specialized layer that
applies a mathematical operation called convolution to the input image, which extracts
specific features from the image.

The convolution operation involves sliding a small filter (also known as a kernel) across
the input image, and computing the dot product between the filter and the corresponding
pixels in the image. The result of this operation is a feature map that highlights specific
patterns or edges in the input image.
Figure 7.1. Convolutional Layer

7.2.3. ReLU Layer

ReLU (Rectified Linear Unit) is an activation function commonly used in neural


networks, including CNNs (Convolutional Neural Networks). It is a simple and effective
way to introduce non-linearity into the network and is computationally efficient.

The ReLU activation function is defined as:

f(x) = max (0, x)

Figure 7.1. ReLU Layer


7.2.4. Max-Pooling Layer

Max pooling is a common operation used in convolutional neural networks (CNNs) to


reduce the dimensionality of feature maps generated by convolutional layers. It is a way
of summarizing the most important features of a feature map by taking the maximum
value of small regions of the map.

Max pooling works by dividing the feature map into non-overlapping rectangular regions,
called pooling windows or kernels. For each region, the maximum value is selected and
retained, while the other values are discarded. The result is a new feature map with a
reduced spatial dimension and a higher level of abstraction.

Figure 7.1. Max-Pooling Layer

7.2.5. Flattening data

Flattening data is a process in which a multidimensional array or tensor is reshaped into a


one-dimensional array, or vector. This is a common operation in deep learning,
particularly in the context of convolutional neural networks (CNNs), where it is often
used to transform the output of convolutional layers into a format that can be processed
by fully connected layers
Figure 7.1. Flattening Data

7.2.6. Fully Connected Layer

In deep learning, a fully connected layer (also called a dense layer) is a type of neural
network layer in which each neuron is connected to every neuron in the previous layer,
and each connection is associated with a weight parameter. The output of a fully
connected layer is a linear transformation of the input, followed by an activation function.

One limitation of fully connected layers is that they can be computationally expensive,
particularly for large input sizes and large numbers of neurons. Additionally, they can be
prone to overfitting if the number of parameters for training.

Figure 7.1. Fully Connected Layer


7.3. ADVANTAGES OF CNN
1. Ability to learn spatial hierarchies of features: CNNs are capable of automatically
learning and extracting hierarchical representations of features from raw data,
such as images, speech signals, or text, without the need for manual feature
engineering. This allows them to capture both local and global patterns and
dependencies in the input data, which can lead to improved performance on a
variety of tasks.
2. Translation invariance: CNNs are designed to be invariant to translations in the
input data, meaning that they can detect the same features regardless of their
location in the input image. This property is useful for tasks such as object
recognition, where the position and orientation of objects can vary.
3. Parameter sharing: CNNs make use of weight sharing across different regions of
the input, which greatly reduces the number of parameters that need to be learned
compared to fully connected networks. This reduces the risk of overfitting and
improves generalization.
4. Parallel processing: CNNs can be highly parallelized and efficiently implemented
on modern hardware such as GPUs, which makes them suitable for large-scale
and real-time applications.

7.4. DISADVANTAGES OF CNN


1. Computationally expensive: CNNs can be computationally expensive to train,
particularly for large datasets and complex architectures. This can require
significant computational resources and time.
2. Overfitting: CNNs can be prone to overfitting, especially when the number of
parameters is large relative to the amount of training data. This can lead to poor
generalization performance on new data.
3. Sensitivity to hyperparameters: CNNs require careful tuning of hyperparameters
such as learning rate, batch size, and regularization strength, which can be time-
consuming and require expertise.
4. Limited interpretability: CNNs can be difficult to interpret and understand, due to
the complex interactions between the input data and the learned features. This can
make it challenging to diagnose errors or understand the reasons for the model's
decisions.[15]
7.5. PYTHON LIBRARIES IN CNN
7.5.1. TensorFlow

TensorFlow is a popular open-source software library developed by Google Brain team


for building and training machine learning models, particularly neural networks. It
provides a comprehensive suite of tools and APIs that enable developers to easily build
and deploy machine learning applications.

The core of TensorFlow is its data flow graph, which represents the mathematical
operations and transformations that are applied to the data in the model. The graph
consists of a series of nodes that represent operations and a set of edges that represent the
data flowing between those operations. This graph is designed to be highly flexible and
scalable, allowing for the efficient processing of large datasets and the distribution of
computations across multiple devices.

One of the key features of TensorFlow is its ability to automatically compute gradients
for any function defined in the graph, using the backpropagation algorithm. This enables
efficient training of neural networks and other models, as it allows the optimization
algorithms to iteratively adjust the model's parameters to minimize the error between the
predicted output and the actual output.

7.5.2. "get_dataset_partitions_tf"

that returns three datasets: a training dataset ("train_ds"), a validation dataset ("val_ds"),
and a test dataset ("test_ds").

Based on the name of the function, it's likely that this function is used to split a larger
dataset into smaller partitions for the purposes of training, validating, and testing a
machine learning model.

In general, it's common practice to split a dataset into these three partitions in order to
evaluate the performance of a machine learning model. The training dataset is used to
train the model, the validation dataset is used to tune the model's hyperparameters and
evaluate its performance during training, and the test dataset is used to evaluate the final
performance of the model after it has been trained.
Without more information about the "get_dataset_partitions_tf" function, it's difficult to
provide more specific details about what it does or how it partitions the dataset. However,
it's likely that the function uses TensorFlow APIs to load and preprocess the dataset, and
then splits it into the desired partitions using some form of random sampling or stratified
sampling.

7.5.3. KERAS

Keras is a popular open-source software library for building and training machine
learning models, particularly deep neural networks. It is built on top of TensorFlow and
provides a simple and intuitive interface for defining and training complex models.

Keras provides a high-level API that allows developers to easily build and configure
different types of neural networks, including convolutional neural networks (CNNs),
recurrent neural networks (RNNs), and more. It also supports a wide range of layers,
activations, loss functions, and optimization algorithms that can be combined in different
ways to create custom models.

One of the key advantages of Keras is its ease of use and flexibility. The API is designed
to be intuitive and user-friendly, with a simple and consistent syntax that makes it easy to
define and train complex models. Keras also supports a range of backends, including
TensorFlow, Microsoft Cognitive Toolkit, and Theano, allowing developers to choose the
best option for their needs.[16]

7.5.4. Adam Optimizer

Adam optimizer is a popular optimization algorithm for training machine learning


models, particularly deep neural networks. It is an extension of stochastic gradient
descent (SGD) that uses adaptive learning rates for each parameter, allowing it to
converge more quickly and effectively than traditional SGD.

The name "Adam" stands for "Adaptive Moment Estimation," which refers to the
algorithm's use of both first and second-order moments of the gradients to update the
model's parameters. Specifically, the algorithm maintains an exponential moving average
of the past gradients and squared gradients, and uses these estimates to compute adaptive
learning rates for each parameter.
The adaptive learning rates in Adam allow the optimizer to make larger updates to the
parameters when the gradients are small and smaller updates when the gradients are
large. This helps to prevent the optimizer from getting stuck in local minima and to
converge more quickly to the global minimum of the loss function.

Adam also includes several hyperparameters that can be tuned to improve its
performance on a specific problem, including the learning rate, the decay rate for the
moving averages, and the epsilon value used to prevent division by zero.

In practice, Adam is often the optimizer of choice for deep neural networks due to its
efficiency and effectiveness. However, it may not always be the best choice for every
problem, and other optimizer such as Adagrad, RMSProp, or SGD with momentum may
perform better depending on the specific characteristics of the problem and the model
being trained.[17]

7.5.5. Training process of CNN model

In machine learning, training a model involves iteratively updating the model parameters
to minimize the error on the training data. The concepts of epochs, batch size, and steps
for epochs are related to this process and are explained below:

Epochs: An epoch is one complete iteration over the entire training dataset. During each
epoch, the model goes through the entire training dataset and updates its parameters
based on the average loss across all the data points. The number of epochs is typically a
hyperparameter that needs to be tuned to achieve the best performance on the validation.

Batch size: In practice, it is not always feasible to feed the entire training dataset to the
model at once due to memory constraints. Therefore, the training dataset is divided into
smaller batches, and the model is trained on each batch in turn. The number of data points
in each batch is called the batch size. The batch size is typically a hyperparameter that
needs to be tuned to achieve the best performance on the validation set.

Steps per epoch: The number of steps per epoch is the number of batches that the model
processes before completing one epoch. For example, if the training dataset has 1000 data
points, and the batch size is 10, then there will be 100 batches in one epoch, and the
number of steps per epoch will be 100.
CHAPTER 8
WORK

8.1. FLOW CHART FOR SVM


The detection of Breast Cancer involves Five stages. The Five stages of work flow as
shown below.

Figure 8.1. Flow chart of SVM

 Stage 1: Load Image Dataset

 Stage 2: Image Preprocessing

 Stage 3: Image Segmentation

 Stage 4: Feature Extraction and Selection

 Stage 5: Image Classification

8.1.2. Load Image Dataset

The mammography image dataset has been downloaded from the aiplanet website and
categorized it into benign and malignant and set the paths to benign and malignant and by
using python library cv2 read the images.
8.1.3. Image Preprocessing

The main goal of the pre-processing is to improve the image quality to make it ready to
further processing by removing or reducing the unrelated and surplus parts in the
background of the mammogram images. The noise and high frequency components
removed by filters. Some of the methods for image pre-processing are image
enhancement, image smoothening, noise removal, edge detection, etc.

This process involves the following steps:

 Input for this stage is a mammogram image which was converted into a grayscale

image.

 CLAHE (Contrast Limited Adaptive Histogram Equalization) is a type of image

preprocessing technique that is commonly used to enhance the contrast of an

image. The purpose of CLAHE is to improve the visual appearance of an image

by increasing the contrast between its different regions.

 CLAHE is commonly used in various image processing applications, such as

medical imaging, satellite imaging, and digital photography.

8.1.4. Image Segmentation

In digital image processing and computer vision, image segmentation is the process of
partitioning a digital image into multiple segments (sets of pixels, also known as image
objects). The goal of segmentation is to simplify and/or change the representation of an
image into something that is more meaningful and easier to analyze.

Edge detection is another important technique in image processing that involves


detecting boundaries or edges in an image. Edges are defined as sudden changes in
intensity or colour within an image, and they can be used for a variety of applications,
such as object detection and segmentation.

There are several methods for performing edge detection, but one of the most commonly
used is the Canny edge detection algorithm. This algorithm involves several steps,
including smoothing the image to remove noise, calculating the gradient of the image to
find regions of rapid change in intensity, and applying thresholding to identify edges.

The Canny algorithm is a multi-stage process that involves the following steps:

1. Smoothing: The image is convolved with a Gaussian filter to reduce noise and
blur the edges.

2. Gradient calculation: The gradient magnitude and direction are calculated for each
pixel in the image.

3. Non-maximum suppression: The gradient magnitude is compared to its


neighbouring pixels, and if it is the maximum, the pixel is retained, otherwise, it is
suppressed.

4. Double thresholding: Two threshold values are applied to the gradient magnitude,
and pixels above the high threshold are considered as strong edges, while pixels
below the low threshold are considered as non-edges. Pixels between the two
thresholds are considered as weak edges.

5. Edge tracking by hysteresis: The weak edges are connected to the strong edges if
they are adjacent to each other, forming continuous edges.

The output of the Canny algorithm is a binary image that indicates the location of edges
in the original image.

INPUT IMAGE OUTPUT IMAGE

Figure 8.1. Edge Detection


The Hough Transform is a mathematical technique that is used for detecting shapes in
an image, such as lines, circles, and ellipses. It was first introduced by Paul Hough in
1962 and has since found widespread use in computer vision, image processing, and
pattern recognition.

The basic idea behind the Hough Transform is to transform the image space into a
parameter space, where each pixel in the image is represented as a point in the parameter
space. In the case of line detection, for example, the parameters are the slope and y-
intercept of the line. Each pixel in the image is then transformed into a curve in the
parameter space, which corresponds to a line in the image.

Figure 8.2. Hough Circles

8.1.5. Feature Extraction and Selection

Feature extraction is a part of the dimensionality reduction process, in which, an


initial set of the raw data is divided and reduced to more manageable groups. So,
when you want to process it will be easier. The most important characteristic of these
large data sets is that they have a large number of variables. These variables require a
lot of computing resources to process them. So, Feature extraction helps to get the
best feature from those big data sets by select and combine variables into features,
thus, effectively reducing the amount of data. These features are easy to process, but
still able to describe the actual data set with the accuracy and originality.

The technique of extracting the features is useful when you have a large data set and
need to reduce the number of resources without losing any important or relevant
information. Feature extraction helps to reduce the amount of redundant data from
the data set.

In the end, the reduction of the data helps to build the model with less machine’s
efforts and also increase the speed of learning and generalization steps in the
machine learning process.

Feature extraction is a very important process for the overall system performance in
the classification of micro-calcifications. The features extracted are distinguished
according to the method of extraction and the image characteristics. The features
which are implemented here is texture features and statistical measures like Mean,
Standard deviation, Variance, Mean Smoothness, Mean Symmetry, Skewness,
Entropy and Kurtosis are explained in the Chapter 3.

 After Extaction of features we have created a path to generate a CSV file and

store the Features in it.

 Then, read the CSV file and convert it to pandas Data Frame.

 Later on, perform MinMaxScaler which is used to normalize the features of a

dataset to a specified range (e.g., [0, 1]).

TABLE 8.1. Features from Dataset

Feature Selection we performed Principal Component Analysis (PCA) is a popular


feature selection technique used in machine learning and data analysis. It is a
statistical technique that is used to reduce the dimensionality of a dataset by
identifying the most important features or variables.
The basic idea behind PCA is to transform a high-dimensional dataset into a lower-
dimensional space, while preserving as much of the original variance as possible.
This is achieved by identifying the principal components of the dataset, which are
linear combinations of the original features that capture the most variation in the
data.

The PCA algorithm works by first computing the covariance matrix of the dataset,
which represents the relationships between the different features. The eigenvectors
and eigenvalues of this matrix are then calculated, which represent the directions of
maximum variation in the data and the amount of variation along each of these
directions, respectively. The eigenvectors with the highest eigenvalues are the
principal components of the dataset.

PCA can be used for feature selection by selecting only the top k principal
components that capture the most variation in the data. This reduces the
dimensionality of the dataset while retaining the most important features. By
reducing the number of features, PCA can also help to reduce overfitting and
improve the performance of machine learning models.

TABLE 8.2. Hybrid Features

8.1.6. Image Classification

SVC (Support Vector Classification) is one of the implementations of SVM for


classification problems. The SVC classifier is used to find the optimal hyperplane that
can classify the data points into different classes.

The SVC classifier works by finding the support vectors, which are the data points
closest to the decision boundary. These support vectors are used to define the hyperplane
and maximize the margin between the classes.

One of the advantages of using SVC is that it can handle non-linear data by using kernel
functions. By transforming the data into a higher-dimensional space, the SVC classifier
can find a hyperplane that can separate the data into different classes. Some commonly
used kernel functions in SVC include the linear kernel, polynomial kernel, RBF kernel,
and sigmoid kernel.

SVC is a powerful algorithm that can be used for a wide range of classification problems.
By selecting the appropriate hyperparameters and kernel function, SVC can achieve high
accuracy and generalization performance.

Figure 8.1. SVC Classification


8.2. FLOW CHART FOR CNN

Figure 8.1. Flow chart of CNN

Steps involved for the detection of Breast Cancer Using CNN.

 Step 1: Initialize the Batch Size and Epochs 32 and 50 respectively. The purpose

of Batch size and no of epochs Batch size refers to the number of data points

processed in one iteration of a neural network, while the number of epochs refers

to the number of times the entire dataset is passed through the network during

training.

 Step 2: Now load the dataset of mammography images which consists both

Benign and Malignant images.

 Step 3: Split the data into training, testing and validation

 Step 4: Define the function resize & rescale and Data augmentation. The purpose

of image resizing and rescaling is to modify the size and scale of an image,

respectively, which can improve the accuracy of the model by ensuring that all

images are of a consistent size and scale. Data augmentation can help improve the
accuracy of a model by exposing it to a wider range of variations in the data, and

can also help prevent overfitting by increasing the diversity of the training data.

 Step 5: Define CNN model which we have discussed in the Chapter 7 present in

the various layers such as Convolution Layer, Max-Pooling, Flatten Layer and

Dense Layer (Fully Connected Layer).

 Step 6: Compile the model using Adam optimizer. The purpose of the Adam

optimizer is to efficiently update the weights of a neural network during training

in order to minimize the loss function.

 Step 7: Fit the model in order the model is trained on a portion of the available

data and evaluate the model to check accuracy.

 Step 8: Plot the training and Validation accuracy and then save the model.

CHAPTER 9
RESULTS
9.1. OUTPUT OF SVM

INPUT IMAGE CLAHE IMAGE

HOUGH CIRCLES EDGE DETECTION


9.2. FEATURES FROM DATASET

9.3. HYBRID FEATURES OBTAINED FROM PCA

9.4. ACCURACY FOR SVM


9.5. OUTPUT OF CNN

9.6. PLOT ACCURACY VS VALIDATION LOSS


9.7. ACCURACY OF CNN
CONCLUSION
Early detection of the tumor is a vital process that benefits the diagnosis of Breast cancer.
This project is achieved by using four image processing techniques, namely image pre-
processing, image segmentation, feature extraction and selection, and classification. By
using both the Convolutional Neural Networks and Support Vector Machine we detected
the Medical Image processing for the Breast Cancer. In the Convolutional Neural
networks, an accuracy of 98.8 has been achieved by Specified Model whereas by the
Support Vector Machine, the accuracy has been achieved of 93.1.

The main disadvantage of SVM Model is that when dealing with high number of instant
inputs The functioning of the Model may affect. However, the SVM still offers certain
advantages, such as its ability to handle nonlinear relationships between inputs and its
suitability for binary classification problems

CNNs are better suited for image processing and CNNs can automatically learn relevant
features, including their ability to automatically learn relevant features, scalability for
large datasets, and superior accuracy.

Hence, the project helps in detecting the cancerous tumor before its spreads to other parts
of the body and increases the chances of successful diagnosis.
FUTURE SCOPE
Breast Cancer Detection using CNN algorithm training model as an accuracy of 98%, it
means that the prediction of a test image is more accurate and in time consuming is less.
So we want to build a website using CNN Deep learning algorithm and the test images
which are perform in medical field for Breast Cancer Detection takes a one week of time
to predict either the patient is “Benign” or “Malignant”. So, by this project the prediction
is done in fraction of seconds. Whether the patient is “Benign” or “Malignant”. Early
detection of the tumor is a vital process that benefits the diagnosis of Breast cancer this
can be achieved by the website of Breast Cancer Detection.
REFERENCES
[1] DeSantis CE, Bray F, Ferlay J, Lortet-Tieulent J, Anderson BO, Jemal A. International
Variation in Female Breast Cancer Incidence and Mortality Rates. Cancer Epidemiol
Biomarkers Prev. 2015; 24(10): 1495-506.

[2] Henry NL, Shah PD, Haider I, Freer PE, Jagsi R, Sabel MS. Chapter 88: Cancer of the Breast.
In: Niederhuber JE, Armitage JO, Doroshow JH, Kastan MB, Tepper JE, eds. Abeloff’s Clinical
Oncology. 6th ed. Philadelphia, Pa: Elsevier; 2020.

[3] American Cancer Society. Cancer Facts and Figures 2023. Atlanta, Ga: American Cancer
Society; 2023.

[4] Jagsi R, King TA, Lehman C, Morrow M, Harris JR, Burstein HJ. Chapter 79: Malignant
Tumors of the Breast. In: DeVita VT, Lawrence TS, Lawrence TS, Rosenberg SA, eds. DeVita,
Hellman, and Rosenberg’s Cancer: Principles and Practice of Oncology. 11th ed. Philadelphia,
Pa: Lippincott Williams & Wilkins; 2019.

[5] [Link], [Link], [Link], [Link], Amit Karn “An Accurate Breast Cancer
Detection and Classification using Image Processing” Department of ECE, Sri Eshwar College
of Engineering, Coimbatore, India Volume 9, Issue 3, March 2021.

[6] Siddhartha Gupta School Of Electrical Engineering VIT Vellore, India “Breast Cancer
Detection Using Image Processing Techniques” Innovations in Power and Advanced
Computing Technologies (i-PACT) 2019.

[7] Prannoy Giri* and K Saravanakumar Department of Computer Science, Christ University,
India. “Breast Cancer Detection using Image Processing Techniques” ISSN: 0974-6471 June
2017, Vol. 10, No. (2): Pgs. 391-399 volume 9,2019.

[8] Dina A. Ragab, Maha Sharkas, Stephen Marshall and Jinchang Ren Electronics and
Communications Engineering Department, Arab Academy for Science, Technology, and
Maritime Transport (AASTMT), “Breast cancer detection using deep convolutional neural
networks and support vector machines” 2019.

[9] X. Liu and D. Wang, “Image and Texture Segmentation Using Local Histograms”, IEEE Trans.
Med. Img., vol.15, pp. 3066-3076, 2006.

[10] Monika Sharma, R. B. Dubey, Sujata, S. K. Gupta “Feature Extraction of Mammograms”,


International Journal of Advanced Computer Research (ISSN (print): 2249-7277 ISSN (online):
2277-7970) Volume-2 Number-3 Issue-5 September-2012.

[11] M. Yang, S. Cui, Y. Zhang, J. Zhang and X. Li, "Data and Image Classification of
Haematococcus pluvialis Based on SVM Algorithm," 2021 China Automation Congress (CAC),
Beijing, China, 2021, pp. 522-525, doi: 10.1109/CAC53003.2021.9727433.

[12] K. K. Shinde, S. S. Tharewal, K. S. Suryawanshi and C. N. Kayte, "Python Based Face


Recognition for Person Identification Using PCA and 2DPCA Techniques," 2020 International
Conference on Smart Innovations in Design, Environment, Management, Planning and
Computing (ICSIDEMPC), Aurangabad, India, pp. 171-175, doi:
10.1109/ICSIDEMPC49020.2020.9299649.

[13] M. C. Irmak, M. B. H. Taş, S. Turan and A. Haşiloğlu, "Comparative Breast Cancer Detection
with Artificial Neural Networks and Machine Learning Methods," 2021 29th Signal Processing
and Communications Applications Conference (SIU), Istanbul, Turkey, 2021, pp. 1-4, doi:
10.1109/SIU53274.2021.9477991.

[14] K. Mridha, "Early Prediction of Breast Cancer by using Artificial Neural Network and Machine
Learning Techniques," 2021 10th IEEE International Conference on Communication Systems
and Network Technologies (CSNT), Bhopal, India, 2021, pp. 582-587, doi:
10.1109/CSNT51715.2021.9509658.

[15] Vijayvargia, "MACHINE LEARNING WITH PYTHON: An Approach to Applied Machine


Learning," BPB Publications, 2018.

[16] K. Duvvuri, H. Kanisettypalli and S. Jayan, "Detection of Brain Tumor Using CNN and CNN-
SVM," 2022 3rd International Conference for Emerging Technology (INCET), Belgaum, India,
2022, pp. 1-7, doi: 10.1109/INCET54531.2022.9824725.

[17] G. P. Kumar, G. S. Priya, M. Dileep, B. E. Raju, A. R. Shaik and K. V. S. H. G. Sarman, "Image


Deconvolution using Deep Learning-based Adam Optimizer," 2022 6th International
Conference on Electronics, Communication and Aerospace Technology, Coimbatore, India,
2022, pp. 901-904, doi: 10.1109/ICECA55336.2022.10009073.

You might also like