0% found this document useful (0 votes)
22 views12 pages

Exp No. 1-3 (MLC)

MACHINE LEARNING THEORYY

Uploaded by

murudkarp11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views12 pages

Exp No. 1-3 (MLC)

MACHINE LEARNING THEORYY

Uploaded by

murudkarp11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 12

74 - PRACHI MURUDKAR

EXPERIMENT NO. 1

Aim : To install ANACONDA in our system.

Theory :

Steps to install Anaconda-

1. Download Anaconda
• Visit the Anaconda website.
• Choose the appropriate Python version (usually Python 3) and download the installer
for your system (32-bit or 64-bit).

2. Run the Installer


• Double-click the downloaded installer file.
• Read the license agreement and click "I Agree."
• Choose the installation type:
o Just Me: Installs Anaconda for the current user.
o All Users: Installs Anaconda for all users on the system.
• Select the installation location.
• Important: Uncheck the box to add Anaconda to the PATH environment variable.
This can cause conflicts with other Python installations.
• Click "Install."

3. Verify Installation
• Open a command prompt or Anaconda Prompt.
• Type conda --version and press Enter. If the installation was successful, you should
see the Anaconda version.

4. Launch Jupyter Notebook


• Windows: Open Anaconda Navigator from the Start Menu.
• MacOS/Linux: Open Anaconda Navigator from the Applications folder or by typing
anaconda-navigator in the terminal.
• Launch Jupyter Notebook: In Anaconda Navigator, find the Jupyter Notebook option
and click "Launch." This will open Jupyter Notebook in your default web browser.

5. Using Jupyter Notebook


• Create a new notebook: In the Jupyter Notebook interface, navigate to the directory
where you want to save your notebooks. Click the "New" button on the right and
select "Python 3" to create a new notebook.
• Start coding: You can now start writing and executing Python code in the Jupyter
Note.

Optional: Setting Up a Virtual Environment


1. Create a virtual environment: Open Anaconda Prompt or Terminal and run conda
create -n myenv python=3.8 (replace "myenv" with your desired environment name).
2. Activate the environment: Run conda activate myenv.
3. Install Jupyter in the environment: Run conda install jupyter.
6. Launch Jupyter Notebook in a Virtual Environment
• Activate your virtual environment: Run conda activate myenv.
• Launch Jupyter Notebook: Run jupyter notebook.

BASIC PYTHON PROGRAMS :

1. PRINT STATEMENT

python
print("helloworld")
Explanation:
• The print function is used to display the specified message or value to the
screen.
• In this example, the string "helloworld" is passed as an argument to the
print function, which outputs the text "helloworld".

2. Variable Assignment and Addition

python a = 5 b = 7
print("a + b =", a + b)

Explanation:
• Variables a and b are assigned the values 5 and 7, respectively.
• The expression a + b adds the values of a and b.
• The print function outputs the result of the addition along with the string "a
+ b =".

3. Decision Making Program:


• Check Even or Odd Number

python a = 100 if a % 2 == 0:
print("Even number")else:
print(“Odd number”)

Explanation:
• Variable a is assigned the value 100.
• The if statement checks whether a is divisible by 2 using the
modulus operator %.
• If a % 2 equals 0, it means a is an even number, and the program
prints "Even number".
• Otherwise, the program prints "Odd number" using the else
Clause.
4. Program Using Loops
• While Loop to Print Numbers

python
count = 10 while count < 15:
count = count + 1
print(count)

Explanation:
• Variable count is assigned the initial value of 10.
• The while loop continues to execute as long as the condition count
< 15 is true.
• Inside the loop, the value of count is incremented by 1 in each
iteration.
• The print function outputs the current value of count during each
iteration.
• The loop stops when count reaches 15, printing the numbers.

SCREENSHOT:
74 - PRACHI MURUDKAR

EXPERIMENT NO. 2

Aim: To perform python libraries (Numpy , Pandas, Matplotlib)

Theory:

NumPy Library in Python

Introduction:

NumPy (Numerical Python) is a library for working with arrays and


mathematical operations in Python. It's a fundamental package for
scientific computing and is widely used in various fields.
Key Features-
o Multi-dimensional arrays: Efficient and flexible arrays for
representing complex data structures.
o Vectorized operations: Fast element-wise operations on
entire arrays.
o Matrix operations: Extensive set of matrix operations,
including multiplication and inversion.
o Random number generation: Functions for generating
random numbers, including uniform and normal distributions.

Advantages-
• Speed: Faster than Python lists for numerical computations.
• Memory efficiency: Uses less memory than Python lists.
• Vectorized operations: Easy to perform complex calculations on
entire arrays.

Common Functions-
o numpy.array(): Creates a NumPy array from a Python list. o
numpy.zeros(): Creates a NumPy array filled with zeros.
o numpy.sum(): Calculates the sum of all elements in a NumPy
array.
OUTPUT:

• Pandas Library in Python


Pandas is a library for data manipulation and analysis in Python. It provides
data structures and functions to efficiently handle and process large
datasets, making it a fundamental tool for data science and analytics. Pandas
allows us to analyze big data and make conclusions based on statistical
theories. Pandas can clean messy data sets, and make them readable and
relevant. Relevant data is very important in data science.

Key Features-
• DataFrames: Two-dimensional labeled data structures with columns
of potentially different types.
• Series: One-dimensional labeled array capable of holding any data type.
• Indexing and Selecting: Efficient indexing and selecting of data using
labels or conditional statements.
• Data Alignment: Automatic alignment of data during operations,
eliminating the need for manual data matching.
• GroupBy and Reshaping: Functions for grouping, aggregating,
and reshaping data.
• Key Features
• Plotting: Creates a wide range of plots, including line plots, scatter plots,
bar charts, histograms, and more.
• Customization: Offers extensive customization options for plot
appearance, including colors, fonts, labels, and titles.
• Interactive Plots: Supports interactive plots with zooming, panning,
and hover-over text.
• 3D Plotting: Creates 3D plots and charts for visualizing complex data.
• Integration: Seamlessly integrates with other popular Python
libraries, such as NumPy, Pandas, and Scikit-learn.

Advantages-
• Efficient Data Handling: Optimized for performance, making it suitable
for large datasets.
• Flexible Data Structures: DataFrames and Series can handle various
data types and structures.
• Easy Data Manipulation: Intuitive API for data filtering, sorting,
and grouping.
• Integration with Other Libraries: Seamless integration with other
popular data science libraries, such as NumPy and Matplotlib.

Common Functions-
• pandas.read_csv(): Reads a CSV file into a DataFrame.
• pandas.DataFrame(): Creates a DataFrame from a dictionary or
other data structure.
• pandas.Series(): Creates a Series from a list or other data structure.
• df.head(): Displays the first few rows of a DataFrame.
• df.groupby(): Groups a DataFrame by one or more columns.

Real-World Applications-

• Data Analysis: Data cleaning, filtering, and visualization.


• Data Science: Data preprocessing, feature engineering, and model
implementation.
• Business Intelligence: Data reporting, dashboard creation, and business
analytics.

OUTPUT:
o Matplotlib Library in Python
Matplotlib is a popular data visualization library in Python that provides a
comprehensive set of tools for creating high-quality 2D and 3D plots, charts,
and graphs. It's widely used in various fields, including scientific research, data
analysis, and machine learning.

Key Features

Plotting: Creates a wide range of plots, including line plots, scatter plots, bar
charts, histograms, and more.
Customization: Offers extensive customization options for plot appearance,
including colors, fonts, labels, and titles.
Interactive Plots: Supports interactive plots with zooming, panning, and hover-
over text.
3D Plotting: Creates 3D plots and charts for visualizing complex data.
Integration: Seamlessly integrates with other popular Python libraries, such as
NumPy, Pandas, and Scikit-learn.

Advantages
o Easy to Use: Simple and intuitive API for creating plots and charts.
o High-Quality Output: Produces high-quality, publication-ready plots and
charts.
o Customizable: Offers extensive customization options for plot
appearance.
o Flexible: Supports a wide range of plot types and data formats.
o Large Community: Active community and extensive documentation
make it easy to find help and resources.

Common Functions

o matplotlib.pyplot.plot(): Creates a line plot.


o matplotlib.pyplot.scatter(): Creates a scatter plot.
o matplotlib.pyplot.bar(): Creates a bar chart.
o matplotlib.pyplot.hist(): Creates a histogram.
o matplotlib.pyplot.show(): Displays the plot.

Real-World Applications

Data Analysis: Visualizing data to identify trends, patterns, and correlations.


Scientific Research: Creating plots and charts to present research findings.

OUTPUT: -
74 - PRACHI MURUDKAR

EXPERIMENT NO. 3

Aim: Data Processing and Data Cleaning.

Theory:

Data Processing And Cleaning:


Data Processing is a crucial step in the data science workflow that involves
transforming raw data into a format that is suitable for analysis and modeling.
It's an essential step that ensures data quality, completeness, and
consistency, and is a critical component of data preparation.

Key Concepts

• Data Cleaning: Identifying and correcting errors, inconsistencies,


and inaccuracies in the data.
• Data Transformation: Converting data from one format to another to
make it more suitable for analysis.
• Data Reduction: Selecting a subset of the most relevant data to
reduce dimensionality and improve model performance.
• Data Integration: Combining data from multiple sources into a
single, unified view.

Data Processing Steps-


• Data Ingestion: Collecting and gathering data from various sources.
• Data Cleaning: Identifying and correcting errors, inconsistencies,
and inaccuracies in the data.
• Data Transformation: Converting data from one format to another to
make it more suitable for analysis.
• Data Reduction: Selecting a subset of the most relevant data to
reduce dimensionality and improve model performance.
• Data Integration: Combining data from multiple sources into a
single, unified view.
• Data Quality Check: Verifying the quality and integrity of the
processed data.
Data Processing Techniques

• Handling Missing Values: Replacing missing values with mean,


median, or imputed values.
• Data Normalization: Scaling numerical data to a common range to
prevent feature dominance.
• Data Standardization: Transforming data to have a mean of 0 and a
standard deviation of 1.
• Data Aggregation: Combining multiple data points into a single
value, such as sum or average.
• Data Encoding: Converting categorical data into numerical data using
techniques like one-hot encoding or label encoding.

OUTPUT:

You might also like