SAVITRIBAI PHULE PUNE UNIVERSITY
ADSUL’S TECHNICAL CAMPUS
DEPARTMENT OF COMPUTER ENGINEERING
FINAL YEAR OF COMPUTER ENGINEERING
LP-VI (BI LAB MANUAL)
INDEX
PAGE
SR.NO. NAME OF PRACTICAL
NO.
Import the legacy data from different sources such as (Excel,
SqlServer, Oracle etc.) and load in the target system. (You can
1 1 TO 4
download sample database such as Adventure works, North wind,
footmark etc.)
Perform the Extraction Transformation and Loading (ETL) process to
2 5 TO 21
construct the database in the SQL server.
Create the cube with suitable dimension and fact tables based on
3 22 TO 46
OLAP
Import the data warehouse data in Microsoft Excel and create the Pivot
4 47 TO 52
table and Pivot Chart
5 Perform the data classification using classification algorithm. 53 TO 55
PRACTICAL 1
Import the legacy data from different sources such as (Excel, SqlServer,
Oracle etc.) and load in the target system. (You can download sample
database such as Adventure works, North wind, footmark etc.)
Step 1: Open Power BI
Step 2: Click on Get data following list will be displayed select Excel
Step 3: Select required file and click on Open, Navigator screen appears
.
Step 4: Select file and click on edit
Step 5: Power query editor appears
Step 6: Again, go to Get Data and select OData feed
Step 7:
Paste url as http://services.odata.org/V3/Northwind/Northwind.svc/
Click on ok
Step 8: Select orders table
Sand click on edit
Note: If you just want to see preview you can just click on table name
without clicking on checkbox
Click on edit to view table
PRACTICAL 2 B
Perform the Extraction Transformation and Loading (ETL) process to
construct the database in the SQL server.
Software requirements: SQL SERVER 2012 FULL VERSION
(SQLServer2012SP1-FullSlipstream-ENU-x86)
Steps to install SQL SERVER 2012 FULL VERSION (SQLServer2012SP1-
FullSlipstream-ENU-x86) are given in my previous post.
Step 1:
Open SQL Server Management Studio to restore backup file
Step 2: Right click on Databases Restore Database
Step 3: Select Device click on icon towards end of device box
Step 4: Click on Add Select path of backup files
Step 5: Select both files at a time
Step 6 : Click ok and in select backup devices window Add both files of
AdventureWorks
Step 7: Open SQL Server Data Tools
Select File New Project Business Intelligence Integration Services
Project & give appropriate project name.
Environment consists of SQL Server Integration Services(SSIS)
Step 8: Right click on Connection Managers in solution explorer and click on
New Connection Manager.
Add SSIS connection manager window appears.
Step 9: Select OLEDB Connection Manager and Click on Add
Step 10: Configure OLE DB Connection Manager window appears Click on
New
Step 11: Select Server name(as per your machine) from drop down and
database name and click on Test connection.
If test connection succeeded click on OK.
Step 12: Click on OK
Connection is added to connection manager
Step 13: Drag and drop Data Flow Task in Control Flow tab
Step 14: Drag OLE DB Source from Other Sources and drop into Data Flow
tab
Step 15: Double click on OLE DB source OLE DB Source Editor appears
click on New to add connection manager.
Select [Sales].[Store] table from drop down ok
Step 16: Drag ole db destination in data flow tab and connect both
Step 17: Double click on OLE DB destination
Click on New to run the query to get [OLE DB Destination] in Name of the
table or the view.
Click on ok
Step 18: Click on start.
Step 19: Go to SQL Server Management Studio
In database tab Adventureworks Right click on [dbo].[OLE DB
Destination] Scrip Table as SELECT To New Query Editor Window
Step 20: Execute following query to get output.
USE [AdventureWorks2012]
GO
SELECT [BusinessEntityID]
,[Name]
,[SalesPersonID]
,[Demographics]
,[rowguid]
,[ModifiedDate]
FROM [dbo].[OLE DB Destination]
GO
PRACTICAL 3 b
Create the cube with suitable dimension and fact tables based on
OLAP
Step 1: Creating Data Warehouse
Let us execute our T-SQL Script to create data warehouse with fact tables,
dimensions and populate them with appropriate test values.
Download T-SQL script attached with this article for creation of Sales Data
Warehouse or download from this article “Create First Data Warehouse” and
run it in your SQL Server.
Downloading "Data_WareHouse SQLScript.zip" from the article
https://www.codeproject.com/Articles/652108/Create-First-Data-WareHou
se
After downloading extract file in folder.
Follow the given steps to run the query in SSMS (SQL Server Management
Studio).
1. Open SQL Server Management Studio 2012
2. Connect Database Engine
Password for sa : admin123 (as given during installation)
Click Connect.
3. Open New Query editor
4. Copy paste Scripts given below in various steps in new query editor
window one by one
5. To run the given SQL Script, press F5
6. It will create and populate “Sales_DW” database on your SQL Server
OR
1. Go to the extracted sql file and double click on it.
2. New Sql Query Editor will be opened containing Sales_DW Database.
3. Click on execute or press F5 by selecting query one by one or directly
click on Execute.
4. After completing execution save and close SQL Server Management
studio & Reopen to see Sales_DW in Databases Tab.
Step 2: Start SSDT environment and create New Data Source
Go to Sql Server Data Tools --> Right click and run as administrator
Click on File New Project
In Business Intelligence Analysis Services Multidimensional and Data
Mining models appropriate project name click OK
Right click on Data Sources in solution explorer New Data Source
Data Source Wizard appears
Click on New
Select Server Name select Use SQL Server Authentication Select or enter
a database name (Sales_DW)
Note : Password for sa : admin123 (as given during installation of SQL 2012
full version)
Click Next
Select Inherit Next
Click Finish
Sales_DW.ds gets created under Data Sources in Solution Explorer
Step 3: Creating New Data Source View
In Solution explorer right click on Data Source View Select New Data
Source View
Click Next
Click Next
Select FactProductSales(dbo) from Available objects and put in Includes
Objects by clicking on
Click on Add Related Tables
Click Next
Click Finish
Sales DW.dsv appears in Data Source Views in Solution Explorer.
Step 4: Creating new cube
Right click on Cubes New Cube
Select Use existing tables in Select Creation Method Next
In Select Measure Group Tables Select FactProductSales Click Next
In Select Measures check all measures Next
In Select New Dimensions Check all Dimensions Next
Click on Finish
Sales_DW.cube is created
Step 5: Dimension Modification
In dimension tab Double Click Dim Product.dim
Drag and Drop Product Name from Table in Data Source View and Add in
Attribute Pane at left side
Step 6: Creating Attribute Hierarchy in Date Dimension
Double click On Dim Date dimension -> Drag and Drop Fields from Table
shown in Data Source View to Attributes-> Drag and Drop attributes from
leftmost pane of attributes to middle pane of Hierarchy.
Drag fields in sequence from Attributes to Hierarchy window (Year, Quarter
Name, Month Name, Week of the Month, Full Date UK)
Step 7: Deploy Cube
Right click on Project name Properties
This window appaers
Do following changes and click on Apply & ok
Right click on project name Deploy
Deployment successful
To process cube right click on Sales_DW.cube Process
Click run
Browse the cube for analysis in solution explorer
PRACTICAL 4
Import the data warehouse data in Microsoft Excel and create the Pivot table
and Pivot Chart
(Ms Office Professional is used to make sure Power View is enabled for
visualization.)
Step 1: Open Excel 2013 (Professional)
Go to Data tab Get External Data From Other Sources From Data
Connection Wizard
Step 2: In Data Connection Wizard Select Microsoft SQL Server Click on
Next
Step 3: In connect to Database Server provide Server name( Microsoft SQL
Server Name)
Provide password for sa account as given during installation of SQL Server
2012 full version)
Password: admin123
Click on Next
Step 4: In Select Database and Table Select Sales_DW (already created in
SQL) check all dimensions and import relationships between selected
tables
Step 5: In save data connection files browse path and click on Finish
Step 6: In import data select Pivot Chart and click on OK
Step 7: In fields put SalesDateKey in filters, FullDateUK in axis and Sum of
ProductActualCost in values
Step 8: In Insert Tab go to Pivot Table
Step 9: Click on Choose Connection to select existing connection with
Sales_DW and click on open
Pivot table and Pivot chart is created
PRACTICAL 5
Perform the data classification using classification algorithm.
OR
Data Analysis using Time Series Analysis
Software required: R 3.5.1
Time series is a series of data points in which each data point is associated
with a timestamp. A simple example is the price of a stock in the stock
market at different points of time on a given day. Another example is the
amount of rainfall in a region at different months of the year. R language
uses many functions to create, manipulate and plot the time series data. The
data for the time series is stored in an R object called time-series object. It
is also a R data object like a vector or data frame.
The time series object is created by using the ts() function.
Syntax
The basic syntax for ts() function in time series analysis is −
timeseries.object.name <- ts(data, start, end, frequency)
Following is the description of the parameters used −
• data is a vector or matrix containing the values used in the time series.
• start specifies the start time for the first observation in time series.
• end specifies the end time for the last observation in time series.
• frequency specifies the number of observations per unit time.
Except the parameter "data" all other parameters are optional
Consider the annual rainfall details at a place starting from January 2012.
We create an R time series object for a period of 12 months and plot it.
Code to run in R
# Get the data points in form of a R vector.
rainfall <-
c(799,1174.8,865.1,1334.6,635.4,918.5,685.5,998.6,784.2,985,882.8,1071)
# Convert it to a time series object.
rainfall.timeseries <- ts(rainfall,start = c(2012,1),frequency = 12)
# Print the timeseries data.
print(rainfall.timeseries)
# Give the chart file a name.
png(file = "rainfall.png")
# Plot a graph of the time series.
plot(rainfall.timeseries)
# Save the file.
dev.off()
After this again plot to get chart
plot(rainfall.timeseries)
Output:
When we execute the above code, it produces the following result and chart
—
Jan Feb Mar Apr May Jun Jul Aug Sep 2012 799.0 1174.8
865.1 1334.6 635.4 918.5 685.5 998.6 784.2 Oct Nov Dec 2012
985.0 882.8 1071.0