Unit-1 Bi
Unit-1 Bi
Analytics [KDS-051]
UNIT1- BUSINESS INTELLIGENCE –
INTRODUCTION
Lectures Details topics
Introduction - History and Evolution
1
8 Changing the aggregation, both for a single viz, or changing default aggregation
What is Business Intelligence (BI)?
• Business intelligence (BI) is a set of technologies that are used to solve specific business problems. BI tools are typically
designed to deliver a mix of operational embedded analytics, analytics platform capabilities, and rich data visualization
functionality.
• They include reporting tools, statistical analysis tools, database management systems, and data mining applications. BI is
usually implemented as a standalone technology in-house or by an outside consulting firm or vendor.
• The business intelligence platform is mostly a cloud-based solution that is designed to help an organization to achieve and
sustain the goals of its digital transformation initiatives. A business intelligence platform can be used to store, organize, and
analyze data that allows organizations to access the insights they need in order to meet their business goals.
• It can also be used as a tool for employees and stakeholders alike to gain insights into a company’s performance. A business
BI platform is more than just an analytics tool.
• In other words, you can use a business platform to enable collaboration between all parties in the organization as well as with
external stakeholders such as customers, partners, suppliers and others.
What is the history and evolution of business intelligence (BI)?
• BI solutions have been largely focused on acquiring a broad set of data and reporting it back to business analysts. However,
there are some major shifts that have been made in BI over the last 20 years.
• As a result, now we can take the advantages of modern BI tools. Business intelligence in this era is enabled for AI and
machine learning. Furthermore, here's some development and evolution era of business intelligence.
** Traditional era of business intelligence: They started to introduce a technique of combining data from
multiple systems to a single database. Thus, they call the process as extract, transform, and load (ETL).
** Self-service era of business intelligence: They enable data analysts to easily sort through large amounts of
data to find patterns quickly. They substitute the rows and also columns that are part of traditional data
presentation tools with pictures and charts that visually represent the data.
** Augmented analytics era of business intelligence: We are moving away from self-service analytics and
toward automation. So, this is what we call augmented analytics.
History and Evolution of Business
intelligence and Analytics
History and Evolution of Business intelligence and Analytics
Business Analytics
Benefits
◦ …reduced costs, better risk management, faster decisions,
better productivity and enhanced bottom-line performance such
as profitability and customer satisfaction.
Challenges
◦ …lack of understanding of how to use analytics, competing
business priorities, insufficient analytical skills, difficulty in getting
good data and sharing information, and not understanding the
benefits versus perceived costs of analytics studies.
Business Intelligence: Effective and Timely
decisions
The main purpose of business intelligence systems is to provide knowledge workers with tools and methodologies that allow them to
make effective and timely decisions.
• Effective decisions. The application of rigorous analytical methods allows decision makers to rely on information and knowledge
which are more dependable.
• Timely decisions. The ability to `rapidly react to the actions of competitors and to new market conditions is a critical factor in the
success or even the survival of a company.
Data, information and knowledge
Data:
• For a retailer data refer to primary entities such as customers, points of sale and items, while sales receipts represent the commercial
transactions.
• Data is unprocessed facts and figures without any added interpretation or analysis. "The price of crude oil is $80 per barrel."
Information:
• Information is the outcome of extraction and processing activities carried out on data, and it appears meaningful for those who
receive it in a specific domain
• Information is data that has been interpreted so that it has meaning for the user. "The price of crude oil has risen from $70 to $80 per
barrel" gives meaning to the data and so is said to be information to someone who tracks oil prices.
Knowledge:
• Information is transformed into knowledge when it is used to make decisions and develop the corresponding actions.
• Knowledge is a combination of information, experience and insight that may benefit the individual or the organization. "When crude
oil prices go up by $10 per barrel, it's likely that petrol prices will rise by 2p per litre" is knowledge.
Architectural Representation: From data to
information to knowledge
The role of mathematical models
** A business intelligence system provides decision makers with information and knowledge extracted from data,
through the application of mathematical models and algorithms. In some instances, this activity may reduce to
calculations of totals and percentages, graphically represented by simple histograms, whereas more elaborate analyses
require the development of advanced optimization and learning models.
** A business intelligence system provides decision makers with information and knowledge extracted from data,
through the application of mathematical models and algorithms.
First, the objectives of the analysis are identified and the performance indicators that will be used to evaluate
alternative options are defined.
Mathematical models are then developed by exploiting the relationships among system control variables,
parameters and evaluation metrics.
Finally, what-if analyses are carried out to evaluate the effects on the performance determined by variations in
the control variables and changes in the parameters.
Business Intelligence Architectures
The architecture of a business intelligence system, includes three major components.
• Data sources. In a first stage, it is necessary to gather and integrate the data stored in the various primary and
secondary sources, which are heterogeneous in origin and type. The sources consist for the most part of data belonging
to operational systems, but may also include unstructured documents, such as emails and data received from external
providers. Generally speaking, a major effort is required to unify and integrate the different data sources.
• Data warehouses and data marts. Using extraction and transformation tools known as extract, transform, load (ETL),
the data originating from the different sources are stored in databases intended to support business intelligence
analyses. These databases are usually referred to as data warehouses and data marts.
• Business intelligence methodologies. Data are finally extracted and used to feed mathematical models and analysis
methodologies intended to support decision makers. In a business intelligence system, several decision support
applications may be implemented.
[Link] Mining
• The fourth level includes active business intelligence methodologies, whose purpose is the extraction of information and knowledge from data.
• These include mathematical models for pattern recognition, machine learning and data mining techniques. Unlike the tools described at the
previous level of the pyramid, the models of an active kind do not require decision makers to formulate any prior hypothesis to be later
verified. Their purpose is instead to expand the decision makers’ knowledge.
[Link].
• By moving up one level in the pyramid we find optimization models that allow us to determine the best solution out of a set of alternative
actions, which is usually fairly extensive and sometimes even infinite.
[Link]
• Finally, the top of the pyramid corresponds to the choice and the actual adoption of a specific decision, and in some way represents the natural
conclusion of the decision-making process. Even when business intelligence methodologies are available and successfully adopted, the choice
of a decision pertains to the decision makers, who may also take advantage of informal and unstructured information available to adapt and
modify the recommendations and the conclusions achieved through the use of mathematical models.
Cycle of a business intelligence
analysis
Analysis.
• During the analysis phase, it is necessary to recognize and accurately spell out the problem at hand. Decision makers must then create a
mental representation of the phenomenon being analyzed, by identifying the critical factors that are perceived as the most relevant. The
availability of business intelligence methodologies may help already in this stage, by permitting decision makers to rapidly develop various
paths of investigation. For instance, the exploration of data cubes in a multidimensional analysis, according to different logical views, allows
decision makers to modify their hypotheses flexibly and rapidly, until they reach an interpretation scheme that they deem satisfactory. Thus,
the first phase in the business intelligence cycle leads decision makers to ask several questions and to obtain quick responses in an
interactive way.
Insight.
• The second phase allows decision makers to better and more deeply understand the problem at hand, often at a causal level. For instance, if
the analysis carried out in the first phase shows that a large number of customers are discontinuing an insurance policy upon yearly
expiration, in the second phase it will be necessary to identify the profile and characteristics shared by such customers. The information
obtained through the analysis phase is then transformed into knowledge during the insight phase. On the one hand, the extraction of
knowledge may occur due to the intuition of the decision makers and therefore be based on their experience and possibly on unstructured
information available to them. On the other hand, inductive learning models may also prove very useful during this stage of analysis,
particularly when applied to structured data.
Decision.
• During the third phase, knowledge obtained as a result of the insight phase is converted into decisions and subsequently into actions. The
availability of business intelligence methodologies allows the analysis and insight phases to be executed more rapidly so that more effective
and timely decisions can be made that better suit the strategic priorities of a given organization. This leads to an overall reduction in the
execution time of the analysis–decision–action– revision cycle, and thus to a decision-making process of better quality.
Evaluation.
• Finally, the fourth phase of the business intelligence cycle involves performance measurement and evaluation. Extensive metrics should then
be devised that are not exclusively limited to the financial aspects but also take into account the major performance indicators defined for
the different company departments.
Enabling factors in business intelligence projects
Some factors are more critical than others to the success of a business intelligence project: technologies, analytics and human resources.
Technologies
• Hardware and software technologies are significant enabling factors that have facilitated the development of business intelligence systems within
enterprises and complex organizations. On the one hand, the computing capabilities of microprocessors have increased on average by 100% every 18
months during the last two decades, and prices have fallen. This trend has enabled the use of advanced algorithms which are required to employ
inductive learning methods and optimization models, keeping the processing times within a reasonable range. Moreover, it permits the adoption of state-
of-the-art graphical visualization techniques, featuring real-time animations. A further relevant enabling factor derives from the exponential increase in
the capacity of mass storage devices, again at decreasing costs, enabling any organization to store terabytes of data for business intelligence systems.
And network connectivity, in the form of Extranets or Intranets, has played a primary role in the diffusion within organizations of information and
knowledge extracted from business intelligence systems. Finally, the easy integration of hardware and software purchased by different suppliers, or
developed internally by an organization, is a further relevant factor affecting the diffusion of data analysis tools.
Analytics
• As stated above, mathematical models and analytical methodologies play a key role in information enhancement and knowledge extraction from the
data available inside most organizations. The mere visualization of the data according to timely and flexible logical views, as described in Chapter 3,
plays a relevant role in facilitating the decision-making process, but still represents a passive form of support. Therefore, it is necessary to apply more
advanced models of inductive learning and optimization in order to achieve active forms of support for the decision-making process.
Human resources
• The human assets of an organization are built up by the competencies of those who operate within its boundaries, whether as individuals or collectively.
The overall knowledge possessed and shared by these individuals constitutes the organizational culture. The ability of knowledge workers to acquire
information and then translate it into practical actions is one of the major assets of any organization, and has a major impact on the quality of the
decision-making process. If a given enterprise has implemented an advanced business intelligence system, there still remains much scope to emphasize
the personal skills of its knowledge workers, who are required to perform the analyses and to interpret the results, to work out creative solutions and to
devise effective action plans. All the available analytical tools being equal, a company employing human resources endowed with a greater mental
agility and willing to accept changes in the decision-making style will be at an advantage over its competitors.
Development of a business intelligence system
The development of a business intelligence system can be assimilated to a project, with a specific final objective,
expected development times and costs, and the usage and coordination of the resources needed to perform planned.
Analysis:
• During the first phase, the needs of the organization relative to the development of a business intelligence system
should be carefully identified.
• This preliminary phase is generally conducted through a series of interviews of knowledge workers performing
different roles and activities within the organization. It is necessary to clearly describe the general objectives and
priorities of the project, as well as to set out the costs and benefits deriving from the development of the business
intelligence system.
Design:
• The second phase includes two sub-phases and is aimed at deriving a provisional plan of the overall architecture,
taking into account any development in the near future and the evolution of the system in the mid-term. First, it is
necessary to make an assessment of the existing information infrastructures. Moreover, the main decision-making
processes that are to be supported by the business intelligence system should be examined, in order to adequately
determine the information requirements. Later on, using classical project management methodologies, the project
plan will be laid down, identifying development phases, priorities, expected execution times and costs, together
with the required roles and resources.
Planning:
• The planning stage includes a sub-phase where the functions of the business intelligence system are defined and described
in greater detail. Subsequently, existing data as well as other data that might be retrieved externally are assessed. This
allows the information structures of the business intelligence architecture, which consist of a central data warehouse and
possibly some satellite data marts, to be designed. Simultaneously with the recognition of the available data, the
mathematical models to be adopted should be defined, ensuring the availability of the data required to feed each model and
verifying that the efficiency of the algorithms to be utilized will be adequate for the magnitude of the resulting problems.
Finally, it is appropriate to create a system prototype, at low cost and with limited capabilities, in order to uncover
beforehand any discrepancy between actual needs and project specifications.
Data visualization is one of the steps of the data science process, which
states that after data has been collected, processed and modeled, it must be
visualized for conclusions to be made.
Context of data visualization – Definition
Data visualization is the graphical representation of information and data. By
using visual elements like charts, graphs, and maps, data visualization tools
provide an accessible way to see and understand trends, outliers, and
patterns in data. Additionally, it provides an excellent way for employees or
business owners to present data to non-technical audiences without
confusion.
In the world of Big Data, data visualization tools and technologies are
essential to analyze massive amounts of information and make data-driven
decisions.
Advantages and Disadvantages of data visualization
Advantages:
● Easily sharing information.
● Interactively explore opportunities.
● Visualize patterns and relationships.
Disadvantages:
● Biased or inaccurate information.
● Correlation doesn’t always mean causation.
● Core messages can get lost in translation.
Why data visualization is important?
The importance of data visualization is simple:
it helps people see,
interact with, and
better understand data. ‘Whether simple or complex, the right visualization
can bring everyone on the same page, regardless of their level of expertise.
What is Tableau?
Tableau is a visual analytics platform transforming the way we use data to
solve problems empowering people and organizations to make the most of
their data.
Tableau helps people and organizations be more data-driven.
Tableau disrupted business intelligence with intuitive, visual analytics for
everyone.
[Link]
Architecture of Tableau
Features of Tableau
Data Visualization Principles
2. Compare - We need to be able to compare our data visualizations side by side. We can't hold the details of our data visualizations in
our memory - shift the burden of effort to our eyes.
3. Attend - The tool needs to make it easy for us to attend to the data that's really important. Our brains are easily encouraged to pay
attention to the relevant or irrelevant details. Stephen demonstrated this convincingly with a video similar to Daniel Simon's classic
gorilla and ball passing.
4. Explore - Data visualization tools should let us just look. Not just to answer a specific question, but to explore data and discover
things. Directed and exploratory analysis are equally valid, but we need to be sure that out visualization tool makes both possible.
5. View Diversely - Different views of the same data provide different insights. It helps to be able to look at the same data from
different perspectives at the same time and see how they fit together.
6. Ask why - More than knowing "what's happening", we need to know "why it's happening". This is where actionable results come
from.
7. Be skeptical - We too rarely question the answers we get from our data because traditional tools have made data analysis so hard.
We accept the first answer we get simply because exploring any further is tool hard More powerful tools like Tableau give you the
luxury to ask more questions, as fast as we can think of them.
8. Respond - Simply answering questions for yourself has limited benefit. It's the ability to share our data that leads to global
enlightenment.
Why visualization came into the picture
Our brains are naturally inclined to process visual information more
efficiently than textual or numerical data. Visualization leverages this
strength by representing data visually, using charts, graphs, diagrams, and
other visual elements. This allows us to perceive patterns, trends, and
relationships that might not be immediately apparent in raw data.
10 Good and Bad Examples of Data Visualization
[Link]
isualization
[Link]
Poor visualization vs Perfect visualization
[Link]
VrGEVK9DetjfM/edit?usp=sharing
Books
[Link]
Goal of Data visualization
The visual representation of data, is more scientific than artistic in our
modern world.
The main goal of data visualization is effectively, efficiently, elegantly,
accurately as well as meaningfully communicating information.
It fulfills its objectives only if it encodes the given input in such a manner that
our eyes can recognize and our brain can comprehend.
One of the main goals of data visualization is to give support in making
decision through appropriately designed graphically represented
information.
Visualizing the Past
Different Data Visuals for Different Needs:
There are two common types of visual representations of data. Both are very important and both have
different requirements when it comes to designing great visualizations.
1. Presentation - Uses data visuals to communicate. This type of visual representation has two roles: a
presenter and an audience.
2. Visualization - This is a fairly new term and the idea is to use visuals to think. Here, the experience is
active and involves people trying to answer questions.
Visualizing the Past
1700-1900: Visualization is Transformed:
William Playfair, a Scottish engineer who is widely regarded as the father of statistical presentation.
Playfair published a book in 1786 called the Commercial and Political Atlas which used graphical
representations of data to describe England’s balance of trade.
One famous example comes from Dr. John Snow, a British physician who used statistical graphics to
deal with London’s cholera epidemic of 1855.
Visualizing the Past
1700-1900: Visualization is Transformed:
Snow plotted individual cases of cholera as dots on a map of London. These dots showed that the
majority of cases could be traced to a water pump on Broad Street. An investigation of outlying
cases showed they, too, had connections to the Broad Street pump. Snow removed the handle from
the contaminated pump and the cholera epidemic subsided. This shows how the power of
visualization can answer questions and, in this case, even work for the public good. Snow’s map
also works as an effective example of the Presentation style; Snow’s data was strong enough to
persuade city officials to remove the infected handle and quell the outbreak.
Tableau Products:
● Tableau Desktop
● Tableau Server
● Tableau Online
● Tableau Public
● Tableau Reader
● Tableau Mobile
[Link]
dEtjRy-ga9qgqcY/edit?usp=sharing
Why use Tableau?
Tableau is the fastest and powerful growing visualization tool. It is very easy
to use. There are no complex formulas like excel and other visualization
tools. It provides the features like cleaning, organizing, and visualizing data, it
is easier to create interactive visual analytics in the form of dashboards.
These dashboards make it easier for non-technical analysts and end-users to
convert data into understandable ones.
Values in Tableau
There are two types of values in the tableau:
Dimensions: Values that are discrete(which can not change with respect to time)
in nature called Dimension in tableau. Example: city name, product name,
country name.
Measures: Values that are continuous(which can change with respect to time) in
nature called Measure in tableau. Example: profit, sales, discount, population.
Advantages of Tableau
● Quick calculation- All the calculations on the tableau done by the backend, so it is relatively faster
than any other tool.
● Interactive dashboards– Tableau dashboards are very interactive and easy to draw.
● No manual calculation- All the calculations done by the tableau only. There is no manual calculation
but in some specific cases, we used calculated fields for calculation.
● A large amount of data- Tableau can handle a large amount of data. Different types of visualization
can be created with a large amount of data without impacting the performance of the dashboards.
Disadvantages of Tableau
● High Cost- tableau is a paid tool for visualization, and it is a reason why people are not using tableau so much.
● Static and single value parameters- Tableau’s parameters are static and always single value can be selected
using a parameter. Whenever the data gets changed, these parameters need to be updated manually every
time.
● Limited Data Preprocessing- Tableau is strictly a visualization tool. Tableau Desktop allows you to do very
basic preprocessing.
Disadvantages of Tableau
● High Cost- tableau is a paid tool for visualization, and it is a reason why people are not using tableau so much.
● Static and single value parameters- Tableau’s parameters are static and always single value can be selected
using a parameter. Whenever the data gets changed, these parameters need to be updated manually every
time.
● Limited Data Preprocessing- Tableau is strictly a visualization tool. Tableau Desktop allows you to do very
basic preprocessing.
Google Data Studio v/s Tableau
[Link]
0wdGRlqkCdiySU/edit?usp=sharing