0% found this document useful (0 votes)
11 views14 pages

Basics of Data Science

It is a short notes on data science

Uploaded by

subhihafirdouz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views14 pages

Basics of Data Science

It is a short notes on data science

Uploaded by

subhihafirdouz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Data Analysis

Dr. Melba Rosalind J


What do you infer?
Data Analysis
● Science is knowledge which we understand so well that we can teach it to a computer.
Everything else is art.
- Donald Knuth’s legendary 1974 essay Computer Programming as an Art
● Eg., The art of songwriting
○ creative spark is difficult to describe, much less write down, but it’s clearly essential to
writing good songs.
● Developing a useful framework involves characterizing the elements of a data analysis using
abstract languages
● Language is the language of mathematics / statistics
Role of Data Analyst
● a data analyst must find a way to assemble all of the
tools and
● apply them to data to answer a relevant question—a
question of interest to people.
What is Data Analysis?

“ Data analysis is a highly iterative and


non-linear process, better reflected by a
series of epicycles “

https://youtu.be/llzbfsKLJyg?si=LkqGMpaK2c2Yy7sH
Epicycles of Analysis
★ An epicycle is a small circle whose center moves around the
circumference of a larger circle.
★ In data analysis, the iterative process that is applied to all steps of
the data analysis can be conceived of as an epicycle that is
repeated for each step along the circumference of the entire data
analysis process.
★ Algorithms are final data analysis products that have emerged
from the very non-linear work of developing and refining a data
analysis so that it can be “algorithmized”
Data Analysis

● data analysis presumes the data have already been collected

data analysis a study includes

● the development of a hypothesis or question,


● the designing of the data collection process (or study protocol),
● the collection of the data, and
● the analysis and interpretation of the data
5 core activities of data analysis:

1. Stating and refining the question


2. Exploring the data
3. Building formal statistical models
4. Interpreting the results
5. Communicating the results
“The epicycle of data analysis”

Iterating through the 3-step process

1. Setting Expectations,

2. Collecting information (data), comparing the data to your expectations, and


if the expectations don’t match,

3. Revising your expectations or fixing the data so your data and your
expectations match.
Data Analysis- defining in different terms
➔ Data analysis is the process of systematically inspecting,
cleaning, transforming, and modeling data to extract
meaningful insights, draw conclusions, and support
decision-making.
➔ It involves using statistical and logical techniques to
understand data, identify patterns, and inform decisions
across various fields such as business, healthcare, and
research.
Fascinating Facts
● 90% of the world's data has been generated in just
the last few years, with around 2.5 quintillion
bytes created every day.
● Less than 0.5% of all data created is ever analyzed
and used—the vast majority remains untouched.
● 80%–90% of digital content is unstructured, such
as emails, social media posts, and videos, making
it challenging for businesses to analyze.
● It would take an average internet user 181 million
years to download all the information on the
internet.
● An average internet user generated about 1.7 MB
of data per second in 2023, totaling nearly 147 GB
per day.
Trivia

● Google uses about 1,000 computers to answer a single search query.


● 1 billion pieces of content are shared on Facebook every day.
● A stack of CD-ROMs equal to the world’s digital storage would reach 80,000 km
beyond the moon.
● There are nearly as many digital data pieces as stars in the universe.
● 80% of a data scientist’s work is cleaning and preparing data; only 20% is actual
analysis.
Fun & Surprising news

● AI-generated text models have been trained to write


new Harry Potter novels—showing the creative side of
data science.

● The City of Chicago used data analysis to predict


which restaurants were likely to violate sanitation
rules, enabling inspectors to find violators a week
earlier on average.

● AI-powered bees are being developed for crop


pollination and climate monitoring, blending data
science with environmental innovation.
References

15 Astonishing Tweetable Facts About Analytics - DataScienceCentral.com

You might also like