Data Analytics Assignment - Simplified Explanations
1. What is data analytics? Key components and Evolution of data analytics
Data analytics is the process of examining datasets to extract useful insights and trends. Key
components include data collection, processing, analysis, and visualization. It has evolved
from basic statistical analysis to advanced AI-driven techniques, enabling faster and more
accurate decision-making.
2. What are the differences among data analytics, Data science, and Data
engineering? Use cases for each.
Data Analytics focuses on analyzing data to gain insights. Data Science combines analytics,
programming, and machine learning to build predictive models. Data Engineering handles
the design and maintenance of systems to store and process large datasets. Use cases
include: Data Analytics for customer trend analysis, Data Science for recommendation
systems, and Data Engineering for creating robust data pipelines.
3. Why Oracle Big Data is required? Explain Oracle Big Data working.
Oracle Big Data is required to handle large-scale data processing and analysis efficiently. It
integrates big data technologies like Hadoop and Spark to process structured and
unstructured data. It works by combining these tools with Oracle’s database and analytics
capabilities to deliver insights quickly.
4. What is data, information, knowledge, and experience? Explain various types
of data with examples.
Data are raw facts (e.g., temperature readings). Information is processed data (e.g., average
temperature). Knowledge is understanding gained from information (e.g., patterns in
temperature changes). Experience is expertise developed from applying knowledge. Types
of data include: structured (e.g., databases), unstructured (e.g., videos), and semi-structured
(e.g., JSON files).
5. What are the most common data quality issues? How to address them?
Common data quality issues include missing values, duplicate records, inconsistent formats,
and outliers. Addressing them involves data cleaning techniques like imputation for missing
values, deduplication for duplicates, and standardization for formats.