1) what is pivot table? explain different areas of pivot table in simple?
A pivot table is a powerful data analysis tool used in spreadsheet software, like Microsoft Excel or
Google Sheets. It allows you to summarize and manipulate large sets of data, making it easier to
understand and extract insights from your information.
A pivot table consists of several key areas:
Rows Area: This is where you define the criteria by which you want to group and categorize your
data. Each unique value in this area represents a row in your final pivot table.
Columns Area: Similar to the Rows Area, this area lets you categorize your data by different columns.
The intersection of a row and a column is where your data will be summarized.
Values Area: In this section, you place the data you want to summarize or analyze. It could be
numbers like sales, quantities, or any other measurable values. The values in this area are typically
aggregated using functions like sum, average, count, etc.
Filters Area: This area allows you to apply filters to your data, narrowing down what is shown in your
pivot table. You can use this to focus on specific subsets of your data for analysis.
2) what are slicers? give a use case? in excel
slicers are visual controls that provide an interactive way to filter and analyze data within a pivot
table or pivot chart. They are particularly useful for quickly exploring and filtering data without
having to modify the pivot table settings directly. Slicers offer a user-friendly interface that allows
users to filter data by selecting values from a list, making data analysis more intuitive and efficient.
use case: use health insurance dataset which you have done in project.
3) Difference between count, count A and count IF? (Answered)
COUNT: The COUNT function simply counts the number of cells within a range that contain numbers
or numerical values.
It doesn't consider text, empty cells, or cells containing errors.
Syntax: = COUNT(range)
COUNTIF: The COUNTIF function counts the number of cells within a range that meet a specific
condition or criteria.
It can be used with both numerical and text data.
Syntax: =COUNTIF(range, criteria)
COUNTA: The COUNTA function counts the number of non-empty cells within a range.
It includes cells with text, numbers, errors, and any other non-blank content.
Syntax: =COUNTA(range)
Example Scenarios: Let's consider a dataset with the following values in a column:
A1: 10
A2: Apple
A3: 25
A4: Banana
A5: 15
A6: 30
COUNT:
=COUNT(A1:A6) would result in 4, as there are four numerical values (10, 25, 15, 30).
COUNTIF:
=COUNTIF(A1:A6, "<>Apple") would result in 3, as there are three values that are not "Apple."
COUNTA:
=COUNTA(A1:A6) would result in 6, as there are six non-empty cells.
4) What is the named range? (Answered)
A named range in Excel is a user-defined name that represents a specific cell, range of cells, formula,
constant value, or table. Instead of referring to cells using cell references like A1 or B3, you can assign
a meaningful name to a cell or range, making it easier to understand and manage your formulas,
functions, and data.
5) Given sample data, use the inbuilt Excel function to split the employee name into first name and
last name. (Answered)
Practice it in excel
6) Demostrate how to get descriptive statistics of the data. (Answered)
descriptive statistics for a set of data in Excel. Descriptive statistics help you understand the central
tendency, variability, and distribution of your data.
PRACTICE IT IN EXCEL
PYTHON
7) What is a lambda function? (Answered)
A lambda function is a small anonymous function. A lambda function can take any number of
arguments, but can only have one expression.
Syntax : lambda arguments: expression
square = lambda x: x * x
print(square(5)) # Output: 25
8) What is list comprehension? (Answered)
when you want to create a new list based on the values of an existing list.
syntax
new_list = [expression for element in iterable if condition]
fruits = ["apple", "banana", "cherry", "kiwi", "mango"]
newlist = [x for x in fruits if "a" in x]
print(newlist)
9) Convert the two lists given into a tuple of pairs of elements from both the lists given.(Answered)
list_1= [1, 2, 3, 4]
list_2 = [0, 5, 2, 1, 4]
required output: ((1, 0), (2, 5), (3, 2), (4, 1))
practice in python
10) Given two lists, and asked to get the common elements between them as an output. (Answered)
Practice in python
SQL
11) Display the details of the employees who got hired between Jan and March month. (Answered)
Practice in sql
12) Display the details of employee who has 10th highest salary? (Answered)
Practice in sql
13) What are the constraints of SQL? (Answered)
Constraints are used to limit the type of data that can go into a table. This ensures the accuracy and
reliability of the data in the table. If there is any violation between the constraint and the data
action, the action is aborted.
Constraints can be column level or table level. Column level constraints apply to a column, and table
level constraints apply to the whole table.
The following constraints are commonly used in SQL:
NOT NULL - Ensures that a column cannot have a NULL value
UNIQUE - Ensures that all values in a column are different
PRIMARY KEY - A combination of a NOT NULL and UNIQUE. Uniquely identifies each row in a
table
FOREIGN KEY - Prevents actions that would destroy links between tables
CHECK - Ensures that the values in a column satisfies a specific condition
DEFAULT - Sets a default value for a column if no value is specified
CREATE INDEX - Used to create and retrieve data from the database very quickly
14) What are the Primary key and foreign key? ( Answered)
Primary key: A primary key is a unique identifier for each record in a table. It uniquely identifies each
row and ensures that there are no duplicate values in the primary key column.
It can’t accept null or duplicates values.
foreign key: A foreign key is a field in one table that is linked to the primary key of another table. It
establishes a relationship between two tables, indicating that values in the foreign key column(s) of
one table must match values in the primary key column(s) of the related table.
Candidate Key: A candidate key is a column or set of columns that could potentially become the
primary key. It meets the requirements of being unique and able to identify records uniquely.
Unique Key: A unique key ensures that the values in the specified column(s) are unique across the
table.
Unlike a primary key, a table can have multiple unique keys, but only one primary key.
Alternative key: candidate keys that are not currently selected as the primary key of a table. These
are the secondary candidate keys that can be used to uniquely identify.
Super key: A super key is a set of one or more columns that can uniquely identify records. It might
include additional columns beyond what's required for a primary key.
15) What is Normalization?
Normalization is a process in database design that helps organize and structure data in a way that
reduces redundancy and improves efficiency. In simple terms, it's like tidying up your data to avoid
repeating the same information and to make it easier to work with.
We can split large table into smaller table
Tableau
16) What are measured and dimensions?
Dimensions:
Dimensions are categorical or qualitative attributes that provide context to your data. They are often
used for grouping and categorizing data points.
Dimensions represent things like categories, labels, names, dates, or geographic locations.
Examples: of dimensions include product names, customer names, dates, regions, and departments.
Measures:
Measures are quantitative or numerical values that can be aggregated and analyzed mathematically.
They represent the data you want to analyze and visualize.
Measures are used for calculations, summarizations, and performing mathematical operations.
Examples of measures include sales revenue, profit, quantity sold, temperature, and average score.
17) What is the difference between joining and blending?
Data Blending: Combining the data from two or more different sources is Data Blending. We can combine
data between two more variety of sources such as Oracle, Excel, SQL Server and others.
In Data Blending, each data source contains its own set of Dimensions and Measures.
Data Joining: Combining the data between two or more tables or sheets within the same Data Source is
Data Joining.
In Data Joining, all the combined tables or sheets contains common set of Dimensions and Measures.
S.
Data Joining Data Blending
No.
Used when the dataset is from a different data
1 Used when the data set is from the same source
source
Ability to use different types of join (left join, right
2 Uses only left join.
join, inner join and full outer join)
Data has to be maintained at the same level of Data can be available in different level of
3
granularity granularity.
Sends separate query to each dataset, aggregates
4 Joins data at a row-level
and then performs blending
18) What is a dashboard?
A dashboard is like a digital snapshot of important information. It shows key data and numbers using
easy-to-understand charts, graphs, and tables and understand important facts and figures quickly,
making it easier to make decisions.
19) what is story?
a "story" refers to a sequence or narrative created using data visualizations, charts, and insights to
effectively communicate a message or present a series of findings.
20) what is parameter?
A parameter is a flexible value you can change to explore different outcomes in your data analysis,
without altering the actually data.-