0% found this document useful (0 votes)
3 views31 pages

Snowflake Interview Questions

Snowflake is a cloud data warehouse platform that offers flexible data storage, processing, and analytics without the need for hardware management. It features unique capabilities like auto-scaling, zero copy cloning, time travel, and a separation of storage and compute. Snowflake supports various file formats, provides robust data security, and is designed for online analytical processing (OLAP) rather than transaction processing (OLTP).

Uploaded by

kiran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views31 pages

Snowflake Interview Questions

Snowflake is a cloud data warehouse platform that offers flexible data storage, processing, and analytics without the need for hardware management. It features unique capabilities like auto-scaling, zero copy cloning, time travel, and a separation of storage and compute. Snowflake supports various file formats, provides robust data security, and is designed for online analytical processing (OLAP) rather than transaction processing (OLTP).

Uploaded by

kiran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

1

1. What is snowflake?
Snowflake is a cloud data warehouse platform built on top of the public cloud (AWS or Azure or
GCP) for storing and retrieving data. It enables data storage, processing, and analytic solutions
that are faster, easier to use, and far more flexible than traditional offerings. It’s a true software-
as-a-service (SaaS) offering. There is no hardware & software for you to install, configure, or
manage. All ongoing maintenance, management, and tuning is handled by Snowflake.
2. What is unique about snowflake cloud data warehouse?
Auto scaling, Zero copy cloning, Time travel, multi cluster virtual warehouse, all the data is
compressed, encrypted and stored in columnar format.
3. Why snowflake is 5 star cloud data warehouse?
1. True separation of storage and compute
2. Zero copy cloning
3. Data share
4. Memory cache
5. Time travel
4. PaaS (platform as a service) vs SaaS (software as a service)?
Platform as a service (PaaS) provides hardware and software tools available over the internet
and people use these tools to develop application. Example: Windows azure, google App engine
etc.
Software as a service (SaaS): Software available over the internet for users and pay fee in every
month, year etc. example: Drop box, sales force GoToMeeting, Cisco webEx etc.
Infrastructure as a service (IaaS): Could-based services, pay-as-you-go for services such as
storage, networking and virtualization. AWS, AZURE and GCP etc.
5. Snowflake query performance tuning?
1. Avoid scanning files while loading the data files into snowflake using COPY into
command: When we transferring the data from on premise system to cloud storage and
then using COPY into command to load to snowflake.
2

2. Choose a sensible virtual warehouse size: When loading large data files into snowflake,
involves scaling up to a bigger virtual warehouse to speed the data load process. Ideally
scaling up the warehouse has no performance benefit in this case.

The best approach is split (CSV) file (10GB) into multiple smaller files (100*100MB) to
make use of snowflake automatic parallel execution.

3. Select only required columns instead of all columns while querying.


4. Define clustering key when you’re working on huge volume of data.
6. Difference between snowflake database and other database?

Feature Oracle Snowflake


Workload Transactional or data warehouse Pure data warehouse
Deployed On premise on public could like AWS, AZURE & GCS
Cost Fixed charge per user Flexible charging.
High availability and
disaster recovery
Automatic scaling up &
down
Installation Required hardware, OS None
Upgrade & patches Upgrade may require down time Automatically and transparently applied
Management overhead Almost zero
3

Manually defined partitions & sub


Partitions partitions Automatic
Indexes None
Not recommended for large volume of Automatically captured. Nothing to
Statistics data manage.

7. Snowflake is OLAP or OLTP?


Snowflake is designed for OLAP (online analytical processing) database system not for OLTP
(online transaction processing) database system. In OLTP data is collected, stored, and
processed from real time transactions but in OLAP, complex queries are used to evaluated
aggregated historical data from OLTP system. Snowflake is not designed to handle much
inserting and updating of small amount of data like traditional database.
8. Snowflake vs redshift?
Snowflake charge separately for compute and storage whereas redshift combines both.
Snowflake support for JSON based functions and queries better than redshift.
Snowflake has more automated maintenance than redshift.
9. Where is metadata stored in snowflake?
Snowflake automatically generates metadata for files in internal or external stage. It is stored in
virtual columns and can be queried using SELECT statement.
10. What is materialized view?
Materialized view is nothing but a kind of a table, we will fetch data from table and store that
result in a separate table or schema.
11. What is view?
A view is a virtual relation, which performs as an actual relation in the database. It used for
combining, segregating and data protecting. View result will never store anywhere.
12. Materialized view vs view?
Views is a virtual relation, which performs actual relation in the database.
Whereas materialized view is nothing but an indirect access to the table data by storing the
results of a query in separate schema.
13. Snowflake account can be hosted on?
AWS, Microsoft AZURE and google GCP
14. When I host snowflake account on AWS it will only use?
It will use AWS resource to store the data and process the data
15. Your company has raw data stored in AWS s3 in the region, the US East (N. Virginia). If you are
hosting your snowflake account on AWS while creating Snowflake account which region you
choose?
US East (N. Virginia)
16. What is the correct account name used to login through snowsql ?
Tia73803.us.east-1
17. When you create database, what default schema are created?
Public and information schema
18. Every query execution in snowflake has it won?
Query ID
19. Database consists of?
4

Tables, views, schema, stages, file format, sequences and pipes


20. Can I compile a query before execution?
No
21. A large warehouse will have?
8 servers running in background
22. `DESC TABLE EMPLOYEE` executing this statement will use warehouse to get result?
No.
23. Virtual warehouse can be assigned only to one user?
No
24. Multi-cluster warehouses enable you to scale compute resources to manage your user and
query concurrency needs as they change, such as during peak and off hours?
Yes.
25. If i am using multi cluster warehouse, once my load decreases snowflake will automatically
spins down unused clusters?
Yes
26. When you enable maximized mode, what will happen?
Snowflake starts all the clusters so that maximum resources are available while the warehouse
is running
27. Your data load varies during your ETL or ELT process. Which cluster mode you prefer to use?
Auto scale mode
28. Which mode comes with SCALING POLICY?
Auto scale mode
29. Which scaling policy is chosen by default?
Standard
30. Prevents/minimizes queuing by favoring starting additional clusters over conserving credits?
Standard
31. If you want to save, cost and utilize your clusters to maximum level before spinning up
new ones; which SCALING POLICY you will use?
Economy
32. Snowflake supported file formats?
Snowflake supports structured and semi-structured file formats
Structured file formats: CSV
Semi-structured file formats: JSON, XML, PARQUET, ORC and AVRO
CSV, JSON, XML, Avro, ORC and Parquet.
33. How to delete record in snowflake table?
Delete from <table name> where <condition>.
34. How to delete duplicate records in snowflake table?
Entire rows duplicated:

Here's a very simple approach that doesn't need any temporary tables. It will work very nicely
for small tables but might not be the best approach for large tables.
5

If you have primary key:

35. How to check the retention period?

36. How to change the retention period?

37. Snowflake architecture called as?


Multi cluster shared data architecture
38. What is the problem with shared architecture?
Scalability is limited, hard to maintain data consistency across the cluster and bottleneck of
communication with shared disk.
39. What is the problem with shared nothing architecture?
If a node want to refer data from different node then data must get shuffled between the
nodes. What if a node fails data is stored in the disk, should get, and shuffled between the
nodes. Such a failures will impact the whole performance of the system.
40. In shared nothing architecture, each node is only responsible for?
Each node is responsible to store data on the disk.
41. Big problem with shared nothing architecture?
Compute resources are tightly coupled with each other.
42. At the hardware level what problems we have with shared-nothing architecture?
Homogeneous hardware and heterogeneous workload
Membership change
Software and hardware upgrades
43. What are the layers in snowflake?
Data storage or database layer
Virtual warehouse or processing layer
Cloud services or management layer
44. Which are the services provided by the cloud services layers in snowflake?
Authentication and access control
6

Metadata information storage


Query optimizations
Transactions
Security
45. How snowflake process queries?
Using virtual warehouse layer
46. How to change/set retention period?
You can set the retention period up to 90 days.

47. How to disable time travel in snowflake for a table?

48. Time travel SQL extensions?


AT or BEFORE clause which can be specified in SELECT statement.
TIMESTAMP

OFFSET (time difference in seconds from the present time)

STATEMENT (example query ID)

49. Data store in snowflake?


Internal optimized, compressed and columnar format
50. Data storage layer can be accessed directly without web console?
No. The data objects stored by Snowflake are not directly visible nor accessible by customers;
they are only accessible through SQL query operations run using Snowflake.
51. How to disabled cashed results or snowflake database results cache?

52. What does the compute layer (virtual warehouse layer) do in snowflake?
It will manage query execution and DML operations within elastic clusters of virtual machines.
53. What does the service layer (cloud services layer) do in snowflake?
It’s collection of services. It will manage virtual warehouses where we executed on snowflake,
metadata, transactions. This is also called as brain of the system. It will manage authentication
and access control, security, query optimization, transaction and metadata information etc.
54. What are the different ways to access the snowflake cloud data warehouses?
SnowSQL CLi
ODBC drivers
JDBC drivers
Web user interface
Python libraries
55. What are the data security features in snowflake?
Snowflake encrypts all customer data by default using End-to-end encryption (E2EE), using the
latest security standards, at no additional cost. Snowflake provides best-in-class key
management, which is entirely transparent to customers.
56. How does data compression works in snowflake?
7

All the data compressed by default in snowflake. Snowflake chooses the best compression and
it’s not configurable by the user. Snowflake will charge after data compression is applied.
57. What are the benefits of snowflake compression?
Storage costs lower than native cloud storage due to compression
No storage costs for on disk caches
Near zero storage overhead for data cloning or data sharing
58. Different types of caching in snowflake?
Query result cache, Virtual warehouse cache and metadata cache.
59. Snowflake cache layers?
1. Result cache: It stores the results of all queries executed in the past 24 hours.
2. Local disk cache or virtual warehouse: Whenever data is needed for a given query it's
retrieved from the Remote Disk storage and cached in SSD and memory of the Virtual
Warehouse. This data will remain until the virtual warehouse is active.
3. Remote disk cache: Which holds the long-term storage. This level is responsible for data
resilience, which in the case of Amazon Web Services, means 99.9% durability. Even in the
entire data center failure.
60. What is snowflake cache?
A cache is a temporary storage area.
61. Different states of snowflake virtual warehouse and difference between cold vs hot virtual
warehouse?
1. Run from cold virtual warehouse: When you all your virtual warehouses are suspended and
if you run a query, it will start a new virtual warehouse (with no local disk caching) and
executing the query.
2. Run from warm virtual warehouse: Your virtual warehouse has been active and running for a
while and has processed few queries, Then its called WARM Virtual warehouse. Now, If you
disable the result caching, and repeat the query. It will make use of the local disk caching
which it pulled in the past, which is termed as Warm Caching.
3. Run from hot virtual warehouse: you repeated the query execute, and the result caching is
switched on. The results are fully served from the cache and this is the most efficient
operation among all the three types.
62. Is there a cost associated with time travel in snowflake?
Yes, for maintaining historical data in both time travel and Fail-Safe periods.
63. Time travel vs Fail-Safe in snowflake?
If you do any table operations on snowflake table and if you want to get older version of the
table then we can use time travel feature in snowflake. What is the retention period is set for
the table that will tell you how long you can travel back.

Once the table retention period over, snowflake automatically takes table data backup. It’s
continuous backup, which is provided by snowflake. Data get stored up to 7 days in fail-safe
zone once the retention period is over. No user operations allowed. Only snowflake can help us
to retrieve those results in case of disasters.
64. What is zero copy cloning in snowflake?
Cloning also referred to as zero copy cloning. It creates a copy of database or schema or table
without duplication associated storage files on disk. Zero copy cloning saves you in storage cost.
8

65. What is horizontal scaling vs vertical scaling in snowflake?


Horizontal Scaling to increase concurrency - When you have to support additional users, you can
use auto scaling and increase the number of virtual warehouses to support and fulfil more user
queries.
Vertical Scaling to reduce processing times - When you have huge workloads and if you want to
optimize it and make it run faster, you can consider choosing a bigger virtual warehouse size

66. In case I do some DDL operation on the cloned table, will that not affect my original table?
No.
67. If there are some changes to original tables data like insert/update of rows will it
automatically get reflected in the cloned data as well?
No
68. If the original table data had deleted then will the cloned, table able to see the deleted data?
Yes. The cloned table still have the original data.
69. Since zero copy cloning is a copy of the database, schema and table, do we pay for storage
twice or only one?
There is no additional storage cost for the zero copy cloning as all the data is shared from the
source version. The only additional cost would be for any changed data blocks, which are local
to the cloned copy
70. What type of tool is snowflake or is snowflake an ETL tool?
Yes. Snowflake is an ETL tool that consists of three steps.
1. Extract: data. Data files supports multiple formats like JSON, CSV, XML etc.
2. Load: data into external or internal stage.
3. Copy: Data is copied into snowflake database table using COPY INTO command.
71. How data stored in snowflake?
Snowflake store the data in multiple micro partitions, which are internally optimized and
compressed. The data stored in columnar format.
72. Snowflake is what kind of database?
All Snowflake's features are built on top of SQL (Structured Query Language) databases. The
data is stored in columns and it is compatible with other tools, including Excel and Tableau. As a
SQL database, Snowflake contains a query tool, supports multi-statement transactions, provides
role-based security, etc.
73. Snowflake editions?
1. Standard edition: It provides unlimited access to snowflake standard features.
2. Enterprise edition: Enterprise Edition provides all the features and services of Standard
Edition, with additional features designed specifically for the needs of large-scale
enterprises and organizations.
3. Business-critical edition: formally known as enterprise for sensitive data (ESD). Provides
high-level data protection to support organization with sensitive data.
4. Virtual private snowflake (VPS): Provides high-level security for organizations dealing with
financial activities.
9

74. What is snowflake cluster?


The process of grouping records within micro partition is called clustering in snowflake.
Snowflake automatically group data in micro partition by defining the clustering key on
snowflake. Data partitioning is called clustering, that specifies cluster key on the table. The
method by which you manage clustered data in table is called re-clustering.
75. What is schema?
Schema and database used for organizing data stored in the snowflake. Schema is a logical
group of database objects such as tables, view etc.
76. What kind of SQL does snowflake use?
It supports most common standard versions of SQL, ie., ANSI for powerful relational database
querying.
77. Snowflake data types?
Numeric data type
String & binary data type
Logical data types
Date & time data type,
Semi-structure data type
78. Semi-structure data types?
VARINT, OBJECT and ARRAY
VARINT  Convert a value to the variant data type use TO_VARIANT or <expr::variant>.
To convert a value from the variant type use <expr::variant>

OBJECT  Collection of key value pairs, where the key is non-empty string and the value is a
value of variant type. Snowflake does not support currently explicitly-types object.
ARRAY  Sparse arrays of arbitrary size, where index is a non-negative integer and values have
variant type. Snowflake does not support currently.
79. Snowflake variant data type?
10

Variant data type is the column datatype used for ingesting semi-structured data. It can store
values of any other type including OBJECT and ARRAY up to maximum size of 16MB compressed.
A view is the recommended approach for making a variant column accessible in a BI tool.
80. Snowflake stored procedure?
Stored procedures are similar to functions. Once the stored procedure is created we can
executed many times. A stored procedure is created with CREATE PROCEDURE command and is
executed with a CALL command.
A stored procedure returns a single value.
Benefits of stored procedure:
 Procedural logic (branching & looping), which straight SQL does not support.
 Error handling.
 Dynamically creating a SQL statement and execute it.
 One common use for stored procedures is to automate a task that requires multiple SQL
statements and performed frequently.
81. Stored procedure vs UDF (user defined functions)?
1. Stored procedure is called as an independent statement, rather than as part of statement.

2. Stored procedure return a value but not mandatory or is not required to return value.
Function required to return a value.
3. The value returned by a stored procedure cannot used directly in SQL.
The value returned by a function cannot used directly in SQL.
4. Single executable statement can call only one stored procedure. Which means single SQL
statement can call multiple functions.
Function cannot be called as part an expression
5. Snowflake provides a JavaScript API (in the form of JavaScript objects and methods). The API
enables stored procedures to execute database operations such as SELECT, UPDATE, and
CREATE.
82. What is dynamic data masking?
Dynamic Data Masking is a Column-level Security feature that uses masking policies to
selectively, mask plain-text data in table and view columns at query time.
Snowflake supports using dynamic data masking on tables and views. Dynamic data masking
privileges are APPLY and OWNERSHIP.
Apply masking policy to columns

Apply masking policy to view

Apply masking policy to external table

Create dynamic policy syntax:


11

83. You clone a view?


Snowflake will not allow you to clone a view directly. If you clone the entire database or schema
that contains the view, it will work. There is no direct clone for a view.
We can clone a database, schema, table, streams and other schema objects like stage, file
format, sequence and task.
84. Types of joins?
1. Inner join: Returns matched records in both the database tables.

2. Outer join: Return all rows from the left table, and the matched rows from the right
table.
12

3. Right outer join: Return all rows from the right table and matched rows from the
left table.

4. Full outer join: Return all matching and non-matching records from both the tables.

5. Cross join: Combines each row in the first table with each row in the second table. It will
omitting the join condition.
13

6. Natural join: When two tables contain columns that have the same name and in which the
data in those columns corresponds. Natural join can be combined with an outer join.
A natural join implicitly constructs
the ON clause ON table2.city = table1.city AND table2.province = table1.province.
85. How to capture all the rejected records in snowflake?

86. Calculate maximum, minimum and average salary column in employee table?

87. Calculate the averages of different groups?

88. How to find the second highest salary in employee table?


14

89. Difference between rank, dense rank and row number in snowflake?

Rank: If you have a duplicate value in the amount column it will give you the same ranking
number but it will skip the next ranking number. This is the default behavior of the rank
function.
Or
15

Rank function if it founds the duplicate values it will gives the same ranking number but it will
skip the next ranking number.

Dense Rank: If you have a duplicate value in the amount column it will give you the same
ranking number and it will not skip the next ranking number. This is the default behavior of the
dense rank function.
Or
Dense rank function if it founds the duplicate value or same value it will gives the same ranking
number but it will not skip the next ranking number.
16

Row Number: If you have a duplicate value in the amount column it will give you the next
ranking number. Which means it gives the straight ranking number. No matter if its gets the
duplicate values it will give the sequential ranking value. This is the default behavior of the row
number function.
Or
Row number function if it founds the duplicate values it will gives the straight ranking number.

90. How to find the table structure?


We can find it into two ways
First way:

Second way:
17

91. How to copy all the rows from one table to another table without insert statement?

92. How to convert rows to columns in snowflake?


Using PIVOT

93. Convert columns into rows in snowflake?


Using UNPIVOT
18

94. Truncate and delete operations?


Delete:
1. Delete is used to delete specific data.
2. Delete can use where clause.
3. We can roll back the changes.
4. Slower than truncate.
Truncate:
1. Truncate is used to delete entire data.
2. Truncate can’t be used where clause.
3. We can’t roll back the changes.
4. Faster than delete.
95. How to find stage properties for the table?
19

96. How to rename the table name in snowflake?

When you rename the table name, old table will be deleted.
97. How to add, delete and modify a column in snowflake?
Add column

Drop column
20

Modify column

98. Difference between OLAP vs OLTP?


OLAP (on-line analytical processing)
1. OLAP used for data analysis.
2. OLAP uses data warehouse
3. OLAP manages insert, update and delete transactions.
4. Processing is bit slower than OLTP.
OLTP (on-line transaction processing)
1. OLTP used to manage large number of online short transactions.
2. OLTP uses traditional DMBS.
3. OLTP mainly used for data reading.
4. Processing faster (milliseconds)
99. What is factless fact table?
Factless fact table does not have any measures. There are two types
1. Capturing an event 2. Describing conditions.
100. How to apply cluster in table?

101. What is information schema in snowflake?


Information schema is nothing but a data dictionary that provides metadata information about
the objects such as tables, views, functions etc. It cannot be modified or dropped.
21

Login history details:

102. What is FLATTEN?


Flatten is used to convert semi-structured data to relational representation. Flatten is table
function that takes a VARIANT, OBJECT, ARRAY.
103. Database vs data warehouse?
Database is collection of related data. Whereas data warehouse is an information system that
stores historical and cumulative data from multiple resources.
Database is designed to record data whereas data warehouse is designed to analyze data.
104. Types of continuous data pipe load in snowflake?
Snow pipe, stream and tasks.
105. Star schema vs snowflake schema?

Star Schema Snowflake Schema


Star schema contains fact tables and Snowflake schema contains fact tables, dimension
dimension tables. tables as well as sub dimension tables.

Star schema is a top-down model Snowflake schema is bottom-up model


Star schema uses more space Snowflake schema uses less space
Star schema takes less time for the Snowflake schema takes more time than star schema
execution of queries. for the execution of queries.
In snowflake schema normalization and de-
In star schema normalization is not used normalization are used.
Star schema design is simple Snowflake schema design is complex
Query complexity of snowflake schema is higher than
Query complexity of star schema is low star schema
Star schema understanding is simple Snowflake schema understanding is difficult
Star schema has high data redundancy Snowflake schema has low data redundancy

1. What is surrogate key or sequences in snowflake?


22

Used to generate sequential number or unique numbers across sessions and statements,
including concurrent statement. It can be used for a primary key or any column that requires a
unique value.

It is used in dimension tables, fact tables and my guess is that it can be used in start schema and
snowflake schema also.
Example:

2. What is query pruning?


23

Precise pruning of columns in micro-partitions at query run-time, including columns containing


semi-structured data.

https://docs.snowflake.com/en/user-guide/tables-clustering-micropartitions.html#query-
pruning
3. What are the snowflake objects can share?
Database, Tables, External tables, Secure views, Secure materialized views and Secure UDFs.
4. Types of stages in snowflake?
1. User stage: represented by @~. Each user has snowflake stage allocated to them by default
for storing the file. This stage is convenient option if your file will be accessed by the single
user, need to copy into multiple tables.
2. Table stage: represented by @%. Each user has snowflake stage allocated to them by default
for storing the file. This stage is convenient option if your file will be accessed by the
multiple users, need to be copy into single table.
3. Named stage: Represented by @. Internal stages are named database objects that provide
the flexibility for data loading. Because they are database objects, the security/access rules
that apply to all objects apply.
5. DDL (Data definition language) commands in SF?
Used to create, manipulate and modify objects in sf such as users, virtual warehouses,
databases, schemas, tables, views, columns, functions and stored procedures.
1. ALTER
2. COMMENT
3. CREATE
4. DESCRIBE
5. DROP
6. SHOW
7. USE
6. DML (data manipulation language) commands in SF?
General DML, data loading/unloading DML and file staging commands (for data
loading/unloading).
General DML:
1. INSERT
2. INSERT (multiple table)
3. MERGE
4. UPDATE
5. DELETE
6. TRUNCATE TABLE
Data loading/unloading DML
1. COPY INTO <table> (loading/import data)
2. COPY INTO <location> (unloading/exporting data)
3. VALIDATE (table function)
File staging commands (for data loading/unloading)
1. PUT
2. GET
3. LIST
4. REMOVE
24

7. Types of integration objects in snowflake?


That provides an interface between snowflake and third-party services.
1. API integration: It stores information about an HTTPS proxy services like amazon API
gateway, azure API management service and google cloud API gateway.
Example:

2. Notification integration: Interface between snowflake and third-party massaging services


(third-party cloud message queuing services, email services etc.).
Example:

3. Security integration: It provides an interface between snowflake and third-party services.


There are four types of security integrations in snowflake.
1. External OAuth
2. SF OAuth
3. SAML2
4. SCIM
4. Storage integration: It’s a snowflake object that stores IAM (identity and access
management) entity for your external cloud stage.
Example:
25

SF must share STORAGE_AWS_IAM_USER_ARN & STORAGE_AWS_EXTERNAL_ID to AWS


team in order to establish the connection between SF & AWS.

8. Types of security integrations in SF?


1. External OAuth: It allows a client to use a third-party authorization server to obtain the
access tokens needed to interact with Snowflake.
Mandatory parameters:
TYPE = EXTERNAL_OAUTH
ENABLED = TRUE | FALSE (TRUE-allow the integration and FALSE – Suspend the
integration for maintenance).
EXTERNAL_OAUTH_TYPE = OKTA | AZURE | PING_FEDERATE | CUSTOM
EXTERNAL_OAUTH_ISSUER = (specifies the URL to define the OAuth 2.0 authentication
server).
EXTERNAL_OAUTH_TOKEN_USER_MAPPING_CLAIM = (map the access token to a SF
user record).
EXTERNAL_OAUTH_SNOWFLAKE_USER_MAPPING_ATTRIBUTE = ‘LOGIN_NAME |
EMAIL_ADDRESS’
Indicates which SF user record attribute should be used to map the access token to a SF
user record.
Example:
26

2. Snowflake OAuth: It enables clients that support OAuth to redirect users to an


authorization page and generate access tokens (and optionally, refresh tokens) for
access to Snowflake.
Mandatory parameters:
TYPE = OAUTH (specify the type of integration. Creates a security interface between SF
and client that support OAuth)
OAUTH_CLIENT = CUSTOM (specify the client type. OAUTH creates an OAuth interface
between SF and custom client)
OAUTH_CLIENT_TYPE = ‘CONFIDENTIAL’ | ‘PUBLIC’ (specify the type of client)
OAUTH_REDIRECT_URI = (Specify the client URL. After user is authenticated the web
browser is redirected to this URL)

Example:
27

SF team must share the information below to configure connection:


OAUTH_AUTHORIZATION_ENDPOINT
OAUTH_TOKEN_ENDPOINT
OAUTH_CLIENT_ID
OAUTH_CLIENT_SECRET
3. SAML2: It provides single sign-on (SSO) workflows by creating an interface between
Snowflake and a third-party Identity Provider (IdP).
Mandatory parameters:
TYPE = SAML2 (specify the type of integration. Creates a security interface between SF
and the identity provider)
ENABLED = TRUE | FALSE (TRUE-allow the integration and FALSE – Suspend the integration for
maintenance).
SAML2_ISSUER = (containing the IdP)
SAML2_SSO_URL = (containing the IdP SSO URL. Where the user should be redirected by
SF with a SAML2.
SAML2_PROVIDER = (describing IdP)
SAML2_X509_CERT = (The base64 encoded IdP singing certificate on a single line
without the leading)

4. SCIM: It allows the automated management of user identities and groups (i.e. roles) by
creating an interface between Snowflake and a third-party Identity Provider (IdP).
Mandatory parameters:
TYPE = SCIM (specify the type of integration. Creates a security interface between SF
and a client that supports SCIM)
SCIM_CLIENT = ‘OKTA’ | ’AZURE’ | ‘GENERIC’ (specify the SCIM client)
28

RUN_AS_ROLE = ‘OKTA_PROVISIONER’ | ‘ADD_PROVISIONER’ |


‘GENERIC_SCIM_PROVISIONER’ (specify the SCIM role in SF that owns any users and
roles that are imported from the identity provider into SF using SCIM)

9. Types of facts in data warehouse?


1. Additive
2. Semi-additive
3. Non-additive
10. Types of fact tables?
1. Transaction fact table
2. Periodic snapshot tables
3. Accumulating snapshot tables
11. Types of dimension tables?
1. Conformed dimension
2. Junk dimension
3. Role play dimension
4. Slowly changing dimension
5. Degenerated dimension
12. Types of data warehouse schema?
1. Star Schema
2. Snowflake schema
3. Galaxy schema: It is a collection of multiple fact tables having some common dimension
tables. It can be viewed as a collection of several star schema and hence also known as
galaxy schema.
13. Types of shares in snowflake?
1. Outbound shares (to consumers)
2. Inbound shares (from provider)
14. Snowflake will charge you for queries running in queue?
Snowflake will charge 0.06 credits for 1000 queries.
15. How many clusters we can use in snowflake?
No limitation but suggestable clustering keys up to 4 per table.
16. How many reader accounts created in snowflake?
Maximum 20 reader accounts we can create in snowflake. If you want more that we need to
contact to snowflake support team.
17. How many child task can create in snowflake?
Maximum 100 child task. A simple tree of tasks is limited to maximum of 1000 tasks total
(including root node) in either resumed or suspended.
18. Types of stages in snowflake?
1. User stage (default)
2. Table stage
3. Name stage
19. How data cleaning and data cleansing in snowflake?
Python
20. Which model you’re following in snowflake?
Data vault (DV) modeling it provides a method and approach to modeling of enterprise data
warehouse (EDW) to be agile, flexible and scalable.
21. Snowflake date and time dimension?
29

SQL scripts that will generate a date dimension in snowflake. We are using build-in generator
and SEQ-function.
Create date dimension or calendar table also known as a date scaffold or date sample.

22. What is fact table?


Fact tables contains all the primary keys of the dimension and associated with facts or
measurements like metrics, quantity, amount etc. It doesn’t contain a hierarchy.
23. What is dimension table?
Dimension tables contains attributes of measurements stored in fact tables. Dimension
table contains hierarchies.
24. What is Data mining?
Process of analyzing the data in different dimensions and summarizing into useful
information. It can required and retrieve the data from database in their own format.
25. Data warehouse stages?
1. Offline operational system
2. Offline data warehouse
3. Real time data warehouse
4. Integrated data warehouse
26. Data mart?
It helps the business people to decide with the analysis of past trends and experiences. It
helps to emphasizes on easy access to relevant information.
27. What is dimensional modeling?
Used by data warehouse designers to build their own data warehouse. This model can
stored in two types of tables fact table and dimension table.
Fact table has facts or measurements of the business and dimension table contains context
of measurements.
28. Type of data models or dimension modeling?
30

1. Conceptual modeling: defining what data system contains, used to organize, scope, and
define business concepts and rules.
2. Logical modeling: defining how a data system should be implemented, used to develop
a technical map of rules and data structures.
3. Physical modeling: defining how the data system will be implemented according to the
specific use case.
29. Types of data warehousing?
1. Enterprise Data warehousing
2. Operational data store
3. Data mart
30. What are the approaches used by optimizer during execution plan?
1. Rule based
2. Cost based
31. What is normalization?
Splits up the data into additional tables. It helps in reducing data redundancy.
32. Why partitioning?
Partitioning is done for various reasons like easy management, enhance performance.
33. SCDs (slowly changing dimension)?
Slowly changing dimension are the dimensions in which data changes slowly, rather than
changing regularly on time basis.
34. Primary key vs unique key?
Primary key will not accept NULL values whereas unique key accepts null value.
A table can have only primary key whereas can have multiple unique keys on table.
Cluster index created automatically when you define a primary key whereas unique key
generates the non-clustered index.
35. VARIANT data type size limit in snowflake or data size limitation?
16 MB (compressed) size limit on individual rows.
We recommend to enabling STRIP_OUTER_ARRAY file format option in the COPY into
command to remove the outer array structure and load records into separate table rows.

36. Types of views in snowflake?


1. Regular views: It referred in a query. It doesn’t need extra storage but refers to the
physical data of underlying tables.

Views are automatically updated based on the underlying table's data changing. If the
underlying tables have schema changes (column names are changed, for example), then
the view will need to be recreated. Otherwise, views are dynamically updated.
2. Materialized views: It stores a copy of the result query. It will improve the query
performance and efficiency. Zero maintenance no input required, always refresh the
data (with base table).
31

If the tables are updated frequently, then you are using a lot of compute to update the
materialized view.
3. Secure views: Secure view is designed for data privacy, i.e., limited access to sensitive
data that should not be exposed to all users of the underlying tables.

37. Materialized view?


Track credit cards

Start or stop materialized view

Add clustering key

Create materialized view

Joining materialized view

You might also like