0% found this document useful (0 votes)
54 views2 pages

Snowpark Python Cheatsheet

This cheat sheet provides a comprehensive guide for using Snowpark with Python, covering connection setup, DataFrame operations, built-in functions, user-defined functions, stored procedures, SQL execution, data writing, and utilities. It emphasizes the advantages of using the DataFrames API over raw SQL for transformations and highlights the performance benefits of executing computations within Snowflake. Key code snippets are included for each section to facilitate quick reference.

Uploaded by

hari.kappera476
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views2 pages

Snowpark Python Cheatsheet

This cheat sheet provides a comprehensive guide for using Snowpark with Python, covering connection setup, DataFrame operations, built-in functions, user-defined functions, stored procedures, SQL execution, data writing, and utilities. It emphasizes the advantages of using the DataFrames API over raw SQL for transformations and highlights the performance benefits of executing computations within Snowflake. Key code snippets are included for each section to facilitate quick reference.

Uploaded by

hari.kappera476
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

❄■ Snowpark Python Cheat Sheet

1. Connecting to Snowflake

• from [Link] import Session


• connection_parameters = { 'account': 'your_account', 'user': 'username', 'password': 'password',
'warehouse': 'COMPUTE_WH', 'database': 'MY_DB', 'schema': 'PUBLIC' }
• session = [Link](connection_parameters).create()

2. Working with DataFrames

• df = [Link]('employees')
• [Link]('id', 'name', 'salary').show()
• from [Link] import col
• [Link](col('salary') > 50000).show()
• df.order_by(col('salary').desc()).show()
• from [Link] import avg
• df.group_by('department').agg(avg('salary').alias('avg_salary')).show()
• [Link](df2, df['dept_id'] == df2['id'], 'inner').show()

3. Functions / Built-in Operations

• from [Link] import lit, upper, lower, concat


• [Link](upper(col('name')), concat(col('name'), lit('_2025'))).show()
• from [Link] import current_date, dateadd, datediff
• [Link](current_date(), dateadd('day', 7, col('hire_date'))).show()
• from [Link] import sum_, count, min_, max_
• df.group_by('department').agg(sum_('salary'), count('*')).show()

4. User-Defined Functions (UDFs)

• from [Link] import udf


• from [Link] import IntegerType
• def double_salary(salary): return salary * 2
• double_salary_udf = udf(double_salary, return_type=IntegerType(),
input_types=[IntegerType()])
• [Link](double_salary_udf(col('salary')).alias('double_salary')).show()

5. Stored Procedures

• from [Link] import Session


• from [Link] import StringType
• def sp_get_employee(session: Session, emp_id: int) -> str:
• df = [Link]('employees').filter(col('id') == emp_id)
• row = [Link]()[0]
• return f"Employee: {row['name']}"
• # Register in Snowflake: [Link](...)
6. SQL Execution

• [Link]("SELECT * FROM employees WHERE salary > 50000").show()


• [Link]("CREATE TABLE test_table (id INT, name STRING)").collect()

7. Writing Data

• [Link].save_as_table('new_employees_table', mode='overwrite')

8. Snapshots / Tasks / Streams

• df_stream = [Link]('employees_stream')
• df_stream.filter(col('salary') > 50000).write.save_as_table('high_salary_employees')

9. Common Utilities

• [Link](5) # Show first 5 rows


• [Link]() # Count rows
• rows = [Link]()
• for row in rows: print(row['name'], row['salary'])

Tips

• Use DataFrames API instead of raw SQL for transformations.


• Combine UDFs + DataFrames for advanced transformations.
• All computations run inside Snowflake → faster, secure.
• Use [Link]() when raw SQL is easier than DataFrame API.

You might also like