❄■ Snowpark Python Cheat Sheet
1. Connecting to Snowflake
• from [Link] import Session
• connection_parameters = { 'account': 'your_account', 'user': 'username', 'password': 'password',
'warehouse': 'COMPUTE_WH', 'database': 'MY_DB', 'schema': 'PUBLIC' }
• session = [Link](connection_parameters).create()
2. Working with DataFrames
• df = [Link]('employees')
• [Link]('id', 'name', 'salary').show()
• from [Link] import col
• [Link](col('salary') > 50000).show()
• df.order_by(col('salary').desc()).show()
• from [Link] import avg
• df.group_by('department').agg(avg('salary').alias('avg_salary')).show()
• [Link](df2, df['dept_id'] == df2['id'], 'inner').show()
3. Functions / Built-in Operations
• from [Link] import lit, upper, lower, concat
• [Link](upper(col('name')), concat(col('name'), lit('_2025'))).show()
• from [Link] import current_date, dateadd, datediff
• [Link](current_date(), dateadd('day', 7, col('hire_date'))).show()
• from [Link] import sum_, count, min_, max_
• df.group_by('department').agg(sum_('salary'), count('*')).show()
4. User-Defined Functions (UDFs)
• from [Link] import udf
• from [Link] import IntegerType
• def double_salary(salary): return salary * 2
• double_salary_udf = udf(double_salary, return_type=IntegerType(),
input_types=[IntegerType()])
• [Link](double_salary_udf(col('salary')).alias('double_salary')).show()
5. Stored Procedures
• from [Link] import Session
• from [Link] import StringType
• def sp_get_employee(session: Session, emp_id: int) -> str:
• df = [Link]('employees').filter(col('id') == emp_id)
• row = [Link]()[0]
• return f"Employee: {row['name']}"
• # Register in Snowflake: [Link](...)
6. SQL Execution
• [Link]("SELECT * FROM employees WHERE salary > 50000").show()
• [Link]("CREATE TABLE test_table (id INT, name STRING)").collect()
7. Writing Data
• [Link].save_as_table('new_employees_table', mode='overwrite')
8. Snapshots / Tasks / Streams
• df_stream = [Link]('employees_stream')
• df_stream.filter(col('salary') > 50000).write.save_as_table('high_salary_employees')
9. Common Utilities
• [Link](5) # Show first 5 rows
• [Link]() # Count rows
• rows = [Link]()
• for row in rows: print(row['name'], row['salary'])
Tips
• Use DataFrames API instead of raw SQL for transformations.
• Combine UDFs + DataFrames for advanced transformations.
• All computations run inside Snowflake → faster, secure.
• Use [Link]() when raw SQL is easier than DataFrame API.