Easy Snowflake-Python Connection using BAG Data with Snowpark
Geospatial Data Scientist
The Snowflake Marketplace BAG dataset is a free dataset that can be accessed with one simple query. Our previous articleof Snowflake showed how to use Tableau to connect to the dataset. This time, we show how to use Python and the Snowpark package to access the data from Snowflake, directly in a local Python Jupyter Notebook.
What is Snowpark?
Snowpark is part of Snowflake. At its core, Snowpark provides an API that developers can use to construct DataFrames that are executed on Snowflake's platform. It allows coding in languages other than SQL such as Scala, Java, and Python to take advantage of Snowflake's powerful platform, without having to leave Snowflake. This allows data application developers to perform complex transformations within Snowflake while benefiting from built-in, unlimited scalability, performance, governance, and security features.
Start using the DATA ANALYSIS
First, you need a (free) Snowflake account. You can sign up via this link. The default role is set to 'public'. All you need to do to process your queries is to create a warehouse for your account. If you go to the 'Worksheets' and click on '+ Worksheet', you can create a new warehouse with the following commands:
CREATE OR REPLACE WAREHOUSE MY_WH WITH WAREHOUSE_SIZE='X-SMALL'; GRANT USAGE ON WAREHOUSE MY_WH TO ROLE PUBLIC
With this, you create a new warehouse (the smallest version) and assign user rights to your public role. After this step, we are ready to install Jupyter Notebook. The instructions for this can be found here. The next step is to install the Snowpark package. You can easily do this with the pip command in the environment you are working in.
pip install snowflake-snowpark-python
Make sure you install the standard pandas package, because we want to turn our requested table into a pandas data frame.
pip install pandas
Once installed, we can start making our connection.
Import the library
To get started, we import the relevant Snowflake libraries. We do that with this code:
### Import Snowpark for Python from snowflake.snowpark import Session from snowflake.snowpark.functions import col import pandas as pd
Create a session object
After importing the Snowpark package, we create a session object. Here you can enter your accounts, credentials, role, and warehouse to connect to the BAG dataset within the Marketplace. You do that with this function.
Connecting to the bag dataset in the snowflake marketplace
Once a session object is created, we can connect to the BAG dataset. You can consult the full dataset or write a specific query as required (like these examples). Then you convert the consulted tables into Pandas data frames, on which you can perform ordinary data transformations.
After running the code, you should see a Pandas data frame, like the image below.
Congratulations, you Jupyter Notebook! You are now ready to use complex Python data analysis tools for your queried data.
Do you have any questions or comments about this topic? Contact us and we will be happy to help you with this.