Snowflake Data Clean Rooms: Developer APIs overview

A Snowflake Data Clean Room provides first-in-class developer APIs that enable you to develop applications using a clean room. You can leverage these powerful APIs to manage the lifecycle of a clean room, create and run various secure analytics, and then share them

A clean room is a cryptographically secure environment that protects the data inside it. Clean rooms only run specific analyses and algorithms enabled by the creator of the clean room. Additionally, a layer of protection is added through differential privacy techniques that only expose aggregated results externally. Clean rooms guarantee that no user or process can read or write data except for algorithms provided by the clean room creator.

Collaborators

There are generally two parties involved in the lifecycle of a clean room who are collaborating together:

  • A provider creates a clean room, adds the necessary data, sets up the policies, adds the relevant analyses, and then shares it with the consumer.

  • A consumer receives a clean room shared from a provider. Consumers can install a clean room, add their data, and run any analysis supported with appropriate arguments passed to the analysis templates contained inside the clean room.

Analyses

Analyses are algorithms that run inside a clean room. The provider chooses some analyses and enables them for a specific clean room, and then the consumer can choose to run one or more of those analyses.

Prerequisites

Before you can use the developer APIs of a Snowflake Data Clean Room, an administrator must configure the clean room environment and add you as a user. For details, see Getting started with a Snowflake Data Clean Room.

You must use the SAMOOHA_APP_ROLE role to execute the developer APIs. Add the following to your Snowflake worksheet before executing the API:

USE ROLE samooha_app_role;
Copy

Linking data

Providers and consumers can link data into a clean room so it doesn’t have to have a physical copy in the clean room environment. Because Snowflake Data Clean Rooms rely on the Snowflake Native App Framework, you can link any object that is supported by the Snowflake Native App Framework.

Before clean room users can link data into a clean room, the account administrator must register the data at the database, schema, or object level. If the administrator registers a database or schema, all of the objects in that database or schema are also registered. For information about how an administrator registers data, see Register data for Snowflake Data Clean Rooms.

After the account administrator registers data, clean room users can link an object into the clean room by executing the link_datasets command. For example, a provider could execute the following command to link a table MY_TABLE into the clean room dcr_cleanroom:

CALL samooha_by_snowflake_local_db.provider.link_datasets('dcr_cleanroom', 
   ['MY_DB.MY_SCHEMA.MY_TABLE']);
Copy

Note

External tables and Iceberg tables require additional steps before they can be linked in a clean room. For more information, see Snowflake Data Clean Rooms: External and Iceberg Tables.

Retrieving previous results from an analysis

When you use the developer APIs to run an analysis, Snowflake Data Cleans Rooms executes a query in your account. You can retrieve results from an older analysis using the query ID of the query associated with the analysis.

Note

If the warehouse gets suspended in between running the API analysis and using this query to retrieve results, you might not be able to get the results.

To retrieve results from a previous analysis:

  1. Sign in to Snowsight.

  2. Select Monitoring >> Query History.

  3. Use the filters to find the query associated with the analysis, and then copy the query ID.

  4. Open a worksheet and execute a query that retrieves the results based on the query ID of the query. For example, if the query ID is ABC123, then execute:

       SELECT * FROM TABLE(result_scan(ABC123));
    
    Copy

API reference documentation

To obtain the descriptions and signatures of the developer APIs, see the following:

Extended examples

To help you understand how to use various features of the Developer APIs, you can refer to the following examples.

End-to-End: Provider Data Analysis

  • A provider can define join and other column policies over datasets that they have linked to the clean room and then add a predefined, secure data analysis template to the clean room.

  • A consumer can use the clean room shared by the provider and run exploratory analyses within it. The consumer must abide by the join and column policies set by the provider.

For more information, see the End-to-End: Provider Data Analysis.

End-to-End: Overlap Analysis

  • A provider can link multiple datasets and add a predefined analysis template that carries out an overlap analysis over the datasets to the clean room.

  • The consumer can link more datasets and perform the overlap analysis jointly over all the available provider and consumer datasets in the clean room.

For more information, see the End-to-End: Overlap Analysis.

Custom Analysis Templates

  • A provider can define and add a custom analysis template to a clean room, which allows consumers to run the custom analytics.

  • These custom analysis templates can be made generic by leveraging powerful SQL Jinja templates, and can also support Privacy Enhancing Technologies like differential privacy.

For more information, see the Custom Analysis Templates.

Secure Python Based Templates

  • Providers can load custom Python code to be run inside custom analysis templates.

  • All Python code loaded into the clean room remains completely confidential and cannot be seen by the consumers using it.

For more information, see the Secure Python Based Templates.

Machine Learning

  • Providers can define advanced machine learning models that users can run securely inside clean rooms.

  • Secure Python code that is not visible to consumers can be used to define complex ML models that can run in a fully secure environment inside the clean room.

For more information, see the Machine Learning.

Secure Python UDTF-Based Templates

  • Providers can create secure Python UDTFs using a simple API and share them with consumers.

  • Consumers can use the Python UDTF using a simple template provided by the provider.

For more information, see the Secure Python UDTF-Based Templates.

Registering developer API clean rooms into the web app

  • Providers can register clean rooms loaded with custom analyses and templates into the web app of a Snowflake Data Clean Room, which allows their collaborators to work with the clean room in a user interface.

  • Collaborators can interact with these complex, custom clean rooms entirely through the web app.

For more information, see the Registering developer API clean rooms into the web app.

Secure Snowpark Procedures

  • Providers can define their own Snowpark procedures and share them securely with a consumer.

  • Consumers can call these Snowpark procedures using the usual run_analysis workflow.

For more information, see the Secure Snowpark Procedures.

Language: English