Using internal tables for multistep workflows¶
概述¶
Many clean room use cases involve running a single SQL query against one or more tables in a clean room and displaying the results in the response. However, there are use cases where you might require creating an internal table that can be used within subsequent templates to support a multistep workflow. For example, a machine learning flow, where the model is trained once against a dataset and then run multiple times against varying input data, either singly or in batches.
Creating internal tables¶
You can create internal tables inside a clean room to store intermediary results, or as persistent storage for usage downstream (for example, to save training data that is used for multiple runs). See properties and guidance of internal tables below:
You can create internal tables by using a clean room template that executes CREATE TABLE, or by running a UDF/UDTF that uses Python to create a table.
Internal tables can be created in the
cleanroomschema, which is available by default. If a custom schema is preferred, the schema must be created first before creating the table.By default, internal tables are only accessible by approved templates in the clean room. If access needs to be provided outside of templates, then the CLEANROOM_PUBLIC_ROLE application role of the clean room needs to be granted corresponding privileges. For example, the following grant can be given:
GRANT SELECT ON TABLE CLEANROOM.MY_TABLE TO APPLICATION ROLE CLEANROOM_PUBLIC_ROLE;If you have proper access, you can list the internal tables in your collaboration. Internal tables can be found at
SFDCR_collaboration_name.cleanroom, and can be listed by running the following SQL code:SHOW TABLES IN SCHEMA SFDCR_collaboration_name.CLEANROOM;.Internal tables are deleted when the collaboration is removed. However, if an internal table is designed to have a shorter lifetime than the collaboration, consider deleting the table when it's no longer needed.
以下是一些创建内部表的示例:
JinjaSQL 模板可以创建内部表,这是通过某些类型的 激活 完成的。
This example template creates the table and returns the table name, so that the name can be passed in as a parameter to other templates.
UDF 可以创建内部表。这通常是通过在 Python 中执行 SQL 来完成的。