Snowflake Data Clean Rooms:外部数据连接器故障排除

备注

Snowflake Data Clean Rooms 目前不支持数据主体同意管理。客户有责任确保其已获得使用其 Clean Room 中关联的数据的所有必要权利和同意。客户在使用 Data Clean Room 时,还必须确保遵守所有适用的法律和法规,包括与第三方连接器相关的法律法规。

本主题介绍如何排除外部数据错误。它适用于 Amazon Web Services、Microsoft Azure 和 Google Cloud。

排除外部数据错误时需遵循的步骤

  1. 确保路径 URL/URI 正确。有关正确的 URL/URI,请参阅相关主题。

  2. 确保桶或 Blob 存储中至少有一个文件。

  3. 确保文件为 parquet 格式。

  4. 确保 parquet 文件不是空的。

  5. 确保未使用 Snappy 格式压缩 parquet 文件。

  6. 如果以上方法都不能解决问题,请使用以下脚本进行调试:

USE ROLE SAMOOHA_APP_ROLE;
USE DATABASE SAMOOHA_BY_SNOWFLAKE_LOCAL_DB;
USE SCHEMA PUBLIC;

/*
  Query the stage name from the connector configuration.
  Use AWS_CONNECTOR_ID for AWS, GCP_CONNECTOR_ID for GCP and
  AZURE_CONNECTOR_ID for Azure.

  For example, if you are connecting to AWS, enter:

  SELECT CONFIGURATION_ID, PARSE_JSON(CONFIGURATION) FROM SAMOOHA_BY_SNOWFLAKE_LOCAL_DB.PUBLIC.CONNECTOR_CONFIGURATION WHERE CONNECTOR_ID = 'AWS_CONNECTOR_ID';

/*
  Note that the rest of this script relies on the output of this query so you
  must save the output for use in the rest of the steps.

  Next, check the storage integration. Replace <CONFIGURATION_ID> from the output
  of the query.
*/

  DESC STORAGE INTEGRATION SAMOOHA_STORAGE_INT_<CONFIGURATION_ID>;

/*
  List files in the stage. Replace <STAGE_NAME> from the output of the query.
*/

  LIST @<STAGE_NAME>;

/*
  Check if you are able to query the files in the external stage. Replace
  <STAGE_NAME> from the output of the query.
*/

  SELECT * FROM @<STAGE_NAME> LIMIT 10;

/*
  Check if you are able to infer the schema from the files in the external
  stage. Replace <STAGE_NAME> from the output of the query.
*/

  SELECT ARRAY_AGG(OBJECT_CONSTRUCT(*))
  WITHIN GROUP (ORDER BY order_id)
  FROM TABLE(
    INFER_SCHEMA(
      LOCATION=>'@<STAGE_NAME>',
      FILE_FORMAT=>'SAMOOHA_BY_SNOWFLAKE_LOCAL_DB.PUBLIC.PAR_FF'
    )
  );

/*
  Try to create a table from the external stage. Replace <STAGE_NAME> from
  the output of the query.
*/

  CALL SAMOOHA_BY_SNOWFLAKE_LOCAL_DB.LIBRARY.CREATE_TABLE_FROM_STAGE('<STAGE_NAME>', 'EXT_INT_TEMP_TABLE');

/*
  Check data in the table.
*/

  SELECT * FROM SAMOOHA_BY_SNOWFLAKE_LOCAL_DB.PUBLIC.EXT_INT_TEMP_TABLE LIMIT 10;
Copy
语言: 中文