Copying data from an Azure stage¶
Load data from your staged files into the target table.
Loading your data¶
Execute COPY INTO <table> to load your data into the target table.
Note
Loading data requires a warehouse. If you’re using a warehouse that is not configured to auto resume, execute ALTER WAREHOUSE to resume the warehouse. Note that starting the warehouse could take up to five minutes.
ALTER WAREHOUSE mywarehouse RESUME;
The following example loads data from files in the named my_azure_stage
stage created in Creating an Azure stage. Using pattern matching, the statement only loads files whose names start with the string sales
:
COPY INTO mytable FROM @my_azure_stage PATTERN='.*sales.*.csv';
Note that file format options are not specified because a named file format was included in the stage definition.
The following example loads all files prefixed with data/files
in your Azure container using the named my_csv_format
file format created in Preparing to load data:
COPY INTO mytable
FROM 'azure://myaccount.blob.core.windows.net/mycontainer/data/files'
CREDENTIALS=(AZURE_SAS_TOKEN='?sv=2016-05-31&ss=b&srt=sco&sp=rwdl&se=2018-06-27T10:05:50Z&st=2017-06-27T02:05:50Z&spr=https,http&sig=abcDEFGHIjklmNOPqrsTUVwxyZ123456789%3D')
ENCRYPTION=(TYPE='AZURE_CSE' MASTER_KEY = 'aBcDeFGHI0jklMnoP0QrsTUVWXyz1234567891abcDEFG=')
FILE_FORMAT = (FORMAT_NAME = my_csv_format);
The following ad hoc example loads data from all files in the Azure container. The COPY command specifies file format options instead of referencing a named file format. This example loads CSV files with a pipe (|
) field delimiter. The COPY command skips the first line in the data files:
COPY INTO mytable
FROM 'azure://myaccount.blob.core.windows.net/mycontainer/data/files'
CREDENTIALS=(AZURE_SAS_TOKEN='?sv=2016-05-31&ss=b&srt=sco&sp=rwdl&se=2018-06-27T10:05:50Z&st=2017-06-27T02:05:50Z&spr=https,http&sig=abcDEFGHIjklmNOPqrsTUVwxyZ123456789%3D')
ENCRYPTION=(TYPE='AZURE_CSE' MASTER_KEY = 'aBcDeFGHI0jklMnoP0QrsTUVWXyz1234567891abcDEFG=')
FILE_FORMAT = (TYPE = CSV FIELD_DELIMITER = '|' SKIP_HEADER = 1);
Validating your data¶
Before loading your data, you can validate that the data in the uploaded files will load correctly.
To validate data in an uploaded file, execute COPY INTO <table> in validation mode using the VALIDATION_MODE parameter. The VALIDATION_MODE parameter returns errors that it encounters in the file. You can then modify the data in the file to ensure it loads without error.
In addition, COPY INTO <table> provides the ON_ERROR copy option to specify an action to perform if errors are encountered in a file during loading.
Note
While Azure storage accounts support the MD5 File Validation feature, Azure doesn’t calculate MD5 hash values for files larger than 100 MB. For more information, see MD5 hash calculation for large files (https://learn.microsoft.com/answers/questions/282572/md5-hash-calculation-for-large-files).
Monitoring data loads¶
Snowflake retains historical data for COPY INTO commands executed within the previous 14 days. The metadata can be used to monitor and manage the loading process, including deleting files after upload completes:
Monitor the status of each COPY INTO <table> command on the History page of the Classic Console.
Use the LOAD_HISTORY Information Schema view to retrieve the history of data loaded into tables using the COPY INTO command.
Copying files from one stage to another¶
Use the COPY FILES command to organize data into a single location by copying files from one named stage to another.
The following example copies all of the files from a source stage (src_stage
) to a target stage (trg_stage
):
COPY FILES
INTO @trg_stage
FROM @src_stage;
You can also specify a list of file names to copy, or copy files by using pattern matching. For information, see the COPY FILES examples.