Summary of data loading features

This topic provides a quick-reference of the supported features for using the COPY INTO <table> command to load data from files into Snowflake tables.

Data file details

The following table describes the general details for the files used to load data:

FeatureSupportedNotes
Location of filesLocal environmentFiles are first copied (“staged”) to an internal (Snowflake) stage, then loaded into a table.
Amazon S3Files can be loaded directly from any user-supplied bucket.
Google Cloud StorageFiles can be loaded directly from any user-supplied bucket.

Microsoft Azure cloud storage

  • Blob storage
  • Data Lake Storage Gen2
  • General-purpose v1
  • General-purpose v2
Files can be loaded directly from any user-supplied container.
File formatsDelimited files (CSV, TSV, etc.)Any valid delimiter is supported; default is comma (i.e. CSV).

Semi-structured formats

  • JSON
  • Avro
  • ORC
  • Parquet
  • XML
Unstructured formats
File encodingFile format-specificFor delimited files (CSV, TSV, etc.), the default character set is UTF-8. To use any other characters sets, you must explicitly specify the encoding to use for loading. For the list of supported character sets, see Supported Character Sets for Delimited Files (in this topic).
For semi-structured file formats (JSON, Avro, etc.), the only supported character set is UTF-8.
Snowflake doesn’t support loading data from tar (tape archive) files.

Supported character sets for delimited files

The following table lists the encoding character sets supported for loading data from delimited files (CSV, TSV, etc.):

Character SetENCODING ValueSupported LanguagesNotes
Big5BIG5Traditional Chinese
EUC-JPEUCJPJapanese
EUC-KREUCKRKorean
GB18030GB18030Chinese
IBM420IBM420Arabic
IBM424IBM424Hebrew
IBM949IBM949Korean
ISO-2022-CNISO2022CNSimplified Chinese
ISO-2022-JPISO2022JPJapanese
ISO-2022-KRISO2022KRKorean
ISO-8859-1ISO88591

Danish, Dutch, English, French, German, Italian, Norwegian, Portuguese, Swedish

ISO-8859-2ISO88592Czech, Hungarian, Polish, Romanian
ISO-8859-5ISO88595Russian
ISO-8859-6ISO88596Arabic
ISO-8859-7ISO88597Greek
ISO-8859-8ISO88598Hebrew
ISO-8859-9ISO88599Turkish
ISO-8859-15ISO885915

Danish, Dutch, English, French, German, Italian, Norwegian, Portuguese, Swedish

Identical to ISO-8859-1 except for 8 characters, including the Euro currency symbol.

KOI8-RKOI8RRussian
Shift_JISSHIFTJISJapanese
UTF-8UTF8All languages

For loading data from delimited files (CSV, TSV, etc.), UTF-8 is the default.

For loading data from all other supported file formats (JSON, Avro, etc.), as well as unloading data, UTF-8 is the only supported character set.

UTF-16UTF16All languages
UTF-16BEUTF16BEAll languages
UTF-16LEUTF16LEAll languages
UTF-32UTF32All languages
UTF-32BEUTF32BEAll languages
UTF-32LEUTF32LEAll languages
windows-874WINDOWS874Thai
windows-949WINDOWS949Korean
windows-1250WINDOWS1250Czech, Hungarian, Polish, Romanian
windows-1251WINDOWS1251Russian
windows-1252WINDOWS1252

Danish, Dutch, English, French, German, Italian, Norwegian, Portuguese, Swedish

windows-1253WINDOWS1253Greek
windows-1254WINDOWS1254Turkish
windows-1255WINDOWS1255Hebrew
windows-1256WINDOWS1256Arabic

Compression of staged files

The following table describes how Snowflake handles compression of data files for loading. The options are different depending on whether the files are staged, uncompressed, or already-compressed:

FeatureSupportedNotes
Uncompressed filesgzipWhen staging uncompressed files in a Snowflake stage, the files are automatically compressed using gzip, unless compression is explicitly disabled.
Already-compressed files
  • gzip
  • bzip2
  • deflate
  • raw_deflate
  • Brotli
  • Zstandard

Snowflake can automatically detect any of these compression methods, or you can explicitly specify the method that was used to compress the files.

Auto-detection isn’t supported for Brotli-compressed files; when staging or loading Brotli-compressed files, you must explicitly specify the compression method that was used.

Snowflake doesn’t support uploading compressed tar (tape archive) files.

Encryption of staged files

The following table describes how Snowflake handles encryption of data files for loading. The options are different depending on whether the files are staged unencrypted or already-encrypted:

FeatureSupportedNotes
Unencrypted files128-bit or 256-bit keysAll files stored on internal stages for data loading and unloading operations are automatically encrypted using AES-256 strong encryption on the server side. By default, Snowflake provides additional client-side encryption with a 128-bit key (with the option to configure a 256-bit key).
Already-encrypted filesUser-supplied keyFiles that are already encrypted can be loaded into Snowflake from external cloud storage; the key used to encrypt the files must be provided to Snowflake.