- Categories:
HASH¶
返回已签名的 64 位哈希值。请注意,HASH 从不返回 NULL,即使对于 NULL 输入也如此。
HASH 函数的可能用途包括:
- 将偏斜数据值转换为可能更随机分布或更均匀分布的值。
例如,您可以对一组高度偏斜的值进行哈希处理,并生成一组更可能随机分布或均匀分布的值。
-
Put data in buckets. Because hashing can convert skewed data values to closer-to-evenly distributed values, you can use hashing to help take skewed values and create approximately evenly-sized buckets.
If hashing alone is not sufficient to get the number of distinct buckets that you want, you can combine hashing with the ROUND or WIDTH_BUCKET functions.
Note
HASH is a proprietary function that accepts a variable number of input expressions of arbitrary types and returns a signed value. It is not a cryptographic hash function and should not be used as such.
加密哈希函数具有此函数所没有的一些属性,例如:
- 无法反转值的加密哈希以查找原始值。
- 给定一个值,无法找到具有相同加密哈希值的另一个值。
For cryptographic purposes, use the SHA families of functions (in String & binary functions).
- See also:
语法
实参
expr表达式可以是任何 Snowflake 数据类型的常规表达式。
*根据每条记录(包括带 NULL 值的记录)中的所有列,返回一个哈希处理的值。
When you pass a wildcard to the function, you can qualify the wildcard with the name or alias for the table. For example, to pass in all of the columns from the table named
mytable, specify the following:You can also use the ILIKE and EXCLUDE keywords for filtering:
-
ILIKE filters for column names that match the specified pattern. Only one pattern is allowed. For example:
-
EXCLUDE filters out column names that don’t match the specified column or columns. For example:
Qualifiers are valid when you use these keywords. The following example uses the ILIKE keyword to filter for all of the columns that match the pattern
col1%in the tablemytable:The ILIKE and EXCLUDE keywords can’t be combined in a single function call.
对于此函数,ILIKE 和 EXCLUDE 关键字仅在 SELECT 列表或 GROUP BY 子句中有效。
For more information about the ILIKE and EXCLUDE keywords, see the “Parameters” section in SELECT.
-
返回
返回已签名的 64 位值 NUMBER (19,0)。
HASH 从不返回 NULL,即使对于 NULL 输入也是如此。
使用说明
-
HASH is stable in the sense that it guarantees:
-
Any two values of type NUMBER that compare equally will hash to the same hash value, even if the respective types have different precision and/or scale.
-
Any two values of type FLOAT that can be converted to NUMBER(38, 0) without loss of precision will hash to the same value. For example, the following all return the same hash value:
HASH(10::NUMBER(38,0))HASH(10::NUMBER(5,3))HASH(10::FLOAT)
-
Any two values of type TIMESTAMP_TZ that compare equally will hash to the same hash value, even if the timestamps are from different time zones.
-
此保证也适用于 VARIANT 列中的 NUMBER、FLOAT 和 TIMESTAMP_TZ 值。
-
Note that this guarantee does not apply to other combinations of types, even if implicit conversions exist between the types. For example, with overwhelming probability, the following will not return the same hash values even though
10 = '10'after implicit conversion:HASH(10)HASH('10')
-
-
HASH(*)means to create a single hashed value based on all columns in the row. -
Do not use HASH to create unique keys. HASH has a finite resolution of 64 bits, and is guaranteed to return non-unique values if more than 2^64 values are entered (e.g. for a table with more than 2^64 rows). In practice, if the input is on the order of 2^32 rows (approximately 4 billion rows) or more, the function is reasonably likely to return at least one duplicate value.
排序规则详细信息
No impact.
- Two strings that are identical but have different collation specifications have the same hash value. In other words, only the string, not the collation specification, affects the hash value.
- Two strings that are different, but compare equal according to a collation, might have a different hash value. For example, two strings that are identical using punctuation-insensitive collation will normally have different hash values because only the string, not the collation specification, affects the hash value.
示例
The example below shows that even if the table contains multiple columns, HASH(*) returns a single value per row.