- Categories:
SOUNDEX¶
返回包含输入字符串的语音表示形式的字符串。
You can use this function to determine whether two strings (e.g. the family names Levine and Lavine, the words to
and too, etc.) have similar pronounciations in the English language.
This function uses the Soundex phonetic algorithm, which is described in Soundex System (https://www.archives.gov/research/census/soundex). Note, however, that Snowflake provides no special handling for surname prefixes (e.g. “Van”, “De”, “La”, etc.).
SOUNDEX('Pfister') returns P236. Because the first two letters (P and f) are adjacent and share the same
Soundex code number (1), the function ignores the Soundex code number for the second letter.
Some database systems (e.g. Teradata) use a variant that retains the Soundex code number for the second letter when the first and
second letters use the same number. For that variant, the string for Pfister is P123 (not P236). To use that variant,
call the SOUNDEX_P123 function instead.
- See also:
语法
实参
varchar_expr返回发音表示形式的字符串。字符串应使用拉丁文或 Unicode 字符集。
返回
返回的值为包含输入字符串的发音表示形式的 VARCHAR。换句话说,返回值是一个字符串(不是发音),表示输入字符串的发音(而不是拼写)。
请注意以下事项:
-
The returned value starts with a letter that represents the first letter in the string followed by 3 digits (e.g.
s400,c130).For more information about how the return value is calculated, see the Soundex phonetic algorithm (in Wikipedia).
-
As mentioned earlier, if you want to use the variant that retains the Soundex code number for the second letter when the first and second letters use the same number, call the SOUNDEX_P123 function instead.
使用说明
- Because the function returns only four characters (one letter and three digits), the output is primarily determined by the first few syllables of the input, rather than the entire string.
例如,以下语句比较三个字符串,并为每个字符串返回相同的 SOUNDEX 值,因为即使它们的拼写和含义完全不同,但它们以发音相似的音节开头:
示例
以下查询返回两个名称的 SOUNDEX 值,这两个名称拼写不同,但通常发音相似:
以下查询演示如何使用 SOUNDEX 在不同表中查找可能相关的行:
创建并加载表:
在不使用 SOUNDEX 的情况下查找相关记录:
使用 SOUNDEX 查找相关记录: