类别：: :doc:`/sql-reference/functions-string`（匹配/比较）

SUBSTR、SUBSTRING¶

返回 base_expr 中的字符串或二进制值的部分，从 start_expr 指定的字符/字节开始，并且可选择限制长度。

这些函数为同义函数。

另请参阅:: LEFT、RIGHT

语法¶

SUBSTR( <base_expr>, <start_expr> [ , <length_expr> ] )

SUBSTRING( <base_expr>, <start_expr> [ , <length_expr> ] )

Copy

实参¶

base_expr

计算结果为 VARCHAR 或 BINARY 值。

start_expr

计算结果为整数的表达式。它指定子字符串的起始偏移量。偏移量通过以下数量来测量：

UTF-8 字符数（如果输入为 VARCHAR 值）。
字节数（如果输入为 BINARY 值）。

起始位置从 1 而不是 0 开始。例如，SUBSTR('abc', 1, 1) 返回 a，而不是 b。

length_expr

计算结果为整数的表达式。它指定：

如果输入为 VARCHAR，返回的 UTF-8 字符数。
如果输入为 BINARY，返回的字节数。

指定一个大于或等于零的长度。如果长度为负数，该函数将返回空字符串。

返回¶

返回值的数据类型与 :samp:`{base_expr}`（VARCHAR 或 BINARY）的数据类型相同。

如果任何输入为 NULL，则返回 NULL。

使用说明¶

如果指定了 length_expr，则最多返回 length_expr 个字符/字节。如果未指定 length_expr，则返回字符串或二进制值末尾之前的所有字符。
start_expr 中的值从 1 开始：
- 如果指定 0，则将其视为 1。
- 如果指定了负值，则起始位置计算为从字符串或二进制值末尾开始的 start_expr 个字符/字节。如果位置超出字符串或二进制值的范围，则返回空值。

排序规则详细信息¶

排序规则适用于 VARCHAR 输入。如果第一个参数的输入数据类型为 BINARY，则排序规则不适用。
No impact. 尽管在语法上可以接受排序规则，但排序规则对处理没有影响。例如，某些语言的字母包含两个字符或三个字符（如匈牙利语中的“dzs”，捷克语中的“ch”），这些字母作为长度实参时仍会算作两个或三个字符（而不是一个字符）。
The collation of the result is the same as the collation of the input. 如果返回值作为嵌套函数调用的一部分传递给另一个函数，这可能很有用。

示例¶

以下示例使用 SUBSTR 函数。

基本示例¶

以下示例使用 SUBSTR 函数返回从第九个字符开始的字符串部分，并将返回值的长度限制为三个字符：

SELECT SUBSTR('testing 1 2 3', 9, 3);

Copy

+-------------------------------+
| SUBSTR('TESTING 1 2 3', 9, 3) |
|-------------------------------|
| 1 2                           |
+-------------------------------+

指定不同的开始值和长度值¶

以下示例显示为 start_expr 和 length_expr 指定不同值时，为相同 base_expr 返回的子字符串：

CREATE OR REPLACE TABLE test_substr (
    base_value VARCHAR,
    start_value INT,
    length_value INT)
  AS SELECT
    column1,
    column2,
    column3
  FROM
    VALUES
      ('mystring', -1, 3),
      ('mystring', -3, 3),
      ('mystring', -3, 7),
      ('mystring', -5, 3),
      ('mystring', -7, 3),
      ('mystring', 0, 3),
      ('mystring', 0, 7),
      ('mystring', 1, 3),
      ('mystring', 1, 7),
      ('mystring', 3, 3),
      ('mystring', 3, 7),
      ('mystring', 5, 3),
      ('mystring', 5, 7),
      ('mystring', 7, 3),
      ('mystring', NULL, 3),
      ('mystring', 3, NULL);

SELECT base_value,
       start_value,
       length_value,
       SUBSTR(base_value, start_value, length_value) AS substring
  FROM test_substr;

Copy

+------------+-------------+--------------+-----------+
| BASE_VALUE | START_VALUE | LENGTH_VALUE | SUBSTRING |
|------------+-------------+--------------+-----------|
| mystring   |          -1 |            3 | g         |
| mystring   |          -3 |            3 | ing       |
| mystring   |          -3 |            7 | ing       |
| mystring   |          -5 |            3 | tri       |
| mystring   |          -7 |            3 | yst       |
| mystring   |           0 |            3 | mys       |
| mystring   |           0 |            7 | mystrin   |
| mystring   |           1 |            3 | mys       |
| mystring   |           1 |            7 | mystrin   |
| mystring   |           3 |            3 | str       |
| mystring   |           3 |            7 | string    |
| mystring   |           5 |            3 | rin       |
| mystring   |           5 |            7 | ring      |
| mystring   |           7 |            3 | ng        |
| mystring   |        NULL |            3 | NULL      |
| mystring   |           3 |         NULL | NULL      |
+------------+-------------+--------------+-----------+

返回电子邮件地址、电话和日期字符串的子字符串¶

以下示例返回表中客户信息的子字符串。

创建表并插入数据：

CREATE OR REPLACE TABLE customer_contact_example (
    cust_id INT,
    cust_email VARCHAR,
    cust_phone VARCHAR,
    activation_date VARCHAR)
  AS SELECT
    column1,
    column2,
    column3,
    column4
  FROM
    VALUES
      (1, 'some_text@example.com', '800-555-0100', '20210320'),
      (2, 'some_other_text@example.org', '800-555-0101', '20240509'),
      (3, 'some_different_text@example.net', '800-555-0102', '20191017');

SELECT * from customer_contact_example;

Copy

+---------+---------------------------------+--------------+-----------------+
| CUST_ID | CUST_EMAIL                      | CUST_PHONE   | ACTIVATION_DATE |
|---------+---------------------------------+--------------+-----------------|
|       1 | some_text@example.com           | 800-555-0100 | 20210320        |
|       2 | some_other_text@example.org     | 800-555-0101 | 20240509        |
|       3 | some_different_text@example.net | 800-555-0102 | 20191017        |
+---------+---------------------------------+--------------+-----------------+

将 POSITION 函数与 SUBSTR 函数结合使用，以便从电子邮件地址中提取域名。此示例找到每个字符串中 @ 的位置，并通过加一从下一个字符开始：

SELECT cust_id,
       cust_email,
       SUBSTR(cust_email, POSITION('@' IN cust_email) + 1) AS domain
  FROM customer_contact_example;

Copy

+---------+---------------------------------+-------------+
| CUST_ID | CUST_EMAIL                      | DOMAIN      |
|---------+---------------------------------+-------------|
|       1 | some_text@example.com           | example.com |
|       2 | some_other_text@example.org     | example.org |
|       3 | some_different_text@example.net | example.net |
+---------+---------------------------------+-------------+

小技巧

您可以使用 POSITION 函数查找其他字符的位置，例如空字符 (' ') 或下划线 (_)。

在表的 cust_phone 列中，区号始终是前三个字符。从电话号码中提取区号：

SELECT cust_id,
       cust_phone,
       SUBSTR(cust_phone, 1, 3) AS area_code
  FROM customer_contact_example;

Copy

+---------+--------------+-----------+
| CUST_ID | CUST_PHONE   | AREA_CODE |
|---------+--------------+-----------|
|       1 | 800-555-0100 | 800       |
|       2 | 800-555-0101 | 800       |
|       3 | 800-555-0102 | 800       |
+---------+--------------+-----------+

从电话号码中移除区号：

SELECT cust_id,
       cust_phone,
       SUBSTR(cust_phone, 5) AS phone_without_area_code
  FROM customer_contact_example;

Copy

+---------+--------------+-------------------------+
| CUST_ID | CUST_PHONE   | PHONE_WITHOUT_AREA_CODE |
|---------+--------------+-------------------------|
|       1 | 800-555-0100 | 555-0100                |
|       2 | 800-555-0101 | 555-0101                |
|       3 | 800-555-0102 | 555-0102                |
+---------+--------------+-------------------------+

在表的 activation_date 列中，日期格式始终为 YYYYMMDD。从这些字符串中提取年、月和日：

SELECT cust_id,
       activation_date,
       SUBSTR(activation_date, 1, 4) AS year,
       SUBSTR(activation_date, 5, 2) AS month,
       SUBSTR(activation_date, 7, 2) AS day
  FROM customer_contact_example;

Copy

+---------+-----------------+------+-------+-----+
| CUST_ID | ACTIVATION_DATE | YEAR | MONTH | DAY |
|---------+-----------------+------+-------+-----|
|       1 | 20210320        | 2021 | 03    | 20  |
|       2 | 20240509        | 2024 | 05    | 09  |
|       3 | 20191017        | 2019 | 10    | 17  |
+---------+-----------------+------+-------+-----+