SQL functions: Passing in columns that have the upper, lower, or trim collation specifier

Attention

This behavior change is in the 2024_02 bundle.

For the current status of the bundle, refer to Bundle History.

The upper and lower collation specifiers convert strings to upper or lower case (respectively) before strings are compared. The trim collation specifier removes leading and trailing spaces before strings are compared.

In cases when you pass in columns with these specifiers to some of the SQL functions, the behavior changes in the following ways:

Before the change:

The LIKE function ignores the upper, lower, and trim specifiers, which results in a case-sensitive comparison.

In addition, the following functions do not support columns with the upper, lower, and trim specifiers:

If you pass a column with the upper, lower, or trim specifier to these functions, a compilation error occurs.

After the change:

The LIKE function respects the upper, lower, and trim specifiers, which results in a case-insensitive comparison.

As a result, queries that use the LIKE function might return additional rows (see the example below).

Note that the LIKE function does not support combinations with locale specifiers (for example, en-upper).

In addition, the following functions now support columns with the upper, lower, and trim specifiers.

  • Variants of the LIKE function:

    • LIKE ALL

    • LIKE ANY

    • ILIKE ANY

  • CONTAINS

  • ENDSWITH

  • POSITION

  • REPLACE

  • SPLIT

  • SPLIT_PART

  • STARTSWITH

Example of the effects of the change on the LIKE function

As noted above, if a column has the upper, lower, or trim specifier, queries with the LIKE function might return additional rows. For example, suppose that a table has a column with the lower specifier. Suppose that the text in the table differs in case.

CREATE OR REPLACE TABLE collated_like (
  col_a VARCHAR,
  col_b VARCHAR COLLATE 'lower'
);

INSERT INTO collated_like VALUES ('abc', 'abc'), ('ABC','ABC');
Copy

Before the behavior change, each of the following queries that use the LIKE function return one row with the value 'abc':

SELECT * FROM collated_like WHERE col_a LIKE '%b%';

SELECT * FROM collated_like WHERE col_a COLLATE 'lower' LIKE '%b%';

SELECT * FROM collated_like WHERE col_b LIKE '%b%';
Copy
+-------+-------+
| COL_A | COL_B |
|-------+-------|
| abc   | abc   |
+-------+-------+

After the behavior change, the query that does not use the lower specification column with the LIKE function still returns one row:

SELECT * FROM collated_like WHERE col_a LIKE '%b%';
Copy
+-------+-------+
| COL_A | COL_B |
|-------+-------|
| abc   | abc   |
+-------+-------+

However, the queries that use the COLLATE function to specify lower and queries that pass a column with the lower specification to the LIKE function return two rows:

SELECT * FROM collated_like WHERE col_a COLLATE 'lower' LIKE '%b%';

SELECT * FROM collated_like WHERE col_b LIKE '%b%';
Copy
+-------+-------+
| COL_A | COL_B |
|-------+-------|
| abc   | abc   |
| ABC   | ABC   |
+-------+-------+

This is roughly equivalent to ILIKE. To determine if you should expect changes to these queries, you can replace LIKE with ILIKE in these queries.

Preserving the behavior before the change

If your columns use the upper, lower, or trim specification and you want to preserve the behavior before the change, you can use the COLLATE function with an empty specification to indicate that the upper, lower, or trim specification associated with the column should not be used:

SELECT * FROM collated_like WHERE col_b COLLATE '' LIKE '%b%';
Copy

Note

If you are using this approach with the LIKE function, make sure that both the subject and pattern do not have a collation specification applied.

Ref: 1535

Language: English