[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

8.4 Operations Affected by Character Set Support

This section describes operations that take character set information into account now.


8.4.1 Result Strings

MySQL has many operators and functions that return a string. This section answers the question: What is the character set and collation of such a string?

For simple functions that take a string input and return a string result as output, the output's character set and collation are the same as the principal input's. For example, UPPER(X) returns a string whose character string and collation are the same as that of X. The same applies for: INSTR(), LCASE(), LOWER(), LTRIM(), MID(), REPEAT(), REPLACE(), REVERSE(), RIGHT(), RPAD(), RTRIM(), SOUNDEX(), SUBSTRING(), TRIM(), UCASE(), UPPER(). (Also note: the REPLACE() function, unlike all other functions, ignores the collation of the string input and performs a case-insensitive comparison every time.)

For operations that combine multiple string inputs and return a single string output, SQL-99's "aggregation rules" apply. They are:

For example, with CASE ... WHEN a THEN b WHEN b THEN c COLLATE X END, the resultant collation is X. The same applies for: CONCAT(), GREATEST(), IF(), LEAST(), CASE, UNION, ||, ELT().

For operations that convert to character data, the result string's character set and collation are in the connection/literals character set and have the connection/literals collation. This applies for: CHAR(), CAST(), CONV(), FORMAT(). HEX(), SPACE().


8.4.2 CONVERT()

CONVERT() provides a way to convert data between different character sets. The syntax is:

 
CONVERT(expr USING transcoding_name)

In MySQL, transcoding names are the same as the corresponding character set names.

Examples:

 
SELECT CONVERT(_latin1'Müller' USING utf8);
INSERT INTO utf8table (utf8column)
   SELECT CONVERT(latin1field USING utf8) FROM latin1table;

CONVERT(... USING ...) is implemented according to the SQL-99 specification.


8.4.3 CAST()

You may also use CAST() to convert a string to a different character set. The new format is:

 
CAST ( character_string AS character_data_type
    CHARACTER SET character_set_name )

Example:

 
SELECT CAST(_latin1'test' AS CHAR CHARACTER SET utf8);

You may not use a COLLATE clause inside a CAST(), but you may use it outside, that is, CAST(... COLLATE ...) is illegal but CAST(...) COLLATE ... is legal.

Example:

 
SELECT CAST(_latin1'test' AS CHAR CHARACTER SET utf8) COLLATE utf8_bin;

If you use CAST() without specifying CHARACTER SET, then the resulting character set and collation are the connection/literal character set and its default collation. If you use CAST() with CHARACTER SET X, then the resulting character set is X and the resulting collation is X's default collation.


8.4.4 SHOW CHARACTER SET

The SHOW CHARACTER SET command shows all available character sets. It takes an optional LIKE clause that indicates which character set names to match.

For example:

 
mysql> SHOW CHARACTER SET LIKE 'latin%';
+---------+-----------------------------+-------------------+--------+
| Charset | Description                 | Default collation | Maxlen |
+---------+-----------------------------+-------------------+--------+
| latin1  | ISO 8859-1 West European    | latin1_swedish_ci |      1 |
| latin2  | ISO 8859-2 Central European | latin2_general_ci |      1 |
| latin5  | ISO 8859-9 Turkish          | latin5_turkish_ci |      1 |
| latin7  | ISO 8859-13 Baltic          | latin7_general_ci |      1 |
+---------+-----------------------------+-------------------+--------+
4 rows in set (0.00 sec)

Notes about the preceding listing:


8.4.5 SHOW COLLATION

The output from SHOW COLLATION includes all available character sets. It takes an optional LIKE clause that indicates which collation names to match.

 
mysql> SHOW COLLATION LIKE 'latin1%';
+-------------------+---------+----+---------+----------+---------+
| Collation         | Charset | Id | Default | Compiled | Sortlen |
+-------------------+---------+----+---------+----------+---------+
| latin1_german1_ci | latin1  |  5 |         |          |       0 |
| latin1_swedish_ci | latin1  |  8 | Yes     | Yes      |       0 |
| latin1_danish_ci  | latin1  | 15 |         |          |       0 |
| latin1_german2_ci | latin1  | 31 |         | Yes      |       2 |
| latin1_bin        | latin1  | 47 |         | Yes      |       0 |
| latin1_general_ci | latin1  | 48 |         |          |       0 |
| latin1_general_cs | latin1  | 49 |         |          |       0 |
+-------------------+---------+----+---------+----------+---------+
7 rows in set (0.00 sec)

The Default column indicates whether a collation is the default for its character set. Compiled indicates whether or not the character set is compiled into the server. Sortlen is related to the amount of memory required to sort strings expressed in the character set.


8.4.6 SHOW CREATE DATABASE

The following query shows a CREATE DATABASE statement that will create the given database. The result includes all database options. DEFAULT CHARACTER SET and COLLATE are supported. All database options are stored in a text file that can be found in the database directory.

 
mysql> SHOW CREATE DATABASE a;
+----------+---------------------------------------------------------------------------------+
| Database | Create Database
|
+----------+---------------------------------------------------------------------------------+
| a        | CREATE DATABASE `a` /*!40100 DEFAULT CHARACTER SET macce
COLLATE macce_ci_ai */ |
+----------+---------------------------------------------------------------------------------+
1 row in set (0.00 sec)

8.4.7 SHOW FULL COLUMNS

The SHOW COLUMNS statement now displays the collations of a table's columns, when invoked as SHOW FULL COLUMNS. Columns with CHAR, VARCHAR, or TEXT datatypes have non-NULL collations. Numeric and other non-character types have NULL collations. For example:

 
mysql> SHOW FULL COLUMNS FROM a;
+-------+---------+-------------------+------+-----+---------+-------+
| Field | Type    | Collation         | Null | Key | Default | Extra |
+-------+---------+-------------------+------+-----+---------+-------+
| a     | char(1) | latin1_swedish_ci | YES  |     | NULL    |       |
| b     | int(11) | NULL              | YES  |     | NULL    |       |
+-------+---------+-------------------+------+-----+---------+-------+
2 rows in set (0.02 sec)

The character set is not part of the display.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]


Hosting by: Hurra Communications Ltd.
Generated: 2007-01-26 17:58:46