![]() |
|
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
MySQL keeps row data and index data in separate files. Many (almost all) other databases mix row and index data in the same file. We believe that the MySQL choice is better for a very wide range of modern systems.
Another way to store the row data is to keep the information for each column in a separate area (examples are SDBM and Focus). This will cause a performance hit for every query that accesses more than one column. Because this degenerates so quickly when more than one column is accessed, we believe that this model is not good for general purpose databases.
The more common case is that the index and data are stored together (as in Oracle/Sybase et al). In this case you will find the row information at the leaf page of the index. The good thing with this layout is that it, in many cases, depending on how well the index is cached, saves a disk read. The bad things with this layout are:
One of the most basic optimisation is to get your data (and indexes) to take as little space on the disk (and in memory) as possible. This can give huge improvements because disk reads are faster and normally less main memory will be used. Indexing also takes less resources if done on smaller columns.
MySQL supports a lot of different table types and row formats. Choosing the right table format may give you a big performance gain. See section MySQL Table Types.
You can get better performance on a table and minimise storage space using the techniques listed here:
MEDIUMINT
is often better than INT
.
NOT NULL
if possible. It makes everything
faster and you save one bit per column. Note that if you really need
NULL
in your application you should definitely use it. Just avoid
having it on all columns by default.
VARCHAR
,
TEXT
, or BLOB
columns), a fixed-size record format is
used. This is faster but unfortunately may waste some space.
See section MyISAM
table formats.
Indexes are used to find rows with a specific value of one column fast. Without an index MySQL has to start with the first record and then read through the whole table until it finds the relevant rows. The bigger the table, the more this costs. If the table has an index for the columns in question, MySQL can quickly get a position to seek to in the middle of the datafile without having to look at all the data. If a table has 1000 rows, this is at least 100 times faster than reading sequentially. Note that if you need to access almost all 1000 rows it is faster to read sequentially because we then avoid disk seeks.
All MySQL indexes (PRIMARY
, UNIQUE
, and
INDEX
) are stored in B-trees. Strings are automatically prefix-
and end-space compressed. See section CREATE INDEX
.
Indexes are used to:
WHERE
clause.
MAX()
or MIN()
value for a specific indexed
column. This is optimised by a preprocessor that checks if you are
using WHERE
key_part_# = constant on all key parts < N. In this case
MySQL will do a single key lookup and replace the MIN()
expression with a constant. If all expressions are replaced with
constants, the query will return at once:
SELECT MIN(key_part2),MAX(key_part2) FROM table_name where key_part1=10 |
ORDER BY
key_part_1,key_part_2
). The key is read in reverse order if all key
parts are followed by DESC
. See section How MySQL Optimises ORDER BY
.
SELECT key_part3 FROM table_name WHERE key_part1=1 |
Suppose you issue the following SELECT
statement:
mysql> SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2; |
If a multiple-column index exists on col1
and col2
, the
appropriate rows can be fetched directly. If separate single-column
indexes exist on col1
and col2
, the optimiser tries to
find the most restrictive index by deciding which index will find fewer
rows and using that index to fetch the rows.
If the table has a multiple-column index, any leftmost prefix of the
index can be used by the optimiser to find rows. For example, if you
have a three-column index on (col1,col2,col3)
, you have indexed
search capabilities on (col1)
, (col1,col2)
, and
(col1,col2,col3)
.
MySQL can't use a partial index if the columns don't form a
leftmost prefix of the index. Suppose you have the SELECT
statements shown here:
mysql> SELECT * FROM tbl_name WHERE col1=val1; mysql> SELECT * FROM tbl_name WHERE col2=val2; mysql> SELECT * FROM tbl_name WHERE col2=val2 AND col3=val3; |
If an index exists on (col1,col2,col3)
, only the first query
shown above uses the index. The second and third queries do involve
indexed columns, but (col2)
and (col2,col3)
are not
leftmost prefixes of (col1,col2,col3)
.
MySQL also uses indexes for LIKE
comparisons if the argument
to LIKE
is a constant string that doesn't start with a wildcard
character. For example, the following SELECT
statements use indexes:
mysql> SELECT * FROM tbl_name WHERE key_col LIKE "Patrick%"; mysql> SELECT * FROM tbl_name WHERE key_col LIKE "Pat%_ck%"; |
In the first statement, only rows with "Patrick" <= key_col <
"Patricl"
are considered. In the second statement, only rows with
"Pat" <= key_col < "Pau"
are considered.
The following SELECT
statements will not use indexes:
mysql> SELECT * FROM tbl_name WHERE key_col LIKE "%Patrick%"; mysql> SELECT * FROM tbl_name WHERE key_col LIKE other_col; |
In the first statement, the LIKE
value begins with a wildcard
character. In the second statement, the LIKE
value is not a
constant.
MySQL 4.0 does another optimisation on LIKE
. If you use
... LIKE "%string%"
and string
is longer than 3 characters,
MySQL will use the Turbo Boyer-Moore
algorithm to initialise the
pattern for the string and then use this pattern to perform the search
quicker.
Searching using column_name IS NULL
will use indexes if column_name
is an index.
MySQL normally uses the index that finds the least number of rows. An
index is used for columns that you compare with the following operators:
=
, >
, >=
, <
, <=
, BETWEEN
, and a
LIKE
with a non-wildcard prefix like 'something%'
.
Any index that doesn't span all AND
levels in the WHERE
clause
is not used to optimise the query. In other words: To be able to use an
index, a prefix of the index must be used in every AND
group.
The following WHERE
clauses use indexes:
... WHERE index_part1=1 AND index_part2=2 AND other_column=3 ... WHERE index=1 OR A=10 AND index=2 /* index = 1 OR index = 2 */ ... WHERE index_part1='hello' AND index_part_3=5 /* optimised like "index_part1='hello'" */ ... WHERE index1=1 and index2=2 or index1=3 and index3=3; /* Can use index on index1 but not on index2 or index 3 */ |
These WHERE
clauses do NOT use indexes:
... WHERE index_part2=1 AND index_part3=2 /* index_part_1 is not used */ ... WHERE index=1 OR A=10 /* Index is not used in both AND parts */ ... WHERE index_part1=1 OR index_part2=10 /* No index spans all rows */ |
Note that in some cases MySQL will not use an index, even if one would be available. Some of the cases where this happens are:
LIMIT
to only retrieve
part of the rows, MySQL will use an index anyway, as it can
much more quickly find the few rows to return in the result.
All MySQL column types can be indexed. Use of indexes on the
relevant columns is the best way to improve the performance of SELECT
operations.
The maximum number of keys and the maximum index length is defined per storage engine. See section MySQL Table Types. You can with all storage engines have at least 16 keys and a total index length of at least 256 bytes.
For CHAR
and VARCHAR
columns, you can index a prefix of a
column. This is much faster and requires less disk space than indexing the
whole column. The syntax to use in the CREATE TABLE
statement to
index a column prefix looks like this:
KEY index_name (col_name(length)) |
The example here creates an index for the first 10 characters of the
name
column:
mysql> CREATE TABLE test ( -> name CHAR(200) NOT NULL, -> KEY index_name (name(10))); |
For BLOB
and TEXT
columns, you must index a prefix of the
column. You cannot index the entire column.
In MySQL Version 3.23.23 or later, you can also create special
FULLTEXT
indexes. They are used for full-text search. Only the
MyISAM
table type supports FULLTEXT
indexes. They can be
created only from CHAR
, VARCHAR
, and TEXT
columns.
Indexing always happens over the entire column and partial indexing is not
supported. See MySQL Full-text Search for details.
MySQL can create indexes on multiple columns. An index may
consist of up to 15 columns. (On CHAR
and VARCHAR
columns you
can also use a prefix of the column as a part of an index.)
A multiple-column index can be considered a sorted array containing values that are created by concatenating the values of the indexed columns.
MySQL uses multiple-column indexes in such a way that queries are
fast when you specify a known quantity for the first column of the index in a
WHERE
clause, even if you don't specify values for the other columns.
Suppose a table is created using the following specification:
mysql> CREATE TABLE test ( -> id INT NOT NULL, -> last_name CHAR(30) NOT NULL, -> first_name CHAR(30) NOT NULL, -> PRIMARY KEY (id), -> INDEX name (last_name,first_name)); |
Then the index name
is an index over last_name
and
first_name
. The index will be used for queries that specify
values in a known range for last_name
, or for both last_name
and first_name
.
Therefore, the name
index will be used in the following queries:
mysql> SELECT * FROM test WHERE last_name="Widenius"; mysql> SELECT * FROM test WHERE last_name="Widenius" -> AND first_name="Michael"; mysql> SELECT * FROM test WHERE last_name="Widenius" -> AND (first_name="Michael" OR first_name="Monty"); mysql> SELECT * FROM test WHERE last_name="Widenius" -> AND first_name >="M" AND first_name < "N"; |
However, the name
index will NOT be used in the following queries:
mysql> SELECT * FROM test WHERE first_name="Michael"; mysql> SELECT * FROM test WHERE last_name="Widenius" -> OR first_name="Michael"; |
For more information on the manner in which MySQL uses indexes to improve query performance, see MySQL indexes.
When you run mysqladmin status
, you'll see something like this:
Uptime: 426 Running threads: 1 Questions: 11082 Reloads: 1 Open tables: 12 |
This can be somewhat perplexing if you only have 6 tables.
MySQL is multi-threaded, so it may have many queries on the same table
simultaneously. To minimise the problem with two threads having
different states on the same file, the table is opened independently by
each concurrent thread. This takes some memory but will normaly increase
performance. With ISAM
and MyISAM
tables this also requires
one extra file descriptor for the datafile. With these table types the index
file descriptor is shared between all threads.
You can read more about this topic in the next section. See section How MySQL Opens and Closes Tables.
table_cache
, max_connections
, and max_tmp_tables
affect the maximum number of files the server keeps open. If you
increase one or both of these values, you may run up against a limit
imposed by your operating system on the per-process number of open file
descriptors. However, you can increase the limit on many systems.
Consult your OS documentation to find out how to do this, because the
method for changing the limit varies widely from system to system.
table_cache
is related to max_connections
. For example,
for 200 concurrent running connections, you should have a table cache of
at least 200 * n
, where n
is the maximum number of tables
in a join. You also need to reserve some extra file descriptors for
temporary tables and files.
Make sure that your operating system can handle the number of open file
descriptors implied by the table_cache
setting. If
table_cache
is set too high, MySQL may run out of file
descriptors and refuse connections, fail to perform queries, and be very
unreliable. You also have to take into account that the MyISAM
storage
engine needs two file descriptors for each unique open table. You can
in increase the number of file descriptors available for MySQL with
the --open-files-limit=#
startup option. See section File Not Found.
The cache of open tables will be kept at a level of table_cache
entries. The default value is 64; this can be changed with the -O
table_cache=#
option to mysqld
). Note that MySQL may
temporarily open even more tables to be able to execute queries.
A not used table is closed and removed from the table cache under the following circumstances:
table_cache
entries and
a thread is no longer using a table.
mysqladmin refresh
or
mysqladmin flush-tables
.
FLUSH TABLES
statement.
When the table cache fills up, the server uses the following procedure to locate a cache entry to use:
A table is opened for each concurrent access. This means that
if you have two threads accessing the same table or access the table
twice in the same query (with AS
) the table needs to be opened twice.
The first open of any table takes two file descriptors; each additional
use of the table takes only one file descriptor. The extra descriptor
for the first open is used for the index file; this descriptor is shared
among all threads.
If you are opening a table with the HANDLER table_name OPEN
statement, a dedicated table object is allocated for the thread.
This table object is not shared by other threads an will not be closed
until the thread calls HANDLER table_name CLOSE
or the thread dies.
See section HANDLER
. When this happens, the table is put
back in the table cache (if it isn't full).
You can check if your table cache is too small by checking the mysqld
variable Opened_tables
. If this is quite big, even if you
haven't done a lot of FLUSH TABLES
, you should increase your table
cache. See section Opened_tables
.
If you have many files in a directory, open, close, and create operations will
be slow. If you execute SELECT
statements on many different tables,
there will be a little overhead when the table cache is full, because for
every table that has to be opened, another must be closed. You can reduce
this overhead by making the table cache larger.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] |
Hosting by: Hurra Communications Ltd.
Generated: 2007-01-26 17:58:45