Go to the first, previous, next, last section, table of contents.


How to get maximum performance out of MySQL

How does one change the size of MySQL buffers?

You can get the current buffer sizes with:

> ./mysqld --help

This should result in a list of all mysqld options and configurable variables like the following.

Possibly variables to option --set-variable (-O) are: 
back_log              current value: 5
join_buffer           current value: 131072
key_buffer            current value: 1048568
max_allowed_packet    current value: 65536
max_connections       current value: 90
max_join_size         current value: 4294967295
max_sort_length       current value: 1024
net_buffer_length     current value: 8192
record_buffer         current value: 131072
sort_buffer           current value: 2097144
table_cache           current value: 64
tmp_table_size        current value: 1048576
thread_stack          current value: 65536
back_log How many outstanding connection requests may MySQL have. This comes into play when the main MySQL thread gets VERY many connection requests in a very short time. It then takes some time (but very short) for the main thread to check the connection and start a new thread. The back_log is how many connects can be stacked during this short time before MySQL momentarily stops answering new requests. You only need to increase this if you expect a large number of connections in a short period of time. In other words, the size of the listen queue for incoming tcp/ip connections. The manual page for the unix system call listen(2) should have more details. Check your OS documentation for the maximum value for this variable.
join_buffer This buffer is used for full joins (without indexes). It is allocated one time for each full join between two tables. Increase this to get a faster full join when adding indexes is not possible. Normally the best way to get fast joins is by adding indexes.
key_buffer Buffers index blocks and are shared by all threads. You might want to increase this when doing many delete/inserts on a table with lots of indexes. To get even more speed use LOCK TABLES. See section LOCK TABLES syntax.
max_allowed_packet Max size of one packet. This allows the message buffer to grow up to this limit when needed (it is initiated to net_buffer_length). May be set very big because this is mainly to find erroneous packets. You must increase this if you are using big BLOBS. It should be as big as the biggest BLOB you want to use.
max_connections How many simultaneous clients are allowed. If you increase this you probably has to increase the number of file descriptors mysqld has. This is Operating system depended so look at you OS documentation.
max_join_size Joins that touch more records than max_join_size return an error. Set this if you have users to tend to make joins without a WHERE that take a long time and return millions of rows.
max_sort_length The number of bytes to use when sorting on BLOB or TEXT columns.
net_buffer_length The communication buffer is reset to this size between queries. This should not normally be changed, but if you have very little memory you can set it to the expected size of a query.
record_buffer Each thread that is doing a sequential scan allocates a buffer of this size for each table it scans. If you do many sequential scans you may want to increase this.
sort_buffer Each thread that needs to do a sort allocates a buffer of this size. Increase this for faster ORDER BY or GROUP BY. A sort also allocates one or two temporary files. The maximum disk-space needed is (length_of_what_is_sorted + sizeof(database_pointer)) * number_of_rows * 2. sizeof(database_pointer) is usally 4 but may grow in the future for really big tables.
table_cache Number of open tables for all threads. If this is increased you must see to that the number of open file descriptor is also increased. MySQL needs two file descriptors for each unique table.
tmp_table_size If a temporary table gets bigger than this a The table ### is full error will be generated. Increase this if you do many advanced GROUP BY queries.
thread_stack How big will each threads C stack be. A lot of the limits detected by crash-me are dependent on this. The default is normally enough.
MySQL uses algorithms that are very scalable so one can usually run with very little memory or give MySQL more memory to get better performance. If you have much memory and many tables and want maximum performance with a moderate number of clients you should use something like:
> safe_mysqld -O key_buffer=16M -O table_cache=128 \
        -O sort_buffer=4M -O record_buffer=1M &
If you have little memory with lots of connections, use something like:
> safe_mysqld -O key_buffer=512k -O sort_buffer=100k -O record_buffer=100k &
or even
> safe_mysqld -O key_buffer=512k -O sort_buffer=16k -O table_cache=32 \
        -O record_buffer=8k -O net_buffer=1K &
Note that if you change an option to mysqld it is only for that instance. To see the effects of a parameter change do something like this mysqld -O keybuffer=32m --help. If there are very many connections, 'swapping problems' may occur if mysqld has not been configured to use very little memory for each connection. It also works better if you have a enough memory for all connections of course. For example, for 200 open connections one should have a table cache of at least 200 * (max_number of tables in join).

How does MySQL use memory ?

When running mysqld, ps and other programs may report that it uses a lot of memory. This may be caused by thread-stacks on different memory addresses. For example, the Solaris ps calculates the unused memory between stacks as used memory. You can verify this by checking available swap with 'swap -s'. We have tested mysqld with commercial memory-leakage detectors so there should not be any memory leaks.

How does MySQL use indexes?

All indexes, PRIMARY, UNIQUE and INDEX(), are stored in B trees. Strings are automatically prefix- and end-space compressed.

INDEX(col1, col2) creates a multiple index over the two columns. The index can be seen as a concatenation of the given columns. If you use INDEX(col1), INDEX(col2) instead of INDEX(col1,col2) you get two separate indexes instead.

SELECT * FROM table WHERE col1=# AND col2=#

In a case of INDEX(col1,col2) the right row(s) can be fetched directly. In a case of INDEX(col1), INDEX(col2) the optimiser decides which index will find fewer rows and this index will be used to fetch the rows.

If the table has an index INDEX(col1,col2,col3...) the prefix of this can be used by the optimiser to find the rows. This means that the above gives you search capabilities on: INDEX(col1) and INDEX(col1,col2) and INDEX(col1,col2,col3)...

MySQL can't use a portion of an index to locate rows through an index.

With the definition INDEX(col1,col2,col3):

SELECT * FROM table WHERE col1=#
SELECT * FROM table WHERE col2=#
SELECT * FROM table WHERE col2=# and col3=#

only the first query will use indexes.

What kind of optimisation is done on the WHERE clause?

(Incomplete, MySQL does a lot of optimisations.)

The first issue about making a slow SELECT ... WHERE faster is to check if one could add an index. All references between different tables should usually be done with indexes. One can use the EXPLAIN command to check which indexes are used in a select. See section EXPLAIN syntax. Get information about a SELECT.. See section How does MySQL use indexes?

How does MySQL open & close tables?

The cache of open tables can grow to a maximum of table-cache (default 64, changeable with -O table_cache=#). A table is never closed, except when the cache is full and another thread tries to open a table or if one uses 'mysqladmin refresh'.

When the limit is reached, MySQL closes as many tables as possible, until the cache size has been reached or there are no more unused tables. This means that if all tables are in use by some threads, there will be more open tables than the cache limit, but the extra tables will be closed eventually. Tables are closed according to last-used order.

A table is opened (again) for each concurrent access. This means that if one has two threads running on the same table or access the table twice in the same query (with AS) the table needs to be opened twice. The first open of any table takes two file descriptors, each following use of the table takes only one file descriptor.

If MySQL notices that a table is a symbolic linked it will resolve the symlink and use the table it points to instead. MySQL will not notice if the database directory is symbolic linked! If you symlink a database to another and use the same table trough both databases, the tables will be inconsistent after any update to the tables.

What are the drawbacks of creating possibly thousands of tables in a database?

Each table is actually three files. If you have many files in a directory open, close and create will be slow. If you also do selects on many different tables there will be a little overhead because when the table cache is full, for every table that has to be opened another has to be closed. One can make the overhead smaller by making the table cache larger.

How does MySQL lock tables?

All locking in MySQL is deadlock free. This is managed by always requesting all needed locks at once at query start and always locking the tables in the same order.

The locking method MySQL uses for WRITE lock works as follows:

If there is no locks on the table, put a write lock on it, else put the lock in the write lock queue.

The locking method MySQL uses for READ locks works as follows:

If there is no write locks on the table, put a read lock on it else put the lock in the read lock queue.

When a lock is released first use give the lock to the threads in the write lock queue and after this to the threads in the read lock queue.

This means that if you have many updates on the same table, select statements will be waiting until there is no more updates.

To fix this in the case where you do many inserts and many selects on the same table you could insert rows in another table and once in a while update the other table with all records from the temporary table.

This can be done with the following code:

LOCK TABLES real_table WRITE, insert_table WRITE
insert into real_table select * from insert_table
delete from insert_table
UNLOCK TABLES

One could also change the locking code in mysys/thr_lock.c to use only one queue. In this case write locks would have the same priority that read locks and this could help some applications.

How should I arrange my table to be as fast/small as possible?

To check how you are doing, run isamchk -evi on the .ISM file. @c See section The MySQL table check, optimise and repair program.

What affects the speed of the INSERT statement?

The time to insert a record consists of:

Where (number) is proportional time. This does not take into consideration the initial overhead to open tables (which is done once for each simultaneous running query).

The size of the table slows down the insert of indexes with N log N (B-trees).

A way of speeding up inserts is to lock your table during the inserts.

LOCK TABLES a WRITE;

INSERT INTO a VALUES (1,23)
INSERT INTO a VALUES (2,34)
INSERT INTO a VALUES (4,33)
INSERT INTO a VALUES (8,26)
INSERT INTO a VALUES (6,29)

UNLOCK TABLES;

The main speed difference is that the index buffer is only flushed once to disk for all inserts. Normally there would be as many index buffer flushes as there are inserts.

Locking will also lower the total time of multi-connection test but the maximum wait time for some threads will go up.

For example:

thread 1 does 1000 inserts
thread 2, 3, and 4 does 1 insert
thread 5 does 1000 inserts

If you don't use locking, 2, 3 and 4 will finish before 1 and 5. If you use locking 2,3,4 may finish before 1 or 5 but probably not, but the total time should be about 40% faster.

As INSERTs, UPDATEs and DELETEs are very fast in MySQL, one will obtain better overall performance by adding locks around everything that does more than about 5 inserts/updates in a row. If one does very many inserts in a row one could do a UNLOCK TABLES followed by a LOCK TABLES once in a while (about each 1000 rows) to give other threads access to the table. This would still give a nice performance gain.

Of course LOAD DATA INFILE is much faster still.

What affects the speed of DELETE statements?

The delete time of a record is exactly proportional to the number of indexes. To increase the speed of deletes you can increase the size of the index cache. The default index cache is 1M and to get faster deletes it should be increased a couple of times (try 16M if you have enough memory).

How do I get MySQL to run at full speed?

What are the different row formats? Or when to use VARCHAR/CHAR?

MySQL dosen't have true SQL VARCHAR() types.

MySQL has instead 3 different ways to store records and uses this to emulate VARCHAR():

If one doesn't use any of the VARCHAR, BLOB or TEXT column types a fixed row size is used, otherwise a dynamic row size is used. CHAR() and VARCHAR() are treated identically from the applications point of view; Both truncates end space from the column when the column is accessed.

You can check the format used in a table with isamchk -d.

MySQL has three different table formats:

  1. Fixed length tables.
  2. Dynamic tables
  3. Compressed tables:

Why so many open tables?

When you run mysqladmin status you get something like:

Uptime: 426 Running threads: 1 Questions: 11082 Reloads: 1 Open tables: 12

This can be somewhat perplexing if you only have 6 tables.

As MySQL is multithreaded it may have many queries on the same table at once. To minimise the problem with two threads having different states on the same file, I open the table again for each concurrent thread. This takes some memory and one extra file descriptor for the data file. The index file descriptor is shared between all threads.


Go to the first, previous, next, last section, table of contents.


Casa de Bender