[Top] [Contents] [Index] [ ? ]

MySQL Internals Manual for version 5.0.2-alpha.

Copyright © 1998-2004 MySQL AB


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1. Coding Guidelines

The purpose of this chapter is to establish a set of guidelines that will help you, the developer, to follow the established MySQL coding styles and standards when writing code. Consistent style is important as it makes it easier to maintain our code and allows you to find specific routines much more quickly.

The standards contained within this chapter apply to the mysql server, and do not necessarily apply to other projects such as JDBC, ODBC, Administrator, etc.

Indentation and Spacing

The following rules apply to indentation and spacing of code within a source file:

Naming Conventions

Commenting Code

General Development Guidelines

Suggested mode in emacs:

 
(require 'font-lock)
(require 'cc-mode)
(setq global-font-lock-mode t) ;;colors in all buffers that support it
(setq font-lock-maximum-decoration t) ;;maximum color
(c-add-style "MY"
 '("K&R"
   '("MY"
     (c-basic-offset . 2)
     (c-comment-only-line-offset . 0)
     (c-offsets-alist . ((statement-block-intro . +)
                         (knr-argdecl-intro . 0)
                         (substatement-open . 0)
                         (label . -)
                         (statement-cont . +)
                         (arglist-intro . c-lineup-arglist-intro-after-paren)
                         (arglist-close . c-lineup-arglist)
                         ))
     )))

(setq c-mode-common-hook '(lambda ()
                            (c-set-style "MY")
                            (setq tab-width 8)
                            (setq indent-tabs-mode t)
                            (setq comment-column 48)))

(c-set-style "MY")
(setq c-default-style "MY")

Basic vim setup:

 
set tabstop=8
set shiftwidth=2
set backspace=2
set softtabstop
set smartindent
set cindent
set cinoptions=g0:0t0c2C1(0f0l1
"set expandtab "uncomment if you don't want to use tabstops

Another vim setup:

 
set tabstop=8
set shiftwidth=2
set bs=2
set et
set sts=2
set tw=78
set formatoptions=cqroa1
set cinoptions=g0:0t0c2C1(0f0l1
set cindent

function InsertShiftTabWrapper()
  let num_spaces = 48 - virtcol('.')
  let line = ' '
  while (num_spaces > 0)
    let line = line . ' '
    let num_spaces = num_spaces - 1
  endwhile
  return line
endfunction
" jump to 48th column by Shift-Tab - to place a comment there
inoremap <S-tab> <c-r>=InsertShiftTabWrapper()<cr>
" highlight trailing spaces as errors
let c_space_errors=1

DBUG Tags

Here are some of the DBUG tags we now use:

enter

Arguments to the function.

exit

Results from the function.

info

Something that may be interesting.

warning

When something doesn't go the usual route or may be wrong.

error

When something went wrong.

loop

Write in a loop, that is probably only useful when debugging the loop. These should normally be deleted when you are satisfied with the code and it has been in real use for a while.

Some tags specific to mysqld, because we want to watch these carefully:

trans

Starting/stopping transactions.

quit

info when mysqld is preparing to die.

query

Print query.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2. The Optimizer

Definitions

This description uses a narrow definition: The OPTIMISER is the set of routines which decide what execution path the DBMS should take for queries.

MySQL changes these routines frequently, so you should compare what is said here with what's in the current source code. To make that easy, this description includes notes referring to the relevant routine, for example "See: `/sql/select_cc', optimize_cond()".

When one query is changed into another query which delivers the same result, that is a TRANSFORMATION. For example, the DBMS could change

 
SELECT ... WHERE 5 = a

to

 
SELECT ...WHERE a = 5

Most transformations are less obvious. Some transformations result in faster execution.

The Code

Here is a diagram showing the code structure of handle_select() in `/sql/sql_select.cc', the server code that handles a query:

 
handle_select()
   mysql_select()
     JOIN::prepare()
       setup_fields()
     JOIN::optimize()            /* optimizer is from here ... */
       optimize_cond()
       opt_sum_query()
       make_join_statistics()
         get_quick_record_count()
         choose_plan()
           /* Find the best way to access tables */
           /* as specified by the user.          */
           optimize_straight_join()
             best_access_path()
           /* Find a (sub-)optimal plan among all or subset */
           /* of all possible query plans where the user    */
           /* controlls the exhaustiveness of the search.   */
           greedy_search()
             best_extension_by_limited_search()
               best_access_path()
           /* Perform an exhaustive search for an optimal plan */
           find_best()
       make_join_select()        /* ... to here */
     JOIN::exec()

The indentation in the diagram shows what calls what. Thus you can see that handle_select() calls mysql_select() which calls JOIN::prepare() which calls setup_fields(), and so on. The first part of mysql_select() is JOIN::prepare() which is for context analysis, metadata setup, and some subquery transformations. The optimizer is JOIN::optimize() and all its subordinate routines. When the optimizer finishes, JOIN::exec() takes over and does the job that JOIN::optimize() decides upon.

Although the word "JOIN" appears, these optimizer routines are for all query types.

The optimize_cond() and opt_sum_query() routines do transformations. The make_join_statistics() routine puts together all the information it can find about indexes that might be useful for accessing the query's tables.

Constant Propagation

A transformation takes place for expressions like this:

 
WHERE column1 = column2 AND column2 = 'x'

For such expressions, since it is known that "if A=B and B=C then A=C" (the Transitivity Law), the transformed condition becomes:

 
WHERE column1='x' AND column2='x'

This transformation occurs for column1 <operator> column2 conditions if and only if <operator> is one of these operators:

 
=, <, >, <=, >=, <>, <=>, LIKE

That is, transitive transformations don't apply for BETWEEN. Probably they should not apply for LIKE either, but that's a story for another day.

Constant propagation happens in a loop, so the output from one "propagation step" can be input for the next step.

See: `/sql/sql_select.cc', change_cond_ref_to_const(). Or see: `/sql/sql_select.cc', propagate_cond_constants().

Dead Code Elimination

A transformation takes place for always-true conditions:

 
WHERE 0=0 AND column1='y'

The first condition is always true, so it is removed, leaving:

 
WHERE column1='y'

See: `/sql/sql_select.cc', remove_eq_conds().

A transformation takes place for always-false conditions:

 
WHERE (0 = 1 AND s1 = 5) OR s1 = 7

The parenthesized part is always false, so it is removed, reducing the expression above to:

 
WHERE s1 = 7

Sometimes the optimizer might eliminate the whole WHERE clause:

 
WHERE (0 = 1 AND s1 = 5)

The EXPLAIN statement will show the words Impossible WHERE. Informally, we at MySQL say: "The WHERE has been optimized away."

If a column cannot be NULL, the optimizer removes any non-relevant IS NULL conditions. Thus,

 
WHERE not_null_column IS NULL

is an always-false situation, and

 
WHERE not_null_column IS NOT NULL

is an always-true situation -- so such columns are also eliminated from the conditional expression. This can be tricky. For example, in an OUTER JOIN, a column which is defined as NOT NULL might still contain a NULL. The optimizer leaves IS NULL conditions alone in such exceptional situations.

The optimizer will not detect all possible impossible WHERE situations -- there are too many. For example:

 
CREATE TABLE Table1 (column1 CHAR(1));
...
SELECT * FROM Table1 WHERE column1 = 'Canada';

The optimizer will not eliminate the condition in the query, even though the CREATE TABLE definition makes it an impossible condition.

Constant Folding

A transformation takes place for this expression:

 
WHERE column1 = 1 + 2

which becomes:

 
WHERE column1 = 3

Before you say "but I never would write 1 + 2 in the first place" -- remember what was said earlier about constant propagation. It is quite easy for the optimizer to put such expressions together. This process simplifies the result.

Constants and Constant Tables

A MySQL "constant" is something more than a mere literal in the query. It can also be the contents of a "constant table," which is defined as follows:

  1. A table with zero rows, or with only one row
  2. A table expression that is restricted with a WHERE condition, containing expressions of the form column = constant, for all the columns of the table's PRIMARY KEY, or for all the columns of any of the table's UNIQUE keys (provided that the UNIQUE columns are also defined as NOT NULL).

For example, if the table definition for Table0 contains

 
... PRIMARY KEY (column1,column2)

then this expression

 
FROM Table0 ... WHERE column1=5 AND column2=7 ...

returns a constant table. More simply, if the table definition for Table1 contains

 
... unique_not_null_column INT NOT NULL UNIQUE

then this expression

 
FROM Table1 ... WHERE unique_not_null_column=5

returns a constant table.

These rules mean that a constant table has at most one row value. MySQL will evaluate a constant table in advance, to find out what that value is. Then MySQL will "plug" that value into the query. Here's an example:

 
SELECT Table1.unique_not_null_column, Table2.any_column
    FROM Table1, Table2
    WHERE Table1.unique_not_null_column = Table2.any_column
    AND Table1.unique_not_null_column = 5;

When evaluating this query, MySQL first finds that table Table1 -- after restriction with Table1.unique_not_null_column -- is a constant table according to the second definition above. So it retrieves that value.

If the retrieval fails (there is no row in the table with unique_not_null_column = 5), then the constant table has zero rows and you will see this message if you run EXPLAIN for the statement:

 
Impossible WHERE noticed after reading const tables

Alternatively, if the retrieval succeeds (there is exactly one row in the table with unique_not_null_column = 5), then the constant table has one row and MySQL transforms the query to this:

 
SELECT 5, Table2.any_column
    FROM Table1, Table2
    WHERE 5 = Table2.any_column
    AND 5 = 5;

Actually this is a grand-combination example. The optimizer does some of the transformation because of constant propagation, which we described earlier. By the way, we described constant propagation first because it happens happens before MySQL figures out what the constant tables are. The sequence of optimizer steps sometimes makes a difference.

Although many queries have no constant-table references, it should be kept in mind that whenever the word "constant" is mentioned hereafter, it refers either to a literal or to the contents of a constant table.

See: `/sql/sql_select.cc', make_join_statistics().

Join Type

When evaluating a conditional expression, MySQL decides what "join type" the expression has. (Again: despite the word "join," this applies for all conditional expressions, not just join expressions. A term like "access type" would be clearer.) These are the documented join types, in order from best to worst:

 
system          ... a system table which is a constant table
const           ... a constant table
eq_ref          ... unique/primary index with '=' for joining
ref             ... index with '='
ref_or_null     ... index with '=', possibly NULL
range           ... index with BETWEEN, IN, >=, LIKE, etc.
index           ... sequential scan of index
ALL             ... sequential scan of table

See: `/sql/sql_select.h', enum join_type{}. Notice that there are a few other (undocumented) join types too, for subqueries.

The optimizer can use the join type to pick a "driver expression." For example, consider this query:

 
SELECT *
FROM Table1
WHERE indexed_column = 5 AND unindexed_column = 6

Since indexed_column has a better join type, it is more likely to be the driver. You'll see various exceptions as this description proceeds, but this is a simple first rule.

What is significant about a driver? Consider that there are two execution paths for the query:

(The Bad Execution Path) Read every row in the table. (This is called a "sequential scan of Table1" or just "table scan.") For each row, examine the values in indexed_column and in unindexed_column, to see if they meet the conditions.

(The Good Execution Plan) Via the index, look up the rows which have indexed_column = 5. (This is called an "indexed search.") For each row, examine the value in unindexed_column to see if it meets the condition.

An indexed search generally involves fewer accesses than a sequential scan, and far fewer accesses if the table is large but the index is UNIQUE. That is why it is better to access with "The Good Execution Plan," and that is why it is often good to choose indexed_column as the driver.

The 'range' Join Type

Some conditions can work with indexes, but over a (possibly wide) range of keys. These are known as "range" conditions, and are most often encountered with expressions involving these operators: >, >=, <, <=, IN, LIKE, BETWEEN

To the optimizer, this expression:

 
column1 IN (1,2,3)

is the same as this one:

 
column1 = 1 OR column1 = 2 OR column1 = 3

and MySQL treats them the same -- there is no need to change IN to OR for a query, or vice versa.

The optimizer will use an index (range search) for

 
column1 LIKE 'x%'

but not for

 
column1 LIKE '%x'

That is, there is no range search if the first character in the pattern is a wildcard.

To the optimizer,

 
column1 BETWEEN 5 AND 7

is the same as this expression

 
column1 >= 5 AND column1 <= 7

and again, MySQL treats both expressions the same.

The optimizer may change a Range to an ALL join type if a condition would examine too many index keys. Such a change is particularly likely for < and > conditions and multiple-level secondary indexes. See: (for MyISAM indexes) `/myisam/mi_range.c', mi_records_in_range().

The index Join Type

Consider this query:

 
SELECT column1 FROM Table1;

If column1 is indexed, then the optimizer may choose to retrieve the values from the index rather than from the table. An index which is used this way is called a "covering index" in most texts. MySQL just uses the word "index" in EXPLAIN descriptions.

For this query:

 
SELECT column1, column2 FROM Table1;

the optimizer will use "join type = index" only if the index has this definition:

 
CREATE INDEX ... ON Table1 (column1, column2);

In other words, all columns in the select list must be in the index. (The order of the columns in the index does not matter.) Thus it might make sense to define a multiple-column index strictly for use as a covering index, regardless of search considerations.

Transposition

MySQL supports transpositions (reversing the order of operands around a relational operator) for simple expressions only. In other words:

 
WHERE - 5 = column1

becomes:

 
WHERE column1 = -5

However, MySQL does not support transpositions where arithmetic exists. Thus:

 
WHERE 5 = -column1

is not treated the same as:

 
WHERE column1 = -5

Transpositions to expressions of the form <column>=<constant> are ideal for index lookups. If an expression of this form refers to an indexed column, then MySQL always uses the index, regardless of the table size. (Exception: if the table has only zero rows or only one row, it is a constant table and receives special treatment. See the earlier section "Constants and Constant Tables".)

AND

The ANDed search has the form <condition> AND <condition>, as in:

 
WHERE column1 = 'x' AND column2 = 'y'

Here the optimizer's decision is:

  1. If (neither condition is indexed) use sequential scan.
  2. Otherwise, if (one condition has better join type) then pick a driver based on join type (see the earlier section "Join Type").
  3. Otherwise, since (both conditions are indexed and have equal join type) pick a driver based on the first index that was created.

Here's an example:

 
CREATE TABLE Table1 (s1 INT, s2 INT);
CREATE INDEX Index1 ON Table1 (s2);
CREATE INDEX Index2 ON Table1 (s1);
...
SELECT * FROM Table1 WHERE s1 = 5 AND s2 = 5;

When choosing a strategy to solve this query, the optimizer picks s2 = 5 as the driver because the index for s2 was created first. Regard this as an accidental effect rather than a rule -- it could change at any moment.

OR

The ORed search has the form "<condition> OR <condition>" as in:

 
WHERE column1 = 'x' OR column2 = 'y'

Here the optimizer's decision is:

 
Use a sequential scan.

In theory there is another choice if both column1 and column2 are indexed:

 
index search for the first condition,
index search for the second condition,
merge the two result sets

MySQL never does that. But MySQL will do that in future, the code for doing so already exists.

The above warning does not apply if the same column is used in both conditions. For example:

 
WHERE column1 = 'x' OR column1 = 'y'

In such a case, the search is indexed because the expression is a range search. This subject will be revisited during the discussion of the IN predicate.

AND plus OR

Consider this search expression:

 
WHERE column1 = 5 AND (column2 = 5 OR column3 = 5)

If column1 is indexed, then the optimizer will choose to use the index. In other words, the "sequential scan if OR" rule does not apply if the OR is subordinate to an AND.

UNION

All SELECT statements within a UNION are optimized separately. Therefore, for this query:

 
SELECT * FROM Table1 WHERE column1 = 'x'
UNION ALL
SELECT * FROM TABLE1 WHERE column2 = 'y'

if both column1 and column2 are indexed, then each SELECT is done using an indexed search, and the result sets are merged. Notice that this query might produce the same results as the query used in the OR example, which uses a sequential scan.

NOT, <>

It is a logical rule that

 
column1 <> 5

is the same as

 
column1 < 5 OR column1 > 5

However, MySQL does not transform in this circumstance. If you think that a range search would be better, then you should do your own transforming in such cases.

It is also a logical rule that

 
WHERE NOT (column1 != 5)

is the same as

 
WHERE column1 = 5

However, MySQL does not transform in this circumstance either.

We expect to add optimizations soon for both the above cases.

ORDER BY

In general, the optimizer will skip the sort procedure for the ORDER BY clause if it sees that the rows will be in order anyway. But let's examine some exceptional situations.

For the query:

 
SELECT column1 FROM Table1 ORDER BY 'x';

the optimizer will throw out the ORDER BY clause. This is another example of dead code elimination.

For the query:

 
SELECT column1 FROM Table1 ORDER BY column1;

the optimizer will use an index on column1, if it exists.

For the query:

 
SELECT column1 FROM Table1 ORDER BY column1+1;

the optimizer will use an index on column1, if it exists. But don't let that fool you! The index is only for finding the values. (It's cheaper to do a sequential scan of the index than a sequential scan of the table, that's why index is a better join type than ALL -- see "The 'index' Join Type" section, earlier.) There will still be a full sort of the results.

For the query:

 
SELECT * FROM Table1
WHERE column1 > 'x' AND column2 > 'x'
ORDER BY column2;

if both column1 and column2 are indexed, the optimizer will choose an index on ... column1. The fact that ordering takes place by column2 values does not affect the choice of driver in this case.

See: `/sql/sql_select.cc', test_if_order_by_key(), and `/sql/sql_select.cc', test_if_skip_sort_order().

There is a description of the internal sort procedure in the MySQL Reference Manual, in section 5.2.8 "How MySQL optimizes ORDER BY." We will not repeat it here, but urge you to read it because it includes a description of how the buffering and the quicksort operate.

See: `/sql/sql_select.cc', create_sort_index().

GROUP BY

These are the main optimizations that take place for GROUP BY and related items (HAVING, COUNT(), MAX(), MIN(), SUM(), AVG(), DISTINCT()).

See: `/sql/sql_select.cc', opt_sum_query(), and `/sql/sql_select.cc', remove_duplicates().

JOIN

Bad join choices can cause more damage than bad choices in single-table searches, so MySQL developers have spent proportionally more time making sure that the tables in a query are joined in an optimal order and that optimal access methods (often called "access paths") are chosen to retrieve table data. A combination of a fixed order in which tables are joined and the corresponding table access methods for each table is called "query execution plan" (QEP). The goal of the query optimizer is to find an optimal QEP among all possible such plans. There are several general ideas behind join optimization.

Each plan (or part of plan) is assigned a "cost." The cost of a plan reflects roughly the resources needed to compute a query according to the plan, where the main factor is the number of rows that will be accessed while computing a query. Once we have a way to assign costs to different QEPs we have a way to compare them. Thus, the goal of the optimizer is to find a QEP with minimal cost among all possible plans.

In MySQL, the search for an optimal QEP is performed in a bottom-up manner. The optimizer first considers all plans for one table, then all plans for two tables, and so on, until it builds a complete optimal QEP. Query plans that consist of only some of the tables (and predicates) in a query are called "partial plans." The optimizer relies on the fact that the more tables are added to a partial plan, the greater its cost. This allows the optimizer to expand with more tables only the partial plans with lower cost than the current best complete plan.

The key routine that performs the search for an optimal QEP is `sql/sql_select.cc', find_best(). It performs an exhaustive search of all possible plans and thus guarantees it will find an optimal one.

Below we represent find_best() in an extremely free translation to pseudocode. It is recursive, so some input variables are labeled "so far" to indicate that they come from a previous iteration.

 
remaining_tables = {t1, ..., tn}; /* all tables referenced in a query */

procedure find_best(
   partial_plan in,      /* in, partial plan of tables-joined-so-far */
   partial_plan_cost,    /* in, cost of partial_plan */
   remaining_tables,     /* in, set of tables not referenced in partial_plan */
   best_plan_so_far,     /* in/out, best plan found so far */
   best_plan_so_far_cost)/* in/out, cost of best_plan_so_far */
{
   for each table T from remaining_tables
   {
     /* Calculate the cost of using table T. Factors that the
        optimizer takes into account may include:
          Many rows in table (bad)
          Many key parts in common with tables so far (very good)
          Restriction mentioned in the WHERE clause (good)
          Long key (good)
          Unique or primary key (good)
          Full-text key (bad)
        Other factors that may at some time be worth considering:
          Many columns in key
          Short average/maximum key length
          Small table file
          Few levels in index
          All ORDER BY / GROUP columns come from this table */
     cost = complex-series-of-calculations;
     /* Add the cost to the cost so far. */
     partial_plan_cost+= cost;

     if (partial_plan_cost >= best_plan_so_far_cost)
       /* partial_plan_cost already too great, stop search */
       continue;

     partial_plan= expand partial_plan by best_access_method;
     remaining_tables= remaining_tables - table T;
     if (remaining_tables is not an empty set)
     {
       find_best(partial_plan, partial_plan_cost,
                 remaining_tables,
                 best_plan_so_far, best_plan_so_far_cost);
     }
     else
     {
       best_plan_so_far_cost= partial_plan_cost;
       best_plan_so_far= partial_plan;
     }
   }
}

Here the optimizer applies a "depth-first search algorithm." It tries estimates for every table in the FROM clause. It will stop a search early if the estimate becomes worse than the best estimate so far. The order of scanning will depend on the order that the tables appear in the FROM clause.

See: `/sql/table.h', struct st_table.

ANALYZE TABLE may affect some of the factors that the optimizer considers.

See also: `/sql/sql_sqlect.cc', make_join_statistics().

The straightforward use of find_best() and greedy_search() will not apply for LEFT JOIN or RIGHT JOIN. For example, starting with MySQL 4.0.14, the optimizer may change a left join to a straight join and swap the table order in some cases. See also "5.2.7 How MySQL Optimises LEFT JOIN and RIGHT JOIN" in the MySQL Reference Manual.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.1 The Index Merge Join Type


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.1.1 Overview

Index Merge is used when table condition can be converted to form:

 
cond_1 OR cond_2 ... OR cond_N

The conditions for conversion are that each cond_i can be used for a range scan, and no pair (cond_i, cond_j) uses the same index. (If cond_i and cond_j use the same index, then cond_i OR cond_j can be combined into a single range scan and no merging is necessary.)

For example, Index Merge can be used for the following queries:

 
SELECT * FROM t WHERE key1=c1 OR key2<c2 OR key3 IN (c3,c4);

SELECT * FROM t WHERE (key1=c1 OR key2<c2) AND nonkey=c3;

Index Merge is implemented as a "container" for range key scans constructed from cond_i conditions. When doing Index Merge, MySQL retrieves rows for each of the keyscans and then runs them through a duplicate elimination procedure. Currently the Unique class is used for duplicate elimination.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.1.2 Index Merge Optimizer


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.1.2.1 Range Optimizer

For Range-type queries, the MySQL optimizer builds a SEL_TREE object which represents a condition in this form:

 
range_cond = (cond_key_1 AND cond_key_2 AND ... AND cond_key_N)

Each of cond_key_i is a condition that refers to components of one key. MySQL creates a cond_key_i condition for each of the usable keys. Then the cheapest condition cond_key_i is used for doing range scan.

A single cond_key_i condition is represented by a pointer-linked network of SEL_ARG objects. Each SEL_ARG object refers to particular part of the key and represents the following condition:

 
  sel_arg_cond= (inf_val < key_part_n AND key_part_n < sup_val) (1)
                AND next_key_part_sel_arg_cond                  (2)
                OR left_sel_arg_cond                            (3)
                OR right_sel_arg_cond                           (4)
  1. is for an interval, possibly without upper or lower bound, either including or not including boundary values.
  2. is for a SEL_ARG object with condition on next key component.
  3. is for a SEL_ARG object with interval on the same field as this SEL_ARG object. Intervals of current and "left" object are disjoint and left_sel_arg_cond.sup_val <= inf_val.
  4. is for a SEL_ARG object with interval on the same field as this SEL_ARG object. Intervals of current and "right" object are disjoint and left_sel_arg_cond.min_val >= max_val.

MySQL is able to convert arbitrary-depth nested AND-OR conditions to the above conjunctive form.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.1.2.2 Index Merge Optimizer

A single SEL_TREE object cannot be constructed for conditions that have different members of keys in the OR clause, like in condition:

 
key1 < c1 OR key2 < c2

Beginning with MySQL 5.0, these conditions are handled with the Index Merge method, and its range optimizer structure, class SEL_IMERGE. SEL_IMERGE represents a disjunction of several SEL_TREE objects, which can be expressed as:

 
sel_imerge_cond = (t_1 OR t_1 OR ... OR t_n)

where each of t_i stands for a SEL_TREE object, and no pair (t_i, t_j) of distinct SEL_TREE objects can be combined into single SEL_TREE object.

The current implementation builds SEL_IMERGE only if no single SEL_TREE object can be built for the part of the query condition it has analyzed, and discards SEL_TREE immediately if it discovers that a single SEL_TREE object can be constructed. This is actually a limitation, and can cause worse row retrieval strategy to be used. E.g. for query:

 
SELECT * FROM t WHERE (goodkey1=c1 OR goodkey1=c2) AND badkey=c3

scan on badkey will be chosen even if Index Merge on (goodkey1, goodkey) would be faster.

The Index Merge optimizer collects a list of possible ways to access rows with Index Merge. This list of SEL_IMERGE structures represents the following condition:

 
  (t_11 OR t_12 OR ... OR t_1k) AND
  (t_21 OR t_22 OR ... OR t_2l) AND
   ...                          AND
  (t_M1 OR t_M2 OR ... OR t_mp)

where t_ij is one SEL_TREE and one line is for one SEL_IMERGE object.

The SEL_IMERGE object with minimal cost is used for row retrieval.

In `sql/opt_range.cc', see imerge_list_and_list(), imerge_list_or_list(), and SEL_IMERGE class member functions for more details of Index Merge construction.

See the get_index_merge_params function in the same file for Index Merge cost calculation algorithm.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.1.3 Row Retrieval Algorithm

Index Merge works in two steps:

Preparation step:

 
activate 'index only';
foreach key_i in (key_scans \ clustered_pk_scan)
{
  while (retrieve next (key, rowid) pair from key_i)
  {
    if (no clustered PK scan ||
        row doesn't match clustered PK scan condition)
      put rowid into Unique;
  }
}
deactivate 'index only';

Row retrieval step:

 
for each rowid in Unique
{
  retrieve row and pass it to output;
}
if (clustered_pk_scan)
{
  while (retrieve next row for clustered_pk_scan)
   pass row to output;
}

See: `sql/opt_range.cc', QUICK_INDEX_MERGE_SELECT class members for Index Merge row retrieval code.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3. Important Algorithms and Structures

MySQL uses many different algorithms and structures. This chapter tries to describe some of them.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.1 How MySQL Does Sorting (filesort)

In those cases where MySQL must sort the result, it uses the following filesort algorithm before MySQL 4.1:

  1. Read all rows according to key or by table scanning. Rows that don't match the WHERE clause are skipped.
  2. For each row, store a pair of values in a buffer (the sort key and the row pointer). The size of the buffer is the value of the sort_buffer_size system variable.
  3. When the buffer gets full, run a qsort (quicksort) on it and store the result in a temporary file. Save a pointer to the sorted block. (If all pairs fit into the sort buffer, no temporary file is created.)
  4. Repeat the preceding steps until all rows have been read.
  5. Do a multi-merge of up to MERGEBUFF (7) regions to one block in another temporary file. Repeat until all blocks from the first file are in the second file.
  6. Repeat the following until there are fewer than MERGEBUFF2 (15) blocks left.
  7. On the last multi-merge, only the pointer to the row (the last part of the sort key) is written to a result file.
  8. Read the rows in sorted order by using the row pointers in the result file. To optimize this, we read in a big block of row pointers, sort them, and use them to read the rows in sorted order into a row buffer. The size of the buffer is the value of the read_rnd_buffer_size system variable. The code for this step is in the `sql/records.cc' source file.

One problem with this approach is that it reads rows twice: One time when evaluating the WHERE clause, and again after sorting the pair values. And even if the rows were accessed successively the first time (for example, if a table scan is done), the second time they are accessed randomly. (The sort keys are ordered, but the row positions are not.)

In MySQL 4.1 and up, a filesort optimization is used that records not only the sort key value and row position, but also the columns required for the query. This avoids reading the rows twice. The modified filesort algorithm works like this:

  1. Read the rows that match the WHERE clause, as before.
  2. For each row, record a tuple of values consisting of the sort key value and row position, and also the columns required for the query.
  3. Sort the tuples by sort key value
  4. Retrieve the rows in sorted order, but read the required columns directly from the sorted tuples rather than by accessing the table a second time.

Using the modified filesort algorithm, the tuples are longer than the pairs used in the original method, and fewer of them fit in the sort buffer (the size of which is given by sort_buffer_size). As a result, it is possible for the extra I/O to make the modified approach slower, not faster. To avoid a slowdown, the optimization is used only if the total size of the extra columns in the sort tuple does not exceed the value of the max_length_for_sort_data system variable. (A symptom of setting the value of this variable too high is that you will see high disk activity and low CPU activity.)


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.2 Bulk Insert

The logic behind bulk insert optimization is simple.

Instead of writing each key value to B-tree (that is, to the key cache, although the bulk insert code doesn't know about the key cache), we store keys in a balanced binary (red-black) tree, in memory. When this tree reaches its memory limit, we write all keys to disk (to key cache, that is). But since the key stream coming from the binary tree is already sorted, inserting goes much faster, all the necessary pages are already in cache, disk access is minimized, and so forth.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.3 How MySQL Does Caching

MySQL has the following caches. (Note that the some of the filenames contain an incorrect spelling of the word "cache.")

Key Cache

A shared cache for all B-tree index blocks in the different NISAM files. Uses hashing and reverse linked lists for quick caching of the most recently used blocks and quick flushing of changed entries for a specific table. (`mysys/mf_keycash.c')

Record Cache

This is used for quick scanning of all records in a table. (`mysys/mf_iocash.c' and `isam/_cash.c')

Table Cache

This holds the most recently used tables. (`sql/sql_base.cc')

Hostname Cache

For quick lookup (with reverse name resolving). This is a must when you have a slow DNS. (`sql/hostname.cc')

Privilege Cache

To allow quick change between databases, the last used privileges are cached for each user/database combination. (`sql/sql_acl.cc')

Heap Table Cache

Many uses of GROUP BY or DISTINCT cache all found rows in a HEAP table. (This is a very quick in-memory table with hash index.)

Join Buffer Cache

For every "full join" in a SELECT statement the rows found are cached in a join cache. (A "full join" here means there were no keys that could be used to find rows for the next table in the list.) In the worst case, one SELECT query can use many join caches.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.4 How MySQL Uses the Join Buffer Cache

Basic information about the join buffer cache:

Assume you have the following join:

 
Table name      Type
t1              range
t2              ref
t3              ALL

The join is then done as follows:

 
- While rows in t1 matching range
 - Read through all rows in t2 according to reference key
  - Store used fields from t1, t2 in cache
  - If cache is full
    - Read through all rows in t3
      - Compare t3 row against all t1, t2 combinations in cache
        - If row satisfies join condition, send it to client
    - Empty cache

- Read through all rows in t3
 - Compare t3 row against all stored t1, t2 combinations in cache
   - If row satisfies join condition, send it to client

The preceding description means that the number of times table t3 is scanned is determined as follows:

 
S = size-of-stored-row(t1,t2)
C = accepted-row-combinations(t1,t2)
scans = (S * C)/join_buffer_size + 1

Some conclusions:


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.5 How MySQL Handles FLUSH TABLES


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.6 Full-text Search in MySQL

Hopefully, sometime there will be complete description of full-text search algorithms. For now, it's just unsorted notes.

Weighting in boolean mode

The basic idea is as follows: In an expression of the form A or B or (C and D and E), either A or B alone is enough to match the whole expression, whereas C, D, and E should all match. So it's reasonable to assign weight 1 to each of A, B, and (C and D and E). Furthermore, C, D, and E each should get a weight of 1/3.

Things become more complicated when considering boolean operators, as used in MySQL full-text boolean searching. Obviously, +A +B should be treated as A and B, and A B - as A or B. The problem is that +A B can not be rewritten in and/or terms (that's the reason why this--extended--set of operators was chosen). Still, aproximations can be used. +A B C can be approximated as A or (A and (B or C)) or as A or (A and B) or (A and C) or (A and B and C). Applying the above logic (and omitting mathematical transformations and normalization) one gets that for +A_1 +A_2 ... +A_N B_1 B_2 ... B_M the weights should be: A_i = 1/N, B_j=1 if N==0, and, otherwise, in the first rewriting approach B_j = 1/3, and in the second one - B_j = (1+(M-1)*2^M)/(M*(2^(M+1)-1)).

The second expression gives a somewhat steeper increase in total weight as number of matched B_j values increases, because it assigns higher weights to individual B_j values. Also, the first expression is much simpler, so it is the first one that is implemented in MySQL.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.7 FLOAT and DOUBLE data types and their representation.

A quotation from the MySQL Reference Manual regarding numeric types:

"12.2 Numeric Types ... For floating-point column types, MySQL uses four bytes for single-precision values and eight bytes for double-precision values.

The FLOAT type is used to represent approximate numeric data types. The SQL standard allows an optional specification of the precision (but not the range of the exponent) in bits following the keyword FLOAT in parentheses. The MySQL implementation also supports this optional precision specification, but the precision value is used only to determine storage size. A precision from 0 to 23 results in four-byte single-precision FLOAT column. A precision from 24 to 53 results in eight-byte double-precision DOUBLE column.

When the keyword FLOAT is used for a column type without a precision specification, MySQL uses four bytes to store the values. MySQL also supports variant syntax with two numbers given in parentheses following the FLOAT keyword. The first number represents the display width and the second number specifies the number of digits to be stored and displayed following the decimal point (as with DECIMAL and NUMERIC). When MySQL is asked to store a number for such a column with more decimal digits following the decimal point than specified for the column, the value is rounded to eliminate the extra digits when the value is stored.

In standard SQL, the REAL and DOUBLE PRECISION types do not accept precision specifications. MySQL supports a variant syntax with two numbers given in parentheses following the type name. The first number represents the display width and the second number specifies the number of digits to be stored and displayed following the decimal point. As an extension to the SQL standard, MySQL recognizes DOUBLE as a synonym for the DOUBLE PRECISION type. In contrast with the standard's requirement that the precision for REAL be smaller than that used for DOUBLE PRECISION, MySQL implements both as eight-byte double-precision floating-point values (unless the server SQL mode includes the REAL_AS_FLOAT option).

For maximum portability, code requiring storage of approximate numeric data values should use FLOAT or DOUBLE PRECISION with no specification of precision or number of decimal points."

The following discussion concentrates on the case where no display width and decimals are given. This means that FLOAT is stored as whatever the C type float is and REAL or DOUBLE [PRECISION] is stored as whatever the C type double is. The field length is selected by the MySQL code.

This document was created when Bug #4457 (Different results in SQL-Statements for the same record) was fixed at the end of August 2004. Until then there was some confusion in the double-to-string conversion at different places in the code.

The bugfix for Bug #4937 (INSERT + SELECT + UNION ALL + DATE to VARCHAR(8) conversion problem) produced a conversion function which was a promising approach to the conversion problems. Unfortunately it was only used for direct field conversions and not for function results etc. It did not take small numbers (absolute value less than 1) and negative numbers into account. It did not take the limited precision of float and double data types into account. The bugfix was developed in two steps: The first attempt looked like this (in principle):

 
length= sprintf(buff, "%.*g", field_length, nr);
if (length > field_length)
  length= sprintf(buff, "%.*g", field_length-5, nr);

If the libc conversion brings too much characters, the precision is reduced by the space required for the scientific notation (1.234e+05). Thus the printf() conversion is forced to switch to the scientific notation, since the value would not fit otherwise. Or, if it was scientific already, the precision was reduced and also uses less space. I left out some important stuff around limit checking just to show the idea. This simple algorithm should work quite well in most cases, but has been discarded for the sake of performance. The double call to the slow printf() conversion %g didn't seem reasonable, though it would only be used for extreme values and small fields. During my explorations of the code I didn't find places where float or double were to be converted into small fields. Remeber that I talk only of conversions where field length and precision are not given. In this case a sufficient field length is selected at several places, except of a bug where it was selected wrongly. If a field length is given, a different conversion is used anyway. But since the code is quite complex, I don't claim to have the full insigth and as such may be in error. Hence, let us look further:

The second attempt in the bugfix was like this:

 
bool use_scientific_notation=TRUE;
if (field_length < 32 && nr > 1)
{
  double e[]={1, 1e1, 1e2, 1e4, 1e8, 1e16 }, p=1;
  for (int i=sizeof(e), j=1<<i-- ; j; i--,  j>>=1 )
  {
    if (field_length & j)
      p*=e[i];
  }
  use_scientific_notation=(p < nr);
}
length= sprintf(buff, "%.*g", use_scientific_notation ?
                              field_length-5 : field_length, nr);

Here we evaluate if the string representation of a given number fits into field_length characters. If not, we reduce the precision to make it fit. Again, I left out important details. For example, the evaluation is done only once per field for the sake of performance. The downside here is the unconditional reduction of precision for field length > 31 (which doesn't really matter), for negative numbers and for small numbers (absolute value less than 1).

Both algorithms do not take the limited precision of float and double values into account. This could lead to conversions with ridiculous bogus precision output. For example a value of 0.7 converted with %.30g will give a lot of digits, which pretend to tell about deviations from the value 0.7 and are completely absurd: 0.699999988079071044921875. To understand more about the %g conversion, I quote from a comment introduced in the source at the beginning of bugfixing #4937 (this has been removed later, since it mainly describes, how the printf() conversion works. I think it's valuable enough to include it here):

 
/*
  Let's try to pretty print a floating point number. Here we use
  '%-*.*g' conversion string:
    '-' stands for right-padding with spaces, if such padding will take
  place
    '*' is a placeholder for the first argument, field_length, and
  signifies minimal width of result string. If result is less than
  field length it will be space-padded. Note, however, that we'll not
  pass spaces to Field_string::store(const char *, ...), due to
  strcend in the next line.
    '.*' is a placeholder for DBL_DIG and defines maximum number of
  significant digits in the result string. DBL_DIG is a hardware
  specific C define for maximum number of decimal digits of a floating
  point number, such that rounding to hardware floating point
  representation and back to decimal will not lead to loss of
  precision. I.e if DBL_DIG is 15, number 123456789111315 can be
  represented as double without precision loss.  As one can judge from
  this description, chosing DBL_DIG here is questionable, especially
  because it introduces a system dependency.
    'g' means that conversion will use [-]ddd.ddd (conventional) style,
  and fall back to [-]d.ddde[+|i]ddd (scientific) style if there is no
  enough space for all digits.
  Maximum length of result string (not counting spaces) is (I guess)
  DBL_DIG + 8, where 8 is 1 for sign, 1 for decimal point, 1 for
  exponent sign, 1 for exponent, and 4 for exponent value.
  XXX: why do we use space-padding and trim spaces in the next line?
*/
sprintf(to,"%-*.*g",(int) field_length,DBL_DIG,nr);
to=strcend(to,' ');

There is one small misapprehension in the comment. %g does not switch to scientific notation when there is 'not enough space for all digits'. As the commentator says, the field length gives the minimal output length. printf() happily outputs more characters if required to produce a result with 'precision' digits. In fact it switches to scientific when the value can no longer be represented by 'precision' digits in conventional notation. The man page says "Style e is used if the exponent from its conversion is less than -4 or greater than or equal to the precision." In explanation, a precision of 3 digits can print a value of 345 in conventional notation, but 3456 needs scientific notation, as it would require 4 digits (a precision of 4) in conventional notation. Thus, it is printed as 3.46e+03 (rounded).

Since we don't want spaces in the output, we should not give a field length, but always use "%.*g". However, the precision matters, as seen above. It is worth its own paragraph.

Since MySQL uses the machine-dependent binary representation of float and double to store values in the database, we have to care about these. Today, most systems use the IEEE standard 754 for binary floating-point arithmetic. It describes a representation for single precision numbers as 1 bit for sign, 8 bits for biased exponent and 23 bits for fraction and for double precision numbers as 1-bit sign, 11-bit biased exponent and 52-bit fraction. However, we can not rely in the fact that every system uses this representation. Luckily, the ISO C standard requires the standard C library to have a header `float.h' that describes some details of the floating point representation on a machine. The comment above describes the value DBL_DIG. There is an equivalent value FLT_DIG for the C data type float.

So, whenever we print a floating-point value, we must not specify a precision above DBL_DIG or FLT_DIG respectively. Otherwise we produce a bogus precision, which is wrong. For the honor of the writer of the first attempt above, I must say that his complete algorithm took DBL_DIG into account, if however only for the second call to sprintf(). But FLT_DIG has never been accounted for. At the place of the conversions it was not even known, if the value has come from a float or double field.

My attempt to solve the problems tries to take all this into account. I tried to concentrate all float/double-to-string conversions in one function, and to bring the knowledge about float versus double to this function wherever it is called. This solution managed to keep the test suite happy while solving the new problem of Bug #4457. Luckily, the first was not a big problem, as the test cases have been very carefully selected, so that they succeed as long as the machine uses IEEE 754 ;).

Nevertheless, the function is still not perfect. It is not possible to guess how many sigificant digits a number has, and as such it is not simple to tell how long the resulting string would be. This applies to numbers with an absolute value smaller then 1. There are probably ways to figure this out, but I doubt that we would win in terms of performance over the simple solution of the first attempt, let alone the chance for new bugs. The compromise taken here is to accept that the resulting string may exceed the destined field length by five characters in the worst case.

 
if (nr < 0.0)
{
  abs_nr= -nr;
  extra_space= 1;
}
else
{
  abs_nr= nr;
  extra_space= 0;
}
precision= is_float ? FLT_DIG : DBL_DIG;
if (precision > field_length)
  precision= field_length;

if (! initialized)
{
  /* Better switch to scientific too early than too late. */
  double mult;
  mult= 1e0;
  for (length= 0; length < DBL_DIG; length++)
    mult/= 1e1;
  mult= 1e1 - mult;

  double val;
  val= 1.0;
  for (int idx= 0; idx < DBL_DIG+1; idx++)
  {
    DBUG_PRINT("info",("double_to_string_conv: big[%d] %.*g",
                       idx, DBL_DIG+3, val));
    big_number[idx]= val;
    val*= mult;
  }
  small_number[0]= 1e0;
  small_number[1]= 1e0;
  small_number[2]= 1e0;
  small_number[3]= 1e-1;
  small_number[4]= 1e-2;
  small_number[5]= 1e-3;
  small_number[6]= 1e-4;
  /* %g switches to scientific when exponent < -4. */
  for (int idx= 7; idx < DBL_DIG+1; idx++)
    small_number[idx]= 1e-4;
  initialized= TRUE;
}

use_scientific_notation= (abs_nr != 0.0) &&
                         ((abs_nr >   big_number[precision]) ||
                          (abs_nr < small_number[precision]));

if (use_scientific_notation)
{
  if (((nr >= 0.0) && ((nr >= 1e+100) || (nr <= 1e-100))) ||
      ((nr < 0.0) && ((nr <= -1e+100) || (nr >= -1e-100))))
    extra_space+= 6; /* .e+100 or .e-100 */
  else
    extra_space+= 5; /* .e+99  or .e-99 */
}

if (field_length < extra_space)
  precision= 0;
else if (precision > (field_length - extra_space))
  precision= field_length - extra_space;

length= sprintf(buff, "%.*g", precision, nr);

This solution takes performance into account by initializing the limiting numbers arrays only once into static space. It copes with negative numbers and tries to decide even over small numbers. The latter has only small implications, as the prefix 0.000 is exactly the same size as the postfix e-100. But to know if scientific notation will be selected by sprintf() allows for saving one digit when the exponent is larger than -100.

The calculations for the big number array are less precise than in the second attempt, but faster. The precision is sufficient for the guess whether sprintf() uses scientific notation. There may be number to field length combinations which exploit the gap, but these won't emerge anyway as I found no situation where this function is called with small field lengths. Remember again that it is not called with user supplied field lengths.

However in the current stable releases (including gamma) we have some places where the field length is too small by one character. Thus, the precision is sometimes by one digit smaller than DBL_DIG would allow for. Consequently, we cannot use the simple algorithm in the stable releases. There is a chance for doing it in a development release, though.

Addendum:

There turned out to be a new solution to the "big number array" problem. We have a statically initialized array log_10, which holds the necessary values. But I did not check, if these values are safe. Even if computed by the compiler, they could carry values slightly above the decimal powers, which would be bad. In this case we needed to initialize by 9.99999999e+xxx, where the number of nines is equal to DBL_DIG. This must be protected by #if DBL_DIG == yy, so that a new DBL_DIG on a new platform is detected. And the array is of limited length. We must at least protect it by a DBUG_ASSERT(sizeof(log_10)/sizeof(log_10[0]) > DBL_DIG).

But all of this is probably completely unneccessary, since we are only speaking of ceses where no user-supplied field length is given. So MySQL selects the field length on its own. So it is totally possible, yes highly desired that MySQL selects a field length, which allows for a maximum of precision for all possible values. And these are DBL_DIG+7 or FLT_DIG+6 respectively as far as IEEE 754 is used. In this case we can have values of about +/-1e-307 to +/-1e+308 for double and +/-1e-37 to +/-1e+38 for float. That is, for example -1.<DBL_DIG-1 digits>e+100. For cases where a precision above IEEE 754 is possible, we may need +8 instead. We can detect this with #if DBL_MAX_10_EXP >= 1000. So using a field length of DBL_DIG+8 in all cases should be sufficient for a simple sprintf(buff, "%.*g", DBL_DIG, nr) or sprintf(buff, "%.*g", FLT_DIG, nr), respectively. To be safe, we should not use the machine dependend constants everywhere, but instead concentrate them into definitions like these:

 
#if (DBL_MAX_10_EXP > 9999) || (DBL_MIN_10_EXP < -9999)
#  error "Need new definition for UNSPECIFIED_DOUBLE_FIELD_LENGTH"
#elif (DBL_MAX_10_EXP > 999) || (DBL_MIN_10_EXP < -999)
#  define UNSPECIFIED_DOUBLE_FIELD_LENGTH (DBL_DIG+8)
#else
#  define UNSPECIFIED_DOUBLE_FIELD_LENGTH (DBL_DIG+7)
#endif

#if (FLT_MAX_10_EXP > 999) || (FLT_MIN_10_EXP < -999)
#error "Need new definition for UNSPECIFIED_FLOAT_FIELD_LENGTH"
#elif (FLT_MAX_10_EXP > 99) || (FLT_MIN_10_EXP < -99)
#  define UNSPECIFIED_FLOAT_FIELD_LENGTH (FLT_DIG+7)
#else
#  define UNSPECIFIED_FLOAT_FIELD_LENGTH (FLT_DIG+6)
#endif

These definitions should be used wherever an item or field of type float or double without an explicit field length specification is encountered. We have to propagate these lengths though all deviated items and fields and to select the maximum of all field lengths wherever in an expression or function two or more of them are used.

We need to treat the precision (DBL_DIG/FLT_DIG) similarly, but have to select the smallest in expressions or functions.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.8 Functions in the mysys Library

Functions in mysys: (For flags see `my_sys.h')

int my_copy _A((const char *from, const char *to, myf MyFlags));

Copy file from from to to.

int my_rename _A((const char *from, const char *to, myf MyFlags));

Rename file from from to to.

int my_delete _A((const char *name, myf MyFlags));

Delete file name.

int my_redel _A((const char *from, const char *to, int MyFlags));

Delete from before rename of to to from. Copies state from old file to new file. If MY_COPY_TIME is set, sets old time.

int my_getwd _A((string buf, uint size, myf MyFlags));
int my_setwd _A((const char *dir, myf MyFlags));

Get and set working directory.

string my_tempnam _A((const char *dir, const char *pfx, myf MyFlags));

Make a unique temporary file name by using dir and adding something after pfx to make the name unique. The file name is made by adding a unique six character string and TMP_EXT after pfx. Returns pointer to malloc()'ed area for filename. Should be freed by free().

File my_open _A((const char *FileName,int Flags,myf MyFlags));
File my_create _A((const char *FileName, int CreateFlags, int AccsesFlags, myf MyFlags));
int my_close _A((File Filedes, myf MyFlags));
uint my_read _A((File Filedes, byte *Buffer, uint Count, myf MyFlags));
uint my_write _A((File Filedes, const byte *Buffer, uint Count, myf MyFlags));
ulong my_seek _A((File fd,ulong pos,int whence,myf MyFlags));
ulong my_tell _A((File fd,myf MyFlags));

Use instead of open, open-with-create-flag, close, read, and write to get automatic error messages (flag MYF_WME) and only have to test for != 0 if error (flag MY_NABP).

FILE *my_fopen _A((const char *FileName,int Flags,myf MyFlags));
FILE *my_fdopen _A((File Filedes,int Flags,myf MyFlags));
int my_fclose _A((FILE *fd,myf MyFlags));
uint my_fread _A((FILE *stream,byte *Buffer,uint Count,myf MyFlags));
uint my_fwrite _A((FILE *stream,const byte *Buffer,uint Count, myf MyFlags));
ulong my_fseek _A((FILE *stream,ulong pos,int whence,myf MyFlags));
ulong my_ftell _A((FILE *stream,myf MyFlags));

Same read-interface for streams as for files.

gptr _mymalloc _A((uint uSize,const char *sFile,uint uLine, myf MyFlag));
gptr _myrealloc _A((string pPtr,uint uSize,const char *sFile,uint uLine, myf MyFlag));
void _myfree _A((gptr pPtr,const char *sFile,uint uLine));
int _sanity _A((const char *sFile,unsigned int uLine));
gptr _myget_copy_of_memory _A((const byte *from,uint length,const char *sFile, uint uLine,myf MyFlag));

malloc(size,myflag) is mapped to these functions if not compiled with -DSAFEMALLOC.

void TERMINATE _A((void));

Writes malloc() info on stdout if compiled with -DSAFEMALLOC.

int my_chsize _A((File fd, ulong newlength, myf MyFlags));

Change size of file fd to newlength.

void my_error _D((int nr, myf MyFlags, ...));

Writes message using error number (see `mysys/errors.h') on stdout, or using curses, if MYSYS_PROGRAM_USES_CURSES() has been called.

void my_message _A((const char *str, myf MyFlags));

Writes str on stdout, or using curses, if MYSYS_PROGRAM_USES_CURSES() has been called.

void my_init _A((void ));

Start each program (in main()) with this.

void my_end _A((int infoflag));

Gives info about program. If infoflag & MY_CHECK_ERROR, prints if some files are left open. If infoflag & MY_GIVE_INFO, prints timing info and malloc() info about program.

int my_copystat _A((const char *from, const char *to, int MyFlags));

Copy state from old file to new file. If MY_COPY_TIME is set, sets old time.

string my_filename _A((File fd));

Returns filename of open file.

int dirname _A((string to, const char *name));

Copy name of directory from filename.

int test_if_hard_path _A((const char *dir_name));

Test if dir_name is a hard path (starts from root).

void convert_dirname _A((string name));

Convert dirname according to system. On Windows, changes all characters to capitals and changes `/' to `\'.

string fn_ext _A((const char *name));

Returns pointer to extension in filename.

string fn_format _A((string to,const char *name,const char *dsk,const char *form,int flag));

Format a filename with replacement of library and extension and convert between different systems. The to and name parameters may be identical. Function doesn't change name if name != to. flag may be:

1

Force replace filnames library with 'dsk'

2

Force replace extension with 'form' */

4

Force unpack filename (replace ~ with home directory)

8

Pack filename as short as possible for output to user

All open requests should always use at least open(fn_format(temp_buffer, name, "", "", 4), ...) to unpack home and convert filename to system-form.

string fn_same _A((string toname, const char *name, int flag));

Copies directory and extension from name to toname if needed. Copying can be forced by same flags used in fn_format().

int wild_compare _A((const char *str, const char *wildstr));

Compare if str matches wildstr. wildstr can contain `*' and `?' as wildcard characters. Returns 0 if str and wildstr match.

void get_date _A((string to, int timeflag));

Get current date in a form ready for printing.

void soundex _A((string out_pntr, string in_pntr))

Makes in_pntr to a 5 char long string. All words that sound alike have the same string.

int init_key_cache _A((ulong use_mem, ulong leave_this_much_mem));

Use caching of keys in MISAM, PISAM, and ISAM. KEY_CACHE_SIZE is a good size. Remember to lock databases for optimal caching.

void end_key_cache _A((void));

End key caching.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4. Charsets and Related Issues


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.1 CHARSET_INFO Structure


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

5. How MySQL Performs Different Selects


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

5.1 Steps of Select Execution

Every select is performed in these base steps:


[ < ] [ > ]   [