mirror of
https://github.com/MariaDB/server.git
synced 2025-01-25 00:04:33 +01:00
19013b3031
Docs/internals.texi: fulltext chapter added Docs/manual.texi: news updated BitKeeper/etc/ignore: Added Docs/internals.info to the ignore list myisam/ft_boolean_search.c: weighting scheme changed
834 lines
26 KiB
Text
834 lines
26 KiB
Text
\input texinfo @c -*-texinfo-*-
|
|
@c Copyright 2002 MySQL AB, TcX AB, Detron HB and Monty Program KB
|
|
@c
|
|
@c %**start of header
|
|
@setfilename internals.info
|
|
|
|
@c We want the types in the same index
|
|
@synindex cp fn
|
|
|
|
@iftex
|
|
@afourpaper
|
|
@end iftex
|
|
|
|
@c Get version and other info
|
|
@include include.texi
|
|
|
|
@ifclear tex-debug
|
|
@c This removes the black squares in the right margin
|
|
@finalout
|
|
@end ifclear
|
|
|
|
@c Set background for HTML
|
|
@set _body_tags BGCOLOR=#FFFFFF TEXT=#000000 LINK=#101090 VLINK=#7030B0
|
|
@settitle @strong{MySQL} Internals Manual for version @value{mysql_version}.
|
|
@setchapternewpage odd
|
|
@paragraphindent 0
|
|
|
|
@c %**end of header
|
|
|
|
@ifinfo
|
|
@format
|
|
START-INFO-DIR-ENTRY
|
|
* mysql-internals: (mysql-internals). @strong{MySQL} internals.
|
|
END-INFO-DIR-ENTRY
|
|
@end format
|
|
@end ifinfo
|
|
|
|
@titlepage
|
|
@sp 10
|
|
@center @titlefont{@strong{MySQL} Internals Manual}
|
|
@sp 10
|
|
@center Copyright @copyright{} 1998-2002 MySQL AB
|
|
@page
|
|
@end titlepage
|
|
|
|
@node Top, caching, (dir), (dir)
|
|
|
|
@ifinfo
|
|
This is a manual about @strong{MySQL} internals.
|
|
@end ifinfo
|
|
|
|
@menu
|
|
* caching:: How MySQL Handles Caching
|
|
* flush tables:: How MySQL Handles @code{FLUSH TABLES}
|
|
* filesort:: How MySQL Does Sorting (@code{filesort})
|
|
* coding guidelines:: Coding Guidelines
|
|
* mysys functions:: Functions In The @code{mysys} Library
|
|
* DBUG:: DBUG Tags To Use
|
|
* protocol:: MySQL Client/Server Protocol
|
|
* Fulltext Search:: Fulltext Search in MySQL
|
|
@end menu
|
|
|
|
|
|
@node caching, flush tables, Top, Top
|
|
@chapter How MySQL Handles Caching
|
|
|
|
@strong{MySQL} has the following caches:
|
|
(Note that the some of the filename have a wrong spelling of cache. :)
|
|
|
|
@table @strong
|
|
|
|
@item Key Cache
|
|
A shared cache for all B-tree index blocks in the different NISAM
|
|
files. Uses hashing and reverse linked lists for quick caching of the
|
|
last used blocks and quick flushing of changed entries for a specific
|
|
table. (@file{mysys/mf_keycash.c})
|
|
|
|
@item Record Cache
|
|
This is used for quick scanning of all records in a table.
|
|
(@file{mysys/mf_iocash.c} and @file{isam/_cash.c})
|
|
|
|
@item Table Cache
|
|
This holds the last used tables. (@file{sql/sql_base.cc})
|
|
|
|
@item Hostname Cache
|
|
For quick lookup (with reverse name resolving). Is a must when one has a
|
|
slow DNS.
|
|
(@file{sql/hostname.cc})
|
|
|
|
@item Privilege Cache
|
|
To allow quick change between databases the last used privileges are
|
|
cached for each user/database combination.
|
|
(@file{sql/sql_acl.cc})
|
|
|
|
@item Heap Table Cache
|
|
Many use of @code{GROUP BY} or @code{DISTINCT} caches all found rows in
|
|
a @code{HEAP} table. (This is a very quick in-memory table with hash index.)
|
|
|
|
@item Join Row Cache
|
|
For every full join in a @code{SELECT} statement (a full join here means
|
|
there were no keys that one could use to find the next table in a list),
|
|
the found rows are cached in a join cache. One @code{SELECT} query can
|
|
use many join caches in the worst case.
|
|
@end table
|
|
|
|
|
|
@node flush tables, filesort, caching, Top
|
|
@chapter How MySQL Handles @code{FLUSH TABLES}
|
|
|
|
@itemize @bullet
|
|
|
|
@item
|
|
Flush tables is handled in @file{sql/sql_base.cc::close_cached_tables()}.
|
|
|
|
@item
|
|
The idea of flush tables is to force all tables to be closed. This
|
|
is mainly to ensure that if someone adds a new table outside of
|
|
@strong{MySQL} (for example with @code{cp}) all threads will start using
|
|
the new table. This will also ensure that all table changes are flushed
|
|
to disk (but of course not as optimally as simple calling a sync on
|
|
all tables)!
|
|
|
|
@item
|
|
When one does a @code{FLUSH TABLES}, the variable @code{refresh_version}
|
|
will be incremented. Every time a thread releases a table it checks if
|
|
the refresh version of the table (updated at open) is the same as
|
|
the current @code{refresh_version}. If not it will close it and broadcast
|
|
a signal on @code{COND_refresh} (to wait any thread that is waiting for
|
|
all instanses of a table to be closed).
|
|
|
|
@item
|
|
The current @code{refresh_version} is also compared to the open
|
|
@code{refresh_version} after a thread gets a lock on a table. If the
|
|
refresh version is different the thread will free all locks, reopen the
|
|
table and try to get the locks again; This is just to quickly get all
|
|
tables to use the newest version. This is handled by
|
|
@file{sql/lock.cc::mysql_lock_tables()} and
|
|
@file{sql/sql_base.cc::wait_for_tables()}.
|
|
|
|
@item
|
|
When all tables has been closed @code{FLUSH TABLES} will return an ok
|
|
to client.
|
|
|
|
@item
|
|
If the thread that is doing @code{FLUSH TABLES} has a lock on some tables,
|
|
it will first close the locked tables, then wait until all other threads
|
|
have also closed them, and then reopen them and get the locks.
|
|
After this it will give other threads a chance to open the same tables.
|
|
|
|
@end itemize
|
|
|
|
@node filesort, coding guidelines, flush tables, Top
|
|
@chapter How MySQL Does Sorting (@code{filesort})
|
|
|
|
@itemize @bullet
|
|
|
|
@item
|
|
Read all rows according to key or by table scanning.
|
|
|
|
@item
|
|
Store the sort-key in a buffer (@code{sort_buffer}).
|
|
|
|
@item
|
|
When the buffer gets full, run a @code{qsort} on it and store the result
|
|
in a temporary file. Save a pointer to the sorted block.
|
|
|
|
@item
|
|
Repeat the above until all rows have been read.
|
|
|
|
@item
|
|
Repeat the following until there is less than @code{MERGEBUFF2} (15)
|
|
blocks left.
|
|
|
|
@item
|
|
Do a multi-merge of up to @code{MERGEBUFF} (7) regions to one block in
|
|
another temporary file. Repeat until all blocks from the first file
|
|
are in the second file.
|
|
|
|
@item
|
|
On the last multi-merge, only the pointer to the row (last part of
|
|
the sort-key) is written to a result file.
|
|
|
|
@item
|
|
Now the code in @file{sql/records.cc} will be used to read through them
|
|
in sorted order by using the row pointers in the result file.
|
|
To optimize this, we read in a big block of row pointers, sort these
|
|
and then we read the rows in the sorted order into a row buffer
|
|
(@code{record_buffer}).
|
|
|
|
@end itemize
|
|
|
|
|
|
@node coding guidelines, mysys functions, filesort, Top
|
|
@chapter Coding Guidelines
|
|
|
|
@itemize @bullet
|
|
|
|
@item
|
|
We are using @uref{http://www.bitkeeper.com/, BitKeeper} for source management.
|
|
|
|
@item
|
|
You should use the @strong{MySQL} 4.0 source for all developments.
|
|
|
|
@item
|
|
If you have any questions about the @strong{MySQL} source, you can post these
|
|
to @email{dev-public@@mysql.com} and we will answer them. Please
|
|
remember to not use this internal email list in public!
|
|
|
|
@item
|
|
Try to write code in a lot of black boxes that can be reused or use at
|
|
least a clean, easy to change interface.
|
|
|
|
@item
|
|
Reuse code; There is already a lot of algorithms in MySQL for list handling,
|
|
queues, dynamic and hashed arrays, sorting, etc. that can be reused.
|
|
|
|
@item
|
|
Use the @code{my_*} functions like @code{my_read()}/@code{my_write()}/
|
|
@code{my_malloc()} that you can find in the @code{mysys} library instead
|
|
of the direct system calls; This will make your code easier to debug and
|
|
more portable.
|
|
|
|
@item
|
|
Try to always write optimized code, so that you don't have to
|
|
go back and rewrite it a couple of months later. It's better to
|
|
spend 3 times as much time designing and writing an optimal function than
|
|
having to do it all over again later on.
|
|
|
|
@item
|
|
Avoid CPU wasteful code, even where it does not matter, so that
|
|
you will not develop sloppy coding habits.
|
|
|
|
@item
|
|
If you can write it in fewer lines, do it (as long as the code will not
|
|
be slower or much harder to read).
|
|
|
|
@item
|
|
Don't use two commands on the same line.
|
|
|
|
@item
|
|
Do not check the same pointer for @code{NULL} more than once.
|
|
|
|
@item
|
|
Use long function and variable names in English. This makes your code
|
|
easier to read.
|
|
|
|
@item
|
|
Use @code{my_var} as opposed to @code{myVar} or @code{MyVar} (@samp{_}
|
|
rather than dancing SHIFT to seperate words in identifiers).
|
|
|
|
@item
|
|
Think assembly - make it easier for the compiler to optimize your code.
|
|
|
|
@item
|
|
Comment your code when you do something that someone else may think
|
|
is not ``trivial''.
|
|
|
|
@item
|
|
Use @code{libstring} functions (in the @file{strings} directory)
|
|
instead of standard @code{libc} string functions whenever possible.
|
|
|
|
@item
|
|
Avoid using @code{malloc()} (its REAL slow); For memory allocations
|
|
that only need to live for the lifetime of one thread, one should use
|
|
@code{sql_alloc()} instead.
|
|
|
|
@item
|
|
Before making big design decisions, please first post a summary of
|
|
what you want to do, why you want to do it, and how you plan to do
|
|
it. This way we can easily provide you with feedback and also
|
|
easily discuss it thoroughly if some other developer thinks there is better
|
|
way to do the same thing!
|
|
|
|
@item
|
|
Class names start with a capital letter.
|
|
|
|
@item
|
|
Structure types are @code{typedef}'ed to an all-caps identifier.
|
|
|
|
@item
|
|
Any @code{#define}'s are in all-caps.
|
|
|
|
@item
|
|
Matching @samp{@{} are in the same column.
|
|
|
|
@item
|
|
Put the @samp{@{} after a @code{switch} on the same line, as this gives
|
|
better overall indentation for the switch statement:
|
|
|
|
@example
|
|
switch (arg) @{
|
|
@end example
|
|
|
|
@item
|
|
In all other cases, @samp{@{} and @samp{@}} should be on their own line, except
|
|
if there is nothing inside @samp{@{} and @samp{@}}.
|
|
|
|
@item
|
|
Have a space after @code{if}
|
|
|
|
@item
|
|
Put a space after @samp{,} for function arguments
|
|
|
|
@item
|
|
Functions return @samp{0} on success, and non-zero on error, so you can do:
|
|
|
|
@example
|
|
if(a() || b() || c()) @{ error("something went wrong"); @}
|
|
@end example
|
|
|
|
@item
|
|
Using @code{goto} is okay if not abused.
|
|
|
|
@item
|
|
Avoid default variable initalizations, use @code{LINT_INIT()} if the
|
|
compiler complains after making sure that there is really no way
|
|
the variable can be used uninitialized.
|
|
|
|
@item
|
|
Do not instantiate a class if you do not have to.
|
|
|
|
@item
|
|
Use pointers rather than array indexing when operating on strings.
|
|
|
|
@end itemize
|
|
|
|
Suggested mode in emacs:
|
|
|
|
@example
|
|
(load "cc-mode")
|
|
(setq c-mode-common-hook '(lambda ()
|
|
(turn-on-font-lock)
|
|
(setq comment-column 48)))
|
|
(setq c-style-alist
|
|
(cons
|
|
'("MY"
|
|
(c-basic-offset . 2)
|
|
(c-comment-only-line-offset . 0)
|
|
(c-offsets-alist . ((statement-block-intro . +)
|
|
(knr-argdecl-intro . 0)
|
|
(substatement-open . 0)
|
|
(label . -)
|
|
(statement-cont . +)
|
|
(arglist-intro . c-lineup-arglist-intro-after-paren)
|
|
(arglist-close . c-lineup-arglist)
|
|
))
|
|
)
|
|
c-style-alist))
|
|
(c-set-style "MY")
|
|
(setq c-default-style "MY")
|
|
@end example
|
|
|
|
|
|
@node mysys functions, DBUG, coding guidelines, Top
|
|
@chapter Functions In The @code{mysys} Library
|
|
|
|
Functions in @code{mysys}: (For flags see @file{my_sys.h})
|
|
|
|
@table @code
|
|
@item int my_copy _A((const char *from, const char *to, myf MyFlags));
|
|
Copy file from @code{from} to @code{to}.
|
|
|
|
@item int my_delete _A((const char *name, myf MyFlags));
|
|
Delete file @code{name}.
|
|
|
|
@item int my_getwd _A((string buf, uint size, myf MyFlags));
|
|
@item int my_setwd _A((const char *dir, myf MyFlags));
|
|
Get and set working directory.
|
|
|
|
@item string my_tempnam _A((const char *pfx, myf MyFlags));
|
|
Make a unique temporary file name by using dir and adding something after
|
|
@code{pfx} to make name unique. The file name is made by adding a unique
|
|
six character string and @code{TMP_EXT} after @code{pfx}.
|
|
Returns pointer to @code{malloc()}'ed area for filename. Should be freed by
|
|
@code{free()}.
|
|
|
|
@item File my_open _A((const char *FileName,int Flags,myf MyFlags));
|
|
@item File my_create _A((const char *FileName, int CreateFlags, int AccsesFlags, myf MyFlags));
|
|
@item int my_close _A((File Filedes, myf MyFlags));
|
|
@item uint my_read _A((File Filedes, byte *Buffer, uint Count, myf MyFlags));
|
|
@item uint my_write _A((File Filedes, const byte *Buffer, uint Count, myf MyFlags));
|
|
@item ulong my_seek _A((File fd,ulong pos,int whence,myf MyFlags));
|
|
@item ulong my_tell _A((File fd,myf MyFlags));
|
|
Use instead of open, open-with-create-flag, close, read, and write
|
|
to get automatic error messages (flag @code{MYF_WME}) and only have
|
|
to test for != 0 if error (flag @code{MY_NABP}).
|
|
|
|
@item int my_rename _A((const char *from, const char *to, myf MyFlags));
|
|
Rename file from @code{from} to @code{to}.
|
|
|
|
@item FILE *my_fopen _A((const char *FileName,int Flags,myf MyFlags));
|
|
@item FILE *my_fdopen _A((File Filedes,int Flags,myf MyFlags));
|
|
@item int my_fclose _A((FILE *fd,myf MyFlags));
|
|
@item uint my_fread _A((FILE *stream,byte *Buffer,uint Count,myf MyFlags));
|
|
@item uint my_fwrite _A((FILE *stream,const byte *Buffer,uint Count, myf MyFlags));
|
|
@item ulong my_fseek _A((FILE *stream,ulong pos,int whence,myf MyFlags));
|
|
@item ulong my_ftell _A((FILE *stream,myf MyFlags));
|
|
Same read-interface for streams as for files.
|
|
|
|
@item gptr _mymalloc _A((uint uSize,const char *sFile,uint uLine, myf MyFlag));
|
|
@item gptr _myrealloc _A((string pPtr,uint uSize,const char *sFile,uint uLine, myf MyFlag));
|
|
@item void _myfree _A((gptr pPtr,const char *sFile,uint uLine));
|
|
@item int _sanity _A((const char *sFile,unsigned int uLine));
|
|
@item gptr _myget_copy_of_memory _A((const byte *from,uint length,const char *sFile, uint uLine,myf MyFlag));
|
|
@code{malloc(size,myflag)} is mapped to these functions if not compiled
|
|
with @code{-DSAFEMALLOC}.
|
|
|
|
@item void TERMINATE _A((void));
|
|
Writes @code{malloc()} info on @code{stdout} if compiled with
|
|
@code{-DSAFEMALLOC}.
|
|
|
|
@item int my_chsize _A((File fd, ulong newlength, myf MyFlags));
|
|
Change size of file @code{fd} to @code{newlength}.
|
|
|
|
@item void my_error _D((int nr, myf MyFlags, ...));
|
|
Writes message using error number (see @file{mysys/errors.h}) on @code{stdout},
|
|
or using curses, if @code{MYSYS_PROGRAM_USES_CURSES()} has been called.
|
|
|
|
@item void my_message _A((const char *str, myf MyFlags));
|
|
Writes @code{str} on @code{stdout}, or using curses, if
|
|
@code{MYSYS_PROGRAM_USES_CURSES()} has been called.
|
|
|
|
@item void my_init _A((void ));
|
|
Start each program (in @code{main()}) with this.
|
|
|
|
@item void my_end _A((int infoflag));
|
|
Gives info about program.
|
|
If @code{infoflag & MY_CHECK_ERROR}, prints if some files are left open.
|
|
If @code{infoflag & MY_GIVE_INFO}, prints timing info and malloc info
|
|
about program.
|
|
|
|
@item int my_redel _A((const char *from, const char *to, int MyFlags));
|
|
Delete @code{from} before rename of @code{to} to @code{from}. Copies state
|
|
from old file to new file. If @code{MY_COPY_TIME} is set, sets old time.
|
|
|
|
@item int my_copystat _A((const char *from, const char *to, int MyFlags));
|
|
Copy state from old file to new file. If @code{MY_COPY_TIME} is set,
|
|
sets old time.
|
|
|
|
@item string my_filename _A((File fd));
|
|
Returns filename of open file.
|
|
|
|
@item int dirname _A((string to, const char *name));
|
|
Copy name of directory from filename.
|
|
|
|
@item int test_if_hard_path _A((const char *dir_name));
|
|
Test if @code{dir_name} is a hard path (starts from root).
|
|
|
|
@item void convert_dirname _A((string name));
|
|
Convert dirname according to system.
|
|
In MSDOS, changes all characters to capitals and changes @samp{/} to @samp{\}.
|
|
|
|
@item string fn_ext _A((const char *name));
|
|
Returns pointer to extension in filename.
|
|
|
|
@item string fn_format _A((string to,const char *name,const char *dsk,const char *form,int flag));
|
|
format a filename with replace of library and extension and
|
|
converts between different systems.
|
|
params to and name may be identicall
|
|
function dosn't change name if name != to
|
|
Flag may be: 1 force replace filnames library with 'dsk'
|
|
2 force replace extension with 'form' */
|
|
4 force Unpack filename (replace ~ with home)
|
|
8 Pack filename as short as possibly for output to
|
|
user.
|
|
All open requests should allways use at least:
|
|
"open(fn_format(temp_buffe,name,"","",4),...)" to unpack home and
|
|
convert filename to system-form.
|
|
|
|
@item string fn_same _A((string toname, const char *name, int flag));
|
|
Copys directory and extension from @code{name} to @code{toname} if neaded.
|
|
Copying can be forced by same flags used in @code{fn_format()}.
|
|
|
|
@item int wild_compare _A((const char *str, const char *wildstr));
|
|
Compare if @code{str} matches @code{wildstr}. @code{wildstr} can contain
|
|
@samp{*} and @samp{?} as wildcard characters.
|
|
Returns 0 if @code{str} and @code{wildstr} match.
|
|
|
|
@item void get_date _A((string to, int timeflag));
|
|
Get current date in a form ready for printing.
|
|
|
|
@item void soundex _A((string out_pntr, string in_pntr))
|
|
Makes @code{in_pntr} to a 5 char long string. All words that sound
|
|
alike have the same string.
|
|
|
|
@item int init_key_cache _A((ulong use_mem, ulong leave_this_much_mem));
|
|
Use caching of keys in MISAM, PISAM, and ISAM.
|
|
@code{KEY_CACHE_SIZE} is a good size.
|
|
Remember to lock databases for optimal caching.
|
|
|
|
@item void end_key_cache _A((void));
|
|
End key caching.
|
|
@end table
|
|
|
|
|
|
|
|
@node DBUG, protocol, mysys functions, Top
|
|
@chapter DBUG Tags To Use
|
|
|
|
Here is some of the tags we now use:
|
|
(We should probably add a couple of new ones)
|
|
|
|
@table @code
|
|
@item enter
|
|
Arguments to the function.
|
|
|
|
@item exit
|
|
Results from the function.
|
|
|
|
@item info
|
|
Something that may be interesting.
|
|
|
|
@item warning
|
|
When something doesn't go the usual route or may be wrong.
|
|
|
|
@item error
|
|
When something went wrong.
|
|
|
|
@item loop
|
|
Write in a loop, that is probably only useful when debugging
|
|
the loop. These should normally be deleted when one is
|
|
satisfied with the code and it has been in real use for a while.
|
|
@end table
|
|
|
|
Some specific to mysqld, because we want to watch these carefully:
|
|
|
|
@table @code
|
|
@item trans
|
|
Starting/stopping transactions.
|
|
|
|
@item quit
|
|
@code{info} when mysqld is preparing to die.
|
|
|
|
@item query
|
|
Print query.
|
|
@end table
|
|
|
|
|
|
@node protocol, Fulltext Search, DBUG, Top
|
|
@chapter MySQL Client/Server Protocol
|
|
|
|
@menu
|
|
* raw packet without compression::
|
|
* raw packet with compression::
|
|
* basic packets::
|
|
* communication::
|
|
* fieldtype codes::
|
|
@end menu
|
|
|
|
@node raw packet without compression, raw packet with compression, protocol, protocol
|
|
@section Raw Packet Without Compression
|
|
|
|
@example
|
|
+-----------------------------------------------+
|
|
| Packet Length | Packet no | Data |
|
|
| 3 Bytes | 1 Byte | n Bytes |
|
|
+-----------------------------------------------+
|
|
@end example
|
|
|
|
@table @asis
|
|
@item 3 Byte packet length
|
|
The length is calculated with int3store
|
|
See include/global.h for details.
|
|
The max packetsize can be 16 MB.
|
|
|
|
@item 1 Byte packet no
|
|
If no compression is used the first 4 bytes of each packet is the header
|
|
of the packet. The packet number is incremented for each sent packet.
|
|
The first packet starts with 0.
|
|
@item n Byte data
|
|
|
|
@end table
|
|
|
|
The packet length can be recalculated with:
|
|
|
|
@example
|
|
length = byte1 + (256 * byte2) + (256 * 256 * byte3)
|
|
@end example
|
|
|
|
|
|
@node raw packet with compression, basic packets, raw packet without compression, protocol
|
|
@section Raw Packet With Compression
|
|
|
|
@example
|
|
+---------------------------------------------------+
|
|
| Packet Length | Packet no | Uncomp. Packet Length |
|
|
| 3 Bytes | 1 Byte | 3 Bytes |
|
|
+---------------------------------------------------+
|
|
@end example
|
|
|
|
@table @asis
|
|
@item 3 Byte packet length
|
|
The length is calculated with int3store
|
|
See include/global.h for details.
|
|
The max packetsize can be 16 MB.
|
|
|
|
@item 1 Byte packet no
|
|
@item 3 Byte uncompressed packet length
|
|
@end table
|
|
|
|
If compression is used the first 7 bytes of each packet
|
|
is the header of the packet.
|
|
|
|
|
|
@node basic packets, communication, raw packet with compression, protocol
|
|
@section Basic Packets
|
|
|
|
@menu
|
|
* ok packet::
|
|
* error packet::
|
|
@end menu
|
|
|
|
|
|
@node ok packet, error packet, basic packets, basic packets
|
|
@subsection OK Packet
|
|
|
|
For details, see @file{sql/net_pkg.cc::send_ok()}.
|
|
|
|
@example
|
|
+-----------------------------------------------+
|
|
| Header | No of Rows | Affected Rows |
|
|
| | 1 Byte | 1-8 Byte |
|
|
|-----------------------------------------------|
|
|
| ID (last_insert_id) | Status | Length |
|
|
| 1-8 Byte | 2 Byte | 1-8 Byte |
|
|
|-----------------------------------------------|
|
|
| Messagetext |
|
|
| n Byte |
|
|
+-----------------------------------------------+
|
|
@end example
|
|
|
|
@table @asis
|
|
@item Header
|
|
@item 1 byte number of rows ? (always 0 ?)
|
|
@item 1-8 bytes affected rows
|
|
@item 1-8 byte id (last_insert_id)
|
|
@item 2 byte Status (usually 0)
|
|
@item If the OK-packege includes a message:
|
|
@item 1-8 bytes length of message
|
|
@item n bytes messagetext
|
|
@end table
|
|
|
|
|
|
@node error packet, , ok packet, basic packets
|
|
@subsection Error Packet
|
|
|
|
@example
|
|
+-----------------------------------------------+
|
|
| Header | Status code | Error no |
|
|
| | 1 Byte | 2 Byte |
|
|
|-----------------------------------------------|
|
|
| Messagetext | 0x00 |
|
|
| n Byte | 1 Byte |
|
|
+-----------------------------------------------+
|
|
@end example
|
|
|
|
@table @asis
|
|
@item Header
|
|
@item 1 byte status code (0xFF = ERROR)
|
|
@item 2 byte error number (is only sent to new 3.23 clients.
|
|
@item n byte errortext
|
|
@item 1 byte 0x00
|
|
@end table
|
|
|
|
|
|
@node communication, fieldtype codes, basic packets, protocol
|
|
@section Communication
|
|
|
|
> Packet from server to client
|
|
< Paket from client tor server
|
|
|
|
Login
|
|
------
|
|
> 1. packet
|
|
Header
|
|
1 byte protocolversion
|
|
n byte serverversion
|
|
1 byte 0x00
|
|
4 byte threadnumber
|
|
8 byte crypt seed
|
|
1 byte 0x00
|
|
2 byte CLIENT_xxx options (see include/mysql_com.h
|
|
that is supported by the server
|
|
1 byte number of current server charset
|
|
2 byte server status variables (SERVER_STATUS_xxx flags)
|
|
13 byte 0x00 (not used yet).
|
|
|
|
< 2. packet
|
|
Header
|
|
2 byte CLIENT_xxx options
|
|
3 byte max_allowed_packet for the client
|
|
n byte username
|
|
1 byte 0x00
|
|
8 byte crypted password
|
|
1 byte 0x00
|
|
n byte databasename
|
|
1 byte 0x00
|
|
|
|
> 3. packet
|
|
OK-packet
|
|
|
|
|
|
Command
|
|
--------
|
|
< 1. packet
|
|
Header
|
|
1 byte command type (e.g.0x03 = query)
|
|
n byte query
|
|
|
|
Result set (after command)
|
|
--------------------------
|
|
> 2. packet
|
|
Header
|
|
1-8 byte field_count (packed with net_store_length())
|
|
|
|
If field_count == 0 (command):
|
|
1-8 byte affected rows
|
|
1-8 byte insert id
|
|
2 bytes server_status (SERVER_STATUS_xx)
|
|
|
|
If field_count == NULL_LENGTH (251)
|
|
LOAD DATA LOCAL INFILE
|
|
|
|
If field_count > 0 Result Set:
|
|
|
|
> n packets
|
|
Header Info
|
|
Column description: 5 data object /column
|
|
(See code in unpack_fields())
|
|
|
|
Columninfo for each column:
|
|
1 data block table_name
|
|
1 byte length of block
|
|
n byte data
|
|
1 data block field_name
|
|
1 byte length of block...
|
|
n byte data
|
|
1 data block display length of field
|
|
1 byte length of block
|
|
3 bytes display length of filed
|
|
1 data block type field of type (enum_field_types)
|
|
1 byte length of block
|
|
1 bytexs field of type
|
|
1 data block flags
|
|
1 byte length of block
|
|
2 byte flags for the columns (NOT_NULL_FLAG, ZEROFILL_FLAG....)
|
|
1 byte decimals
|
|
|
|
if table definition:
|
|
1 data block default value
|
|
|
|
Actual result (one packet per row):
|
|
4 byte header
|
|
1-8 byte length of data
|
|
n data
|
|
|
|
|
|
@node fieldtype codes, , communication, protocol
|
|
@section Fieldtype Codes
|
|
|
|
@example
|
|
display_length |enum_field_type |flags
|
|
----------------------------------------------------
|
|
Blob 03 FF FF 00 |01 FC |03 90 00 00
|
|
Mediumblob 03 FF FF FF |01 FC |03 90 00 00
|
|
Tinyblob 03 FF 00 00 |01 FC |03 90 00 00
|
|
Text 03 FF FF 00 |01 FC |03 10 00 00
|
|
Mediumtext 03 FF FF FF |01 FC |03 10 00 00
|
|
Tinytext 03 FF 00 00 |01 FC |03 10 00 00
|
|
Integer 03 0B 00 00 |01 03 |03 03 42 00
|
|
Mediumint 03 09 00 00 |01 09 |03 00 00 00
|
|
Smallint 03 06 00 00 |01 02 |03 00 00 00
|
|
Tinyint 03 04 00 00 |01 01 |03 00 00 00
|
|
Varchar 03 XX 00 00 |01 FD |03 00 00 00
|
|
Enum 03 05 00 00 |01 FE |03 00 01 00
|
|
Datetime 03 13 00 00 |01 0C |03 00 00 00
|
|
Timestamp 03 0E 00 00 |01 07 |03 61 04 00
|
|
Time 03 08 00 00 |01 0B |03 00 00 00
|
|
Date 03 0A 00 00 |01 0A |03 00 00 00
|
|
@end example
|
|
|
|
@c The Index was empty, and ugly, so I removed it. (jcole, Sep 7, 2000)
|
|
|
|
@c @node Index
|
|
@c @unnumbered Index
|
|
|
|
@c @printindex fn
|
|
|
|
@node Fulltext Search, , protocol, Top
|
|
@chapter Fulltext Search in MySQL
|
|
|
|
Hopefully, sometime there will be complete description of
|
|
fulltext search algorithms.
|
|
Now it's just unsorted notes.
|
|
|
|
@menu
|
|
* Weighting in boolean mode::
|
|
@end menu
|
|
|
|
@node Weighting in boolean mode, , , Fulltext Search
|
|
@section Weighting in boolean mode
|
|
|
|
The basic idea is as follows: in expression
|
|
@code{A or B or (C and D and E)}, either @code{A} or @code{B} alone
|
|
is enough to match the whole expression. While @code{C},
|
|
@code{D}, and @code{E} should @strong{all} match. So it's
|
|
reasonable to assign weight 1 to @code{A}, @code{B}, and
|
|
@code{(C and D and E)}. And @code{C}, @code{D}, and @code{E}
|
|
should get a weight of 1/3.
|
|
|
|
Things become more complicated when considering boolean
|
|
operators, as used in MySQL FTB. Obvioulsy, @code{+A +B}
|
|
should be treated as @code{A and B}, and @code{A B} -
|
|
as @code{A or B}. The problem is, that @code{+A B} can @strong{not}
|
|
be rewritten in and/or terms (that's the reason why this - extended -
|
|
set of operators was chosen). Still, aproximations can be used.
|
|
@code{+A B C} can be approximated as @code{A or (A and (B or C))}
|
|
or as @code{A or (A and B) or (A and C) or (A and B and C)}.
|
|
Applying the above logic (and omitting mathematical
|
|
transformations and normalization) one gets that for
|
|
@code{+A_1 +A_2 ... +A_N B_1 B_2 ... B_M} the weights
|
|
should be: @code{A_i = 1/N}, @code{B_j=1} if @code{N==0}, and,
|
|
otherwise, in the first rewritting approach @code{B_j = 1/3},
|
|
and in the second one - @code{B_j = (1+(M-1)*2^M)/(M*(2^(M+1)-1))}.
|
|
|
|
The second expression gives somewhat steeper increase in total
|
|
weight as number of matched B's increases, because it assigns
|
|
higher weights to individual B's. Also the first expression in
|
|
much simplier. So it is the first one, that is implemented in MySQL.
|
|
|
|
@summarycontents
|
|
@contents
|
|
|
|
@bye
|