mirror of
https://github.com/MariaDB/server.git
synced 2025-01-25 00:04:33 +01:00
65dae9fc66
at least slightly.
943 lines
46 KiB
Text
943 lines
46 KiB
Text
MySQL Client/Server Protocol Documentation
|
|
|
|
|
|
|
|
Introduction
|
|
------------
|
|
|
|
|
|
This paper has the objective of presenting a through description
|
|
of the client/server protocol that is embodied in MySQL. Particularly,
|
|
this paper aims to document and describe:
|
|
|
|
- manner in which MySQL server detects client connection requests and
|
|
creates connection
|
|
- manner in which MySQL client C API call connects to server - the
|
|
entire protocol of sending/receiving data by MySQL server and C API
|
|
code
|
|
- manner in which queries are sent by client C API calls to server
|
|
- manner in which query results are sent by server
|
|
- manner in which query results are resolved by server
|
|
- sending and receiving of error messages
|
|
|
|
|
|
This paper does not have the goal or describing nor documenting other
|
|
related MySQL issues, like usage of thread libraries, MySQL standard
|
|
library set, MySQL strings library and other MySQL specific libraries,
|
|
type definitions and utilities.
|
|
|
|
Issues that are covered by this paper are contained in the following
|
|
source code files:
|
|
|
|
- libmysql/net.c and sql/net_serv.cc, the two being identical
|
|
- client/libmysql.c (not entire file is covered)
|
|
- include/mysql_com.h
|
|
- include/mysql.h
|
|
- sql/mysqld.cc (not entire file is covered)
|
|
- sql/net_pkg.cc
|
|
- sql/sql_base.cc (not entire file is covered)
|
|
- sql/sql_select.cc (not entire file is covered)
|
|
- sql/sql_parse.cc (not entire file is covered)
|
|
|
|
Note: libmysql/net.c was client/net.c prior to MySQL 3.23.11.
|
|
sql/net_serv.cc was sql/net_serv.c prior to MySQL 3.23.16.
|
|
|
|
Beside this introduction this paper presents basic definitions,
|
|
constants, structures and global variables, all related functions in
|
|
server and in C API. Textual description of the entire protocol
|
|
functioning is described in the last chapter of this paper.
|
|
|
|
|
|
Constants, structures and global variables
|
|
------------------------------------------
|
|
|
|
This chapter will describe all constants, structures and
|
|
global variables relevant to client/server protocol.
|
|
|
|
Constants
|
|
|
|
They are important as they contain default values, the ones
|
|
that are valid if options are not set in any other way. Beside that
|
|
MySQL source code does not contain a single non-defined constant in
|
|
its code. This description of constants does not include
|
|
configuration and conditional compilation #defines.
|
|
|
|
NAME_LEN - field and table name length, current value 64
|
|
HOSTNAME_LENGTH - length of the hostname, current value 64
|
|
USERNAME_LENGTH - username length, current value 16
|
|
MYSQL_PORT - default TCP/IP port number, current value 3306
|
|
MYSQL_UNIX_ADDR - full path of the default Unix socket file, current value
|
|
"/tmp/mysql.sock"
|
|
MYSQL_NAMEDPIPE - full path of the default NT pipe file, current value
|
|
"MySQL"
|
|
MYSQL_SERVICENAME - name of the MySQL Service on NT, current value "MySQL"
|
|
NET_HEADER_SIZE - size of the network header, when no
|
|
compression is used, current value 4
|
|
COMP_HEADER_SIZE - additional size of network header when
|
|
compression is used, current value 3
|
|
|
|
What follows are set of constants, defined in source only, which
|
|
define capabilities of the client built with that version of C
|
|
API. Simply, when some new feature is added in client, that client
|
|
feature is defined, so that server can detect what capabilities a
|
|
client program has.
|
|
|
|
CLIENT_LONG_PASSWORD - client supports new more secure passwords
|
|
CLIENT_LONG_FLAG - client uses longer flags
|
|
CLIENT_CONNECT_WITH_DB - client can specify db on connect
|
|
CLIENT_COMPRESS - client can use compression protocol
|
|
CLIENT_ODBC - ODBC client
|
|
CLIENT_LOCAL_FILES - client can use LOAD DATA INFILE LOCAL
|
|
CLIENT_IGNORE_SPACE - client can ignore spaces before '('
|
|
CLIENT_CHANGE_USER - client supports the mysql_change_user()
|
|
|
|
What follows are other constants, pertaining to timeouts and sizes
|
|
|
|
MYSQL_ERRMSG_SIZE - maximum size of error message string, current value 200
|
|
NET_READ_TIMEOUT - read timeout, current value 30 seconds
|
|
NET_WRITE_TIMEOUT - write timeout, current value 60 seconds
|
|
NET_WAIT_TIMEOUT - wait for new query timeout, current value 8*60*60
|
|
seconds, that is, 8 hours
|
|
packet_error - value returned in case of socket errors, current
|
|
value -1
|
|
TES_BLOCKING - used in debug mode for setting up blocking testing
|
|
RETRY COUNT - number of times network read and write will be
|
|
retried, current value 1
|
|
|
|
There are also error messages for last_errno, which depict system
|
|
errors, and are used on the server only.
|
|
|
|
ER_NET_PACKAGE_TOO_LARGE - packet is larger than max_allowed_packet
|
|
ER_OUT_OF_RESOURCES - practically no more memory
|
|
ER_NET_ERROR_ON_WRITE - error in writing to NT Named Pipe
|
|
ER_NET_WRITE_INTERRUPTED - some signal or interrupt happened
|
|
during write
|
|
ER_NET_READ_ERROR_FROM_PIPE - error in reading from NT Named Pipe
|
|
ER_NET_FCNTL_ERROR - error in trying to set fcntl on socket
|
|
descriptor
|
|
ER_NET_PACKETS_OUT_OF_ORDER - packet numbers on client and
|
|
server side differ
|
|
ER_NET_UNCOMPRESS_ERROR - error in uncompress of compressed packet
|
|
|
|
|
|
Structs and enums
|
|
|
|
|
|
struct NET
|
|
|
|
This is MySQL's network handle structure, used in all client/server
|
|
read/write functions. On the server, it is initialized and preserved
|
|
in each thread. On the client, it is a part of the MYSQL struct,
|
|
which is the MySQL handle used in all C API functions. This structure
|
|
uniquely identifies a connection, either on the server or client
|
|
side. It consists of the following fields:
|
|
|
|
Vio* vio - explained above
|
|
HANDLE hPipe - Handle for NT Named Pipe file
|
|
my_socket fd - file descriptor used for both TCP/IP socket and
|
|
Unix socket file
|
|
int fcntl - contains info on fcntl options used on fd. Mostly
|
|
used for saving info if blocking is used or not
|
|
unsigned char *buff - network buffer used for storing data for
|
|
reading from/writing to socket
|
|
unsigned char,*buff_end - points to the end of buff
|
|
unsigned char *write_pos - present writing position in buff
|
|
unsigned char *read_pos - present reading position in buff. This
|
|
pointer is used for reading data after
|
|
calling my_net_read function and function
|
|
that are just its wrappers
|
|
char last_error[MYSQL_ERRMSG_SIZE] - holds last error message
|
|
unsigned int last_errno - holds last error code of the network
|
|
protocol. Its possible values are listed
|
|
in above constants. It is used only on
|
|
the server side
|
|
unsigned int max_packet - holds current value of buff size
|
|
unsigned int timeout - stores read timeout value for that connection
|
|
unsigned int pkt_nr - stores the value of the current packet number in
|
|
a batch of packets. Used primarily for
|
|
detection of protocol errors resulting in a
|
|
mismatch
|
|
my_bool error - holds either 1 or 0 depending on the error condition
|
|
my_bool return_errno - if its value != 0 then there is an error in
|
|
protocol mismatch between client and server
|
|
my_bool compress - if true compression is used in the protocol
|
|
unsigned long remain_in_buf - used only in reading compressed packets.
|
|
Explained in my_net_read
|
|
unsigned long length - used only for storing the length of the read
|
|
packet. Explained in my_net_read
|
|
unsigned long buf_length - used only in reading compressed packets.
|
|
Explained in my_net_read
|
|
unsigned long where_b - used only in reading compressed packets.
|
|
Explained in my_net_read
|
|
short int more - used for reporting in mysql_list_processes
|
|
char save_char - used in reading compressed packets for saving chars
|
|
in order to make zero-delimited strings. Explained
|
|
in my_net_read
|
|
|
|
A few typedefs will be defined for easier understanding of the text that
|
|
follows.
|
|
|
|
typedef char **MYSQL_ROW - data containing one row of values
|
|
|
|
typedef unsigned int MYSQL_FIELD_OFFSET - offset in bytes of the current field
|
|
|
|
typedef MYSQL_ROWS *MYSQL_ROW_OFFSET - offset in bytes of the current row
|
|
|
|
struct MYSQL_FIELD - contains all info on the attributes of a
|
|
specific column in a result set, plus info on lengths of the column in
|
|
a result set. This struct is tagged as st_mysql_field. This structure
|
|
consists of the following fields:
|
|
|
|
char *name - name of column
|
|
char *table - table of column if column was a field and not
|
|
an expression or constant
|
|
char *def - default value (set by mysql_list_fields)
|
|
enum enum_field_types type - see above
|
|
unsigned int length - width of column in the current row
|
|
unsigned int max_length - maximum width of that column in entire
|
|
result set
|
|
unsigned int flags - corresponding to Extra in DESCRIBE
|
|
unsigned int decimals - number of decimals in field
|
|
|
|
|
|
struct MYSQL_ROWS - a node for each row in the single linked
|
|
list forming entire result set. This struct is tagged as
|
|
st_mysql_rows, and has two fields:
|
|
|
|
struct st_mysql_rows *next - pointer to the next one
|
|
MYSQL_ROW data - see above
|
|
|
|
|
|
struct MYSQL_DATA - contains all rows from result set. It is
|
|
tagged as st_mysql_data and has following fields:
|
|
|
|
my_ulonglong rows - how many rows
|
|
unsigned int fields - how many columns
|
|
MYSQL_ROWS *data - see above. This is the first node of the linked list
|
|
MEM_ROOT alloc - MEM_ROOT is MySQL memory allocation structure, and
|
|
this field is used to store all fields and rows.
|
|
|
|
|
|
struct st_mysql_options - holds various client options, and
|
|
contains following fields:
|
|
|
|
unsigned int connect_timeout - time in seconds for connection
|
|
unsigned int client_flag - used to hold client capabilities
|
|
my_bool compress - boolean for compression
|
|
my_bool named_pipe - is Named Pipe used? (on NT)
|
|
unsigned int port - what TCP port is used
|
|
char *host - host to connect to
|
|
char *init_command - command to be executed upon connection
|
|
char *user - account name on MySQL server
|
|
char *password - password for the above
|
|
char *unix_socket - full path for Unix socket file
|
|
char *db - default database
|
|
char *my_cnf_file - optional configuration file
|
|
char *my_cnf_group - optional header for options
|
|
|
|
|
|
struct MYSQL - MySQL client's handle. Required for any
|
|
operation issued from client to server. Tagged as st_mysql and having
|
|
following fields:
|
|
|
|
NET net - see above
|
|
char *host - host on which MySQL server is running
|
|
char *user - MySQL username
|
|
char *passwd - password for above
|
|
char *unix_socket- full path of Unix socket file
|
|
char *server_version - version of the server
|
|
char *host_info - contains info on how has connection been
|
|
established, TCP port, socket or Named Pipe
|
|
char *info - used to store information on the query results,
|
|
like number of rows affected etc.
|
|
char *db - current database
|
|
unsigned int port - TCP port in use
|
|
unsigned int client_flag - client capabilities
|
|
unsigned int server_capabilities - server capabilities
|
|
unsigned int protocol_version - version of the protocol
|
|
unsigned int field_count - used for storing number of fields
|
|
immediately upon execution of a query,
|
|
but before fetching rows
|
|
unsigned long thread_id - server thread to which this connection
|
|
is attached
|
|
my_ulonglong affected_rows - used for storing number of rows
|
|
immediately upon execution of a query,
|
|
but before fetching rows
|
|
my_ulonglong insert_id - fetching LAST_INSERT_ID() through client C API
|
|
my_ulonglong extra_info - used by mysqlshow
|
|
unsigned long packet_length - saving size of the first packet upon
|
|
execution of a query
|
|
enum mysql_status status - see above
|
|
MYSQL_FIELD *fields - see above
|
|
MEM_ROOT field_alloc - memory used for storing previous field (fields)
|
|
my_bool free_me - boolean that flags if MYSQL was allocated in mysql_init
|
|
my_bool reconnect - used to automatically reconnect
|
|
struct st_mysql_options options - see above
|
|
char scramble_buff[9] - key for scrambling password before sending it
|
|
to server
|
|
|
|
|
|
struct MYSQL_RES - tagged as st_mysql_res and used to store
|
|
entire result set from a single query. Contains following fields:
|
|
|
|
my_ulonglong row_count - number of rows
|
|
unsigned int field_count - number of columns
|
|
unsigned int current_field - cursor for fetching fields
|
|
MYSQL_FIELD *fields - see above
|
|
MYSQL_DATA *data - see above, and used in buffered reads, that is,
|
|
mysql_store_result only
|
|
MYSQL_ROWS *data_cursor - pointing to the field of above "data"
|
|
MEM_ROOT field_alloc - memory allocation for above "fields"
|
|
MYSQL_ROW row - used for storing row by row in unbuffered reads,
|
|
that is, in mysql_use_result
|
|
MYSQL_ROW current_row - cursor to the current row for buffered reads
|
|
unsigned long *lengths - column lengths of current row
|
|
MYSQL *handle - see above, used in unbuffered reads, that is, in
|
|
mysql_use_result
|
|
my_bool eof - used by mysql_fetch_row as a marker for end of data
|
|
|
|
|
|
|
|
Global variables
|
|
|
|
|
|
unsigned long max_allowed_packet - maximum allowable value of network
|
|
buffer. Default value - 1MB
|
|
|
|
unsigned long net_buffer_length - default, starting value of network
|
|
buffer - 8KB
|
|
|
|
unsigned long bytes_sent - total number of bytes written since startup
|
|
of the server
|
|
|
|
unsigned long bytes_received - total number of bytes read since startup
|
|
of the server
|
|
|
|
|
|
Synopsis of the basic client/server protocol
|
|
--------------------------------------------
|
|
|
|
Purpose of this chapter is to provide a complete picture of
|
|
the basic client/server protocol implemented in MySQL. It was felt
|
|
it is necessary after writing descriptions for all of the functions
|
|
involved in basic protocol. There are at present 11 functions
|
|
involved, with several structures, many constants etc, which are all
|
|
described in detail. But as a forest could not be seen from the trees,
|
|
so the concept of the protocol could not be deciphered easily from a
|
|
thorough documentation of minutiae.
|
|
|
|
Although the concept of the protocol was not changed with the
|
|
introduction of vio system, embodied in violate.cc source file and VIO
|
|
system, the introduction of these has changed the code substantially. Before
|
|
VIO was introduced, functions for reading from/writing to network
|
|
connection had to deal with various network standards. So, these functions
|
|
depended on whether TCP port or Unix socket file or NT Named Pipe file is
|
|
used. This is all changed now and single vio_ functions are called, while
|
|
all this diversity is covered by vio_ functions.
|
|
|
|
In MySQL a specific buffered network input/output transport model
|
|
has been implemented. Although each operating system may have its
|
|
own buffering for network connections, MySQL has added its own
|
|
buffering model. This same for each of the three transport protocol
|
|
types that are used in MySQL client/server communications, which
|
|
are TCP/IP sockets (on all systems), Unix socket files on Unix and
|
|
Unix-like operating systems and Named Pipe files on NT. Although
|
|
TCP/IP sockets are omnipresent, the latter two types have been added
|
|
for local connections. Those two connection types can be used in
|
|
local mode only, that is, when both client and server reside on the
|
|
same host, and are introduced because they enable better speeds for
|
|
local connections. This is especially useful for WWW type of
|
|
applications. Startup options of MySQL server allow that either
|
|
TCP/IP sockets or local connection (OS dependent) can be disallowed.
|
|
|
|
In order to implement buffered input/output, MySQL allocates a
|
|
buffer. The starting size of this buffer is determined by the value
|
|
of the global variable net_buffer_length, which can be changed at
|
|
MySQL server startup. This is, as explained, only the startup length
|
|
of MySQL network buffer. Because a single item that has to be read
|
|
or written can be larger than that value, MySQL will increase buffer
|
|
size as long as that size reaches value of the global variable
|
|
max_allowed_packet, which is also settable at server startup. Maximum
|
|
value of this variable is limited by the way MySQL stores/reads
|
|
sizes of packets to be sent/read, which means by the way MySQL
|
|
formats packages.
|
|
|
|
Basically each packet consists of two parts, a header and data. In
|
|
the case when compression is not used, header consists of 4 bytes
|
|
of which 3 contain the length of the packet to be sent and one holds
|
|
the packet number. When compression is used there are onother 3
|
|
bytes which store the size of uncompressed data. Because of the way
|
|
MySQL packs length into 3 bytes, plus due to the usage of some
|
|
special values in the most significant byte, maximum size of
|
|
max_allowed_packet is limited to 24MB at present. So, if compression
|
|
is not used, at first 4 bytes are written to the buffer and then
|
|
data itself. As MySQL buffers I/O logical packets are packet together
|
|
until packets fill up entire size of the buffer. That size no less
|
|
than net_buffer_size, but no greater than max_allowed_packet. So,
|
|
actual writing to the network is done when this buffer is filled
|
|
up. As frequently sequence of buffers make a logical unit, like a
|
|
result set, then at the end of sending data, even if buffer is not
|
|
full, data is written (flushed to the connection) with a call of
|
|
the net_flush function. So that no single packet can be larger than
|
|
this value, checks are made throughout the code to make sure that
|
|
no single field or command could exceed that value.
|
|
|
|
In order to maintain coherency in consecutive packets, each packet
|
|
is numbered and their number stored as a part of a header, as
|
|
explained above. Packets start with 0, so whenever a logical packet
|
|
is written, that number is incremented. On the other side when
|
|
packets are read, value that is fetched is compared with the value
|
|
stored and if there is no mismatch that value is incremented, too.
|
|
Packet number is reset on the client side when unwanted connections
|
|
are removed from the connection and on the server side when a new
|
|
command has been started.
|
|
|
|
|
|
So, before writing, the buffer contains a sequence of logical
|
|
packets, consisting of header plus data consecutively. If compression
|
|
is used, packet numbers are not stored in each header of the logical
|
|
packets, but a whole buffer, or a part of it if flushing is done,
|
|
containing one or more logical packets are compressed. In that case
|
|
a new larger header, is formed, and all logical packets contained
|
|
in the buffer are compressed together. This way only one packet is
|
|
formed which makes several logical packets, which improves both
|
|
speed and compression ratio. On the other side, when this large
|
|
compressed packet is read, it is first uncompressed, and then logical
|
|
packets are sent, one by one, to the calling functions.
|
|
|
|
|
|
All this functionality is described in detail in the following
|
|
chapter. It does not contain functions that form logical packets, or
|
|
that read and write to connections but also functions that are used
|
|
for initialization, clearing of connections. There are functions at
|
|
higher level dealing with sending fields, rows, establishing
|
|
connections, sending commands, but those are not explained in the
|
|
following chapter.
|
|
|
|
|
|
Functions utilized in client/server protocol
|
|
--------------------------------------------
|
|
|
|
First of all, functions are described that are involved in preparing,
|
|
reading, or writing data over TCP port, Unix socket file, or named
|
|
pipe, and functions directly related to those. All of these functions
|
|
are used both in server and client. Server and client specific code
|
|
segments are documented in each function description.
|
|
|
|
Each MySQL function checks for errors in memory allocation and
|
|
freeing, as well as in every OS call, like the one dealing with
|
|
files and sockets, and for errors in indigenous MySQL function
|
|
calls. This is expected, but has to be said here so as not to repeat
|
|
it in every function description.
|
|
|
|
Older versions of MySQL have utilized the following macros for
|
|
reading from or writing to a socket.
|
|
|
|
raw_net_read - calls OS function recv function that reads N bytes
|
|
from a socket into a buffer. Number of bytes read is returned.
|
|
|
|
raw_net_write - calls OS function send to write N bytes from a
|
|
buffer to socket. Number of bytes written is returned.
|
|
|
|
These macros are replaced with VIO (Virtual I/O) functions.
|
|
|
|
|
|
Function name: my_net_init
|
|
|
|
Parameters: struct NET *, enum_net_type, struct Vio
|
|
|
|
Return value: 1 for error, 0 for success
|
|
|
|
Function purpose: To initialize properly all NET fields,
|
|
allocate memory and set socket options
|
|
|
|
Function description
|
|
|
|
First of all, buff field of NET struct is allocated to the size of
|
|
net_buffer_length, and on failure function exits with 0. All fields
|
|
in NET are set to their default or starting values. As net_buffer_length
|
|
and max_allowed_packet are configurable, max_allowed_packet is set
|
|
equal to net_buffer_length if the latter one is greater. max_packet
|
|
is set for that NET to net_buffer_length, and buff_end points to
|
|
buff end. vio field is set to the second parameter. If it is a
|
|
real connection, which is the case when second parameter is not
|
|
null, then fd field is set by calling vio_fd function. read_pos and
|
|
write_pos to buff, while remaining integers are set to 0. If function
|
|
is run on the MySQL server on Unix and server is started in a test
|
|
mode that would require testing of blocking, then vio_blocking
|
|
function is called. Last, fast throughput mode is set by a call to
|
|
vio_fastsend function.
|
|
|
|
|
|
Function name: net_end
|
|
|
|
Parameters: struct NET *
|
|
|
|
Return value: void
|
|
|
|
Function purpose: To release memory allocated to buff
|
|
|
|
|
|
|
|
Function name: net_realloc (private, static function)
|
|
|
|
Parameters: struct NET, ulong (unsigned long)
|
|
|
|
Return value: 1 for error, 0 for success
|
|
|
|
Function purpose: To change memory allocated to buff
|
|
|
|
Function description
|
|
|
|
New length of buff field of NET struct is passed as second parameter.
|
|
It is first checked versus max_allowed_packet and if greater, an
|
|
error is returned. New length is aligned to 4096-byte boundary. Then,
|
|
buff is reallocated, buff_end, max_packet, and write_pas reset to
|
|
the same values as in my_net_init.
|
|
|
|
|
|
|
|
Function name: net_clear (used on client side only)
|
|
|
|
Parameters: struct NET *
|
|
|
|
Return value: void
|
|
|
|
Function purpose: To read unread packets
|
|
|
|
Function description
|
|
|
|
This function is used on client side only, and is executed
|
|
only if a program is not started in test mode. This function reads
|
|
unread packets without processing them. First, non-blocking mode is
|
|
set on systems that do not have non-blocking mode defined. This is
|
|
performed by checking the mode with vio_is_blocking function. and
|
|
setting non-blocking mode by vio_blocking function. If this operation
|
|
was successful, then packets are read by vio_read function, to which
|
|
vio field of NET is passed together with buff and max_packet field
|
|
values. field of the same struct at a length of max_packet. If
|
|
blocking was active before reading is performed, blocking is set with
|
|
vio_blocking function. After reading has been performed, pkt_nr is
|
|
reset to 0 and write_pos reset to buff. In order to clarify some
|
|
matters non-blocking mode enables executing program to dissociate from
|
|
a connection, so that error in connection would not hang entire
|
|
program or its thread.
|
|
|
|
Function name: net_flush
|
|
|
|
Parameters: struct NET *
|
|
|
|
Return value: 1 for error, 0 for success
|
|
|
|
Function purpose: To write remaining bytes in buff to socket
|
|
|
|
Function description
|
|
|
|
net_real_write (described below) is performed is write_pos
|
|
differs from buff, both being fields of the only parameter. write_pos
|
|
is reset to buff. This function has to be used, as MySQL uses buffered
|
|
writes (as will be explained more in the function net_write_buff).
|
|
|
|
|
|
Function name: my_net_write
|
|
|
|
Parameters: struct NET *, const char *, ulong
|
|
|
|
Return value: 1 for error, 0 for success
|
|
|
|
Function purpose: Write a logical packet in the second parameter
|
|
of third parameter length
|
|
|
|
Function description
|
|
|
|
The purpose of this function is to prepare a logical packet such
|
|
that entire content of data, pointed to by second parameter and in
|
|
length of third parameter is sent to the other side. In case of
|
|
server, it is used for sending result sets, and in case of client
|
|
it is used for sending local data. This function foremost prepares
|
|
a header for the packet. Normally, the header consists of 4 bytes,
|
|
of which the first 3 bytes contain the length of the packet, thereby
|
|
limiting a maximum allowable length of a packet to 16MB, while the
|
|
fourth byte contains the packet number, which is used when one large
|
|
packet has to be divided into sequence of packets. This way each
|
|
sub-packet gets its number which should be matched on the other
|
|
side. When compression is used another three bytes are added to
|
|
packet header, thus packet header is in that case increased to 7
|
|
bytes. Additional three bytes are used to save the length of
|
|
compressed data. As in connection that uses compression option,
|
|
code packs packets together,, a header prepared by this function
|
|
is later not used in writing to / reading from network, but only
|
|
to distinguish logical packets within a buffered read operation.
|
|
|
|
|
|
This function, first stores the value of the third parameter into the
|
|
first 3 bytes of local char variable of NET_HEADER_SIZE size by usage
|
|
of function int3store. Then, at this point, if compression is not
|
|
used, pkt_nr is increased, and its value stored in the last byte of
|
|
the said local char[] variable. If compression is used, 0 is stored in
|
|
both values. Then those four bytes are sent to other side by the usage
|
|
of the function net_write_buff(to be explained later on), and if
|
|
successful, entire packet in second parameter of the length described
|
|
in third parameter is sent by the usage of the same function.
|
|
|
|
|
|
Function name: net_write_command
|
|
|
|
Parameters: struct NET *, char, const char *, ulong
|
|
|
|
Return value: 1 for error, 0 for success
|
|
|
|
Function purpose: Send a command with a packet as in previous function
|
|
|
|
Function description
|
|
|
|
This function is very similar to the previous one. The only
|
|
difference is that first packet is enlarged by one byte, so that the
|
|
command precedes the packet to be sent. This is implemented by
|
|
increasing first packet by one byte, which contains a command code. As
|
|
command codes do not use the range of values that are used by character
|
|
sets, so when the other side receives a packet, first byte after
|
|
header contains a command code. This function is used by client for
|
|
sending all commands and queries, and by server in connection process
|
|
and for sending errors.
|
|
|
|
|
|
Function name: net_write_buff (private, static function)
|
|
|
|
Parameters: struct NET *, const char *, uint
|
|
|
|
Return value: 1 for error, 0 for success
|
|
|
|
Function purpose: To write a packet of any size by cutting it
|
|
and using next function for writing it
|
|
|
|
Function description
|
|
|
|
This function was created after compression feature has been
|
|
added to MySQL. This function supposes that packets have already been
|
|
properly formatted, regarding packet header etc. The principal reason for
|
|
this function to exist is because a packet that is sent by client or
|
|
server does not have to be less than max_packet. So this function
|
|
first calculates how much data has been left in a buff, by getting a
|
|
difference between buff_end and write_pos and storing it to local
|
|
variable left_length. Then a loop is run as long as the length to be
|
|
sent is greater than length of left bytes (left_length). In a loop
|
|
data from second parameter is copied to buff at write_pos, as much as
|
|
it can be, that is, by left_length. Then net_real_write function is called
|
|
(see below) with NET, buff, and max_packet parameters. This function
|
|
is the lowest level function that writes data over established
|
|
connection. In the loop, write_pos is reset to buff, the pointer to data
|
|
(second parameter) is moved by the amount of data sent (left_length),
|
|
length of data to be sent (third parameter) is decreased by the amount
|
|
sent (left_length) and left_length is reset to max_packet value, which
|
|
ends the loop. This logic was necessary, as there could have been some
|
|
data yet unsent (write_pos != buf), while data to be sent could be as
|
|
large as necessary, thus requiring many loops. At the end of function,
|
|
remaining data in second parameter are copied to buff at write_pos, by
|
|
the remaining length of data to be sent (third parameter). So, in the
|
|
next call of this function remaining data will be sent, as buff is
|
|
used in the call to net_real_write. It is very important to note that if
|
|
a packet to be sent is less than the number of bytes that are still
|
|
available in buff, then there will be no writing over network, but
|
|
only logical packets will be added one after another. This will
|
|
accelerate network traffic, plus if compression is used, the
|
|
expected compression rate would be higher. That is why server or
|
|
client functions that sends data uses at the end of data net_flush
|
|
function described above.
|
|
|
|
|
|
Function name: net_real_write
|
|
|
|
Parameters: struct NET *, const char *, ulong
|
|
|
|
Return value: 1 for error, 0 for success
|
|
|
|
Function purpose: To write data to a socket or pipe, with
|
|
compression if used
|
|
|
|
Function description
|
|
|
|
First, more field is set to 2, to enable reporting in
|
|
mysql_list_processes. Then if compression is enabled on that
|
|
connection, a new local buffer (variable b) is initialized to the
|
|
length of total header (normal header + compression header) and if no
|
|
memory is available, an error is returned. This buffer (b) is used for
|
|
holding the final, compressed packet to be written over the
|
|
connection. Furthermore in compression initialization, second
|
|
parameter at length of third parameter is copied to the local buffer
|
|
b, and MySQL's wrapped zlib's compression function is run at total
|
|
header offset of the local buffer. Please, do note that this function
|
|
does not test effectiveness of compression. If compression is turned
|
|
on in some connection, it is used all of the time. Also, it is very
|
|
important to be cognizant of the fact that this algorithm makes
|
|
possible that a single compressed packet contains several logical
|
|
packets. In this way compression rate is increased and network
|
|
throughput is increased as well. However, this algorithm has
|
|
consequences on the other side, that reads compressed packet, which
|
|
is covered in my_net_read function. After compression is done, the full
|
|
compression header is properly formed with the packet number,
|
|
compressed and uncompressed lengths. At the end of compression code,
|
|
third parameter is increased by total header length, as the original
|
|
header is not used (see above), and second parameter, pointer to data,
|
|
is set to point to local buffer b, in order that the further flow of
|
|
function is independent of compression. If a function is executed
|
|
on server side, a thread alarm initialized and if non-blocking is
|
|
active set at NET_WRITE_TIMEOUT. Two local (char *) pointers are
|
|
initialized, pos at beginning of second parameter, and end at end of
|
|
data. Then the loop is run as long as all data is written, which means
|
|
as long as pos != end. First vio_write function is called, with
|
|
parameters of vio field, pos and size of data (end - pos). Number of
|
|
bytes written over connection is saved in local variable (length). If
|
|
error is returned local bool variable (interrupted) is set according
|
|
to the return value of the vio_should_retry called with vio field as
|
|
parameter. This bool variable indicates whether writing was
|
|
interrupted in some way or not.
|
|
|
|
Further, error from vio_write is treated differently on Unix versus
|
|
other OS's (Win32 or OS/2). On Unix an alarm is set if one is not
|
|
in use, no bytes have been written and there has been no interruption.
|
|
Also, in that case, if connection is not in blocking mode, a sub-loop
|
|
is run as long as blocking is not set with vio_blocking function.
|
|
Within the loop another run of above vio_write is run based on
|
|
return value of vio_is_retry function, provided number of repeated
|
|
writes is less than RETRY_COUNT. If that is not the case, error
|
|
field of struct NET is set to 1 and function exits. At the exit
|
|
of sub-loop number of reruns already executed is reset to zero and
|
|
another run of above vio_write function is attempted. If the function
|
|
is run on Win32 and OS/2, and in the case that function flow was
|
|
not interrupted and thread alarm is not in use, again the main loop
|
|
is continued until pos != end. In the case that this function is
|
|
executed on thread safe client program, a communication flow is
|
|
tested on EINTR, caused by context switching, by use of vio_errno
|
|
function, in which case the loop is continued. At the end of
|
|
processing of the error from vio_write, error field of struct NET
|
|
is set, and if on server last_errno field is set to
|
|
ER_NET_WRITE_INTERRUPTED in the case that local bool variable
|
|
(interrupted) is true or to ER_NET_ERROR_ON_WRITE. Before the end
|
|
of loop, in order to make possible evaluation of the loop condition,
|
|
pos is increased by the value written in last iteration (length).
|
|
Also global variable bytes_sent is increased by the same value, for
|
|
status purposes. At the end of the functions more fields is reset,
|
|
in case of compression, compression buffer (b) memory is released
|
|
and if thread is still in use, it is ended and blocking state is
|
|
reset to its original state, and function returns error is all bytes
|
|
are not written.
|
|
|
|
|
|
|
|
Function name: my_real_read (private, static function)
|
|
|
|
Parameters: struct NET *, ulong *
|
|
|
|
Return value: length of bytes read
|
|
|
|
Function purpose: low level network connection read function
|
|
|
|
Function description
|
|
|
|
This function has made as a separate one when compression was
|
|
introduced in MySQL client/server protocol . It contains basic, low
|
|
level network reading functionality, while all dealings with
|
|
compressed packets are handled in next function. Compression in this
|
|
function is only handled in as much to unfold the length of uncompressed
|
|
data. First blocking state of connection is saved in local bool
|
|
variable net_blocking, and field more is set 1 for detailed reporting
|
|
in mysqld_list_processes. A new thread alarm is initialized, in order
|
|
to enable read timeout handling, and if on server and a connection can
|
|
block a program, the alarm is set at a value of timeout field. Local
|
|
pointer is set to the position of the next logical packet, with its
|
|
header skipped, which is at field where_b offset from buff. Next, a
|
|
two time run code is entered. A loop is run exactly two times because
|
|
first time number of bytes to be fetched (remain) are set to the
|
|
header size, which is different when compression is used or not used
|
|
on the connection. After first fetch has been done, number of packets
|
|
that will be received in second iteration is well known, as fetched
|
|
header contains the size of packet, packet number, and in the case of
|
|
compression, the size of the uncompressed packet. Then, as long as there are
|
|
bytes to read the loop is entered with first reading data from network
|
|
connection with vio_read function, called with parameters of field
|
|
vio, current position and remaining number of bytes, which value is
|
|
hold by local variable (remain) initialized at the value of header size,
|
|
which differs if compression is used. Number of bytes read are
|
|
returned in local length variable. If error is returned local bool
|
|
variable (interrupted) is set according to the return value of the
|
|
vio_should_retry called with vio field as parameter. This bool
|
|
variable indicates whether reading was interrupted in some way or not.
|
|
|
|
Further, error from vio_read is treated differently on Unix versus
|
|
other OS's (Win32 or OS/2). On Unix an alarm is set if one is not
|
|
in use, no bytes have been read and there has been no interruption.
|
|
Also, in that case, if connection is not in blocking mode, a sub-loop
|
|
is run as long as blocking is not set with vio_blocking function.
|
|
Within the loop another run of above vio_read is run based on return
|
|
value of vio_is_retry function, provided number of repeated writes
|
|
is less than RETRY_COUNT. If that is not the case, error field of
|
|
struct NET is set to 1 and function exits. At the exit of sub-loop
|
|
number of reruns already executed is reset to zero and another run
|
|
of above vio_read function is attempted. If the function is run on
|
|
Win32 and OS/2, and in the case that function flow was not interrupted
|
|
and thread alarm is not in use, again the main loop is continued
|
|
as long as there are bytes remaining. In the case that this function
|
|
is executed on thread safe client program, then if another run
|
|
should be made, which is decided by the output of vio_should_retry
|
|
function, in which case the loop is continued. At the end of
|
|
processing of the error from vio_read, error field of struct NET
|
|
is set, and if on server last_errno field is set to ER_NET_READ_INTERRUPTED
|
|
in the case that local bool variable (interrupted) is true or to
|
|
ER_NET_ERROR_ON_READ. In case of such an error this function exits
|
|
and returns error. In the case when there is no error, number of
|
|
remaining bytes (remain) is decreased by the number of bytes read,
|
|
which should be zero, but in case it is not the entire code is still
|
|
in while (remain > 0) loop, which will be exited immediately if it
|
|
is. This has been done to accommodate errors in the traffic level
|
|
and for the very slow connections. Current position in field buff
|
|
is also moved by the amount of bytes read by vio_read function, and
|
|
global variable bytes_received is increased by the same value in a
|
|
thread safe manner. When the loop that is run until necessary bytes
|
|
are read (remain) is finished, then if external loop is in its first
|
|
run, of the two, packet sequencing is tested for consistency by
|
|
comparing the number contained at 4th byte in header with pkt_nr
|
|
field. Header location is found at where_b offset to field_b. Usage
|
|
of where_b is obligatory due to the possible compression usage. If
|
|
there is no compression on a connection, then where_b is always 0.
|
|
If there is a discrepancy, then first byte of the header is checked
|
|
whether it is equal to 255, because when error is sent by the server,
|
|
or by a client if it is sending data (like in LOAD DATA INFILE
|
|
LOCAL...), then first byte in header is set to 255. If it is not
|
|
255, then an error on packets being out of order is printed. In any
|
|
case, on server, last_errno field is set to ER_NET_PACKETS_OUT_OF_ORDER
|
|
and the function returns with an error, that is, the value returned is
|
|
packet_error. If a check on serial number of packet is successful,
|
|
pkt_nr field is incremented in order to enable checking packet order
|
|
with next packet and if compression is used, uncompressed length
|
|
is extracted from a proper position in header and returned in the
|
|
second parameter of this function. Length of the packet is saved,
|
|
for the purpose of a proper return value from this function. Still
|
|
in the first iteration of the main loop, a check must be made if
|
|
field buff could accommodate entire package that comes, in its
|
|
compressed or uncompressed form. This is done in such a way, because
|
|
zlib's compress and uncompress functions use the same memory area
|
|
for compression and uncompression. Necessary field buff length is
|
|
equal to current offset where data are (where_b which is zero for
|
|
non-compression), plus the larger value of compressed or uncompressed
|
|
package to be read in a second run. If this value is larger than
|
|
the current length of field buff, which is read from field max_packet,
|
|
then field buff has to be reallocated. If reallocation with net_realloc
|
|
function fails, the function returns an error. Before a second
|
|
loop is started, length to be read is set to the length of expected
|
|
data and current position (pos) is set at where_b offset from field
|
|
buff. At the end of function, if alarm is set, which is the case
|
|
if it is run on server or on a client if a function is interrupted
|
|
and another run of vio_read is attempted, alarm is ended and blocking
|
|
state is restored from the saved local bool variable net_blocking.
|
|
Function returns number of bytes read or the error (packet_error).
|
|
|
|
|
|
Function name: my_net_read
|
|
|
|
Parameters: struct NET *
|
|
|
|
Return value: length of bytes read
|
|
|
|
Function purpose: Highest level general purpose reading function
|
|
|
|
Function description
|
|
|
|
First, if compression is not used, my_real_read is called, with
|
|
struct NET * a first parameter, and pointer to local ulong complen
|
|
as a second parameter, but its value is not used here. Number of
|
|
bytes read is returned in local ulong variable len. read_pos field
|
|
is set to an offset of value of where_b field from field buff.
|
|
where_b field actually denotes where in field buff is the current
|
|
packet. If returned number of bytes read (local variable len) does
|
|
not signal that an error in packet transmission occurred (that is,
|
|
it is not set to packet_error), then the string contained in read_pos
|
|
is zero terminated. Simply, the end of the string starting at
|
|
read_pos, and ending at read_pos + len, is set to zero. This is
|
|
done in that way, because mysql_use_result expects a zero terminated
|
|
string, and function returns with a value local variable len. This
|
|
ends this function in the case that compression is not used and the
|
|
remaining code is executed only if compression is enabled on the
|
|
connection.
|
|
|
|
In order to explain how a compressed packet logically is cut into
|
|
meningful packets, the full meaning of several NET fields should
|
|
be explained. First of all, fields in NET are used and not local
|
|
variables, as all values should be saved between consecutive calls
|
|
of this function. Simply, this function is called in order to return
|
|
logical packets, but this function does not need to call my_real_read
|
|
function everytime, because when a large packet is uncompressed,
|
|
it may, but not necessarily so, contain several logical packets.
|
|
Therefore, in order to preserve data on logical packets local
|
|
variables are not used. Instead fields in NET struct are used. Field
|
|
remain_in_buf denotes how many bytes of entire uncompressed packets
|
|
is still contained within buff. field buf_length saves the value
|
|
of the length of entire uncompressed packet. field save_char is
|
|
used to save the character at the position where the packet ends,
|
|
which character has to be replaced with a zero, '\0', in order to
|
|
make a logical packet zero delimited, for mysql_use_result. Field
|
|
length stores the value of the length of compressed packet. Field
|
|
read_pos as usual, points to the current reading position. This
|
|
char * pointer is used by all functions that call this function in
|
|
order to fetch their data. Field buff is not used for that purpose,
|
|
but read_pos is used instead. This change was introduced with
|
|
compression, when algorithm accommodated grouping of several packets
|
|
together.
|
|
|
|
Now that meanings of all relevant NET fields are explained,
|
|
we can proceed with the flow of this function for the case when
|
|
compression is active. First, if there are remaining portions of
|
|
compressed packet in a field buff, saved character value is set at
|
|
the position where zero char '\0' was inserted to enable the string
|
|
to be zero delimited for mysql_use_result. Then a loop is started.
|
|
In the first part of the loop, if there are remaining bytes, local
|
|
uchar *pos variable is set at the current position in field buff
|
|
where a new packet starts. This position is an (buf_length -
|
|
remain_in_buf) offset in field buff. As it is possible that next
|
|
logical packet is not read to the full length in the remaining of
|
|
the field buf, several things had to be inspected. It should be
|
|
noted that data that is read from net_real_read contains only logical
|
|
packets containing 4 byte headers only, being 4 byte headers prepared
|
|
by my_net_write or net_write_command. But, when written, logical
|
|
packet could be so divided that only a part of header is read in.
|
|
Therefore after pointer to the start of the next packet has been
|
|
saved, a check is made whether number of remaining bytes in buffer
|
|
is less than 4, being 3 bytes for length and one byte for packet
|
|
number. If it is greater, then the length of the logical packet is
|
|
extracted and saved a length field. Then a check is made whether
|
|
entire packet is contained within a buf, that is, a check is made
|
|
that the logical packet is fully contained in the buffer. In that
|
|
case, number of bytes remaining in buffer is decreased by the full
|
|
length of logical packet (4 + length field), read_pos is moved
|
|
forward by 4 bytes to skip header and be set at a beginning of data
|
|
in logical packet, length field is saved for the value to be returned
|
|
in function and the loop is exited. In the case that the entire
|
|
logical packet is not contained within the buffer, then if length of
|
|
the entire buffer differs from remaining length of logical packet,
|
|
it (logical packet) is moved to the beginning of the field buff.
|
|
If length of the entire buffer equals the remaining length of logical
|
|
packet, where_b and buf_length fields are set to 0. This is done
|
|
so that in both cases buffer is ready to accept next part of packet.
|
|
|
|
In order to get a next part of a packet, still within a loop,
|
|
my_real_read function is called and length of compressed packet is
|
|
returned to a local len variable, and length of compressed data is
|
|
returned in complen variable. In the case of non-compression value
|
|
of complen is zero. If packet_error is from my_real_read function,
|
|
this function returns also with packet_error. If it is not a
|
|
packet_error, my_uncompress function is called to uncompress data.
|
|
It is called with offset of where_b data from field buff, as it is
|
|
the position where compressed packet starts, and with len and complen
|
|
values, being lengths of compressed and uncompressed data. If there
|
|
is no compression, 0 is returned for uncompressed size from
|
|
my_real_read function, and my_uncompress wrapper function is made
|
|
to skip zlib uncompress in that case. If error is returned from
|
|
my_uncompress, error field is set to 1, if on server last_errno is
|
|
set to ER_NET_UNCOMPRESS_ERROR and loop is exited and function
|
|
returns with packet_error. If not, buf_length and remain_in_buf
|
|
fields are set to the uncompressed size of buffer and the loop is
|
|
continued. When the loop is exited save_char field is used to save
|
|
the char at end of a logical packet, which is an offset of field
|
|
len from position in field buff pointed by field read_pos, in order
|
|
that zero char is set at the same position, for mysql_use_result.
|
|
Function returns the length of the logical packet without its header.
|