mariadb/storage/innobase/pars/pars0lex.l
aivanov@mysql.com 1d7de700e2 Applied innodb-5.1-ss594 snapshot.
Fixed BUG#19542 "InnoDB doesn't increase the Handler_read_prev couter".
 Fixed BUG#19609 "Case sensitivity of innodb_data_file_path gives stupid error".
 Fixed BUG#19727 "InnoDB crashed server and crashed tables are ot recoverable".
 Also:
 * Remove remnants of the obsolete concept of memoryfixing tables and indexes.
 * Remove unused dict_table_LRU_trim().
 * Remove unused 'trx' parameter from dict_table_get_on_id_low(),
   dict_table_get(), dict_table_get_and_increment_handle_count().
 * Add a normal linked list implementation.
 * Add a work queue implementation.
 * Add 'level' parameter to mutex_create() and rw_lock_create().
   Remove mutex_set_level() and rw_lock_set_level().
 * Rename SYNC_LEVEL_NONE to SYNC_LEVEL_VARYING.
 * Add support for bound ids in InnoDB's parser.
 * Define UNIV_BTR_DEBUG for enabling consistency checks of
   FIL_PAGE_NEXT and FIL_PAGE_PREV when accessing sibling
   pages of B-tree indexes.
   btr_validate_level(): Check the validity of the doubly linked
   list formed by FIL_PAGE_NEXT and FIL_PAGE_PREV.
 * Adapt InnoDB to the new tablename to filename encoding in MySQL 5.1.
   ut_print_name(), ut_print_name1(): Add parameter 'table_id' for
   distinguishing names of tables from other identifiers.
   New: innobase_convert_from_table_id(), innobase_convert_from_id(),
        innobase_convert_from_filename(), innobase_get_charset.
   dict_accept(), dict_scan_id(), dict_scan_col(), dict_scan_table_name(),
   dict_skip_word(), dict_create_foreign_constraints_low(): Add
   parameter 'cs' so that isspace() can be replaced with my_isspace(),
   whose operation depends on the connection character set.
   dict_scan_id(): Convert identifier to UTF-8.
   dict_str_starts_with_keyword(): New extern function, to replace
   dict_accept() in row_search_for_mysql().
   mysql_get_identifier_quote_char(): Replaced with innobase_print_identifier().
   ha_innobase::create(): Remove the thd->convert_strin() call. Pass the
   statement to InnoDB in the connection character set and let InnoDB
   convert the identifier to UTF-8.
 * Add max_row_size to dict_table_t.
 * btr0cur.c
   btr_copy_externally_stored_field(): Only set the 'offset' variable
   when needed.
 * buf0buf.c
   buf_page_io_complete(): Write to the error log if the page number or
   the space id o the disk do not match those in memory. Also write to
   the error log if a page was read from the doublewrite buffer. The
   doublewrite buffer should be only read by the lower-level function
   fil_io() at database startup.
 * dict0dict.c
   dict_scan_table_name(): Remove fallback to differently encoded name
   when the table is not found. The encoding is handled at a higher level.
 * ha_innodb.cc
   Increment statistic counter in ha_innobase::index_prev() (bug 19542).
   Add innobase_convert_string wrapper function and a new file
   ha_prototypes.h.
   innobase_print_identifier(): Remove TODO comment before calling
   get_quote_char_for_identifier(). That function apparently assumes
   the identifier to be encoded in UTF-8.
 * ibuf0ibuf.c|h
   ibuf_count_get(), ibuf_counts[], ibuf_count_inited(): Define these
   only #ifdef UNIV_IBUF_DEBUG. Previously, when compiled without
   UNIV_IBUF_DEBUG, invoking ibuf_count_get() would crash InnoDB.
   The function is only being called #ifdef UNIV_IBUF_DEBUG.
 * innodb.result
   Adjust the results for changes in the foreign key error messages.
 * mem0mem.c|h
   New: mem_heap_dup(), mem_heap_printf(), mem_heap_cat().
 * os0file.c
   Check the page trailers also after writing to disk. This improves
   chances of diagnosing bug 18886.
   os_file_check_page_trailers(): New function for checking that the
   two copies of the LSN stamped on the page match.
   os_aio_simulated_handle(): Call os_file_check_page_trailers()
   before and after os_file_write().
 * row0mysql.c
   Move trx_commit_for_mysql(trx) calls before calls to
   row_mysql_unlock_data_dictionary(trx) (bug 19727).
 * row0sel.c
   row_fetch_print(): Handle SQL NULL values without crashing.
   row_sel_store_mysql_rec(): Remove useless call to rec_get_nth_field
   when handling an externally stored column.
   Fetch externally stored fields when using InnoDB's internal SQL
   parser.
   Optimize BLOB selects by using prebuilt->blob_heap directly instead
   of first reading BLOB data to a temporary heap and then copying it
   to prebuilt->blob_heap.
 * srv0srv.c
   srv_master_thread(): Remove unreachable code.
 * srv0start.c
   srv_parse_data_file_paths_and_sizes(): Accept lower-case 'm' and
   'g' as abbreviations of megabyte and gigabyte (bug 19609).
   srv_parse_megabytes(): New fuction.
 * ut0dbg.c|h
   Implement InnoDB assertions (ut_a and ut_error) with abort() when
   the code is compiled with GCC 3 or later on other platforms than
   Windows or Netware. Also disable the variable ut_dbg_stop_threads
   and the function ut_dbg_stop_thread() i this case, unless
   UNIV_SYC_DEBUG is defined. This should allow the compiler to
   generate more compact code for assertions.
 * ut0list.c|h
   Add ib_list_create_heap().
2006-06-01 10:34:04 +04:00

640 lines
9.3 KiB
Text

/******************************************************
SQL parser lexical analyzer: input file for the GNU Flex lexer generator
(c) 1997 Innobase Oy
Created 12/14/1997 Heikki Tuuri
Published under the GPL version 2
The InnoDB parser is frozen because MySQL takes care of SQL parsing.
Therefore we normally keep the InnoDB parser C files as they are, and do
not automatically generate them from pars0grm.y and pars0lex.l.
How to make the InnoDB parser and lexer C files:
1. Run ./make_flex.sh to generate lexer files.
2. Run ./make_bison.sh to generate parser files.
These instructions seem to work at least with bison-1.875d and flex-2.5.31 on
Linux.
*******************************************************/
%option nostdinit
%option 8bit
%option warn
%option pointer
%option never-interactive
%option nodefault
%option noinput
%option nounput
%option noyywrap
%option noyy_scan_buffer
%option noyy_scan_bytes
%option noyy_scan_string
%option nounistd
%{
#define YYSTYPE que_node_t*
#include "univ.i"
#include "pars0pars.h"
#include "pars0grm.h"
#include "pars0sym.h"
#include "mem0mem.h"
#include "os0proc.h"
#define malloc(A) ut_malloc(A)
#define free(A) ut_free(A)
#define realloc(P, A) ut_realloc(P, A)
#define exit(A) ut_error
#define YY_INPUT(buf, result, max_size) pars_get_lex_chars(buf, &result, max_size)
/* String buffer for removing quotes */
static ulint stringbuf_len_alloc = 0; /* Allocated length */
static ulint stringbuf_len = 0; /* Current length */
static char* stringbuf; /* Start of buffer */
/* Appends a string to the buffer. */
static
void
string_append(
/*==========*/
const char* str, /* in: string to be appended */
ulint len) /* in: length of the string */
{
if (stringbuf == NULL) {
stringbuf = malloc(1);
stringbuf_len_alloc = 1;
}
if (stringbuf_len + len > stringbuf_len_alloc) {
while (stringbuf_len + len > stringbuf_len_alloc) {
stringbuf_len_alloc <<= 1;
}
stringbuf = realloc(stringbuf, stringbuf_len_alloc);
}
memcpy(stringbuf + stringbuf_len, str, len);
stringbuf_len += len;
}
%}
DIGIT [0-9]
ID [a-z_A-Z][a-z_A-Z0-9]*
BOUND_LIT \:[a-z_A-Z0-9]+
BOUND_ID \$[a-z_A-Z0-9]+
%x comment
%x quoted
%x id
%%
{DIGIT}+ {
yylval = sym_tab_add_int_lit(pars_sym_tab_global,
atoi(yytext));
return(PARS_INT_LIT);
}
{DIGIT}+"."{DIGIT}* {
ut_error; /* not implemented */
return(PARS_FLOAT_LIT);
}
{BOUND_LIT} {
ulint type;
yylval = sym_tab_add_bound_lit(pars_sym_tab_global,
yytext + 1, &type);
return(type);
}
{BOUND_ID} {
yylval = sym_tab_add_bound_id(pars_sym_tab_global,
yytext + 1);
return(PARS_ID_TOKEN);
}
"'" {
/* Quoted character string literals are handled in an explicit
start state 'quoted'. This state is entered and the buffer for
the scanned string is emptied upon encountering a starting quote.
In the state 'quoted', only two actions are possible (defined below). */
BEGIN(quoted);
stringbuf_len = 0;
}
<quoted>[^\']+ {
/* Got a sequence of characters other than "'":
append to string buffer */
string_append(yytext, yyleng);
}
<quoted>"'"+ {
/* Got a sequence of "'" characters:
append half of them to string buffer,
as "''" represents a single "'".
We apply truncating division,
so that "'''" will result in "'". */
string_append(yytext, yyleng / 2);
/* If we got an odd number of quotes, then the
last quote we got is the terminating quote.
At the end of the string, we return to the
initial start state and report the scanned
string literal. */
if (yyleng % 2) {
BEGIN(INITIAL);
yylval = sym_tab_add_str_lit(
pars_sym_tab_global,
(byte*) stringbuf, stringbuf_len);
return(PARS_STR_LIT);
}
}
\" {
/* Quoted identifiers are handled in an explicit start state 'id'.
This state is entered and the buffer for the scanned string is emptied
upon encountering a starting quote.
In the state 'id', only two actions are possible (defined below). */
BEGIN(id);
stringbuf_len = 0;
}
<id>[^\"]+ {
/* Got a sequence of characters other than '"':
append to string buffer */
string_append(yytext, yyleng);
}
<id>\"+ {
/* Got a sequence of '"' characters:
append half of them to string buffer,
as '""' represents a single '"'.
We apply truncating division,
so that '"""' will result in '"'. */
string_append(yytext, yyleng / 2);
/* If we got an odd number of quotes, then the
last quote we got is the terminating quote.
At the end of the string, we return to the
initial start state and report the scanned
identifier. */
if (yyleng % 2) {
BEGIN(INITIAL);
yylval = sym_tab_add_id(
pars_sym_tab_global,
(byte*) stringbuf, stringbuf_len);
return(PARS_ID_TOKEN);
}
}
"NULL" {
yylval = sym_tab_add_null_lit(pars_sym_tab_global);
return(PARS_NULL_LIT);
}
"SQL" {
/* Implicit cursor name */
yylval = sym_tab_add_str_lit(pars_sym_tab_global,
(byte*) yytext, yyleng);
return(PARS_SQL_TOKEN);
}
"AND" {
return(PARS_AND_TOKEN);
}
"OR" {
return(PARS_OR_TOKEN);
}
"NOT" {
return(PARS_NOT_TOKEN);
}
"PROCEDURE" {
return(PARS_PROCEDURE_TOKEN);
}
"IN" {
return(PARS_IN_TOKEN);
}
"OUT" {
return(PARS_OUT_TOKEN);
}
"BINARY" {
return(PARS_BINARY_TOKEN);
}
"BLOB" {
return(PARS_BLOB_TOKEN);
}
"INT" {
return(PARS_INT_TOKEN);
}
"INTEGER" {
return(PARS_INT_TOKEN);
}
"FLOAT" {
return(PARS_FLOAT_TOKEN);
}
"CHAR" {
return(PARS_CHAR_TOKEN);
}
"IS" {
return(PARS_IS_TOKEN);
}
"BEGIN" {
return(PARS_BEGIN_TOKEN);
}
"END" {
return(PARS_END_TOKEN);
}
"IF" {
return(PARS_IF_TOKEN);
}
"THEN" {
return(PARS_THEN_TOKEN);
}
"ELSE" {
return(PARS_ELSE_TOKEN);
}
"ELSIF" {
return(PARS_ELSIF_TOKEN);
}
"LOOP" {
return(PARS_LOOP_TOKEN);
}
"WHILE" {
return(PARS_WHILE_TOKEN);
}
"RETURN" {
return(PARS_RETURN_TOKEN);
}
"SELECT" {
return(PARS_SELECT_TOKEN);
}
"SUM" {
return(PARS_SUM_TOKEN);
}
"COUNT" {
return(PARS_COUNT_TOKEN);
}
"DISTINCT" {
return(PARS_DISTINCT_TOKEN);
}
"FROM" {
return(PARS_FROM_TOKEN);
}
"WHERE" {
return(PARS_WHERE_TOKEN);
}
"FOR" {
return(PARS_FOR_TOKEN);
}
"CONSISTENT" {
return(PARS_CONSISTENT_TOKEN);
}
"READ" {
return(PARS_READ_TOKEN);
}
"ORDER" {
return(PARS_ORDER_TOKEN);
}
"BY" {
return(PARS_BY_TOKEN);
}
"ASC" {
return(PARS_ASC_TOKEN);
}
"DESC" {
return(PARS_DESC_TOKEN);
}
"INSERT" {
return(PARS_INSERT_TOKEN);
}
"INTO" {
return(PARS_INTO_TOKEN);
}
"VALUES" {
return(PARS_VALUES_TOKEN);
}
"UPDATE" {
return(PARS_UPDATE_TOKEN);
}
"SET" {
return(PARS_SET_TOKEN);
}
"DELETE" {
return(PARS_DELETE_TOKEN);
}
"CURRENT" {
return(PARS_CURRENT_TOKEN);
}
"OF" {
return(PARS_OF_TOKEN);
}
"CREATE" {
return(PARS_CREATE_TOKEN);
}
"TABLE" {
return(PARS_TABLE_TOKEN);
}
"INDEX" {
return(PARS_INDEX_TOKEN);
}
"UNIQUE" {
return(PARS_UNIQUE_TOKEN);
}
"CLUSTERED" {
return(PARS_CLUSTERED_TOKEN);
}
"DOES_NOT_FIT_IN_MEMORY" {
return(PARS_DOES_NOT_FIT_IN_MEM_TOKEN);
}
"ON" {
return(PARS_ON_TOKEN);
}
"DECLARE" {
return(PARS_DECLARE_TOKEN);
}
"CURSOR" {
return(PARS_CURSOR_TOKEN);
}
"OPEN" {
return(PARS_OPEN_TOKEN);
}
"FETCH" {
return(PARS_FETCH_TOKEN);
}
"CLOSE" {
return(PARS_CLOSE_TOKEN);
}
"NOTFOUND" {
return(PARS_NOTFOUND_TOKEN);
}
"TO_CHAR" {
return(PARS_TO_CHAR_TOKEN);
}
"TO_NUMBER" {
return(PARS_TO_NUMBER_TOKEN);
}
"TO_BINARY" {
return(PARS_TO_BINARY_TOKEN);
}
"BINARY_TO_NUMBER" {
return(PARS_BINARY_TO_NUMBER_TOKEN);
}
"SUBSTR" {
return(PARS_SUBSTR_TOKEN);
}
"REPLSTR" {
return(PARS_REPLSTR_TOKEN);
}
"CONCAT" {
return(PARS_CONCAT_TOKEN);
}
"INSTR" {
return(PARS_INSTR_TOKEN);
}
"LENGTH" {
return(PARS_LENGTH_TOKEN);
}
"SYSDATE" {
return(PARS_SYSDATE_TOKEN);
}
"PRINTF" {
return(PARS_PRINTF_TOKEN);
}
"ASSERT" {
return(PARS_ASSERT_TOKEN);
}
"RND" {
return(PARS_RND_TOKEN);
}
"RND_STR" {
return(PARS_RND_STR_TOKEN);
}
"ROW_PRINTF" {
return(PARS_ROW_PRINTF_TOKEN);
}
"COMMIT" {
return(PARS_COMMIT_TOKEN);
}
"ROLLBACK" {
return(PARS_ROLLBACK_TOKEN);
}
"WORK" {
return(PARS_WORK_TOKEN);
}
"UNSIGNED" {
return(PARS_UNSIGNED_TOKEN);
}
"EXIT" {
return(PARS_EXIT_TOKEN);
}
"FUNCTION" {
return(PARS_FUNCTION_TOKEN);
}
{ID} {
yylval = sym_tab_add_id(pars_sym_tab_global,
(byte*)yytext,
ut_strlen(yytext));
return(PARS_ID_TOKEN);
}
".." {
return(PARS_DDOT_TOKEN);
}
":=" {
return(PARS_ASSIGN_TOKEN);
}
"<=" {
return(PARS_LE_TOKEN);
}
">=" {
return(PARS_GE_TOKEN);
}
"<>" {
return(PARS_NE_TOKEN);
}
"(" {
return((int)(*yytext));
}
"=" {
return((int)(*yytext));
}
">" {
return((int)(*yytext));
}
"<" {
return((int)(*yytext));
}
"," {
return((int)(*yytext));
}
";" {
return((int)(*yytext));
}
")" {
return((int)(*yytext));
}
"+" {
return((int)(*yytext));
}
"-" {
return((int)(*yytext));
}
"*" {
return((int)(*yytext));
}
"/" {
return((int)(*yytext));
}
"%" {
return((int)(*yytext));
}
"{" {
return((int)(*yytext));
}
"}" {
return((int)(*yytext));
}
"?" {
return((int)(*yytext));
}
"/*" BEGIN(comment); /* eat up comment */
<comment>[^*]*
<comment>"*"+[^*/]*
<comment>"*"+"/" BEGIN(INITIAL);
[ \t\n]+ /* eat up whitespace */
. {
fprintf(stderr,"Unrecognized character: %02x\n",
*yytext);
ut_error;
return(0);
}
%%