The root cause of the issue was that the CREATE FUNCTION grammar,
for User Defined Functions, was using the sp_name rule.
The sp_name rule is intended for fully qualified stored procedure names,
like either ident.ident, or just ident but with a default database
implicitly selected.
A UDF does not have a fully qualified name, only a name (ident), and should
not use the sp_name grammar fragment during parsing.
The fix is to re-organize the CREATE FUNCTION grammar, to better separate:
- creating UDF (no definer, can have AGGREGATE, simple ident)
- creating Stored Functions (definer, no AGGREGATE, fully qualified name)
With the test case provided, another issue was exposed which is also fixed:
the DROP FUNCTION statement was using sp_name and also failing when no database
is implicitly selected, when droping UDF functions.
The fix is also to change the grammar so that DROP FUNCTION works with
both the ident.ident syntax (to drop a stored function), or just the ident
syntax (to drop either a UDF or a Stored Function, in the current database)
Bug#29816 Syntactically wrong query fails with misleading error message
The core problem is that an SQL-invoked function name can be a <schema
qualified routine name> that contains no <schema name>, but the mysql
parser insists that all stored procedures (function, procedures and
triggers) must have a <schema name>, which is not true for functions.
This problem is especially visible when trying to create a function
or when a query contains a syntax error after a function call (in the
same query), both will fail with a "No database selected" message if
the session is not attached to a particular schema, but the first
one should succeed and the second fail with a "syntax error" message.
Part of the fix is to revamp the sp name handling so that a schema
name may be omitted for functions -- this means that the internal
function name representation may not have a dot, which represents
that the function doesn't have a schema name. The other part is
to place schema checks after the type (function, trigger or procedure)
of the routine is known.
Previously, UDF *_init functions were passed constant strings with erroneous lengths.
The length came from the containing variable's size, not the length of the value itself.
Now the *_init functions get the constant as a null terminated string with the correct
length supplied.
crashes server
Check for null value is reliable only after calling some of the
val_xxx() methods. If the val_xxx() method is not called
the null_value flag will be set only for certain types of NULL
values (like SQL constant NULLs for example).
This caused a crash while trying to dereference a NULL pointer
that is returned by val_str() for NULL values.
Fixed by swapping the order of val_xxx() and null_value check.
the UDF
When deleting a user defined function MySQL must remove it from both the
in-memory hash table and the mysql.proc system table.
Finding (and removal therefore) from the internal hash table is case
insensitive (or whatever the default charset is), whereas finding and
removal from the system table is case sensitive.
As a result if you supply a function name that is not in the same character
case to DROP FUNCTION the server will remove the function only from the
in-memory hash table and will keep the row in mysql.proc system table.
This will cause inconsistency between the two structures (that is fixed
only by restarting the server).
Fixed by using the name in the precise case (from the in-memory hash table)
to delete the row in the mysql.proc system table.
Before this fix, a call to a User Defined Function (UDF) could,
under some circumstances, be interpreted as a call to a Stored function
instead. This occurred if a native function was invoked in the parameters
for the UDF, as in "select my_udf(abs(x))".
The root cause of this defect is the introduction, by the fix for Bug 21809,
of st_select_lex::udf_list, and it's usage in the parser in sql_yacc.yy
in the rule function_call_generic (in 5.1).
While the fix itself for Bug 21809 is correct in 5.0, the code change
merged into the 5.1 release created the issue, because the calls in 5.1 to :
- lex->current_select->udf_list.push_front(udf)
- lex->current_select->udf_list.pop()
are not balanced in case of native functions, causing the udf_list,
which is really a stack, to be out of sync with the internal stack
maintained by the bison parser.
Instead of moving the call to udf_list.pop(), which would have fixed the
symptom, this patch goes further and removes the need for udf_list.
This is motivated by two reasons:
a) Maintaining a stack in the MySQL code in sync with the stack maintained
internally in sql_yacc.cc (not .yy) is extremely dependent of the
implementation of yacc/bison, and extremely difficult to maintain.
It's also totally dependent of the structure of the grammar, and has a risk
to break with regression defects each time the grammar itself is changed.
b) The previous code did report construct like "foo(expr AS name)" as
syntax errors (ER_PARSER_ERROR), which is incorrect, and misleading.
The syntax is perfectly valid, as this expression is valid when "foo" is
a UDF. Whether this syntax is legal or not depends of the semantic of "foo".
With this change:
a) There is only one stack (in bison), and no List<udf_func> to maintain.
b) "foo(expr AS name)", when used incorrectly, is reported as semantic error:
- ER_WRONG_PARAMETERS_TO_NATIVE_FCT (for native functions)
- ER_WRONG_PARAMETERS_TO_STORED_FCT (for stored functions)
This is achieved by the changes implemented in item_create.cc
Bug#21025 (misleading error message when creating functions named 'x', or 'y')
Bug#22619 (Spaces considered harmful)
This change contains a fix to report warnings or errors, and multiple tests
cases.
Before this fix, name collisions between:
- Native functions
- User Defined Functions
- Stored Functions
were not systematically reported, leading to confusing behavior.
I) Native / User Defined Function
Before this fix, is was possible to create a UDF named "foo", with the same
name as a native function "foo", but it was impossible to invoke the UDF,
since the syntax "foo()" always refer to the native function.
After this fix, creating a UDF fails with an error if there is a name
collision with a native function.
II) Native / Stored Function
Before this fix, is was possible to create a SF named "db.foo", with the same
name as a native function "foo", but this was confusing since the syntax
"foo()" would refer to the native function. To refer to the Stored Function,
the user had to use the "db.foo()" syntax.
After this fix, creating a Stored Function reports a warning if there is a
name collision with a native function.
III) User Defined Function / Stored Function
Before this fix, creating a User Defined Function "foo" and a Stored Function
"db.foo" are mutually exclusive operations. Whenever the second function is
created, an error is reported. However, the test suite did not cover this
behavior.
After this fix, the behavior is unchanged, and is now covered by test cases.
Note that the code change in this patch depends on the fix for Bug 21114.
The code that set up data to be passed to user-defined functions was very
old and analyzed the "Type" of the data that was passed into the UDF, when
it really should analyze the "return_type", which is hard-coded for simple
Items and works correctly for complex ones like functions.
---
Added test at Sergei's behest.
select OK.
The SQL parser was using Item::name to transfer user defined function attributes
to the user defined function (udf). It was not distinguishing between user defined
function call arguments and stored procedure call arguments. Setting Item::name
was causing Item_ref::print() method to print the argument as quoted identifiers
and caused views that reference aggregate functions as udf call arguments (and
rely on Item::print() for the text of the view to store) to throw an undefined
identifier error.
Overloaded Item_ref::print to print aggregate functions as such when printing
the references to aggregate functions taken out of context by split_sum_func2()
Fixed the parser to properly detect using AS clause in stored procedure arguments
as an error.
Fixed printing the arguments of udf call to print properly the udf attribute.
The problem was that the grammar allows to create a function with an optional
definer clause, and define it as a UDF with the SONAME keyword.
Such combination should be reported as an error.
The solution is to not change the grammar itself, and to introduce a
specific check in the yacc actions in 'create_function_tail' for UDF,
that now reports ER_WRONG_USAGE when using both DEFINER and SONAME.
When there is no index defined filesort is used to sort the result of a
query. If there is a function in the select list and the result set should be
ordered by it's value then this function will be evaluated twice. First time to
get the value of the sort key and second time to send its value to a user.
This happens because filesort when sorts a table remembers only values of its
fields but not values of functions.
All functions are affected. But taking into account that SP and UDF functions
can be both expensive and non-deterministic a temporary table should be used
to store their results and then sort it to avoid twice SP evaluation and to
get a correct result.
If an expression referenced in an ORDER clause contains a SP or UDF
function, force the use of a temporary table.
A new Item_processor function called func_type_checker_processor is added
to check whether the expression contains a function of a particular type.
The is_null value was initialized once and thereafter only set to indicate
NULL, and never unset to indicate not-NULL.
Now set is_null to false, in addition to only setting it to true when the value
in question is null.
- Pass "buffers[i]" to val_str() in udf_handler::fix_fields insteead of NULL.
- Add testcase for UDF that will load and run the udf_example functions
if available