Commit graph

16 commits

Author SHA1 Message Date
Alexander Barkov
1ca595fbf7 LDML refactoring for "MDEV-9711 NO PAD collations"
- Moving detection of the MY_CS_CSSORT, MY_CS_PUREASCII, MY_CS_NONASCII
  flags of loadable collations from add_collation() in mysys.c
  to my_cset_init_8bit() and my_coll_init_simple() in ctype-simple.c.

- Adding tests that these flags are set properly for loadable collations

- Moving LDML test related *.xml files from mysql-test/std_data/
  to mysql-test/std_data/ldml/, as there will be more *.xml test files
2016-09-03 09:05:56 +04:00
Alexander Barkov
25e68c5e46 MDEV-8686 A user defined collation utf8_confusables doesn't work
The collation customization code for the UCA (Unicode Collation Alrorithm)
based collations now allows to reset to and shift of characters with
implicit weights. Previously reset/shift worked only for the characters
with explicit DUCET weights. An attempt to use reset/shift with
character with implicit weights made the server crash.
2016-06-23 14:25:48 +04:00
Alexander Barkov
d25d7ec589 Fixing malformed data in mysql-test/std_data/Index.xml 2013-11-26 10:53:21 +04:00
Alexander Barkov
fad7371191 Merging xxx_unicode_520_ci and xxx_vietnamese_ci from MySQL-5.6. 2013-11-12 16:48:57 +04:00
Alexander Barkov
bd3dc54261 A few minor Unicode collation customization improvements were made,
which makes it possible to add more world language collations
with very complex collation rules (e.g. Myanmar):
- Weight string for a single character in a user defined collation
  was erroneously limited to 7 weights (instead of 8 weights).
  Added an extra element in the user-defined weight arrays,
  to fit 8 non-zero weights.
- Weight string limit for contractions was made two times longer (16 weights),
  which allows longer contractions without affecting the performance
  of filesort.
- A user-defined collation now refuses to initialize and reports an error
  in case if a weight string gets longer than 8 weights for a single character,
  or longer than 16 weights for a contraction. Previously weight strings
  for such characters (and contractions) were cut, so a collation
  could silently start with wrong rules.
- Fixed a bug in handling rules like "&a << b" in combination with
  shift-after-method="expand". The primary weight for "b" was not
  correctly calculated, which erroneously made "b" primary greater than "a"
  instead of primary equal to "a".
2013-10-31 14:24:24 +04:00
Alexander Barkov
426d246f5b MDEV-5163 Merge WEIGHT_STRING function from MySQL-5.6 2013-10-23 20:25:52 +04:00
Alexander Barkov
0b6c4bb34f MDEV-4928 Merge collation customization improvements
Merging the following MySQL-5.6 changes:
- WL#5624: Collation customization improvements
  http://dev.mysql.com/worklog/task/?id=5624

- WL#4013: Unicode german2 collation
  http://dev.mysql.com/worklog/task/?id=4013

- Bug#62429 XML: ExtractValue, UpdateXML max arg length 127 chars
  http://bugs.mysql.com/bug.php?id=62429
  (required by WL#5624)
2013-10-02 15:04:07 +04:00
Alexander Barkov
8994fad85d Backporting WL#1213
config/ac-macros/character_sets.m4:
  - Adding configure definitions for utf8mb4, utf16, utf32
include/config-win.h:
  - Enabling utf8mb4, utf16, utf32 in Windows build
include/m_ctype.h:
  - Adding new flags
  - Adding new shared functions prototypes
mysql-test/include/ctype_datetime.inc:
  - Adding test to check that datetime functions
    work with "real" multibyte character sets.
mysql-test/include/ctype_like.inc:
  - Adding LIKE tests
mysql-test/include/have_utf16.inc:
  New file
mysql-test/include/have_utf32.inc:
  New file
mysql-test/include/have_utf8mb4.inc:
  New file
mysql-test/r/ctype_ldml.result:
  - Adding tests for utf8mb4, utf16, utf32
mysql-test/r/ctype_many.result:
  - Adding tests to check superset/subset relations
    between all Unicode character sets.
mysql-test/r/ctype_utf16.result:
  New file
mysql-test/r/ctype_utf16_uca.result:
  New file
mysql-test/r/ctype_utf32.result:
  New file
mysql-test/r/ctype_utf32_uca.result:
  New file
mysql-test/r/ctype_utf8.result:
  - Adding tests for utf8mn3 alias
mysql-test/r/ctype_utf8mb4.result:
  - Adding tests for utf8mb4
mysql-test/r/have_utf16.require:
  New file
mysql-test/r/have_utf32.require:
  New file
mysql-test/r/have_utf8mb4.require:
  New file
mysql-test/std_data/Index.xml:
  - Adding tests for loadable utf8m4, utf16, utf32 collations
mysql-test/suite/sys_vars/r/character_set_client_basic.result:
  - Adding tests for utf16, utf32.
  - Fixing new number of character sets
mysql-test/suite/sys_vars/r/character_set_connection_basic.result:
  - Fixing new number of character sets
mysql-test/suite/sys_vars/r/character_set_database_basic.result:
  - Fixing new number of character sets
mysql-test/suite/sys_vars/r/character_set_filesystem_basic.result:
  - Fixing new number of character sets
mysql-test/suite/sys_vars/r/character_set_results_basic.result:
  - Fixing new number of character sets
mysql-test/suite/sys_vars/t/character_set_client_basic.test:
  - Adding tests for new character sets
mysql-test/suite/sys_vars/t/character_set_connection_basic.test:
  - Adding dependency on utf8mb4, utf16, utf32
mysql-test/suite/sys_vars/t/character_set_database_basic.test:
  - Adding dependency on utf8mb4, utf16, utf32
mysql-test/suite/sys_vars/t/character_set_filesystem_basic.test:
  - Adding dependency on utf8mb4, utf16, utf32
mysql-test/suite/sys_vars/t/character_set_results_basic.test:
  - Adding dependency on utf8mb4, utf16, utf32
mysql-test/t/ctype_ldml.test:
  - Adding tests for dynamic utf8mb4, utf16, utf32 collations
mysql-test/t/ctype_many.test:
  - Adding tests to check superset/subset relations
    between all Unicode character sets
mysql-test/t/ctype_utf16.test:
  New file
mysql-test/t/ctype_utf16_uca.test:
  New file
mysql-test/t/ctype_utf32.test:
  New file
mysql-test/t/ctype_utf32_uca.test:
  New file
mysql-test/t/ctype_utf8.test:
  - Adding tests for utf8mb4 alias
mysql-test/t/ctype_utf8mb4.test:
  New file
mysys/charset-def.c:
  - Adding initialization of utf8mb4, utf16, utf32 built-int collations
mysys/charset.c:
  - Adding initialization of utf8mb4, utf16, utf32 dynamic collations
sql/field.cc:
  - Fixing "truncated" error with datetime functions:
    Force conversion in case of non-ascii character sets.
sql/item.cc:
  - Adding superset/subset relation check for utf8mb4/utf8
sql/item_strfunc.cc:
  - Fixing a problem with CHAR(x USING utf32)
sql/sql_string.cc:
  - Fixing problems with zero padding for UTF32
sql/sql_table.cc:
  - Fixing buffer size, to make utf32 comma fit.
strings/ctype-mb.c:
  - Making handlers for multi-byte binary collations public
strings/ctype-uca.c:
  - Adding definitions for utf8mb4, utf16, utf32 UCA collations
strings/ctype-ucs2.c:
  - Adding functions which are shared between ucs2, utf16, utf32
  - Ading utf16 implementation
  - Adding utf32 implementation
strings/ctype-utf8.c:
  - Adding functions shared between utf8 and utf8mb4
  - Adding implementation of utf8mb4
2010-02-24 13:15:34 +04:00
Alexander Nozdrin
5f0c09dd72 Manual merge from mysql-trunk-merge.
Conflicts:
  - include/my_no_pthread.h
  - mysql-test/r/sp-ucs2.result
  - sql/log.cc
  - sql/sql_acl.cc
  - sql/sql_yacc.yy
2009-12-16 21:02:21 +03:00
Alexey Kopytov
f1e83a4163 Manual merge of mysql-5.1-bugteam into mysql-trunk-merge. 2009-12-16 16:47:07 +03:00
Alexander Barkov
cff23162ec Bug#49134 5.1 server segfaults with 2byte collation file
Problem: add_collation did not check that cs->number is smaller
than the number of elements in the array all_charsets[],
so server could crash when loading an Index.xml file with
a collation ID greater the number of elements 
(for example when downgrading from 5.5).

Fix: adding a condition to check that cs->number is not out of valid range.
2009-12-15 13:48:29 +04:00
Alexander Barkov
7e8b208d3b Backporting Bug#37129 LDML lacks <i> rule 2009-11-09 13:45:40 +04:00
Alexander Barkov
dcb8bb23c2 Merging mysql-next-mr-merge to mysql-next-mr. 2009-10-21 15:48:22 +05:00
Alexander Barkov
3929dddcd7 Backporting WL#4164 Two-byte collation IDs 2009-10-15 15:17:32 +05:00
V Narayanan
64e89a7ed2 Bug#46448 trailing spaces are not ignored when user collation maps space != 0x20
In MySQL when the mapping for space is changed to something other than
0x20 by defining a different collation, then space is not ignored when
comparing two strings.

This was happening because the function that performs the comparison
of two strings while ignoring ending spaces, was comparing the collation
value of a space with the ascii value of the ' ' character. This should
be changed to do comparison between the collated values.

mysql-test/r/ctype_ldml.result:
  Bug#46448 trailing spaces are not ignored when user collation maps space != 0x20
  
  Result file for test case.
mysql-test/std_data/Index.xml:
  Bug#46448 trailing spaces are not ignored when user collation maps space != 0x20
  
  Added entry for new test collation in the index file.
mysql-test/std_data/latin1.xml:
  Bug#46448 trailing spaces are not ignored when user collation maps space != 0x20
  
  Added support for new collation for test.
mysql-test/t/ctype_ldml.test:
  Bug#46448 trailing spaces are not ignored when user collation maps space != 0x20
  
  Added test case to ensure trailing spaces are not ignored.
strings/ctype-simple.c:
  Bug#46448 trailing spaces are not ignored when user collation maps space != 0x20
  
  change my_strnncollsp_simple to compare collated values when checking
  for trailing spaces. Currently the comparison happens between a collated
  value and the ascii value.
2009-10-12 13:13:15 +05:30
unknown
af0a3afe05 Bug#28916 LDML doesn't work for utf8
and is not described in the manual
- Adding missing initialization for utf8 collations
- Minor code clean-ups: renaming variables,
  moving code into a new separate function.
- Adding test, to check that both ucs2 and utf8 user
  defined collations work (ucs2_test_ci and utf8_test_ci)
- Adding Vietnamese collation as a complex user defined
  collation example.


include/m_ctype.h:
  Renaming variable names to match collation names (for convenience).
mysys/charset-def.c:
  - Removing redundant declarations for variables declared in m_ctype.h
  - Renaming variable names to match collation names (for convenience).
mysys/charset.c:
  - Renaming "new" to "newcs", to avoid using C reserved word as a variable name
  - Moving UCA initialization code into a separate function
  - The bug fix itself: adding initialization of utf8 collations
strings/ctype-uca.c:
  Renaming variable names to match collation names (for convenience).
strings/ctype.c:
  Increasing buffer size to fit tailoring for languages
  with complex rules (e.g. Vietnamese).
mysql-test/r/ctype_ldml.result:
  Adding test case
mysql-test/std_data/Index.xml:
  Adding Index.xml example with user defined collations.
mysql-test/t/ctype_ldml-master.opt:
  Adding OPT file for the test case,
  to use the example Index.xml file.
mysql-test/t/ctype_ldml.test:
  Adding test case
2007-06-07 17:55:55 +05:00