mirror of
https://github.com/MariaDB/server.git
synced 2026-05-15 19:37:16 +02:00
Merge branch 'merge-pcre' into 10.0
This commit is contained in:
commit
0b4f5060bb
41 changed files with 1695 additions and 768 deletions
|
|
@ -8,7 +8,7 @@ Email domain: cam.ac.uk
|
|||
University of Cambridge Computing Service,
|
||||
Cambridge, England.
|
||||
|
||||
Copyright (c) 1997-2014 University of Cambridge
|
||||
Copyright (c) 1997-2015 University of Cambridge
|
||||
All rights reserved
|
||||
|
||||
|
||||
|
|
@ -19,7 +19,7 @@ Written by: Zoltan Herczeg
|
|||
Email local part: hzmester
|
||||
Emain domain: freemail.hu
|
||||
|
||||
Copyright(c) 2010-2014 Zoltan Herczeg
|
||||
Copyright(c) 2010-2015 Zoltan Herczeg
|
||||
All rights reserved.
|
||||
|
||||
|
||||
|
|
@ -30,7 +30,7 @@ Written by: Zoltan Herczeg
|
|||
Email local part: hzmester
|
||||
Emain domain: freemail.hu
|
||||
|
||||
Copyright(c) 2009-2014 Zoltan Herczeg
|
||||
Copyright(c) 2009-2015 Zoltan Herczeg
|
||||
All rights reserved.
|
||||
|
||||
|
||||
|
|
|
|||
167
pcre/ChangeLog
167
pcre/ChangeLog
|
|
@ -1,6 +1,173 @@
|
|||
ChangeLog for PCRE
|
||||
------------------
|
||||
|
||||
Version 8.37 28-April-2015
|
||||
--------------------------
|
||||
|
||||
1. When an (*ACCEPT) is triggered inside capturing parentheses, it arranges
|
||||
for those parentheses to be closed with whatever has been captured so far.
|
||||
However, it was failing to mark any other groups between the hightest
|
||||
capture so far and the currrent group as "unset". Thus, the ovector for
|
||||
those groups contained whatever was previously there. An example is the
|
||||
pattern /(x)|((*ACCEPT))/ when matched against "abcd".
|
||||
|
||||
2. If an assertion condition was quantified with a minimum of zero (an odd
|
||||
thing to do, but it happened), SIGSEGV or other misbehaviour could occur.
|
||||
|
||||
3. If a pattern in pcretest input had the P (POSIX) modifier followed by an
|
||||
unrecognized modifier, a crash could occur.
|
||||
|
||||
4. An attempt to do global matching in pcretest with a zero-length ovector
|
||||
caused a crash.
|
||||
|
||||
5. Fixed a memory leak during matching that could occur for a subpattern
|
||||
subroutine call (recursive or otherwise) if the number of captured groups
|
||||
that had to be saved was greater than ten.
|
||||
|
||||
6. Catch a bad opcode during auto-possessification after compiling a bad UTF
|
||||
string with NO_UTF_CHECK. This is a tidyup, not a bug fix, as passing bad
|
||||
UTF with NO_UTF_CHECK is documented as having an undefined outcome.
|
||||
|
||||
7. A UTF pattern containing a "not" match of a non-ASCII character and a
|
||||
subroutine reference could loop at compile time. Example: /[^\xff]((?1))/.
|
||||
|
||||
8. When a pattern is compiled, it remembers the highest back reference so that
|
||||
when matching, if the ovector is too small, extra memory can be obtained to
|
||||
use instead. A conditional subpattern whose condition is a check on a
|
||||
capture having happened, such as, for example in the pattern
|
||||
/^(?:(a)|b)(?(1)A|B)/, is another kind of back reference, but it was not
|
||||
setting the highest backreference number. This mattered only if pcre_exec()
|
||||
was called with an ovector that was too small to hold the capture, and there
|
||||
was no other kind of back reference (a situation which is probably quite
|
||||
rare). The effect of the bug was that the condition was always treated as
|
||||
FALSE when the capture could not be consulted, leading to a incorrect
|
||||
behaviour by pcre_exec(). This bug has been fixed.
|
||||
|
||||
9. A reference to a duplicated named group (either a back reference or a test
|
||||
for being set in a conditional) that occurred in a part of the pattern where
|
||||
PCRE_DUPNAMES was not set caused the amount of memory needed for the pattern
|
||||
to be incorrectly calculated, leading to overwriting.
|
||||
|
||||
10. A mutually recursive set of back references such as (\2)(\1) caused a
|
||||
segfault at study time (while trying to find the minimum matching length).
|
||||
The infinite loop is now broken (with the minimum length unset, that is,
|
||||
zero).
|
||||
|
||||
11. If an assertion that was used as a condition was quantified with a minimum
|
||||
of zero, matching went wrong. In particular, if the whole group had
|
||||
unlimited repetition and could match an empty string, a segfault was
|
||||
likely. The pattern (?(?=0)?)+ is an example that caused this. Perl allows
|
||||
assertions to be quantified, but not if they are being used as conditions,
|
||||
so the above pattern is faulted by Perl. PCRE has now been changed so that
|
||||
it also rejects such patterns.
|
||||
|
||||
12. A possessive capturing group such as (a)*+ with a minimum repeat of zero
|
||||
failed to allow the zero-repeat case if pcre2_exec() was called with an
|
||||
ovector too small to capture the group.
|
||||
|
||||
13. Fixed two bugs in pcretest that were discovered by fuzzing and reported by
|
||||
Red Hat Product Security:
|
||||
|
||||
(a) A crash if /K and /F were both set with the option to save the compiled
|
||||
pattern.
|
||||
|
||||
(b) Another crash if the option to print captured substrings in a callout
|
||||
was combined with setting a null ovector, for example \O\C+ as a subject
|
||||
string.
|
||||
|
||||
14. A pattern such as "((?2){0,1999}())?", which has a group containing a
|
||||
forward reference repeated a large (but limited) number of times within a
|
||||
repeated outer group that has a zero minimum quantifier, caused incorrect
|
||||
code to be compiled, leading to the error "internal error:
|
||||
previously-checked referenced subpattern not found" when an incorrect
|
||||
memory address was read. This bug was reported as "heap overflow",
|
||||
discovered by Kai Lu of Fortinet's FortiGuard Labs and given the CVE number
|
||||
CVE-2015-2325.
|
||||
|
||||
23. A pattern such as "((?+1)(\1))/" containing a forward reference subroutine
|
||||
call within a group that also contained a recursive back reference caused
|
||||
incorrect code to be compiled. This bug was reported as "heap overflow",
|
||||
discovered by Kai Lu of Fortinet's FortiGuard Labs, and given the CVE
|
||||
number CVE-2015-2326.
|
||||
|
||||
24. Computing the size of the JIT read-only data in advance has been a source
|
||||
of various issues, and new ones are still appear unfortunately. To fix
|
||||
existing and future issues, size computation is eliminated from the code,
|
||||
and replaced by on-demand memory allocation.
|
||||
|
||||
25. A pattern such as /(?i)[A-`]/, where characters in the other case are
|
||||
adjacent to the end of the range, and the range contained characters with
|
||||
more than one other case, caused incorrect behaviour when compiled in UTF
|
||||
mode. In that example, the range a-j was left out of the class.
|
||||
|
||||
26. Fix JIT compilation of conditional blocks, which assertion
|
||||
is converted to (*FAIL). E.g: /(?(?!))/.
|
||||
|
||||
27. The pattern /(?(?!)^)/ caused references to random memory. This bug was
|
||||
discovered by the LLVM fuzzer.
|
||||
|
||||
28. The assertion (?!) is optimized to (*FAIL). This was not handled correctly
|
||||
when this assertion was used as a condition, for example (?(?!)a|b). In
|
||||
pcre2_match() it worked by luck; in pcre2_dfa_match() it gave an incorrect
|
||||
error about an unsupported item.
|
||||
|
||||
29. For some types of pattern, for example /Z*(|d*){216}/, the auto-
|
||||
possessification code could take exponential time to complete. A recursion
|
||||
depth limit of 1000 has been imposed to limit the resources used by this
|
||||
optimization.
|
||||
|
||||
30. A pattern such as /(*UTF)[\S\V\H]/, which contains a negated special class
|
||||
such as \S in non-UCP mode, explicit wide characters (> 255) can be ignored
|
||||
because \S ensures they are all in the class. The code for doing this was
|
||||
interacting badly with the code for computing the amount of space needed to
|
||||
compile the pattern, leading to a buffer overflow. This bug was discovered
|
||||
by the LLVM fuzzer.
|
||||
|
||||
31. A pattern such as /((?2)+)((?1))/ which has mutual recursion nested inside
|
||||
other kinds of group caused stack overflow at compile time. This bug was
|
||||
discovered by the LLVM fuzzer.
|
||||
|
||||
32. A pattern such as /(?1)(?#?'){8}(a)/ which had a parenthesized comment
|
||||
between a subroutine call and its quantifier was incorrectly compiled,
|
||||
leading to buffer overflow or other errors. This bug was discovered by the
|
||||
LLVM fuzzer.
|
||||
|
||||
33. The illegal pattern /(?(?<E>.*!.*)?)/ was not being diagnosed as missing an
|
||||
assertion after (?(. The code was failing to check the character after
|
||||
(?(?< for the ! or = that would indicate a lookbehind assertion. This bug
|
||||
was discovered by the LLVM fuzzer.
|
||||
|
||||
34. A pattern such as /X((?2)()*+){2}+/ which has a possessive quantifier with
|
||||
a fixed maximum following a group that contains a subroutine reference was
|
||||
incorrectly compiled and could trigger buffer overflow. This bug was
|
||||
discovered by the LLVM fuzzer.
|
||||
|
||||
35. A mutual recursion within a lookbehind assertion such as (?<=((?2))((?1)))
|
||||
caused a stack overflow instead of the diagnosis of a non-fixed length
|
||||
lookbehind assertion. This bug was discovered by the LLVM fuzzer.
|
||||
|
||||
36. The use of \K in a positive lookbehind assertion in a non-anchored pattern
|
||||
(e.g. /(?<=\Ka)/) could make pcregrep loop.
|
||||
|
||||
37. There was a similar problem to 36 in pcretest for global matches.
|
||||
|
||||
38. If a greedy quantified \X was preceded by \C in UTF mode (e.g. \C\X*),
|
||||
and a subsequent item in the pattern caused a non-match, backtracking over
|
||||
the repeated \X did not stop, but carried on past the start of the subject,
|
||||
causing reference to random memory and/or a segfault. There were also some
|
||||
other cases where backtracking after \C could crash. This set of bugs was
|
||||
discovered by the LLVM fuzzer.
|
||||
|
||||
39. The function for finding the minimum length of a matching string could take
|
||||
a very long time if mutual recursion was present many times in a pattern,
|
||||
for example, /((?2){73}(?2))((?1))/. A better mutual recursion detection
|
||||
method has been implemented. This infelicity was discovered by the LLVM
|
||||
fuzzer.
|
||||
|
||||
40. Static linking against the PCRE library using the pkg-config module was
|
||||
failing on missing pthread symbols.
|
||||
|
||||
|
||||
Version 8.36 26-September-2014
|
||||
------------------------------
|
||||
|
||||
|
|
|
|||
|
|
@ -6,7 +6,8 @@ and semantics are as close as possible to those of the Perl 5 language.
|
|||
|
||||
Release 8 of PCRE is distributed under the terms of the "BSD" licence, as
|
||||
specified below. The documentation for PCRE, supplied in the "doc"
|
||||
directory, is distributed under the same terms as the software itself.
|
||||
directory, is distributed under the same terms as the software itself. The data
|
||||
in the testdata directory is not copyrighted and is in the public domain.
|
||||
|
||||
The basic library functions are written in C and are freestanding. Also
|
||||
included in the distribution is a set of C++ wrapper functions, and a
|
||||
|
|
@ -24,7 +25,7 @@ Email domain: cam.ac.uk
|
|||
University of Cambridge Computing Service,
|
||||
Cambridge, England.
|
||||
|
||||
Copyright (c) 1997-2014 University of Cambridge
|
||||
Copyright (c) 1997-2015 University of Cambridge
|
||||
All rights reserved.
|
||||
|
||||
|
||||
|
|
@ -35,7 +36,7 @@ Written by: Zoltan Herczeg
|
|||
Email local part: hzmester
|
||||
Emain domain: freemail.hu
|
||||
|
||||
Copyright(c) 2010-2014 Zoltan Herczeg
|
||||
Copyright(c) 2010-2015 Zoltan Herczeg
|
||||
All rights reserved.
|
||||
|
||||
|
||||
|
|
@ -46,7 +47,7 @@ Written by: Zoltan Herczeg
|
|||
Email local part: hzmester
|
||||
Emain domain: freemail.hu
|
||||
|
||||
Copyright(c) 2009-2014 Zoltan Herczeg
|
||||
Copyright(c) 2009-2015 Zoltan Herczeg
|
||||
All rights reserved.
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -1,6 +1,14 @@
|
|||
News about PCRE releases
|
||||
------------------------
|
||||
|
||||
Release 8.37 28-April-2015
|
||||
--------------------------
|
||||
|
||||
This is bug-fix release. Note that this library (now called PCRE1) is now being
|
||||
maintained for bug fixes only. New projects are advised to use the new PCRE2
|
||||
libraries.
|
||||
|
||||
|
||||
Release 8.36 26-September-2014
|
||||
------------------------------
|
||||
|
||||
|
|
|
|||
|
|
@ -1,6 +1,14 @@
|
|||
Building PCRE without using autotools
|
||||
-------------------------------------
|
||||
|
||||
NOTE: This document relates to PCRE releases that use the original API, with
|
||||
library names libpcre, libpcre16, and libpcre32. January 2015 saw the first
|
||||
release of a new API, known as PCRE2, with release numbers starting at 10.00
|
||||
and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old libraries
|
||||
(now called PCRE1) are still being maintained for bug fixes, but there will be
|
||||
no new development. New projects are advised to use the new PCRE2 libraries.
|
||||
|
||||
|
||||
This document contains the following sections:
|
||||
|
||||
General
|
||||
|
|
@ -761,4 +769,4 @@ There is also a mirror here:
|
|||
http://www.vsoft-software.com/downloads.html
|
||||
|
||||
==========================
|
||||
Last Updated: 14 May 2013
|
||||
Last Updated: 10 February 2015
|
||||
|
|
|
|||
13
pcre/README
13
pcre/README
|
|
@ -1,7 +1,16 @@
|
|||
README file for PCRE (Perl-compatible regular expression library)
|
||||
-----------------------------------------------------------------
|
||||
|
||||
The latest release of PCRE is always available in three alternative formats
|
||||
NOTE: This set of files relates to PCRE releases that use the original API,
|
||||
with library names libpcre, libpcre16, and libpcre32. January 2015 saw the
|
||||
first release of a new API, known as PCRE2, with release numbers starting at
|
||||
10.00 and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old
|
||||
libraries (now called PCRE1) are still being maintained for bug fixes, but
|
||||
there will be no new development. New projects are advised to use the new PCRE2
|
||||
libraries.
|
||||
|
||||
|
||||
The latest release of PCRE1 is always available in three alternative formats
|
||||
from:
|
||||
|
||||
ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.tar.gz
|
||||
|
|
@ -990,4 +999,4 @@ pcre_xxx, one with the name pcre16_xx, and a third with the name pcre32_xxx.
|
|||
Philip Hazel
|
||||
Email local part: ph10
|
||||
Email domain: cam.ac.uk
|
||||
Last updated: 24 October 2014
|
||||
Last updated: 10 February 2015
|
||||
|
|
|
|||
|
|
@ -506,6 +506,11 @@ echo "---------------------------- Test 106 -----------------------------" >>tes
|
|||
(cd $srcdir; echo "a" | $valgrind $pcregrep -M "|a" ) >>testtrygrep 2>&1
|
||||
echo "RC=$?" >>testtrygrep
|
||||
|
||||
echo "---------------------------- Test 107 -----------------------------" >>testtrygrep
|
||||
echo "a" >testtemp1grep
|
||||
echo "aaaaa" >>testtemp1grep
|
||||
(cd $srcdir; $valgrind $pcregrep --line-offsets '(?<=\Ka)' $builddir/testtemp1grep) >>testtrygrep 2>&1
|
||||
echo "RC=$?" >>testtrygrep
|
||||
|
||||
# Now compare the results.
|
||||
|
||||
|
|
|
|||
|
|
@ -9,17 +9,17 @@ dnl The PCRE_PRERELEASE feature is for identifying release candidates. It might
|
|||
dnl be defined as -RC2, for example. For real releases, it should be empty.
|
||||
|
||||
m4_define(pcre_major, [8])
|
||||
m4_define(pcre_minor, [36])
|
||||
m4_define(pcre_minor, [37])
|
||||
m4_define(pcre_prerelease, [])
|
||||
m4_define(pcre_date, [2014-09-26])
|
||||
m4_define(pcre_date, [2015-04-28])
|
||||
|
||||
# NOTE: The CMakeLists.txt file searches for the above variables in the first
|
||||
# 50 lines of this file. Please update that if the variables above are moved.
|
||||
|
||||
# Libtool shared library interface versions (current:revision:age)
|
||||
m4_define(libpcre_version, [3:4:2])
|
||||
m4_define(libpcre16_version, [2:4:2])
|
||||
m4_define(libpcre32_version, [0:4:0])
|
||||
m4_define(libpcre_version, [3:5:2])
|
||||
m4_define(libpcre16_version, [2:5:2])
|
||||
m4_define(libpcre32_version, [0:5:0])
|
||||
m4_define(libpcreposix_version, [0:3:0])
|
||||
m4_define(libpcrecpp_version, [0:1:0])
|
||||
|
||||
|
|
|
|||
|
|
@ -1,6 +1,14 @@
|
|||
Building PCRE without using autotools
|
||||
-------------------------------------
|
||||
|
||||
NOTE: This document relates to PCRE releases that use the original API, with
|
||||
library names libpcre, libpcre16, and libpcre32. January 2015 saw the first
|
||||
release of a new API, known as PCRE2, with release numbers starting at 10.00
|
||||
and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old libraries
|
||||
(now called PCRE1) are still being maintained for bug fixes, but there will be
|
||||
no new development. New projects are advised to use the new PCRE2 libraries.
|
||||
|
||||
|
||||
This document contains the following sections:
|
||||
|
||||
General
|
||||
|
|
@ -761,4 +769,4 @@ There is also a mirror here:
|
|||
http://www.vsoft-software.com/downloads.html
|
||||
|
||||
==========================
|
||||
Last Updated: 14 May 2013
|
||||
Last Updated: 10 February 2015
|
||||
|
|
|
|||
|
|
@ -1,7 +1,16 @@
|
|||
README file for PCRE (Perl-compatible regular expression library)
|
||||
-----------------------------------------------------------------
|
||||
|
||||
The latest release of PCRE is always available in three alternative formats
|
||||
NOTE: This set of files relates to PCRE releases that use the original API,
|
||||
with library names libpcre, libpcre16, and libpcre32. January 2015 saw the
|
||||
first release of a new API, known as PCRE2, with release numbers starting at
|
||||
10.00 and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old
|
||||
libraries (now called PCRE1) are still being maintained for bug fixes, but
|
||||
there will be no new development. New projects are advised to use the new PCRE2
|
||||
libraries.
|
||||
|
||||
|
||||
The latest release of PCRE1 is always available in three alternative formats
|
||||
from:
|
||||
|
||||
ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.tar.gz
|
||||
|
|
@ -990,4 +999,4 @@ pcre_xxx, one with the name pcre16_xx, and a third with the name pcre32_xxx.
|
|||
Philip Hazel
|
||||
Email local part: ph10
|
||||
Email domain: cam.ac.uk
|
||||
Last updated: 24 October 2014
|
||||
Last updated: 10 February 2015
|
||||
|
|
|
|||
|
|
@ -13,13 +13,24 @@ from the original man page. If there is any nonsense in it, please consult the
|
|||
man page, in case the conversion went wrong.
|
||||
<br>
|
||||
<ul>
|
||||
<li><a name="TOC1" href="#SEC1">INTRODUCTION</a>
|
||||
<li><a name="TOC2" href="#SEC2">SECURITY CONSIDERATIONS</a>
|
||||
<li><a name="TOC3" href="#SEC3">USER DOCUMENTATION</a>
|
||||
<li><a name="TOC4" href="#SEC4">AUTHOR</a>
|
||||
<li><a name="TOC5" href="#SEC5">REVISION</a>
|
||||
<li><a name="TOC1" href="#SEC1">PLEASE TAKE NOTE</a>
|
||||
<li><a name="TOC2" href="#SEC2">INTRODUCTION</a>
|
||||
<li><a name="TOC3" href="#SEC3">SECURITY CONSIDERATIONS</a>
|
||||
<li><a name="TOC4" href="#SEC4">USER DOCUMENTATION</a>
|
||||
<li><a name="TOC5" href="#SEC5">AUTHOR</a>
|
||||
<li><a name="TOC6" href="#SEC6">REVISION</a>
|
||||
</ul>
|
||||
<br><a name="SEC1" href="#TOC1">INTRODUCTION</a><br>
|
||||
<br><a name="SEC1" href="#TOC1">PLEASE TAKE NOTE</a><br>
|
||||
<P>
|
||||
This document relates to PCRE releases that use the original API,
|
||||
with library names libpcre, libpcre16, and libpcre32. January 2015 saw the
|
||||
first release of a new API, known as PCRE2, with release numbers starting at
|
||||
10.00 and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old
|
||||
libraries (now called PCRE1) are still being maintained for bug fixes, but
|
||||
there will be no new development. New projects are advised to use the new PCRE2
|
||||
libraries.
|
||||
</P>
|
||||
<br><a name="SEC2" href="#TOC1">INTRODUCTION</a><br>
|
||||
<P>
|
||||
The PCRE library is a set of functions that implement regular expression
|
||||
pattern matching using the same syntax and semantics as Perl, with just a few
|
||||
|
|
@ -115,7 +126,7 @@ clashes. In some environments, it is possible to control which external symbols
|
|||
are exported when a shared library is built, and in these cases the
|
||||
undocumented symbols are not exported.
|
||||
</P>
|
||||
<br><a name="SEC2" href="#TOC1">SECURITY CONSIDERATIONS</a><br>
|
||||
<br><a name="SEC3" href="#TOC1">SECURITY CONSIDERATIONS</a><br>
|
||||
<P>
|
||||
If you are using PCRE in a non-UTF application that permits users to supply
|
||||
arbitrary patterns for compilation, you should be aware of a feature that
|
||||
|
|
@ -149,7 +160,7 @@ against this: see the PCRE_EXTRA_MATCH_LIMIT feature in the
|
|||
<a href="pcreapi.html"><b>pcreapi</b></a>
|
||||
page.
|
||||
</P>
|
||||
<br><a name="SEC3" href="#TOC1">USER DOCUMENTATION</a><br>
|
||||
<br><a name="SEC4" href="#TOC1">USER DOCUMENTATION</a><br>
|
||||
<P>
|
||||
The user documentation for PCRE comprises a number of different sections. In
|
||||
the "man" format, each of these is a separate "man page". In the HTML format,
|
||||
|
|
@ -188,7 +199,7 @@ follows:
|
|||
In the "man" and HTML formats, there is also a short page for each C library
|
||||
function, listing its arguments and results.
|
||||
</P>
|
||||
<br><a name="SEC4" href="#TOC1">AUTHOR</a><br>
|
||||
<br><a name="SEC5" href="#TOC1">AUTHOR</a><br>
|
||||
<P>
|
||||
Philip Hazel
|
||||
<br>
|
||||
|
|
@ -202,11 +213,11 @@ Putting an actual email address here seems to have been a spam magnet, so I've
|
|||
taken it away. If you want to email me, use my two initials, followed by the
|
||||
two digits 10, at the domain cam.ac.uk.
|
||||
</P>
|
||||
<br><a name="SEC5" href="#TOC1">REVISION</a><br>
|
||||
<br><a name="SEC6" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 08 January 2014
|
||||
Last updated: 10 February 2015
|
||||
<br>
|
||||
Copyright © 1997-2014 University of Cambridge.
|
||||
Copyright © 1997-2015 University of Cambridge.
|
||||
<br>
|
||||
<p>
|
||||
Return to the <a href="index.html">PCRE index page</a>.
|
||||
|
|
|
|||
|
|
@ -1,6 +1,18 @@
|
|||
.TH PCRE 3 "08 January 2014" "PCRE 8.35"
|
||||
.TH PCRE 3 "10 February 2015" "PCRE 8.37"
|
||||
.SH NAME
|
||||
PCRE - Perl-compatible regular expressions
|
||||
PCRE - Perl-compatible regular expressions (original API)
|
||||
.SH "PLEASE TAKE NOTE"
|
||||
.rs
|
||||
.sp
|
||||
This document relates to PCRE releases that use the original API,
|
||||
with library names libpcre, libpcre16, and libpcre32. January 2015 saw the
|
||||
first release of a new API, known as PCRE2, with release numbers starting at
|
||||
10.00 and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old
|
||||
libraries (now called PCRE1) are still being maintained for bug fixes, but
|
||||
there will be no new development. New projects are advised to use the new PCRE2
|
||||
libraries.
|
||||
.
|
||||
.
|
||||
.SH INTRODUCTION
|
||||
.rs
|
||||
.sp
|
||||
|
|
@ -213,6 +225,6 @@ two digits 10, at the domain cam.ac.uk.
|
|||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 08 January 2014
|
||||
Copyright (c) 1997-2014 University of Cambridge.
|
||||
Last updated: 10 February 2015
|
||||
Copyright (c) 1997-2015 University of Cambridge.
|
||||
.fi
|
||||
|
|
|
|||
|
|
@ -13,7 +13,18 @@ PCRE(3) Library Functions Manual PCRE(3)
|
|||
|
||||
|
||||
NAME
|
||||
PCRE - Perl-compatible regular expressions
|
||||
PCRE - Perl-compatible regular expressions (original API)
|
||||
|
||||
PLEASE TAKE NOTE
|
||||
|
||||
This document relates to PCRE releases that use the original API, with
|
||||
library names libpcre, libpcre16, and libpcre32. January 2015 saw the
|
||||
first release of a new API, known as PCRE2, with release numbers start-
|
||||
ing at 10.00 and library names libpcre2-8, libpcre2-16, and
|
||||
libpcre2-32. The old libraries (now called PCRE1) are still being main-
|
||||
tained for bug fixes, but there will be no new development. New
|
||||
projects are advised to use the new PCRE2 libraries.
|
||||
|
||||
|
||||
INTRODUCTION
|
||||
|
||||
|
|
@ -179,8 +190,8 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 08 January 2014
|
||||
Copyright (c) 1997-2014 University of Cambridge.
|
||||
Last updated: 10 February 2015
|
||||
Copyright (c) 1997-2015 University of Cambridge.
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -1704,6 +1704,7 @@ Arguments:
|
|||
utf TRUE in UTF-8 / UTF-16 / UTF-32 mode
|
||||
atend TRUE if called when the pattern is complete
|
||||
cd the "compile data" structure
|
||||
recurses chain of recurse_check to catch mutual recursion
|
||||
|
||||
Returns: the fixed length,
|
||||
or -1 if there is no fixed length,
|
||||
|
|
@ -1713,10 +1714,11 @@ Returns: the fixed length,
|
|||
*/
|
||||
|
||||
static int
|
||||
find_fixedlength(pcre_uchar *code, BOOL utf, BOOL atend, compile_data *cd)
|
||||
find_fixedlength(pcre_uchar *code, BOOL utf, BOOL atend, compile_data *cd,
|
||||
recurse_check *recurses)
|
||||
{
|
||||
int length = -1;
|
||||
|
||||
recurse_check this_recurse;
|
||||
register int branchlength = 0;
|
||||
register pcre_uchar *cc = code + 1 + LINK_SIZE;
|
||||
|
||||
|
|
@ -1741,7 +1743,8 @@ for (;;)
|
|||
case OP_ONCE:
|
||||
case OP_ONCE_NC:
|
||||
case OP_COND:
|
||||
d = find_fixedlength(cc + ((op == OP_CBRA)? IMM2_SIZE : 0), utf, atend, cd);
|
||||
d = find_fixedlength(cc + ((op == OP_CBRA)? IMM2_SIZE : 0), utf, atend, cd,
|
||||
recurses);
|
||||
if (d < 0) return d;
|
||||
branchlength += d;
|
||||
do cc += GET(cc, 1); while (*cc == OP_ALT);
|
||||
|
|
@ -1775,7 +1778,15 @@ for (;;)
|
|||
cs = ce = (pcre_uchar *)cd->start_code + GET(cc, 1); /* Start subpattern */
|
||||
do ce += GET(ce, 1); while (*ce == OP_ALT); /* End subpattern */
|
||||
if (cc > cs && cc < ce) return -1; /* Recursion */
|
||||
d = find_fixedlength(cs + IMM2_SIZE, utf, atend, cd);
|
||||
else /* Check for mutual recursion */
|
||||
{
|
||||
recurse_check *r = recurses;
|
||||
for (r = recurses; r != NULL; r = r->prev) if (r->group == cs) break;
|
||||
if (r != NULL) return -1; /* Mutual recursion */
|
||||
}
|
||||
this_recurse.prev = recurses;
|
||||
this_recurse.group = cs;
|
||||
d = find_fixedlength(cs + IMM2_SIZE, utf, atend, cd, &this_recurse);
|
||||
if (d < 0) return d;
|
||||
branchlength += d;
|
||||
cc += 1 + LINK_SIZE;
|
||||
|
|
@ -2129,32 +2140,60 @@ for (;;)
|
|||
{
|
||||
case OP_CHAR:
|
||||
case OP_CHARI:
|
||||
case OP_NOT:
|
||||
case OP_NOTI:
|
||||
case OP_EXACT:
|
||||
case OP_EXACTI:
|
||||
case OP_NOTEXACT:
|
||||
case OP_NOTEXACTI:
|
||||
case OP_UPTO:
|
||||
case OP_UPTOI:
|
||||
case OP_NOTUPTO:
|
||||
case OP_NOTUPTOI:
|
||||
case OP_MINUPTO:
|
||||
case OP_MINUPTOI:
|
||||
case OP_NOTMINUPTO:
|
||||
case OP_NOTMINUPTOI:
|
||||
case OP_POSUPTO:
|
||||
case OP_POSUPTOI:
|
||||
case OP_NOTPOSUPTO:
|
||||
case OP_NOTPOSUPTOI:
|
||||
case OP_STAR:
|
||||
case OP_STARI:
|
||||
case OP_NOTSTAR:
|
||||
case OP_NOTSTARI:
|
||||
case OP_MINSTAR:
|
||||
case OP_MINSTARI:
|
||||
case OP_NOTMINSTAR:
|
||||
case OP_NOTMINSTARI:
|
||||
case OP_POSSTAR:
|
||||
case OP_POSSTARI:
|
||||
case OP_NOTPOSSTAR:
|
||||
case OP_NOTPOSSTARI:
|
||||
case OP_PLUS:
|
||||
case OP_PLUSI:
|
||||
case OP_NOTPLUS:
|
||||
case OP_NOTPLUSI:
|
||||
case OP_MINPLUS:
|
||||
case OP_MINPLUSI:
|
||||
case OP_NOTMINPLUS:
|
||||
case OP_NOTMINPLUSI:
|
||||
case OP_POSPLUS:
|
||||
case OP_POSPLUSI:
|
||||
case OP_NOTPOSPLUS:
|
||||
case OP_NOTPOSPLUSI:
|
||||
case OP_QUERY:
|
||||
case OP_QUERYI:
|
||||
case OP_NOTQUERY:
|
||||
case OP_NOTQUERYI:
|
||||
case OP_MINQUERY:
|
||||
case OP_MINQUERYI:
|
||||
case OP_NOTMINQUERY:
|
||||
case OP_NOTMINQUERYI:
|
||||
case OP_POSQUERY:
|
||||
case OP_POSQUERYI:
|
||||
case OP_NOTPOSQUERY:
|
||||
case OP_NOTPOSQUERYI:
|
||||
if (HAS_EXTRALEN(code[-1])) code += GET_EXTRALEN(code[-1]);
|
||||
break;
|
||||
}
|
||||
|
|
@ -2334,11 +2373,6 @@ Arguments:
|
|||
Returns: TRUE if what is matched could be empty
|
||||
*/
|
||||
|
||||
typedef struct recurse_check {
|
||||
struct recurse_check *prev;
|
||||
const pcre_uchar *group;
|
||||
} recurse_check;
|
||||
|
||||
static BOOL
|
||||
could_be_empty_branch(const pcre_uchar *code, const pcre_uchar *endcode,
|
||||
BOOL utf, compile_data *cd, recurse_check *recurses)
|
||||
|
|
@ -2469,8 +2503,8 @@ for (code = first_significant_code(code + PRIV(OP_lengths)[*code], TRUE);
|
|||
empty_branch = FALSE;
|
||||
do
|
||||
{
|
||||
if (!empty_branch && could_be_empty_branch(code, endcode, utf, cd, NULL))
|
||||
empty_branch = TRUE;
|
||||
if (!empty_branch && could_be_empty_branch(code, endcode, utf, cd,
|
||||
recurses)) empty_branch = TRUE;
|
||||
code += GET(code, 1);
|
||||
}
|
||||
while (*code == OP_ALT);
|
||||
|
|
@ -3065,7 +3099,7 @@ Returns: TRUE if the auto-possessification is possible
|
|||
|
||||
static BOOL
|
||||
compare_opcodes(const pcre_uchar *code, BOOL utf, const compile_data *cd,
|
||||
const pcre_uint32 *base_list, const pcre_uchar *base_end)
|
||||
const pcre_uint32 *base_list, const pcre_uchar *base_end, int *rec_limit)
|
||||
{
|
||||
pcre_uchar c;
|
||||
pcre_uint32 list[8];
|
||||
|
|
@ -3082,6 +3116,9 @@ pcre_uint32 chr;
|
|||
BOOL accepted, invert_bits;
|
||||
BOOL entered_a_group = FALSE;
|
||||
|
||||
if (*rec_limit == 0) return FALSE;
|
||||
--(*rec_limit);
|
||||
|
||||
/* Note: the base_list[1] contains whether the current opcode has greedy
|
||||
(represented by a non-zero value) quantifier. This is a different from
|
||||
other character type lists, which stores here that the character iterator
|
||||
|
|
@ -3152,7 +3189,8 @@ for(;;)
|
|||
|
||||
while (*next_code == OP_ALT)
|
||||
{
|
||||
if (!compare_opcodes(code, utf, cd, base_list, base_end)) return FALSE;
|
||||
if (!compare_opcodes(code, utf, cd, base_list, base_end, rec_limit))
|
||||
return FALSE;
|
||||
code = next_code + 1 + LINK_SIZE;
|
||||
next_code += GET(next_code, 1);
|
||||
}
|
||||
|
|
@ -3172,7 +3210,7 @@ for(;;)
|
|||
/* The bracket content will be checked by the
|
||||
OP_BRA/OP_CBRA case above. */
|
||||
next_code += 1 + LINK_SIZE;
|
||||
if (!compare_opcodes(next_code, utf, cd, base_list, base_end))
|
||||
if (!compare_opcodes(next_code, utf, cd, base_list, base_end, rec_limit))
|
||||
return FALSE;
|
||||
|
||||
code += PRIV(OP_lengths)[c];
|
||||
|
|
@ -3605,11 +3643,20 @@ register pcre_uchar c;
|
|||
const pcre_uchar *end;
|
||||
pcre_uchar *repeat_opcode;
|
||||
pcre_uint32 list[8];
|
||||
int rec_limit;
|
||||
|
||||
for (;;)
|
||||
{
|
||||
c = *code;
|
||||
|
||||
/* When a pattern with bad UTF-8 encoding is compiled with NO_UTF_CHECK,
|
||||
it may compile without complaining, but may get into a loop here if the code
|
||||
pointer points to a bad value. This is, of course a documentated possibility,
|
||||
when NO_UTF_CHECK is set, so it isn't a bug, but we can detect this case and
|
||||
just give up on this optimization. */
|
||||
|
||||
if (c >= OP_TABLE_LENGTH) return;
|
||||
|
||||
if (c >= OP_STAR && c <= OP_TYPEPOSUPTO)
|
||||
{
|
||||
c -= get_repeat_base(c) - OP_STAR;
|
||||
|
|
@ -3617,7 +3664,8 @@ for (;;)
|
|||
get_chr_property_list(code, utf, cd->fcc, list) : NULL;
|
||||
list[1] = c == OP_STAR || c == OP_PLUS || c == OP_QUERY || c == OP_UPTO;
|
||||
|
||||
if (end != NULL && compare_opcodes(end, utf, cd, list, end))
|
||||
rec_limit = 1000;
|
||||
if (end != NULL && compare_opcodes(end, utf, cd, list, end, &rec_limit))
|
||||
{
|
||||
switch(c)
|
||||
{
|
||||
|
|
@ -3673,7 +3721,8 @@ for (;;)
|
|||
|
||||
list[1] = (c & 1) == 0;
|
||||
|
||||
if (compare_opcodes(end, utf, cd, list, end))
|
||||
rec_limit = 1000;
|
||||
if (compare_opcodes(end, utf, cd, list, end, &rec_limit))
|
||||
{
|
||||
switch (c)
|
||||
{
|
||||
|
|
@ -3947,14 +3996,14 @@ Arguments:
|
|||
adjust the amount by which the group is to be moved
|
||||
utf TRUE in UTF-8 / UTF-16 / UTF-32 mode
|
||||
cd contains pointers to tables etc.
|
||||
save_hwm the hwm forward reference pointer at the start of the group
|
||||
save_hwm_offset the hwm forward reference offset at the start of the group
|
||||
|
||||
Returns: nothing
|
||||
*/
|
||||
|
||||
static void
|
||||
adjust_recurse(pcre_uchar *group, int adjust, BOOL utf, compile_data *cd,
|
||||
pcre_uchar *save_hwm)
|
||||
size_t save_hwm_offset)
|
||||
{
|
||||
pcre_uchar *ptr = group;
|
||||
|
||||
|
|
@ -3966,7 +4015,8 @@ while ((ptr = (pcre_uchar *)find_recurse(ptr, utf)) != NULL)
|
|||
/* See if this recursion is on the forward reference list. If so, adjust the
|
||||
reference. */
|
||||
|
||||
for (hc = save_hwm; hc < cd->hwm; hc += LINK_SIZE)
|
||||
for (hc = (pcre_uchar *)cd->start_workspace + save_hwm_offset; hc < cd->hwm;
|
||||
hc += LINK_SIZE)
|
||||
{
|
||||
offset = (int)GET(hc, 0);
|
||||
if (cd->start_code + offset == ptr + 1)
|
||||
|
|
@ -4171,7 +4221,11 @@ if ((options & PCRE_CASELESS) != 0)
|
|||
range. Otherwise, use a recursive call to add the additional range. */
|
||||
|
||||
else if (oc < start && od >= start - 1) start = oc; /* Extend downwards */
|
||||
else if (od > end && oc <= end + 1) end = od; /* Extend upwards */
|
||||
else if (od > end && oc <= end + 1)
|
||||
{
|
||||
end = od; /* Extend upwards */
|
||||
if (end > classbits_end) classbits_end = (end <= 0xff ? end : 0xff);
|
||||
}
|
||||
else n8 += add_to_class(classbits, uchardptr, options, cd, oc, od);
|
||||
}
|
||||
}
|
||||
|
|
@ -4411,7 +4465,7 @@ const pcre_uchar *tempptr;
|
|||
const pcre_uchar *nestptr = NULL;
|
||||
pcre_uchar *previous = NULL;
|
||||
pcre_uchar *previous_callout = NULL;
|
||||
pcre_uchar *save_hwm = NULL;
|
||||
size_t save_hwm_offset = 0;
|
||||
pcre_uint8 classbits[32];
|
||||
|
||||
/* We can fish out the UTF-8 setting once and for all into a BOOL, but we
|
||||
|
|
@ -5470,6 +5524,12 @@ for (;; ptr++)
|
|||
PUT(previous, 1, (int)(code - previous));
|
||||
break; /* End of class handling */
|
||||
}
|
||||
|
||||
/* Even though any XCLASS list is now discarded, we must allow for
|
||||
its memory. */
|
||||
|
||||
if (lengthptr != NULL)
|
||||
*lengthptr += (int)(class_uchardata - class_uchardata_base);
|
||||
#endif
|
||||
|
||||
/* If there are no characters > 255, or they are all to be included or
|
||||
|
|
@ -5870,6 +5930,7 @@ for (;; ptr++)
|
|||
{
|
||||
register int i;
|
||||
int len = (int)(code - previous);
|
||||
size_t base_hwm_offset = save_hwm_offset;
|
||||
pcre_uchar *bralink = NULL;
|
||||
pcre_uchar *brazeroptr = NULL;
|
||||
|
||||
|
|
@ -5924,7 +5985,7 @@ for (;; ptr++)
|
|||
if (repeat_max <= 1) /* Covers 0, 1, and unlimited */
|
||||
{
|
||||
*code = OP_END;
|
||||
adjust_recurse(previous, 1, utf, cd, save_hwm);
|
||||
adjust_recurse(previous, 1, utf, cd, save_hwm_offset);
|
||||
memmove(previous + 1, previous, IN_UCHARS(len));
|
||||
code++;
|
||||
if (repeat_max == 0)
|
||||
|
|
@ -5948,7 +6009,7 @@ for (;; ptr++)
|
|||
{
|
||||
int offset;
|
||||
*code = OP_END;
|
||||
adjust_recurse(previous, 2 + LINK_SIZE, utf, cd, save_hwm);
|
||||
adjust_recurse(previous, 2 + LINK_SIZE, utf, cd, save_hwm_offset);
|
||||
memmove(previous + 2 + LINK_SIZE, previous, IN_UCHARS(len));
|
||||
code += 2 + LINK_SIZE;
|
||||
*previous++ = OP_BRAZERO + repeat_type;
|
||||
|
|
@ -6011,26 +6072,25 @@ for (;; ptr++)
|
|||
for (i = 1; i < repeat_min; i++)
|
||||
{
|
||||
pcre_uchar *hc;
|
||||
pcre_uchar *this_hwm = cd->hwm;
|
||||
size_t this_hwm_offset = cd->hwm - cd->start_workspace;
|
||||
memcpy(code, previous, IN_UCHARS(len));
|
||||
|
||||
while (cd->hwm > cd->start_workspace + cd->workspace_size -
|
||||
WORK_SIZE_SAFETY_MARGIN - (this_hwm - save_hwm))
|
||||
WORK_SIZE_SAFETY_MARGIN -
|
||||
(this_hwm_offset - base_hwm_offset))
|
||||
{
|
||||
size_t save_offset = save_hwm - cd->start_workspace;
|
||||
size_t this_offset = this_hwm - cd->start_workspace;
|
||||
*errorcodeptr = expand_workspace(cd);
|
||||
if (*errorcodeptr != 0) goto FAILED;
|
||||
save_hwm = (pcre_uchar *)cd->start_workspace + save_offset;
|
||||
this_hwm = (pcre_uchar *)cd->start_workspace + this_offset;
|
||||
}
|
||||
|
||||
for (hc = save_hwm; hc < this_hwm; hc += LINK_SIZE)
|
||||
for (hc = (pcre_uchar *)cd->start_workspace + base_hwm_offset;
|
||||
hc < (pcre_uchar *)cd->start_workspace + this_hwm_offset;
|
||||
hc += LINK_SIZE)
|
||||
{
|
||||
PUT(cd->hwm, 0, GET(hc, 0) + len);
|
||||
cd->hwm += LINK_SIZE;
|
||||
}
|
||||
save_hwm = this_hwm;
|
||||
base_hwm_offset = this_hwm_offset;
|
||||
code += len;
|
||||
}
|
||||
}
|
||||
|
|
@ -6075,7 +6135,7 @@ for (;; ptr++)
|
|||
else for (i = repeat_max - 1; i >= 0; i--)
|
||||
{
|
||||
pcre_uchar *hc;
|
||||
pcre_uchar *this_hwm = cd->hwm;
|
||||
size_t this_hwm_offset = cd->hwm - cd->start_workspace;
|
||||
|
||||
*code++ = OP_BRAZERO + repeat_type;
|
||||
|
||||
|
|
@ -6097,22 +6157,21 @@ for (;; ptr++)
|
|||
copying them. */
|
||||
|
||||
while (cd->hwm > cd->start_workspace + cd->workspace_size -
|
||||
WORK_SIZE_SAFETY_MARGIN - (this_hwm - save_hwm))
|
||||
WORK_SIZE_SAFETY_MARGIN -
|
||||
(this_hwm_offset - base_hwm_offset))
|
||||
{
|
||||
size_t save_offset = save_hwm - cd->start_workspace;
|
||||
size_t this_offset = this_hwm - cd->start_workspace;
|
||||
*errorcodeptr = expand_workspace(cd);
|
||||
if (*errorcodeptr != 0) goto FAILED;
|
||||
save_hwm = (pcre_uchar *)cd->start_workspace + save_offset;
|
||||
this_hwm = (pcre_uchar *)cd->start_workspace + this_offset;
|
||||
}
|
||||
|
||||
for (hc = save_hwm; hc < this_hwm; hc += LINK_SIZE)
|
||||
for (hc = (pcre_uchar *)cd->start_workspace + base_hwm_offset;
|
||||
hc < (pcre_uchar *)cd->start_workspace + this_hwm_offset;
|
||||
hc += LINK_SIZE)
|
||||
{
|
||||
PUT(cd->hwm, 0, GET(hc, 0) + len + ((i != 0)? 2+LINK_SIZE : 1));
|
||||
cd->hwm += LINK_SIZE;
|
||||
}
|
||||
save_hwm = this_hwm;
|
||||
base_hwm_offset = this_hwm_offset;
|
||||
code += len;
|
||||
}
|
||||
|
||||
|
|
@ -6208,7 +6267,7 @@ for (;; ptr++)
|
|||
{
|
||||
int nlen = (int)(code - bracode);
|
||||
*code = OP_END;
|
||||
adjust_recurse(bracode, 1 + LINK_SIZE, utf, cd, save_hwm);
|
||||
adjust_recurse(bracode, 1 + LINK_SIZE, utf, cd, save_hwm_offset);
|
||||
memmove(bracode + 1 + LINK_SIZE, bracode, IN_UCHARS(nlen));
|
||||
code += 1 + LINK_SIZE;
|
||||
nlen += 1 + LINK_SIZE;
|
||||
|
|
@ -6342,7 +6401,7 @@ for (;; ptr++)
|
|||
else
|
||||
{
|
||||
*code = OP_END;
|
||||
adjust_recurse(tempcode, 1 + LINK_SIZE, utf, cd, save_hwm);
|
||||
adjust_recurse(tempcode, 1 + LINK_SIZE, utf, cd, save_hwm_offset);
|
||||
memmove(tempcode + 1 + LINK_SIZE, tempcode, IN_UCHARS(len));
|
||||
code += 1 + LINK_SIZE;
|
||||
len += 1 + LINK_SIZE;
|
||||
|
|
@ -6391,7 +6450,7 @@ for (;; ptr++)
|
|||
|
||||
default:
|
||||
*code = OP_END;
|
||||
adjust_recurse(tempcode, 1 + LINK_SIZE, utf, cd, save_hwm);
|
||||
adjust_recurse(tempcode, 1 + LINK_SIZE, utf, cd, save_hwm_offset);
|
||||
memmove(tempcode + 1 + LINK_SIZE, tempcode, IN_UCHARS(len));
|
||||
code += 1 + LINK_SIZE;
|
||||
len += 1 + LINK_SIZE;
|
||||
|
|
@ -6420,15 +6479,25 @@ for (;; ptr++)
|
|||
parenthesis forms. */
|
||||
|
||||
case CHAR_LEFT_PARENTHESIS:
|
||||
newoptions = options;
|
||||
skipbytes = 0;
|
||||
bravalue = OP_CBRA;
|
||||
save_hwm = cd->hwm;
|
||||
reset_bracount = FALSE;
|
||||
|
||||
/* First deal with various "verbs" that can be introduced by '*'. */
|
||||
|
||||
ptr++;
|
||||
|
||||
/* First deal with comments. Putting this code right at the start ensures
|
||||
that comments have no bad side effects. */
|
||||
|
||||
if (ptr[0] == CHAR_QUESTION_MARK && ptr[1] == CHAR_NUMBER_SIGN)
|
||||
{
|
||||
ptr += 2;
|
||||
while (*ptr != CHAR_NULL && *ptr != CHAR_RIGHT_PARENTHESIS) ptr++;
|
||||
if (*ptr == CHAR_NULL)
|
||||
{
|
||||
*errorcodeptr = ERR18;
|
||||
goto FAILED;
|
||||
}
|
||||
continue;
|
||||
}
|
||||
|
||||
/* Now deal with various "verbs" that can be introduced by '*'. */
|
||||
|
||||
if (ptr[0] == CHAR_ASTERISK && (ptr[1] == ':'
|
||||
|| (MAX_255(ptr[1]) && ((cd->ctypes[ptr[1]] & ctype_letter) != 0))))
|
||||
{
|
||||
|
|
@ -6549,10 +6618,18 @@ for (;; ptr++)
|
|||
goto FAILED;
|
||||
}
|
||||
|
||||
/* Initialize for "real" parentheses */
|
||||
|
||||
newoptions = options;
|
||||
skipbytes = 0;
|
||||
bravalue = OP_CBRA;
|
||||
save_hwm_offset = cd->hwm - cd->start_workspace;
|
||||
reset_bracount = FALSE;
|
||||
|
||||
/* Deal with the extended parentheses; all are introduced by '?', and the
|
||||
appearance of any of them means that this is not a capturing group. */
|
||||
|
||||
else if (*ptr == CHAR_QUESTION_MARK)
|
||||
if (*ptr == CHAR_QUESTION_MARK)
|
||||
{
|
||||
int i, set, unset, namelen;
|
||||
int *optset;
|
||||
|
|
@ -6561,17 +6638,6 @@ for (;; ptr++)
|
|||
|
||||
switch (*(++ptr))
|
||||
{
|
||||
case CHAR_NUMBER_SIGN: /* Comment; skip to ket */
|
||||
ptr++;
|
||||
while (*ptr != CHAR_NULL && *ptr != CHAR_RIGHT_PARENTHESIS) ptr++;
|
||||
if (*ptr == CHAR_NULL)
|
||||
{
|
||||
*errorcodeptr = ERR18;
|
||||
goto FAILED;
|
||||
}
|
||||
continue;
|
||||
|
||||
|
||||
/* ------------------------------------------------------------ */
|
||||
case CHAR_VERTICAL_LINE: /* Reset capture count for each branch */
|
||||
reset_bracount = TRUE;
|
||||
|
|
@ -6620,8 +6686,13 @@ for (;; ptr++)
|
|||
if (tempptr[1] == CHAR_QUESTION_MARK &&
|
||||
(tempptr[2] == CHAR_EQUALS_SIGN ||
|
||||
tempptr[2] == CHAR_EXCLAMATION_MARK ||
|
||||
tempptr[2] == CHAR_LESS_THAN_SIGN))
|
||||
(tempptr[2] == CHAR_LESS_THAN_SIGN &&
|
||||
(tempptr[3] == CHAR_EQUALS_SIGN ||
|
||||
tempptr[3] == CHAR_EXCLAMATION_MARK))))
|
||||
{
|
||||
cd->iscondassert = TRUE;
|
||||
break;
|
||||
}
|
||||
|
||||
/* Other conditions use OP_CREF/OP_DNCREF/OP_RREF/OP_DNRREF, and all
|
||||
need to skip at least 1+IMM2_SIZE bytes at the start of the group. */
|
||||
|
|
@ -6698,8 +6769,7 @@ for (;; ptr++)
|
|||
ptr++;
|
||||
}
|
||||
namelen = (int)(ptr - name);
|
||||
if (lengthptr != NULL && (options & PCRE_DUPNAMES) != 0)
|
||||
*lengthptr += IMM2_SIZE;
|
||||
if (lengthptr != NULL) *lengthptr += IMM2_SIZE;
|
||||
}
|
||||
|
||||
/* Check the terminator */
|
||||
|
|
@ -6735,6 +6805,7 @@ for (;; ptr++)
|
|||
goto FAILED;
|
||||
}
|
||||
PUT2(code, 2+LINK_SIZE, recno);
|
||||
if (recno > cd->top_backref) cd->top_backref = recno;
|
||||
break;
|
||||
}
|
||||
|
||||
|
|
@ -6757,6 +6828,7 @@ for (;; ptr++)
|
|||
int offset = i++;
|
||||
int count = 1;
|
||||
recno = GET2(slot, 0); /* Number from first found */
|
||||
if (recno > cd->top_backref) cd->top_backref = recno;
|
||||
for (; i < cd->names_found; i++)
|
||||
{
|
||||
slot += cd->name_entry_size;
|
||||
|
|
@ -7114,11 +7186,11 @@ for (;; ptr++)
|
|||
|
||||
if (!is_recurse) cd->namedrefcount++;
|
||||
|
||||
/* If duplicate names are permitted, we have to allow for a named
|
||||
reference to a duplicated name (this cannot be determined until the
|
||||
second pass). This needs an extra 16-bit data item. */
|
||||
/* We have to allow for a named reference to a duplicated name (this
|
||||
cannot be determined until the second pass). This needs an extra
|
||||
16-bit data item. */
|
||||
|
||||
if ((options & PCRE_DUPNAMES) != 0) *lengthptr += IMM2_SIZE;
|
||||
*lengthptr += IMM2_SIZE;
|
||||
}
|
||||
|
||||
/* In the real compile, search the name table. We check the name
|
||||
|
|
@ -7475,12 +7547,22 @@ for (;; ptr++)
|
|||
goto FAILED;
|
||||
}
|
||||
|
||||
/* Assertions used not to be repeatable, but this was changed for Perl
|
||||
compatibility, so all kinds can now be repeated. We copy code into a
|
||||
/* All assertions used not to be repeatable, but this was changed for Perl
|
||||
compatibility. All kinds can now be repeated except for assertions that are
|
||||
conditions (Perl also forbids these to be repeated). We copy code into a
|
||||
non-register variable (tempcode) in order to be able to pass its address
|
||||
because some compilers complain otherwise. */
|
||||
because some compilers complain otherwise. At the start of a conditional
|
||||
group whose condition is an assertion, cd->iscondassert is set. We unset it
|
||||
here so as to allow assertions later in the group to be quantified. */
|
||||
|
||||
if (bravalue >= OP_ASSERT && bravalue <= OP_ASSERTBACK_NOT &&
|
||||
cd->iscondassert)
|
||||
{
|
||||
previous = NULL;
|
||||
cd->iscondassert = FALSE;
|
||||
}
|
||||
else previous = code;
|
||||
|
||||
previous = code; /* For handling repetition */
|
||||
*code = bravalue;
|
||||
tempcode = code;
|
||||
tempreqvary = cd->req_varyopt; /* Save value before bracket */
|
||||
|
|
@ -7727,7 +7809,7 @@ for (;; ptr++)
|
|||
const pcre_uchar *p;
|
||||
pcre_uint32 cf;
|
||||
|
||||
save_hwm = cd->hwm; /* Normally this is set when '(' is read */
|
||||
save_hwm_offset = cd->hwm - cd->start_workspace; /* Normally this is set when '(' is read */
|
||||
terminator = (*(++ptr) == CHAR_LESS_THAN_SIGN)?
|
||||
CHAR_GREATER_THAN_SIGN : CHAR_APOSTROPHE;
|
||||
|
||||
|
|
@ -8054,6 +8136,7 @@ int length;
|
|||
unsigned int orig_bracount;
|
||||
unsigned int max_bracount;
|
||||
branch_chain bc;
|
||||
size_t save_hwm_offset;
|
||||
|
||||
/* If set, call the external function that checks for stack availability. */
|
||||
|
||||
|
|
@ -8071,6 +8154,8 @@ bc.current_branch = code;
|
|||
firstchar = reqchar = 0;
|
||||
firstcharflags = reqcharflags = REQ_UNSET;
|
||||
|
||||
save_hwm_offset = cd->hwm - cd->start_workspace;
|
||||
|
||||
/* Accumulate the length for use in the pre-compile phase. Start with the
|
||||
length of the BRA and KET and any extra bytes that are required at the
|
||||
beginning. We accumulate in a local variable to save frequent testing of
|
||||
|
|
@ -8212,7 +8297,7 @@ for (;;)
|
|||
int fixed_length;
|
||||
*code = OP_END;
|
||||
fixed_length = find_fixedlength(last_branch, (options & PCRE_UTF8) != 0,
|
||||
FALSE, cd);
|
||||
FALSE, cd, NULL);
|
||||
DPRINTF(("fixed length = %d\n", fixed_length));
|
||||
if (fixed_length == -3)
|
||||
{
|
||||
|
|
@ -8273,7 +8358,7 @@ for (;;)
|
|||
{
|
||||
*code = OP_END;
|
||||
adjust_recurse(start_bracket, 1 + LINK_SIZE,
|
||||
(options & PCRE_UTF8) != 0, cd, cd->hwm);
|
||||
(options & PCRE_UTF8) != 0, cd, save_hwm_offset);
|
||||
memmove(start_bracket + 1 + LINK_SIZE, start_bracket,
|
||||
IN_UCHARS(code - start_bracket));
|
||||
*start_bracket = OP_ONCE;
|
||||
|
|
@ -8497,6 +8582,7 @@ do {
|
|||
case OP_RREF:
|
||||
case OP_DNRREF:
|
||||
case OP_DEF:
|
||||
case OP_FAIL:
|
||||
return FALSE;
|
||||
|
||||
default: /* Assertion */
|
||||
|
|
@ -9081,6 +9167,7 @@ cd->dupnames = FALSE;
|
|||
cd->namedrefcount = 0;
|
||||
cd->start_code = cworkspace;
|
||||
cd->hwm = cworkspace;
|
||||
cd->iscondassert = FALSE;
|
||||
cd->start_workspace = cworkspace;
|
||||
cd->workspace_size = COMPILE_WORK_SIZE;
|
||||
cd->named_groups = named_groups;
|
||||
|
|
@ -9118,13 +9205,6 @@ if (length > MAX_PATTERN_SIZE)
|
|||
goto PCRE_EARLY_ERROR_RETURN;
|
||||
}
|
||||
|
||||
/* If there are groups with duplicate names and there are also references by
|
||||
name, we must allow for the possibility of named references to duplicated
|
||||
groups. These require an extra data item each. */
|
||||
|
||||
if (cd->dupnames && cd->namedrefcount > 0)
|
||||
length += cd->namedrefcount * IMM2_SIZE * sizeof(pcre_uchar);
|
||||
|
||||
/* Compute the size of the data block for storing the compiled pattern. Integer
|
||||
overflow should no longer be possible because nowadays we limit the maximum
|
||||
value of cd->names_found and cd->name_entry_size. */
|
||||
|
|
@ -9183,6 +9263,7 @@ cd->name_table = (pcre_uchar *)re + re->name_table_offset;
|
|||
codestart = cd->name_table + re->name_entry_size * re->name_count;
|
||||
cd->start_code = codestart;
|
||||
cd->hwm = (pcre_uchar *)(cd->start_workspace);
|
||||
cd->iscondassert = FALSE;
|
||||
cd->req_varyopt = 0;
|
||||
cd->had_accept = FALSE;
|
||||
cd->had_pruneorskip = FALSE;
|
||||
|
|
@ -9319,7 +9400,7 @@ if (cd->check_lookbehind)
|
|||
int end_op = *be;
|
||||
*be = OP_END;
|
||||
fixed_length = find_fixedlength(cc, (re->options & PCRE_UTF8) != 0, TRUE,
|
||||
cd);
|
||||
cd, NULL);
|
||||
*be = end_op;
|
||||
DPRINTF(("fixed length = %d\n", fixed_length));
|
||||
if (fixed_length < 0)
|
||||
|
|
|
|||
|
|
@ -2736,9 +2736,10 @@ for (;;)
|
|||
condcode == OP_DNRREF)
|
||||
return PCRE_ERROR_DFA_UCOND;
|
||||
|
||||
/* The DEFINE condition is always false */
|
||||
/* The DEFINE condition is always false, and the assertion (?!) is
|
||||
converted to OP_FAIL. */
|
||||
|
||||
if (condcode == OP_DEF)
|
||||
if (condcode == OP_DEF || condcode == OP_FAIL)
|
||||
{ ADD_ACTIVE(state_offset + codelink + LINK_SIZE + 1, 0); }
|
||||
|
||||
/* The only supported version of OP_RREF is for the value RREF_ANY,
|
||||
|
|
|
|||
178
pcre/pcre_exec.c
178
pcre/pcre_exec.c
|
|
@ -1136,93 +1136,81 @@ for (;;)
|
|||
printf("\n");
|
||||
#endif
|
||||
|
||||
if (offset < md->offset_max)
|
||||
if (offset >= md->offset_max) goto POSSESSIVE_NON_CAPTURE;
|
||||
|
||||
matched_once = FALSE;
|
||||
code_offset = (int)(ecode - md->start_code);
|
||||
|
||||
save_offset1 = md->offset_vector[offset];
|
||||
save_offset2 = md->offset_vector[offset+1];
|
||||
save_offset3 = md->offset_vector[md->offset_end - number];
|
||||
save_capture_last = md->capture_last;
|
||||
|
||||
DPRINTF(("saving %d %d %d\n", save_offset1, save_offset2, save_offset3));
|
||||
|
||||
/* Each time round the loop, save the current subject position for use
|
||||
when the group matches. For MATCH_MATCH, the group has matched, so we
|
||||
restart it with a new subject starting position, remembering that we had
|
||||
at least one match. For MATCH_NOMATCH, carry on with the alternatives, as
|
||||
usual. If we haven't matched any alternatives in any iteration, check to
|
||||
see if a previous iteration matched. If so, the group has matched;
|
||||
continue from afterwards. Otherwise it has failed; restore the previous
|
||||
capture values before returning NOMATCH. */
|
||||
|
||||
for (;;)
|
||||
{
|
||||
matched_once = FALSE;
|
||||
code_offset = (int)(ecode - md->start_code);
|
||||
|
||||
save_offset1 = md->offset_vector[offset];
|
||||
save_offset2 = md->offset_vector[offset+1];
|
||||
save_offset3 = md->offset_vector[md->offset_end - number];
|
||||
save_capture_last = md->capture_last;
|
||||
|
||||
DPRINTF(("saving %d %d %d\n", save_offset1, save_offset2, save_offset3));
|
||||
|
||||
/* Each time round the loop, save the current subject position for use
|
||||
when the group matches. For MATCH_MATCH, the group has matched, so we
|
||||
restart it with a new subject starting position, remembering that we had
|
||||
at least one match. For MATCH_NOMATCH, carry on with the alternatives, as
|
||||
usual. If we haven't matched any alternatives in any iteration, check to
|
||||
see if a previous iteration matched. If so, the group has matched;
|
||||
continue from afterwards. Otherwise it has failed; restore the previous
|
||||
capture values before returning NOMATCH. */
|
||||
|
||||
for (;;)
|
||||
md->offset_vector[md->offset_end - number] =
|
||||
(int)(eptr - md->start_subject);
|
||||
if (op >= OP_SBRA) md->match_function_type = MATCH_CBEGROUP;
|
||||
RMATCH(eptr, ecode + PRIV(OP_lengths)[*ecode], offset_top, md,
|
||||
eptrb, RM63);
|
||||
if (rrc == MATCH_KETRPOS)
|
||||
{
|
||||
md->offset_vector[md->offset_end - number] =
|
||||
(int)(eptr - md->start_subject);
|
||||
if (op >= OP_SBRA) md->match_function_type = MATCH_CBEGROUP;
|
||||
RMATCH(eptr, ecode + PRIV(OP_lengths)[*ecode], offset_top, md,
|
||||
eptrb, RM63);
|
||||
if (rrc == MATCH_KETRPOS)
|
||||
offset_top = md->end_offset_top;
|
||||
ecode = md->start_code + code_offset;
|
||||
save_capture_last = md->capture_last;
|
||||
matched_once = TRUE;
|
||||
mstart = md->start_match_ptr; /* In case \K changed it */
|
||||
if (eptr == md->end_match_ptr) /* Matched an empty string */
|
||||
{
|
||||
offset_top = md->end_offset_top;
|
||||
ecode = md->start_code + code_offset;
|
||||
save_capture_last = md->capture_last;
|
||||
matched_once = TRUE;
|
||||
mstart = md->start_match_ptr; /* In case \K changed it */
|
||||
if (eptr == md->end_match_ptr) /* Matched an empty string */
|
||||
{
|
||||
do ecode += GET(ecode, 1); while (*ecode == OP_ALT);
|
||||
break;
|
||||
}
|
||||
eptr = md->end_match_ptr;
|
||||
continue;
|
||||
do ecode += GET(ecode, 1); while (*ecode == OP_ALT);
|
||||
break;
|
||||
}
|
||||
|
||||
/* See comment in the code for capturing groups above about handling
|
||||
THEN. */
|
||||
|
||||
if (rrc == MATCH_THEN)
|
||||
{
|
||||
next = ecode + GET(ecode,1);
|
||||
if (md->start_match_ptr < next &&
|
||||
(*ecode == OP_ALT || *next == OP_ALT))
|
||||
rrc = MATCH_NOMATCH;
|
||||
}
|
||||
|
||||
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
|
||||
md->capture_last = save_capture_last;
|
||||
ecode += GET(ecode, 1);
|
||||
if (*ecode != OP_ALT) break;
|
||||
eptr = md->end_match_ptr;
|
||||
continue;
|
||||
}
|
||||
|
||||
if (!matched_once)
|
||||
/* See comment in the code for capturing groups above about handling
|
||||
THEN. */
|
||||
|
||||
if (rrc == MATCH_THEN)
|
||||
{
|
||||
md->offset_vector[offset] = save_offset1;
|
||||
md->offset_vector[offset+1] = save_offset2;
|
||||
md->offset_vector[md->offset_end - number] = save_offset3;
|
||||
next = ecode + GET(ecode,1);
|
||||
if (md->start_match_ptr < next &&
|
||||
(*ecode == OP_ALT || *next == OP_ALT))
|
||||
rrc = MATCH_NOMATCH;
|
||||
}
|
||||
|
||||
if (allow_zero || matched_once)
|
||||
{
|
||||
ecode += 1 + LINK_SIZE;
|
||||
break;
|
||||
}
|
||||
|
||||
RRETURN(MATCH_NOMATCH);
|
||||
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
|
||||
md->capture_last = save_capture_last;
|
||||
ecode += GET(ecode, 1);
|
||||
if (*ecode != OP_ALT) break;
|
||||
}
|
||||
|
||||
/* FALL THROUGH ... Insufficient room for saving captured contents. Treat
|
||||
as a non-capturing bracket. */
|
||||
if (!matched_once)
|
||||
{
|
||||
md->offset_vector[offset] = save_offset1;
|
||||
md->offset_vector[offset+1] = save_offset2;
|
||||
md->offset_vector[md->offset_end - number] = save_offset3;
|
||||
}
|
||||
|
||||
/* VVVVVVVVVVVVVVVVVVVVVVVVV */
|
||||
/* VVVVVVVVVVVVVVVVVVVVVVVVV */
|
||||
if (allow_zero || matched_once)
|
||||
{
|
||||
ecode += 1 + LINK_SIZE;
|
||||
break;
|
||||
}
|
||||
|
||||
DPRINTF(("insufficient capture room: treat as non-capturing\n"));
|
||||
|
||||
/* VVVVVVVVVVVVVVVVVVVVVVVVV */
|
||||
/* VVVVVVVVVVVVVVVVVVVVVVVVV */
|
||||
RRETURN(MATCH_NOMATCH);
|
||||
|
||||
/* Non-capturing possessive bracket with unlimited repeat. We come here
|
||||
from BRAZERO with allow_zero = TRUE. The code is similar to the above,
|
||||
|
|
@ -1388,6 +1376,7 @@ for (;;)
|
|||
break;
|
||||
|
||||
case OP_DEF: /* DEFINE - always false */
|
||||
case OP_FAIL: /* From optimized (?!) condition */
|
||||
break;
|
||||
|
||||
/* The condition is an assertion. Call match() to evaluate it - setting
|
||||
|
|
@ -1404,8 +1393,11 @@ for (;;)
|
|||
condition = TRUE;
|
||||
|
||||
/* Advance ecode past the assertion to the start of the first branch,
|
||||
but adjust it so that the general choosing code below works. */
|
||||
but adjust it so that the general choosing code below works. If the
|
||||
assertion has a quantifier that allows zero repeats we must skip over
|
||||
the BRAZERO. This is a lunatic thing to do, but somebody did! */
|
||||
|
||||
if (*ecode == OP_BRAZERO) ecode++;
|
||||
ecode += GET(ecode, 1);
|
||||
while (*ecode == OP_ALT) ecode += GET(ecode, 1);
|
||||
ecode += 1 + LINK_SIZE - PRIV(OP_lengths)[condcode];
|
||||
|
|
@ -1474,7 +1466,18 @@ for (;;)
|
|||
md->offset_vector[offset] =
|
||||
md->offset_vector[md->offset_end - number];
|
||||
md->offset_vector[offset+1] = (int)(eptr - md->start_subject);
|
||||
if (offset_top <= offset) offset_top = offset + 2;
|
||||
|
||||
/* If this group is at or above the current highwater mark, ensure that
|
||||
any groups between the current high water mark and this group are marked
|
||||
unset and then update the high water mark. */
|
||||
|
||||
if (offset >= offset_top)
|
||||
{
|
||||
register int *iptr = md->offset_vector + offset_top;
|
||||
register int *iend = md->offset_vector + offset;
|
||||
while (iptr < iend) *iptr++ = -1;
|
||||
offset_top = offset + 2;
|
||||
}
|
||||
}
|
||||
ecode += 1 + IMM2_SIZE;
|
||||
break;
|
||||
|
|
@ -1826,7 +1829,11 @@ for (;;)
|
|||
are defined in a range that can be tested for. */
|
||||
|
||||
if (rrc >= MATCH_BACKTRACK_MIN && rrc <= MATCH_BACKTRACK_MAX)
|
||||
{
|
||||
if (new_recursive.offset_save != stacksave)
|
||||
(PUBL(free))(new_recursive.offset_save);
|
||||
RRETURN(MATCH_NOMATCH);
|
||||
}
|
||||
|
||||
/* Any return code other than NOMATCH is an error. */
|
||||
|
||||
|
|
@ -3476,7 +3483,7 @@ for (;;)
|
|||
if (possessive) continue; /* No backtracking */
|
||||
for(;;)
|
||||
{
|
||||
if (eptr == pp) goto TAIL_RECURSE;
|
||||
if (eptr <= pp) goto TAIL_RECURSE;
|
||||
RMATCH(eptr, ecode, offset_top, md, eptrb, RM23);
|
||||
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
|
||||
#ifdef SUPPORT_UCP
|
||||
|
|
@ -3897,7 +3904,7 @@ for (;;)
|
|||
if (possessive) continue; /* No backtracking */
|
||||
for(;;)
|
||||
{
|
||||
if (eptr == pp) goto TAIL_RECURSE;
|
||||
if (eptr <= pp) goto TAIL_RECURSE;
|
||||
RMATCH(eptr, ecode, offset_top, md, eptrb, RM30);
|
||||
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
|
||||
eptr--;
|
||||
|
|
@ -4032,7 +4039,7 @@ for (;;)
|
|||
if (possessive) continue; /* No backtracking */
|
||||
for(;;)
|
||||
{
|
||||
if (eptr == pp) goto TAIL_RECURSE;
|
||||
if (eptr <= pp) goto TAIL_RECURSE;
|
||||
RMATCH(eptr, ecode, offset_top, md, eptrb, RM34);
|
||||
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
|
||||
eptr--;
|
||||
|
|
@ -5603,7 +5610,7 @@ for (;;)
|
|||
if (possessive) continue; /* No backtracking */
|
||||
for(;;)
|
||||
{
|
||||
if (eptr == pp) goto TAIL_RECURSE;
|
||||
if (eptr <= pp) goto TAIL_RECURSE;
|
||||
RMATCH(eptr, ecode, offset_top, md, eptrb, RM44);
|
||||
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
|
||||
eptr--;
|
||||
|
|
@ -5645,12 +5652,17 @@ for (;;)
|
|||
|
||||
if (possessive) continue; /* No backtracking */
|
||||
|
||||
/* We use <= pp rather than == pp to detect the start of the run while
|
||||
backtracking because the use of \C in UTF mode can cause BACKCHAR to
|
||||
move back past pp. This is just palliative; the use of \C in UTF mode
|
||||
is fraught with danger. */
|
||||
|
||||
for(;;)
|
||||
{
|
||||
int lgb, rgb;
|
||||
PCRE_PUCHAR fptr;
|
||||
|
||||
if (eptr == pp) goto TAIL_RECURSE; /* At start of char run */
|
||||
if (eptr <= pp) goto TAIL_RECURSE; /* At start of char run */
|
||||
RMATCH(eptr, ecode, offset_top, md, eptrb, RM45);
|
||||
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
|
||||
|
||||
|
|
@ -5668,7 +5680,7 @@ for (;;)
|
|||
|
||||
for (;;)
|
||||
{
|
||||
if (eptr == pp) goto TAIL_RECURSE; /* At start of char run */
|
||||
if (eptr <= pp) goto TAIL_RECURSE; /* At start of char run */
|
||||
fptr = eptr - 1;
|
||||
if (!utf) c = *fptr; else
|
||||
{
|
||||
|
|
@ -5918,7 +5930,7 @@ for (;;)
|
|||
if (possessive) continue; /* No backtracking */
|
||||
for(;;)
|
||||
{
|
||||
if (eptr == pp) goto TAIL_RECURSE;
|
||||
if (eptr <= pp) goto TAIL_RECURSE;
|
||||
RMATCH(eptr, ecode, offset_top, md, eptrb, RM46);
|
||||
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
|
||||
eptr--;
|
||||
|
|
|
|||
|
|
@ -2446,6 +2446,7 @@ typedef struct compile_data {
|
|||
BOOL had_pruneorskip; /* (*PRUNE) or (*SKIP) encountered */
|
||||
BOOL check_lookbehind; /* Lookbehinds need later checking */
|
||||
BOOL dupnames; /* Duplicate names exist */
|
||||
BOOL iscondassert; /* Next assert is a condition */
|
||||
int nltype; /* Newline type */
|
||||
int nllen; /* Newline string length */
|
||||
pcre_uchar nl[4]; /* Newline string when fixed length */
|
||||
|
|
@ -2459,6 +2460,13 @@ typedef struct branch_chain {
|
|||
pcre_uchar *current_branch;
|
||||
} branch_chain;
|
||||
|
||||
/* Structure for mutual recursion detection. */
|
||||
|
||||
typedef struct recurse_check {
|
||||
struct recurse_check *prev;
|
||||
const pcre_uchar *group;
|
||||
} recurse_check;
|
||||
|
||||
/* Structure for items in a linked list that represents an explicit recursive
|
||||
call within the pattern; used by pcre_exec(). */
|
||||
|
||||
|
|
|
|||
File diff suppressed because it is too large
Load diff
|
|
@ -51,8 +51,6 @@ POSSIBILITY OF SUCH DAMAGE.
|
|||
|
||||
#include "pcre_internal.h"
|
||||
|
||||
#define PCRE_BUG 0x80000000
|
||||
|
||||
/*
|
||||
Letter characters:
|
||||
\xe6\x92\xad = 0x64ad = 25773 (kanji)
|
||||
|
|
@ -69,6 +67,9 @@ POSSIBILITY OF SUCH DAMAGE.
|
|||
\xc3\x89 = 0xc9 = 201 (E')
|
||||
\xc3\xa1 = 0xe1 = 225 (a')
|
||||
\xc3\x81 = 0xc1 = 193 (A')
|
||||
\x53 = 0x53 = S
|
||||
\x73 = 0x73 = s
|
||||
\xc5\xbf = 0x17f = 383 (long S)
|
||||
\xc8\xba = 0x23a = 570
|
||||
\xe2\xb1\xa5 = 0x2c65 = 11365
|
||||
\xe1\xbd\xb8 = 0x1f78 = 8056
|
||||
|
|
@ -78,6 +79,10 @@ POSSIBILITY OF SUCH DAMAGE.
|
|||
\xc7\x84 = 0x1c4 = 452
|
||||
\xc7\x85 = 0x1c5 = 453
|
||||
\xc7\x86 = 0x1c6 = 454
|
||||
Caseless sets:
|
||||
ucp_Armenian - \x{531}-\x{556} -> \x{561}-\x{586}
|
||||
ucp_Coptic - \x{2c80}-\x{2ce3} -> caseless: XOR 0x1
|
||||
ucp_Latin - \x{ff21}-\x{ff3a} -> \x{ff41]-\x{ff5a}
|
||||
|
||||
Mark property:
|
||||
\xcc\x8d = 0x30d = 781
|
||||
|
|
@ -626,6 +631,9 @@ static struct regression_test_case regression_test_cases[] = {
|
|||
{ MUA, 0, "(?P<Name>a)?(?P<Name2>b)?(?(Name)c|d)+?dd", "bcabcacdb bdddd" },
|
||||
{ MUA, 0, "(?P<Name>a)?(?P<Name2>b)?(?(Name)c|d)+l", "ababccddabdbccd abcccl" },
|
||||
{ MUA, 0, "((?:a|aa)(?(1)aaa))x", "aax" },
|
||||
{ MUA, 0, "(?(?!)a|b)", "ab" },
|
||||
{ MUA, 0, "(?(?!)a)", "ab" },
|
||||
{ MUA, 0 | F_NOMATCH, "(?(?!)a|b)", "ac" },
|
||||
|
||||
/* Set start of match. */
|
||||
{ MUA, 0, "(?:\\Ka)*aaaab", "aaaaaaaa aaaaaaabb" },
|
||||
|
|
@ -944,7 +952,7 @@ static void setstack16(pcre16_extra *extra)
|
|||
|
||||
pcre16_assign_jit_stack(extra, callback16, getstack16());
|
||||
}
|
||||
#endif /* SUPPORT_PCRE8 */
|
||||
#endif /* SUPPORT_PCRE16 */
|
||||
|
||||
#ifdef SUPPORT_PCRE32
|
||||
static pcre32_jit_stack *stack32;
|
||||
|
|
@ -967,7 +975,7 @@ static void setstack32(pcre32_extra *extra)
|
|||
|
||||
pcre32_assign_jit_stack(extra, callback32, getstack32());
|
||||
}
|
||||
#endif /* SUPPORT_PCRE8 */
|
||||
#endif /* SUPPORT_PCRE32 */
|
||||
|
||||
#ifdef SUPPORT_PCRE16
|
||||
|
||||
|
|
@ -1177,7 +1185,7 @@ static int regression_tests(void)
|
|||
#elif defined SUPPORT_PCRE16
|
||||
pcre16_config(PCRE_CONFIG_UTF16, &utf);
|
||||
pcre16_config(PCRE_CONFIG_UNICODE_PROPERTIES, &ucp);
|
||||
#elif defined SUPPORT_PCRE16
|
||||
#elif defined SUPPORT_PCRE32
|
||||
pcre32_config(PCRE_CONFIG_UTF32, &utf);
|
||||
pcre32_config(PCRE_CONFIG_UNICODE_PROPERTIES, &ucp);
|
||||
#endif
|
||||
|
|
|
|||
|
|
@ -70,7 +70,7 @@ Arguments:
|
|||
code pointer to start of group (the bracket)
|
||||
startcode pointer to start of the whole pattern's code
|
||||
options the compiling options
|
||||
int RECURSE depth
|
||||
recurses chain of recurse_check to catch mutual recursion
|
||||
|
||||
Returns: the minimum length
|
||||
-1 if \C in UTF-8 mode or (*ACCEPT) was encountered
|
||||
|
|
@ -80,12 +80,13 @@ Returns: the minimum length
|
|||
|
||||
static int
|
||||
find_minlength(const REAL_PCRE *re, const pcre_uchar *code,
|
||||
const pcre_uchar *startcode, int options, int recurse_depth)
|
||||
const pcre_uchar *startcode, int options, recurse_check *recurses)
|
||||
{
|
||||
int length = -1;
|
||||
/* PCRE_UTF16 has the same value as PCRE_UTF8. */
|
||||
BOOL utf = (options & PCRE_UTF8) != 0;
|
||||
BOOL had_recurse = FALSE;
|
||||
recurse_check this_recurse;
|
||||
register int branchlength = 0;
|
||||
register pcre_uchar *cc = (pcre_uchar *)code + 1 + LINK_SIZE;
|
||||
|
||||
|
|
@ -130,7 +131,7 @@ for (;;)
|
|||
case OP_SBRAPOS:
|
||||
case OP_ONCE:
|
||||
case OP_ONCE_NC:
|
||||
d = find_minlength(re, cc, startcode, options, recurse_depth);
|
||||
d = find_minlength(re, cc, startcode, options, recurses);
|
||||
if (d < 0) return d;
|
||||
branchlength += d;
|
||||
do cc += GET(cc, 1); while (*cc == OP_ALT);
|
||||
|
|
@ -393,7 +394,7 @@ for (;;)
|
|||
ce = cs = (pcre_uchar *)PRIV(find_bracket)(startcode, utf, GET2(slot, 0));
|
||||
if (cs == NULL) return -2;
|
||||
do ce += GET(ce, 1); while (*ce == OP_ALT);
|
||||
if (cc > cs && cc < ce)
|
||||
if (cc > cs && cc < ce) /* Simple recursion */
|
||||
{
|
||||
d = 0;
|
||||
had_recurse = TRUE;
|
||||
|
|
@ -401,8 +402,22 @@ for (;;)
|
|||
}
|
||||
else
|
||||
{
|
||||
int dd = find_minlength(re, cs, startcode, options, recurse_depth);
|
||||
if (dd < d) d = dd;
|
||||
recurse_check *r = recurses;
|
||||
for (r = recurses; r != NULL; r = r->prev) if (r->group == cs) break;
|
||||
if (r != NULL) /* Mutual recursion */
|
||||
{
|
||||
d = 0;
|
||||
had_recurse = TRUE;
|
||||
break;
|
||||
}
|
||||
else
|
||||
{
|
||||
int dd;
|
||||
this_recurse.prev = recurses;
|
||||
this_recurse.group = cs;
|
||||
dd = find_minlength(re, cs, startcode, options, &this_recurse);
|
||||
if (dd < d) d = dd;
|
||||
}
|
||||
}
|
||||
slot += re->name_entry_size;
|
||||
}
|
||||
|
|
@ -418,14 +433,26 @@ for (;;)
|
|||
ce = cs = (pcre_uchar *)PRIV(find_bracket)(startcode, utf, GET2(cc, 1));
|
||||
if (cs == NULL) return -2;
|
||||
do ce += GET(ce, 1); while (*ce == OP_ALT);
|
||||
if (cc > cs && cc < ce)
|
||||
if (cc > cs && cc < ce) /* Simple recursion */
|
||||
{
|
||||
d = 0;
|
||||
had_recurse = TRUE;
|
||||
}
|
||||
else
|
||||
{
|
||||
d = find_minlength(re, cs, startcode, options, recurse_depth);
|
||||
recurse_check *r = recurses;
|
||||
for (r = recurses; r != NULL; r = r->prev) if (r->group == cs) break;
|
||||
if (r != NULL) /* Mutual recursion */
|
||||
{
|
||||
d = 0;
|
||||
had_recurse = TRUE;
|
||||
}
|
||||
else
|
||||
{
|
||||
this_recurse.prev = recurses;
|
||||
this_recurse.group = cs;
|
||||
d = find_minlength(re, cs, startcode, options, &this_recurse);
|
||||
}
|
||||
}
|
||||
}
|
||||
else d = 0;
|
||||
|
|
@ -474,12 +501,21 @@ for (;;)
|
|||
case OP_RECURSE:
|
||||
cs = ce = (pcre_uchar *)startcode + GET(cc, 1);
|
||||
do ce += GET(ce, 1); while (*ce == OP_ALT);
|
||||
if ((cc > cs && cc < ce) || recurse_depth > 10)
|
||||
if (cc > cs && cc < ce) /* Simple recursion */
|
||||
had_recurse = TRUE;
|
||||
else
|
||||
{
|
||||
branchlength += find_minlength(re, cs, startcode, options,
|
||||
recurse_depth + 1);
|
||||
recurse_check *r = recurses;
|
||||
for (r = recurses; r != NULL; r = r->prev) if (r->group == cs) break;
|
||||
if (r != NULL) /* Mutual recursion */
|
||||
had_recurse = TRUE;
|
||||
else
|
||||
{
|
||||
this_recurse.prev = recurses;
|
||||
this_recurse.group = cs;
|
||||
branchlength += find_minlength(re, cs, startcode, options,
|
||||
&this_recurse);
|
||||
}
|
||||
}
|
||||
cc += 1 + LINK_SIZE;
|
||||
break;
|
||||
|
|
@ -1503,7 +1539,7 @@ if ((re->options & PCRE_ANCHORED) == 0 &&
|
|||
|
||||
/* Find the minimum length of subject string. */
|
||||
|
||||
switch(min = find_minlength(re, code, code, re->options, 0))
|
||||
switch(min = find_minlength(re, code, code, re->options, NULL))
|
||||
{
|
||||
case -2: *errorptr = "internal error: missing capturing bracket"; return NULL;
|
||||
case -3: *errorptr = "internal error: opcode not recognized"; return NULL;
|
||||
|
|
|
|||
|
|
@ -1582,12 +1582,15 @@ while (ptr < endptr)
|
|||
int endlinelength;
|
||||
int mrc = 0;
|
||||
int startoffset = 0;
|
||||
int prevoffsets[2];
|
||||
unsigned int options = 0;
|
||||
BOOL match;
|
||||
char *matchptr = ptr;
|
||||
char *t = ptr;
|
||||
size_t length, linelength;
|
||||
|
||||
prevoffsets[0] = prevoffsets[1] = -1;
|
||||
|
||||
/* At this point, ptr is at the start of a line. We need to find the length
|
||||
of the subject string to pass to pcre_exec(). In multiline mode, it is the
|
||||
length remainder of the data in the buffer. Otherwise, it is the length of
|
||||
|
|
@ -1729,55 +1732,86 @@ while (ptr < endptr)
|
|||
{
|
||||
if (!invert)
|
||||
{
|
||||
if (printname != NULL) fprintf(stdout, "%s:", printname);
|
||||
if (number) fprintf(stdout, "%d:", linenumber);
|
||||
int oldstartoffset = startoffset;
|
||||
|
||||
/* Handle --line-offsets */
|
||||
/* It is possible, when a lookbehind assertion contains \K, for the
|
||||
same string to be found again. The code below advances startoffset, but
|
||||
until it is past the "bumpalong" offset that gave the match, the same
|
||||
substring will be returned. The PCRE1 library does not return the
|
||||
bumpalong offset, so all we can do is ignore repeated strings. (PCRE2
|
||||
does this better.) */
|
||||
|
||||
if (line_offsets)
|
||||
fprintf(stdout, "%d,%d\n", (int)(matchptr + offsets[0] - ptr),
|
||||
offsets[1] - offsets[0]);
|
||||
|
||||
/* Handle --file-offsets */
|
||||
|
||||
else if (file_offsets)
|
||||
fprintf(stdout, "%d,%d\n",
|
||||
(int)(filepos + matchptr + offsets[0] - ptr),
|
||||
offsets[1] - offsets[0]);
|
||||
|
||||
/* Handle --only-matching, which may occur many times */
|
||||
|
||||
else
|
||||
if (prevoffsets[0] != offsets[0] || prevoffsets[1] != offsets[1])
|
||||
{
|
||||
BOOL printed = FALSE;
|
||||
omstr *om;
|
||||
prevoffsets[0] = offsets[0];
|
||||
prevoffsets[1] = offsets[1];
|
||||
|
||||
for (om = only_matching; om != NULL; om = om->next)
|
||||
if (printname != NULL) fprintf(stdout, "%s:", printname);
|
||||
if (number) fprintf(stdout, "%d:", linenumber);
|
||||
|
||||
/* Handle --line-offsets */
|
||||
|
||||
if (line_offsets)
|
||||
fprintf(stdout, "%d,%d\n", (int)(matchptr + offsets[0] - ptr),
|
||||
offsets[1] - offsets[0]);
|
||||
|
||||
/* Handle --file-offsets */
|
||||
|
||||
else if (file_offsets)
|
||||
fprintf(stdout, "%d,%d\n",
|
||||
(int)(filepos + matchptr + offsets[0] - ptr),
|
||||
offsets[1] - offsets[0]);
|
||||
|
||||
/* Handle --only-matching, which may occur many times */
|
||||
|
||||
else
|
||||
{
|
||||
int n = om->groupnum;
|
||||
if (n < mrc)
|
||||
BOOL printed = FALSE;
|
||||
omstr *om;
|
||||
|
||||
for (om = only_matching; om != NULL; om = om->next)
|
||||
{
|
||||
int plen = offsets[2*n + 1] - offsets[2*n];
|
||||
if (plen > 0)
|
||||
int n = om->groupnum;
|
||||
if (n < mrc)
|
||||
{
|
||||
if (printed) fprintf(stdout, "%s", om_separator);
|
||||
if (do_colour) fprintf(stdout, "%c[%sm", 0x1b, colour_string);
|
||||
FWRITE(matchptr + offsets[n*2], 1, plen, stdout);
|
||||
if (do_colour) fprintf(stdout, "%c[00m", 0x1b);
|
||||
printed = TRUE;
|
||||
int plen = offsets[2*n + 1] - offsets[2*n];
|
||||
if (plen > 0)
|
||||
{
|
||||
if (printed) fprintf(stdout, "%s", om_separator);
|
||||
if (do_colour) fprintf(stdout, "%c[%sm", 0x1b, colour_string);
|
||||
FWRITE(matchptr + offsets[n*2], 1, plen, stdout);
|
||||
if (do_colour) fprintf(stdout, "%c[00m", 0x1b);
|
||||
printed = TRUE;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (printed || printname != NULL || number) fprintf(stdout, "\n");
|
||||
if (printed || printname != NULL || number) fprintf(stdout, "\n");
|
||||
}
|
||||
}
|
||||
|
||||
/* Prepare to repeat to find the next match */
|
||||
/* Prepare to repeat to find the next match. If the patterned contained
|
||||
a lookbehind tht included \K, it is possible that the end of the match
|
||||
might be at or before the actual strting offset we have just used. We
|
||||
need to start one character further on. Unfortunately, for unanchored
|
||||
patterns, the actual start offset can be greater that the one that was
|
||||
set as a result of "bumpalong". PCRE1 does not return the actual start
|
||||
offset, so we have to check against the original start offset. This may
|
||||
lead to duplicates - we we need the fudge above to avoid printing them.
|
||||
(PCRE2 does this better.) */
|
||||
|
||||
match = FALSE;
|
||||
if (line_buffered) fflush(stdout);
|
||||
rc = 0; /* Had some success */
|
||||
startoffset = offsets[1]; /* Restart after the match */
|
||||
if (startoffset <= oldstartoffset)
|
||||
{
|
||||
if ((size_t)startoffset >= length)
|
||||
goto END_ONE_MATCH; /* We were at the end */
|
||||
startoffset = oldstartoffset + 1;
|
||||
if (utf8)
|
||||
while ((matchptr[startoffset] & 0xc0) == 0x80) startoffset++;
|
||||
}
|
||||
goto ONLY_MATCHING_RESTART;
|
||||
}
|
||||
}
|
||||
|
|
@ -1974,6 +2008,7 @@ while (ptr < endptr)
|
|||
/* Advance to after the newline and increment the line number. The file
|
||||
offset to the current line is maintained in filepos. */
|
||||
|
||||
END_ONE_MATCH:
|
||||
ptr += linelength + endlinelength;
|
||||
filepos += (int)(linelength + endlinelength);
|
||||
linenumber++;
|
||||
|
|
|
|||
|
|
@ -2257,16 +2257,19 @@ if (callout_extra)
|
|||
fprintf(f, "Callout %d: last capture = %d\n",
|
||||
cb->callout_number, cb->capture_last);
|
||||
|
||||
for (i = 0; i < cb->capture_top * 2; i += 2)
|
||||
if (cb->offset_vector != NULL)
|
||||
{
|
||||
if (cb->offset_vector[i] < 0)
|
||||
fprintf(f, "%2d: <unset>\n", i/2);
|
||||
else
|
||||
for (i = 0; i < cb->capture_top * 2; i += 2)
|
||||
{
|
||||
fprintf(f, "%2d: ", i/2);
|
||||
PCHARSV(cb->subject, cb->offset_vector[i],
|
||||
cb->offset_vector[i+1] - cb->offset_vector[i], f);
|
||||
fprintf(f, "\n");
|
||||
if (cb->offset_vector[i] < 0)
|
||||
fprintf(f, "%2d: <unset>\n", i/2);
|
||||
else
|
||||
{
|
||||
fprintf(f, "%2d: ", i/2);
|
||||
PCHARSV(cb->subject, cb->offset_vector[i],
|
||||
cb->offset_vector[i+1] - cb->offset_vector[i], f);
|
||||
fprintf(f, "\n");
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -2519,7 +2522,7 @@ re->name_entry_size = swap_uint16(re->name_entry_size);
|
|||
re->name_count = swap_uint16(re->name_count);
|
||||
re->ref_count = swap_uint16(re->ref_count);
|
||||
|
||||
if (extra != NULL)
|
||||
if (extra != NULL && (extra->flags & PCRE_EXTRA_STUDY_DATA) != 0)
|
||||
{
|
||||
pcre_study_data *rsd = (pcre_study_data *)(extra->study_data);
|
||||
rsd->size = swap_uint32(rsd->size);
|
||||
|
|
@ -2700,7 +2703,7 @@ re->name_entry_size = swap_uint16(re->name_entry_size);
|
|||
re->name_count = swap_uint16(re->name_count);
|
||||
re->ref_count = swap_uint16(re->ref_count);
|
||||
|
||||
if (extra != NULL)
|
||||
if (extra != NULL && (extra->flags & PCRE_EXTRA_STUDY_DATA) != 0)
|
||||
{
|
||||
pcre_study_data *rsd = (pcre_study_data *)(extra->study_data);
|
||||
rsd->size = swap_uint32(rsd->size);
|
||||
|
|
@ -3453,7 +3456,7 @@ while (!done)
|
|||
pcre_extra *extra = NULL;
|
||||
|
||||
#if !defined NOPOSIX /* There are still compilers that require no indent */
|
||||
regex_t preg;
|
||||
regex_t preg = { NULL, 0, 0} ;
|
||||
int do_posix = 0;
|
||||
#endif
|
||||
|
||||
|
|
@ -5603,6 +5606,12 @@ while (!done)
|
|||
|
||||
if (!do_g && !do_G) break;
|
||||
|
||||
if (use_offsets == NULL)
|
||||
{
|
||||
fprintf(outfile, "Cannot do global matching without an ovector\n");
|
||||
break;
|
||||
}
|
||||
|
||||
/* If we have matched an empty string, first check to see if we are at
|
||||
the end of the subject. If so, the /g loop is over. Otherwise, mimic what
|
||||
Perl's /g options does. This turns out to be rather cunning. First we set
|
||||
|
|
@ -5618,9 +5627,33 @@ while (!done)
|
|||
g_notempty = PCRE_NOTEMPTY_ATSTART | PCRE_ANCHORED;
|
||||
}
|
||||
|
||||
/* For /g, update the start offset, leaving the rest alone */
|
||||
/* For /g, update the start offset, leaving the rest alone. There is a
|
||||
tricky case when \K is used in a positive lookbehind assertion. This can
|
||||
cause the end of the match to be less than or equal to the start offset.
|
||||
In this case we restart at one past the start offset. This may return the
|
||||
same match if the original start offset was bumped along during the
|
||||
match, but eventually the new start offset will hit the actual start
|
||||
offset. (In PCRE2 the true start offset is available, and this can be
|
||||
done better. It is not worth doing more than making sure we do not loop
|
||||
at this stage in the life of PCRE1.) */
|
||||
|
||||
if (do_g) start_offset = use_offsets[1];
|
||||
if (do_g)
|
||||
{
|
||||
if (g_notempty == 0 && use_offsets[1] <= start_offset)
|
||||
{
|
||||
if (start_offset >= len) break; /* End of subject */
|
||||
start_offset++;
|
||||
if (use_utf)
|
||||
{
|
||||
while (start_offset < len)
|
||||
{
|
||||
if ((bptr[start_offset] & 0xc0) != 0x80) break;
|
||||
start_offset++;
|
||||
}
|
||||
}
|
||||
}
|
||||
else start_offset = use_offsets[1];
|
||||
}
|
||||
|
||||
/* For /G, update the pointer and length */
|
||||
|
||||
|
|
@ -5637,7 +5670,7 @@ while (!done)
|
|||
CONTINUE:
|
||||
|
||||
#if !defined NOPOSIX
|
||||
if (posix || do_posix) regfree(&preg);
|
||||
if ((posix || do_posix) && preg.re_pcre != 0) regfree(&preg);
|
||||
#endif
|
||||
|
||||
if (re != NULL) new_free(re);
|
||||
|
|
|
|||
8
pcre/testdata/grepoutput
vendored
8
pcre/testdata/grepoutput
vendored
|
|
@ -743,3 +743,11 @@ RC=0
|
|||
---------------------------- Test 106 -----------------------------
|
||||
a
|
||||
RC=0
|
||||
---------------------------- Test 107 -----------------------------
|
||||
1:0,1
|
||||
2:0,1
|
||||
2:1,1
|
||||
2:2,1
|
||||
2:3,1
|
||||
2:4,1
|
||||
RC=0
|
||||
|
|
|
|||
10
pcre/testdata/testinput1
vendored
10
pcre/testdata/testinput1
vendored
|
|
@ -5720,4 +5720,14 @@ AbcdCBefgBhiBqz
|
|||
/[\Q]a\E]+/
|
||||
aa]]
|
||||
|
||||
/(?:((abcd))|(((?:(?:(?:(?:abc|(?:abcdef))))b)abcdefghi)abc)|((*ACCEPT)))/
|
||||
1234abcd
|
||||
|
||||
/(\2)(\1)/
|
||||
|
||||
"Z*(|d*){216}"
|
||||
|
||||
"(?1)(?#?'){8}(a)"
|
||||
baaaaaaaaac
|
||||
|
||||
/-- End of testinput1 --/
|
||||
|
|
|
|||
2
pcre/testdata/testinput11
vendored
2
pcre/testdata/testinput11
vendored
|
|
@ -134,4 +134,6 @@ is required for these tests. --/
|
|||
|
||||
/(((a\2)|(a*)\g<-1>))*a?/B
|
||||
|
||||
/((?+1)(\1))/B
|
||||
|
||||
/-- End of testinput11 --/
|
||||
|
|
|
|||
8
pcre/testdata/testinput12
vendored
8
pcre/testdata/testinput12
vendored
|
|
@ -87,4 +87,12 @@ and a couple of things that are different with JIT. --/
|
|||
/^12345678abcd/mS++
|
||||
12345678abcd
|
||||
|
||||
/-- Test pattern compilation --/
|
||||
|
||||
/(?:a|b|c|d|e)(?R)/S++
|
||||
|
||||
/(?:a|b|c|d|e)(?R)(?R)/S++
|
||||
|
||||
/(a(?:a|b|c|d|e)b){8,16}/S++
|
||||
|
||||
/-- End of testinput12 --/
|
||||
|
|
|
|||
74
pcre/testdata/testinput2
vendored
74
pcre/testdata/testinput2
vendored
|
|
@ -1380,6 +1380,8 @@
|
|||
1X
|
||||
123456\P
|
||||
|
||||
//KF>/dev/null
|
||||
|
||||
/abc/IS>testsavedregex
|
||||
<testsavedregex
|
||||
abc
|
||||
|
|
@ -4078,4 +4080,76 @@ backtracking verbs. --/
|
|||
|
||||
/\x{whatever}/
|
||||
|
||||
"((?=(?(?=(?(?=(?(?=()))))))))"
|
||||
a
|
||||
|
||||
"(?(?=)==)(((((((((?=)))))))))"
|
||||
a
|
||||
|
||||
/^(?:(a)|b)(?(1)A|B)/I
|
||||
aA123\O3
|
||||
aA123\O6
|
||||
|
||||
'^(?:(?<AA>a)|b)(?(<AA>)A|B)'
|
||||
aA123\O3
|
||||
aA123\O6
|
||||
|
||||
'^(?<AA>)(?:(?<AA>a)|b)(?(<AA>)A|B)'J
|
||||
aA123\O3
|
||||
aA123\O6
|
||||
|
||||
'^(?:(?<AA>X)|)(?:(?<AA>a)|b)\k{AA}'J
|
||||
aa123\O3
|
||||
aa123\O6
|
||||
|
||||
/(?<N111>(?J)(?<N111>1(111111)11|)1|1|)(?(<N111>)1)/
|
||||
|
||||
/(?(?=0)?)+/
|
||||
|
||||
/(?(?=0)(?=00)?00765)/
|
||||
00765
|
||||
|
||||
/(?(?=0)(?=00)?00765|(?!3).56)/
|
||||
00765
|
||||
456
|
||||
** Failers
|
||||
356
|
||||
|
||||
'^(a)*+(\w)'
|
||||
g
|
||||
g\O3
|
||||
|
||||
'^(?:a)*+(\w)'
|
||||
g
|
||||
g\O3
|
||||
|
||||
//C
|
||||
\O\C+
|
||||
|
||||
"((?2){0,1999}())?"
|
||||
|
||||
/((?+1)(\1))/BZ
|
||||
|
||||
/(?(?!)a|b)/
|
||||
bbb
|
||||
aaa
|
||||
|
||||
"((?2)+)((?1))"
|
||||
|
||||
"(?(?<E>.*!.*)?)"
|
||||
|
||||
"X((?2)()*+){2}+"BZ
|
||||
|
||||
"X((?2)()*+){2}"BZ
|
||||
|
||||
"(?<=((?2))((?1)))"
|
||||
|
||||
/(?<=\Ka)/g+
|
||||
aaaaa
|
||||
|
||||
/(?<=\Ka)/G+
|
||||
aaaaa
|
||||
|
||||
/((?2){73}(?2))((?1))/
|
||||
|
||||
/-- End of testinput2 --/
|
||||
|
|
|
|||
5
pcre/testdata/testinput4
vendored
5
pcre/testdata/testinput4
vendored
|
|
@ -722,4 +722,9 @@
|
|||
/^#[^\x{ffff}]#[^\x{ffff}]#[^\x{ffff}]#/8
|
||||
#\x{10000}#\x{100}#\x{10ffff}#
|
||||
|
||||
"[\S\V\H]"8
|
||||
|
||||
/\C(\W?ſ)'?{{/8
|
||||
\\C(\\W?ſ)'?{{
|
||||
|
||||
/-- End of testinput4 --/
|
||||
|
|
|
|||
8
pcre/testdata/testinput5
vendored
8
pcre/testdata/testinput5
vendored
|
|
@ -790,4 +790,12 @@
|
|||
|
||||
/[b-d\x{200}-\x{250}]*[ae-h]?#[\x{200}-\x{250}]{0,8}[\x00-\xff]*#[\x{200}-\x{250}]+[a-z]/8BZ
|
||||
|
||||
/[^\xff]*PRUNE:\x{100}abc(xyz(?1))/8DZ
|
||||
|
||||
/(?<=\K\x{17f})/8g+
|
||||
\x{17f}\x{17f}\x{17f}\x{17f}\x{17f}
|
||||
|
||||
/(?<=\K\x{17f})/8G+
|
||||
\x{17f}\x{17f}\x{17f}\x{17f}\x{17f}
|
||||
|
||||
/-- End of testinput5 --/
|
||||
|
|
|
|||
6
pcre/testdata/testinput6
vendored
6
pcre/testdata/testinput6
vendored
|
|
@ -1496,4 +1496,10 @@
|
|||
/^s?c/mi8
|
||||
scat
|
||||
|
||||
/[A-`]/i8
|
||||
abcdefghijklmno
|
||||
|
||||
/\C\X*QT/8
|
||||
Ӆ\x0aT
|
||||
|
||||
/-- End of testinput6 --/
|
||||
|
|
|
|||
4
pcre/testdata/testinput8
vendored
4
pcre/testdata/testinput8
vendored
|
|
@ -4837,4 +4837,8 @@
|
|||
'\A(?:[^\"]++|\"(?:[^\"]++|\"\")*+\")++'
|
||||
NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
|
||||
|
||||
/(?(?!)a|b)/
|
||||
bbb
|
||||
aaa
|
||||
|
||||
/-- End of testinput8 --/
|
||||
|
|
|
|||
18
pcre/testdata/testoutput1
vendored
18
pcre/testdata/testoutput1
vendored
|
|
@ -9411,4 +9411,22 @@ No match
|
|||
aa]]
|
||||
0: aa]]
|
||||
|
||||
/(?:((abcd))|(((?:(?:(?:(?:abc|(?:abcdef))))b)abcdefghi)abc)|((*ACCEPT)))/
|
||||
1234abcd
|
||||
0:
|
||||
1: <unset>
|
||||
2: <unset>
|
||||
3: <unset>
|
||||
4: <unset>
|
||||
5:
|
||||
|
||||
/(\2)(\1)/
|
||||
|
||||
"Z*(|d*){216}"
|
||||
|
||||
"(?1)(?#?'){8}(a)"
|
||||
baaaaaaaaac
|
||||
0: aaaaaaaaa
|
||||
1: a
|
||||
|
||||
/-- End of testinput1 --/
|
||||
|
|
|
|||
17
pcre/testdata/testoutput11-16
vendored
17
pcre/testdata/testoutput11-16
vendored
|
|
@ -231,7 +231,7 @@ Memory allocation (code space): 73
|
|||
------------------------------------------------------------------
|
||||
|
||||
/(?P<a>a)...(?P=a)bbb(?P>a)d/BM
|
||||
Memory allocation (code space): 57
|
||||
Memory allocation (code space): 61
|
||||
------------------------------------------------------------------
|
||||
0 24 Bra
|
||||
2 5 CBra 1
|
||||
|
|
@ -733,4 +733,19 @@ Memory allocation (code space): 14
|
|||
41 End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/((?+1)(\1))/B
|
||||
------------------------------------------------------------------
|
||||
0 20 Bra
|
||||
2 16 Once
|
||||
4 12 CBra 1
|
||||
7 9 Recurse
|
||||
9 5 CBra 2
|
||||
12 \1
|
||||
14 5 Ket
|
||||
16 12 Ket
|
||||
18 16 Ket
|
||||
20 20 Ket
|
||||
22 End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/-- End of testinput11 --/
|
||||
|
|
|
|||
17
pcre/testdata/testoutput11-32
vendored
17
pcre/testdata/testoutput11-32
vendored
|
|
@ -231,7 +231,7 @@ Memory allocation (code space): 155
|
|||
------------------------------------------------------------------
|
||||
|
||||
/(?P<a>a)...(?P=a)bbb(?P>a)d/BM
|
||||
Memory allocation (code space): 117
|
||||
Memory allocation (code space): 125
|
||||
------------------------------------------------------------------
|
||||
0 24 Bra
|
||||
2 5 CBra 1
|
||||
|
|
@ -733,4 +733,19 @@ Memory allocation (code space): 28
|
|||
41 End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/((?+1)(\1))/B
|
||||
------------------------------------------------------------------
|
||||
0 20 Bra
|
||||
2 16 Once
|
||||
4 12 CBra 1
|
||||
7 9 Recurse
|
||||
9 5 CBra 2
|
||||
12 \1
|
||||
14 5 Ket
|
||||
16 12 Ket
|
||||
18 16 Ket
|
||||
20 20 Ket
|
||||
22 End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/-- End of testinput11 --/
|
||||
|
|
|
|||
17
pcre/testdata/testoutput11-8
vendored
17
pcre/testdata/testoutput11-8
vendored
|
|
@ -231,7 +231,7 @@ Memory allocation (code space): 45
|
|||
------------------------------------------------------------------
|
||||
|
||||
/(?P<a>a)...(?P=a)bbb(?P>a)d/BM
|
||||
Memory allocation (code space): 34
|
||||
Memory allocation (code space): 38
|
||||
------------------------------------------------------------------
|
||||
0 30 Bra
|
||||
3 7 CBra 1
|
||||
|
|
@ -733,4 +733,19 @@ Memory allocation (code space): 10
|
|||
60 End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/((?+1)(\1))/B
|
||||
------------------------------------------------------------------
|
||||
0 31 Bra
|
||||
3 25 Once
|
||||
6 19 CBra 1
|
||||
11 14 Recurse
|
||||
14 8 CBra 2
|
||||
19 \1
|
||||
22 8 Ket
|
||||
25 19 Ket
|
||||
28 25 Ket
|
||||
31 31 Ket
|
||||
34 End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/-- End of testinput11 --/
|
||||
|
|
|
|||
8
pcre/testdata/testoutput12
vendored
8
pcre/testdata/testoutput12
vendored
|
|
@ -176,4 +176,12 @@ No match, mark = m (JIT)
|
|||
12345678abcd
|
||||
0: 12345678abcd (JIT)
|
||||
|
||||
/-- Test pattern compilation --/
|
||||
|
||||
/(?:a|b|c|d|e)(?R)/S++
|
||||
|
||||
/(?:a|b|c|d|e)(?R)(?R)/S++
|
||||
|
||||
/(a(?:a|b|c|d|e)b){8,16}/S++
|
||||
|
||||
/-- End of testinput12 --/
|
||||
|
|
|
|||
219
pcre/testdata/testoutput2
vendored
219
pcre/testdata/testoutput2
vendored
|
|
@ -561,7 +561,7 @@ Failed: assertion expected after (?( at offset 3
|
|||
Failed: reference to non-existent subpattern at offset 7
|
||||
|
||||
/(?(?<ab))/
|
||||
Failed: syntax error in subpattern name (missing terminator) at offset 7
|
||||
Failed: assertion expected after (?( at offset 3
|
||||
|
||||
/((?s)blah)\s+\1/I
|
||||
Capturing subpattern count = 1
|
||||
|
|
@ -1566,30 +1566,35 @@ Need char = 'b'
|
|||
|
||||
/a(?(1)b)(.)/I
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
No options
|
||||
First char = 'a'
|
||||
No need char
|
||||
|
||||
/a(?(1)bag|big)(.)/I
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
No options
|
||||
First char = 'a'
|
||||
Need char = 'g'
|
||||
|
||||
/a(?(1)bag|big)*(.)/I
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
No options
|
||||
First char = 'a'
|
||||
No need char
|
||||
|
||||
/a(?(1)bag|big)+(.)/I
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
No options
|
||||
First char = 'a'
|
||||
Need char = 'g'
|
||||
|
||||
/a(?(1)b..|b..)(.)/I
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
No options
|
||||
First char = 'a'
|
||||
Need char = 'b'
|
||||
|
|
@ -3379,24 +3384,28 @@ Need char = 'a'
|
|||
|
||||
/(?(1)ab|ac)(.)/I
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
No options
|
||||
First char = 'a'
|
||||
No need char
|
||||
|
||||
/(?(1)abz|acz)(.)/I
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
No options
|
||||
First char = 'a'
|
||||
Need char = 'z'
|
||||
|
||||
/(?(1)abz)(.)/I
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
No options
|
||||
No first char
|
||||
No need char
|
||||
|
||||
/(?(1)abz)(1)23/I
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
No options
|
||||
No first char
|
||||
Need char = '3'
|
||||
|
|
@ -5605,6 +5614,10 @@ No match
|
|||
123456\P
|
||||
No match
|
||||
|
||||
//KF>/dev/null
|
||||
Compiled pattern written to /dev/null
|
||||
Study data written to /dev/null
|
||||
|
||||
/abc/IS>testsavedregex
|
||||
Capturing subpattern count = 0
|
||||
No options
|
||||
|
|
@ -6336,6 +6349,7 @@ No need char
|
|||
|
||||
/^(?P<A>a)?(?(A)a|b)/I
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
Named capturing subpatterns:
|
||||
A 1
|
||||
Options: anchored
|
||||
|
|
@ -6353,6 +6367,7 @@ No match
|
|||
|
||||
/(?:(?(ZZ)a|b)(?P<ZZ>X))+/I
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
Named capturing subpatterns:
|
||||
ZZ 1
|
||||
No options
|
||||
|
|
@ -6370,6 +6385,7 @@ Failed: reference to non-existent subpattern at offset 9
|
|||
|
||||
/(?:(?(ZZ)a|b)(?(ZZ)a|b)(?P<ZZ>X))+/I
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
Named capturing subpatterns:
|
||||
ZZ 1
|
||||
No options
|
||||
|
|
@ -6381,6 +6397,7 @@ Need char = 'X'
|
|||
|
||||
/(?:(?(ZZ)a|\(b\))\\(?P<ZZ>X))+/I
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
Named capturing subpatterns:
|
||||
ZZ 1
|
||||
No options
|
||||
|
|
@ -10226,6 +10243,7 @@ No starting char list
|
|||
(?(1)|.) # check that there was an empty component
|
||||
/xiIS
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
Options: anchored caseless extended
|
||||
No first char
|
||||
Need char = ':'
|
||||
|
|
@ -10255,6 +10273,7 @@ Failed: different names for subpatterns of the same number are not allowed at of
|
|||
b(?<quote> (?<apostrophe>')|(?<realquote>")) )
|
||||
(?('quote')[a-z]+|[0-9]+)/JIx
|
||||
Capturing subpattern count = 6
|
||||
Max back reference = 1
|
||||
Named capturing subpatterns:
|
||||
apostrophe 2
|
||||
apostrophe 5
|
||||
|
|
@ -10317,6 +10336,7 @@ No match
|
|||
End
|
||||
------------------------------------------------------------------
|
||||
Capturing subpattern count = 4
|
||||
Max back reference = 4
|
||||
Named capturing subpatterns:
|
||||
D 4
|
||||
D 1
|
||||
|
|
@ -10364,6 +10384,7 @@ No match
|
|||
End
|
||||
------------------------------------------------------------------
|
||||
Capturing subpattern count = 4
|
||||
Max back reference = 1
|
||||
Named capturing subpatterns:
|
||||
A 1
|
||||
A 4
|
||||
|
|
@ -10486,6 +10507,7 @@ No starting char list
|
|||
|
||||
/()i(?(1)a)/SI
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
No options
|
||||
No first char
|
||||
Need char = 'i'
|
||||
|
|
@ -14206,4 +14228,199 @@ Failed: digits missing in \x{} or \o{} at offset 3
|
|||
/\x{whatever}/
|
||||
Failed: non-hex character in \x{} (closing brace missing?) at offset 3
|
||||
|
||||
"((?=(?(?=(?(?=(?(?=()))))))))"
|
||||
a
|
||||
0:
|
||||
1:
|
||||
2:
|
||||
|
||||
"(?(?=)==)(((((((((?=)))))))))"
|
||||
a
|
||||
No match
|
||||
|
||||
/^(?:(a)|b)(?(1)A|B)/I
|
||||
Capturing subpattern count = 1
|
||||
Max back reference = 1
|
||||
Options: anchored
|
||||
No first char
|
||||
No need char
|
||||
aA123\O3
|
||||
Matched, but too many substrings
|
||||
0: aA
|
||||
aA123\O6
|
||||
0: aA
|
||||
1: a
|
||||
|
||||
'^(?:(?<AA>a)|b)(?(<AA>)A|B)'
|
||||
aA123\O3
|
||||
Matched, but too many substrings
|
||||
0: aA
|
||||
aA123\O6
|
||||
0: aA
|
||||
1: a
|
||||
|
||||
'^(?<AA>)(?:(?<AA>a)|b)(?(<AA>)A|B)'J
|
||||
aA123\O3
|
||||
Matched, but too many substrings
|
||||
0: aA
|
||||
aA123\O6
|
||||
Matched, but too many substrings
|
||||
0: aA
|
||||
1:
|
||||
|
||||
'^(?:(?<AA>X)|)(?:(?<AA>a)|b)\k{AA}'J
|
||||
aa123\O3
|
||||
Matched, but too many substrings
|
||||
0: aa
|
||||
aa123\O6
|
||||
Matched, but too many substrings
|
||||
0: aa
|
||||
1: <unset>
|
||||
|
||||
/(?<N111>(?J)(?<N111>1(111111)11|)1|1|)(?(<N111>)1)/
|
||||
|
||||
/(?(?=0)?)+/
|
||||
Failed: nothing to repeat at offset 7
|
||||
|
||||
/(?(?=0)(?=00)?00765)/
|
||||
00765
|
||||
0: 00765
|
||||
|
||||
/(?(?=0)(?=00)?00765|(?!3).56)/
|
||||
00765
|
||||
0: 00765
|
||||
456
|
||||
0: 456
|
||||
** Failers
|
||||
No match
|
||||
356
|
||||
No match
|
||||
|
||||
'^(a)*+(\w)'
|
||||
g
|
||||
0: g
|
||||
1: <unset>
|
||||
2: g
|
||||
g\O3
|
||||
Matched, but too many substrings
|
||||
0: g
|
||||
|
||||
'^(?:a)*+(\w)'
|
||||
g
|
||||
0: g
|
||||
1: g
|
||||
g\O3
|
||||
Matched, but too many substrings
|
||||
0: g
|
||||
|
||||
//C
|
||||
\O\C+
|
||||
Callout 255: last capture = -1
|
||||
--->
|
||||
+0 ^
|
||||
Matched, but too many substrings
|
||||
|
||||
"((?2){0,1999}())?"
|
||||
|
||||
/((?+1)(\1))/BZ
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
Once
|
||||
CBra 1
|
||||
Recurse
|
||||
CBra 2
|
||||
\1
|
||||
Ket
|
||||
Ket
|
||||
Ket
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/(?(?!)a|b)/
|
||||
bbb
|
||||
0: b
|
||||
aaa
|
||||
No match
|
||||
|
||||
"((?2)+)((?1))"
|
||||
|
||||
"(?(?<E>.*!.*)?)"
|
||||
Failed: assertion expected after (?( at offset 3
|
||||
|
||||
"X((?2)()*+){2}+"BZ
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
X
|
||||
Once
|
||||
CBra 1
|
||||
Recurse
|
||||
Braposzero
|
||||
SCBraPos 2
|
||||
KetRpos
|
||||
Ket
|
||||
CBra 1
|
||||
Recurse
|
||||
Braposzero
|
||||
SCBraPos 2
|
||||
KetRpos
|
||||
Ket
|
||||
Ket
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
"X((?2)()*+){2}"BZ
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
X
|
||||
CBra 1
|
||||
Recurse
|
||||
Braposzero
|
||||
SCBraPos 2
|
||||
KetRpos
|
||||
Ket
|
||||
CBra 1
|
||||
Recurse
|
||||
Braposzero
|
||||
SCBraPos 2
|
||||
KetRpos
|
||||
Ket
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
"(?<=((?2))((?1)))"
|
||||
Failed: lookbehind assertion is not fixed length at offset 17
|
||||
|
||||
/(?<=\Ka)/g+
|
||||
aaaaa
|
||||
0: a
|
||||
0+ aaaa
|
||||
0: a
|
||||
0+ aaaa
|
||||
0: a
|
||||
0+ aaa
|
||||
0: a
|
||||
0+ aa
|
||||
0: a
|
||||
0+ a
|
||||
0: a
|
||||
0+
|
||||
|
||||
/(?<=\Ka)/G+
|
||||
aaaaa
|
||||
0: a
|
||||
0+ aaaa
|
||||
0: a
|
||||
0+ aaa
|
||||
0: a
|
||||
0+ aa
|
||||
0: a
|
||||
0+ a
|
||||
0: a
|
||||
0+
|
||||
|
||||
/((?2){73}(?2))((?1))/
|
||||
|
||||
/-- End of testinput2 --/
|
||||
|
|
|
|||
6
pcre/testdata/testoutput4
vendored
6
pcre/testdata/testoutput4
vendored
|
|
@ -1271,4 +1271,10 @@ No match
|
|||
#\x{10000}#\x{100}#\x{10ffff}#
|
||||
0: #\x{10000}#\x{100}#\x{10ffff}#
|
||||
|
||||
"[\S\V\H]"8
|
||||
|
||||
/\C(\W?ſ)'?{{/8
|
||||
\\C(\\W?ſ)'?{{
|
||||
No match
|
||||
|
||||
/-- End of testinput4 --/
|
||||
|
|
|
|||
45
pcre/testdata/testoutput5
vendored
45
pcre/testdata/testoutput5
vendored
|
|
@ -1897,4 +1897,49 @@ Failed: disallowed Unicode code point (>= 0xd800 && <= 0xdfff) at offset 5
|
|||
End
|
||||
------------------------------------------------------------------
|
||||
|
||||
/[^\xff]*PRUNE:\x{100}abc(xyz(?1))/8DZ
|
||||
------------------------------------------------------------------
|
||||
Bra
|
||||
[^\x{ff}]*
|
||||
PRUNE:\x{100}abc
|
||||
CBra 1
|
||||
xyz
|
||||
Recurse
|
||||
Ket
|
||||
Ket
|
||||
End
|
||||
------------------------------------------------------------------
|
||||
Capturing subpattern count = 1
|
||||
Options: utf
|
||||
No first char
|
||||
Need char = 'z'
|
||||
|
||||
/(?<=\K\x{17f})/8g+
|
||||
\x{17f}\x{17f}\x{17f}\x{17f}\x{17f}
|
||||
0: \x{17f}
|
||||
0+ \x{17f}\x{17f}\x{17f}\x{17f}
|
||||
0: \x{17f}
|
||||
0+ \x{17f}\x{17f}\x{17f}\x{17f}
|
||||
0: \x{17f}
|
||||
0+ \x{17f}\x{17f}\x{17f}
|
||||
0: \x{17f}
|
||||
0+ \x{17f}\x{17f}
|
||||
0: \x{17f}
|
||||
0+ \x{17f}
|
||||
0: \x{17f}
|
||||
0+
|
||||
|
||||
/(?<=\K\x{17f})/8G+
|
||||
\x{17f}\x{17f}\x{17f}\x{17f}\x{17f}
|
||||
0: \x{17f}
|
||||
0+ \x{17f}\x{17f}\x{17f}\x{17f}
|
||||
0: \x{17f}
|
||||
0+ \x{17f}\x{17f}\x{17f}
|
||||
0: \x{17f}
|
||||
0+ \x{17f}\x{17f}
|
||||
0: \x{17f}
|
||||
0+ \x{17f}
|
||||
0: \x{17f}
|
||||
0+
|
||||
|
||||
/-- End of testinput5 --/
|
||||
|
|
|
|||
8
pcre/testdata/testoutput6
vendored
8
pcre/testdata/testoutput6
vendored
|
|
@ -2461,4 +2461,12 @@ No match
|
|||
scat
|
||||
0: sc
|
||||
|
||||
/[A-`]/i8
|
||||
abcdefghijklmno
|
||||
0: a
|
||||
|
||||
/\C\X*QT/8
|
||||
Ӆ\x0aT
|
||||
No match
|
||||
|
||||
/-- End of testinput6 --/
|
||||
|
|
|
|||
6
pcre/testdata/testoutput8
vendored
6
pcre/testdata/testoutput8
vendored
|
|
@ -7785,4 +7785,10 @@ Matched, but offsets vector is too small to show all matches
|
|||
NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
|
||||
0: NON QUOTED "QUOT""ED" AFTER
|
||||
|
||||
/(?(?!)a|b)/
|
||||
bbb
|
||||
0: b
|
||||
aaa
|
||||
No match
|
||||
|
||||
/-- End of testinput8 --/
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue