Merge branch 'merge-pcre' into 10.0

2026-05-15 19:37:16 +02:00 · 2015-05-04 22:25:57 +02:00 · 2015-05-04 22:25:57 +02:00 · 0b4f5060bb
commit 0b4f5060bb
parent 6c5ee86286 c4cc91cdc9
41 changed files with 1695 additions and 768 deletions
--- a/pcre/AUTHORS
+++ b/pcre/AUTHORS
@ -8,7 +8,7 @@ Email domain:     cam.ac.uk
 University of Cambridge Computing Service,
 Cambridge, England.

-Copyright (c) 1997-2014 University of Cambridge
+Copyright (c) 1997-2015 University of Cambridge
 All rights reserved


@ -19,7 +19,7 @@ Written by:       Zoltan Herczeg
 Email local part: hzmester
 Emain domain:     freemail.hu

-Copyright(c) 2010-2014 Zoltan Herczeg
+Copyright(c) 2010-2015 Zoltan Herczeg
 All rights reserved.


@ -30,7 +30,7 @@ Written by:       Zoltan Herczeg
 Email local part: hzmester
 Emain domain:     freemail.hu

-Copyright(c) 2009-2014 Zoltan Herczeg
+Copyright(c) 2009-2015 Zoltan Herczeg
 All rights reserved.


--- a/pcre/ChangeLog
+++ b/pcre/ChangeLog
@ -1,6 +1,173 @@
 ChangeLog for PCRE
 ------------------

+Version 8.37 28-April-2015
+--------------------------
+
+1.  When an (*ACCEPT) is triggered inside capturing parentheses, it arranges
+    for those parentheses to be closed with whatever has been captured so far.
+    However, it was failing to mark any other groups between the hightest
+    capture so far and the currrent group as "unset". Thus, the ovector for
+    those groups contained whatever was previously there. An example is the
+    pattern /(x)|((*ACCEPT))/ when matched against "abcd".
+
+2.  If an assertion condition was quantified with a minimum of zero (an odd
+    thing to do, but it happened), SIGSEGV or other misbehaviour could occur.
+
+3.  If a pattern in pcretest input had the P (POSIX) modifier followed by an
+    unrecognized modifier, a crash could occur.
+
+4.  An attempt to do global matching in pcretest with a zero-length ovector
+    caused a crash.
+
+5.  Fixed a memory leak during matching that could occur for a subpattern
+    subroutine call (recursive or otherwise) if the number of captured groups
+    that had to be saved was greater than ten.
+
+6.  Catch a bad opcode during auto-possessification after compiling a bad UTF
+    string with NO_UTF_CHECK. This is a tidyup, not a bug fix, as passing bad
+    UTF with NO_UTF_CHECK is documented as having an undefined outcome.
+
+7.  A UTF pattern containing a "not" match of a non-ASCII character and a
+    subroutine reference could loop at compile time. Example: /[^\xff]((?1))/.
+
+8. When a pattern is compiled, it remembers the highest back reference so that
+   when matching, if the ovector is too small, extra memory can be obtained to
+   use instead. A conditional subpattern whose condition is a check on a
+   capture having happened, such as, for example in the pattern
+   /^(?:(a)|b)(?(1)A|B)/, is another kind of back reference, but it was not
+   setting the highest backreference number. This mattered only if pcre_exec()
+   was called with an ovector that was too small to hold the capture, and there
+   was no other kind of back reference (a situation which is probably quite
+   rare). The effect of the bug was that the condition was always treated as
+   FALSE when the capture could not be consulted, leading to a incorrect
+   behaviour by pcre_exec(). This bug has been fixed.
+
+9. A reference to a duplicated named group (either a back reference or a test
+   for being set in a conditional) that occurred in a part of the pattern where
+   PCRE_DUPNAMES was not set caused the amount of memory needed for the pattern
+   to be incorrectly calculated, leading to overwriting.
+
+10. A mutually recursive set of back references such as (\2)(\1) caused a
+    segfault at study time (while trying to find the minimum matching length).
+    The infinite loop is now broken (with the minimum length unset, that is,
+    zero).
+
+11. If an assertion that was used as a condition was quantified with a minimum
+    of zero, matching went wrong. In particular, if the whole group had
+    unlimited repetition and could match an empty string, a segfault was
+    likely. The pattern (?(?=0)?)+ is an example that caused this. Perl allows
+    assertions to be quantified, but not if they are being used as conditions,
+    so the above pattern is faulted by Perl. PCRE has now been changed so that
+    it also rejects such patterns.
+
+12. A possessive capturing group such as (a)*+ with a minimum repeat of zero
+    failed to allow the zero-repeat case if pcre2_exec() was called with an
+    ovector too small to capture the group.
+
+13. Fixed two bugs in pcretest that were discovered by fuzzing and reported by
+    Red Hat Product Security:
+
+    (a) A crash if /K and /F were both set with the option to save the compiled
+    pattern.
+
+    (b) Another crash if the option to print captured substrings in a callout
+    was combined with setting a null ovector, for example \O\C+ as a subject
+    string.
+
+14. A pattern such as "((?2){0,1999}())?", which has a group containing a
+    forward reference repeated a large (but limited) number of times within a
+    repeated outer group that has a zero minimum quantifier, caused incorrect
+    code to be compiled, leading to the error "internal error:
+    previously-checked referenced subpattern not found" when an incorrect
+    memory address was read. This bug was reported as "heap overflow",
+    discovered by Kai Lu of Fortinet's FortiGuard Labs and given the CVE number
+    CVE-2015-2325.
+
+23. A pattern such as "((?+1)(\1))/" containing a forward reference subroutine
+    call within a group that also contained a recursive back reference caused
+    incorrect code to be compiled. This bug was reported as "heap overflow",
+    discovered by Kai Lu of Fortinet's FortiGuard Labs, and given the CVE
+    number CVE-2015-2326.
+
+24. Computing the size of the JIT read-only data in advance has been a source
+    of various issues, and new ones are still appear unfortunately. To fix
+    existing and future issues, size computation is eliminated from the code,
+    and replaced by on-demand memory allocation.
+
+25. A pattern such as /(?i)[A-`]/, where characters in the other case are
+    adjacent to the end of the range, and the range contained characters with
+    more than one other case, caused incorrect behaviour when compiled in UTF
+    mode. In that example, the range a-j was left out of the class.
+
+26. Fix JIT compilation of conditional blocks, which assertion
+    is converted to (*FAIL). E.g: /(?(?!))/.
+
+27. The pattern /(?(?!)^)/ caused references to random memory. This bug was
+    discovered by the LLVM fuzzer.
+
+28. The assertion (?!) is optimized to (*FAIL). This was not handled correctly
+    when this assertion was used as a condition, for example (?(?!)a|b). In
+    pcre2_match() it worked by luck; in pcre2_dfa_match() it gave an incorrect
+    error about an unsupported item.
+
+29. For some types of pattern, for example /Z*(|d*){216}/, the auto-
+    possessification code could take exponential time to complete. A recursion
+    depth limit of 1000 has been imposed to limit the resources used by this
+    optimization.
+
+30. A pattern such as /(*UTF)[\S\V\H]/, which contains a negated special class
+    such as \S in non-UCP mode, explicit wide characters (> 255) can be ignored
+    because \S ensures they are all in the class. The code for doing this was
+    interacting badly with the code for computing the amount of space needed to
+    compile the pattern, leading to a buffer overflow. This bug was discovered
+    by the LLVM fuzzer.
+
+31. A pattern such as /((?2)+)((?1))/ which has mutual recursion nested inside
+    other kinds of group caused stack overflow at compile time. This bug was
+    discovered by the LLVM fuzzer.
+
+32. A pattern such as /(?1)(?#?'){8}(a)/ which had a parenthesized comment
+    between a subroutine call and its quantifier was incorrectly compiled,
+    leading to buffer overflow or other errors. This bug was discovered by the
+    LLVM fuzzer.
+
+33. The illegal pattern /(?(?<E>.*!.*)?)/ was not being diagnosed as missing an
+    assertion after (?(. The code was failing to check the character after
+    (?(?< for the ! or = that would indicate a lookbehind assertion. This bug
+    was discovered by the LLVM fuzzer.
+
+34. A pattern such as /X((?2)()*+){2}+/ which has a possessive quantifier with
+    a fixed maximum following a group that contains a subroutine reference was
+    incorrectly compiled and could trigger buffer overflow. This bug was
+    discovered by the LLVM fuzzer.
+
+35. A mutual recursion within a lookbehind assertion such as (?<=((?2))((?1)))
+    caused a stack overflow instead of the diagnosis of a non-fixed length
+    lookbehind assertion. This bug was discovered by the LLVM fuzzer.
+
+36. The use of \K in a positive lookbehind assertion in a non-anchored pattern
+    (e.g. /(?<=\Ka)/) could make pcregrep loop.
+
+37. There was a similar problem to 36 in pcretest for global matches.
+
+38. If a greedy quantified \X was preceded by \C in UTF mode (e.g. \C\X*),
+    and a subsequent item in the pattern caused a non-match, backtracking over
+    the repeated \X did not stop, but carried on past the start of the subject,
+    causing reference to random memory and/or a segfault. There were also some
+    other cases where backtracking after \C could crash. This set of bugs was
+    discovered by the LLVM fuzzer.
+
+39. The function for finding the minimum length of a matching string could take
+    a very long time if mutual recursion was present many times in a pattern,
+    for example, /((?2){73}(?2))((?1))/. A better mutual recursion detection
+    method has been implemented. This infelicity was discovered by the LLVM
+    fuzzer.
+
+40. Static linking against the PCRE library using the pkg-config module was
+    failing on missing pthread symbols.
+
+
 Version 8.36 26-September-2014
 ------------------------------

--- a/pcre/LICENCE
+++ b/pcre/LICENCE
@ -6,7 +6,8 @@ and semantics are as close as possible to those of the Perl 5 language.

 Release 8 of PCRE is distributed under the terms of the "BSD" licence, as
 specified below. The documentation for PCRE, supplied in the "doc"
-directory, is distributed under the same terms as the software itself.
+directory, is distributed under the same terms as the software itself. The data
+in the testdata directory is not copyrighted and is in the public domain.

 The basic library functions are written in C and are freestanding. Also
 included in the distribution is a set of C++ wrapper functions, and a
@ -24,7 +25,7 @@ Email domain:     cam.ac.uk
 University of Cambridge Computing Service,
 Cambridge, England.

-Copyright (c) 1997-2014 University of Cambridge
+Copyright (c) 1997-2015 University of Cambridge
 All rights reserved.


@ -35,7 +36,7 @@ Written by:       Zoltan Herczeg
 Email local part: hzmester
 Emain domain:     freemail.hu

-Copyright(c) 2010-2014 Zoltan Herczeg
+Copyright(c) 2010-2015 Zoltan Herczeg
 All rights reserved.


@ -46,7 +47,7 @@ Written by:       Zoltan Herczeg
 Email local part: hzmester
 Emain domain:     freemail.hu

-Copyright(c) 2009-2014 Zoltan Herczeg
+Copyright(c) 2009-2015 Zoltan Herczeg
 All rights reserved.


--- a/pcre/NEWS
+++ b/pcre/NEWS
@ -1,6 +1,14 @@
 News about PCRE releases
 ------------------------

+Release 8.37 28-April-2015
+--------------------------
+
+This is bug-fix release. Note that this library (now called PCRE1) is now being
+maintained for bug fixes only. New projects are advised to use the new PCRE2
+libraries.
+
+
 Release 8.36 26-September-2014
 ------------------------------

--- a/pcre/NON-AUTOTOOLS-BUILD
+++ b/pcre/NON-AUTOTOOLS-BUILD
@ -1,6 +1,14 @@
 Building PCRE without using autotools
 -------------------------------------

+NOTE: This document relates to PCRE releases that use the original API, with
+library names libpcre, libpcre16, and libpcre32. January 2015 saw the first
+release of a new API, known as PCRE2, with release numbers starting at 10.00
+and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old libraries
+(now called PCRE1) are still being maintained for bug fixes, but there will be
+no new development. New projects are advised to use the new PCRE2 libraries.
+
+
 This document contains the following sections:

  General
@ -761,4 +769,4 @@ There is also a mirror here:
  http://www.vsoft-software.com/downloads.html

 ==========================
-Last Updated: 14 May 2013
+Last Updated: 10 February 2015
--- a/pcre/README
+++ b/pcre/README
@ -1,7 +1,16 @@
 README file for PCRE (Perl-compatible regular expression library)
 -----------------------------------------------------------------

-The latest release of PCRE is always available in three alternative formats
+NOTE: This set of files relates to PCRE releases that use the original API,
+with library names libpcre, libpcre16, and libpcre32. January 2015 saw the
+first release of a new API, known as PCRE2, with release numbers starting at
+10.00 and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old
+libraries (now called PCRE1) are still being maintained for bug fixes, but
+there will be no new development. New projects are advised to use the new PCRE2
+libraries.
+
+
+The latest release of PCRE1 is always available in three alternative formats
 from:

  ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.tar.gz
@ -990,4 +999,4 @@ pcre_xxx, one with the name pcre16_xx, and a third with the name pcre32_xxx.
 Philip Hazel
 Email local part: ph10
 Email domain: cam.ac.uk
-Last updated: 24 October 2014
+Last updated: 10 February 2015
--- a/pcre/RunGrepTest
+++ b/pcre/RunGrepTest
@ -506,6 +506,11 @@ echo "---------------------------- Test 106 -----------------------------" >>tes
 (cd $srcdir; echo "a" | $valgrind $pcregrep -M "|a" ) >>testtrygrep 2>&1
 echo "RC=$?" >>testtrygrep

+echo "---------------------------- Test 107 -----------------------------" >>testtrygrep
+echo "a" >testtemp1grep
+echo "aaaaa" >>testtemp1grep
+(cd $srcdir; $valgrind $pcregrep  --line-offsets '(?<=\Ka)' $builddir/testtemp1grep) >>testtrygrep 2>&1
+echo "RC=$?" >>testtrygrep

 # Now compare the results.

--- a/pcre/configure.ac
+++ b/pcre/configure.ac
@ -9,17 +9,17 @@ dnl The PCRE_PRERELEASE feature is for identifying release candidates. It might
 dnl be defined as -RC2, for example. For real releases, it should be empty.

 m4_define(pcre_major, [8])
-m4_define(pcre_minor, [36])
+m4_define(pcre_minor, [37])
 m4_define(pcre_prerelease, [])
-m4_define(pcre_date, [2014-09-26])
+m4_define(pcre_date, [2015-04-28])

 # NOTE: The CMakeLists.txt file searches for the above variables in the first
 # 50 lines of this file. Please update that if the variables above are moved.

 # Libtool shared library interface versions (current:revision:age)
-m4_define(libpcre_version, [3:4:2])
-m4_define(libpcre16_version, [2:4:2])
-m4_define(libpcre32_version, [0:4:0])
+m4_define(libpcre_version, [3:5:2])
+m4_define(libpcre16_version, [2:5:2])
+m4_define(libpcre32_version, [0:5:0])
 m4_define(libpcreposix_version, [0:3:0])
 m4_define(libpcrecpp_version, [0:1:0])

--- a/pcre/doc/html/NON-AUTOTOOLS-BUILD.txt
+++ b/pcre/doc/html/NON-AUTOTOOLS-BUILD.txt
@ -1,6 +1,14 @@
 Building PCRE without using autotools
 -------------------------------------

+NOTE: This document relates to PCRE releases that use the original API, with
+library names libpcre, libpcre16, and libpcre32. January 2015 saw the first
+release of a new API, known as PCRE2, with release numbers starting at 10.00
+and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old libraries
+(now called PCRE1) are still being maintained for bug fixes, but there will be
+no new development. New projects are advised to use the new PCRE2 libraries.
+
+
 This document contains the following sections:

  General
@ -761,4 +769,4 @@ There is also a mirror here:
  http://www.vsoft-software.com/downloads.html

 ==========================
-Last Updated: 14 May 2013
+Last Updated: 10 February 2015
--- a/pcre/doc/html/README.txt
+++ b/pcre/doc/html/README.txt
@ -1,7 +1,16 @@
 README file for PCRE (Perl-compatible regular expression library)
 -----------------------------------------------------------------

-The latest release of PCRE is always available in three alternative formats
+NOTE: This set of files relates to PCRE releases that use the original API,
+with library names libpcre, libpcre16, and libpcre32. January 2015 saw the
+first release of a new API, known as PCRE2, with release numbers starting at
+10.00 and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old
+libraries (now called PCRE1) are still being maintained for bug fixes, but
+there will be no new development. New projects are advised to use the new PCRE2
+libraries.
+
+
+The latest release of PCRE1 is always available in three alternative formats
 from:

  ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.tar.gz
@ -990,4 +999,4 @@ pcre_xxx, one with the name pcre16_xx, and a third with the name pcre32_xxx.
 Philip Hazel
 Email local part: ph10
 Email domain: cam.ac.uk
-Last updated: 24 October 2014
+Last updated: 10 February 2015
--- a/pcre/doc/html/pcre.html
+++ b/pcre/doc/html/pcre.html
@ -13,13 +13,24 @@ from the original man page. If there is any nonsense in it, please consult the
 man page, in case the conversion went wrong.
 <br>
 <ul>
-<li><a name="TOC1" href="#SEC1">INTRODUCTION</a>
-<li><a name="TOC2" href="#SEC2">SECURITY CONSIDERATIONS</a>
-<li><a name="TOC3" href="#SEC3">USER DOCUMENTATION</a>
-<li><a name="TOC4" href="#SEC4">AUTHOR</a>
-<li><a name="TOC5" href="#SEC5">REVISION</a>
+<li><a name="TOC1" href="#SEC1">PLEASE TAKE NOTE</a>
+<li><a name="TOC2" href="#SEC2">INTRODUCTION</a>
+<li><a name="TOC3" href="#SEC3">SECURITY CONSIDERATIONS</a>
+<li><a name="TOC4" href="#SEC4">USER DOCUMENTATION</a>
+<li><a name="TOC5" href="#SEC5">AUTHOR</a>
+<li><a name="TOC6" href="#SEC6">REVISION</a>
 </ul>
-<br><a name="SEC1" href="#TOC1">INTRODUCTION</a><br>
+<br><a name="SEC1" href="#TOC1">PLEASE TAKE NOTE</a><br>
+<P>
+This document relates to PCRE releases that use the original API,
+with library names libpcre, libpcre16, and libpcre32. January 2015 saw the
+first release of a new API, known as PCRE2, with release numbers starting at
+10.00 and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old
+libraries (now called PCRE1) are still being maintained for bug fixes, but
+there will be no new development. New projects are advised to use the new PCRE2
+libraries.
+</P>
+<br><a name="SEC2" href="#TOC1">INTRODUCTION</a><br>
 <P>
 The PCRE library is a set of functions that implement regular expression
 pattern matching using the same syntax and semantics as Perl, with just a few
@ -115,7 +126,7 @@ clashes. In some environments, it is possible to control which external symbols
 are exported when a shared library is built, and in these cases the
 undocumented symbols are not exported.
 </P>
-<br><a name="SEC2" href="#TOC1">SECURITY CONSIDERATIONS</a><br>
+<br><a name="SEC3" href="#TOC1">SECURITY CONSIDERATIONS</a><br>
 <P>
 If you are using PCRE in a non-UTF application that permits users to supply
 arbitrary patterns for compilation, you should be aware of a feature that
@ -149,7 +160,7 @@ against this: see the PCRE_EXTRA_MATCH_LIMIT feature in the
 <a href="pcreapi.html"><b>pcreapi</b></a>
 page.
 </P>
-<br><a name="SEC3" href="#TOC1">USER DOCUMENTATION</a><br>
+<br><a name="SEC4" href="#TOC1">USER DOCUMENTATION</a><br>
 <P>
 The user documentation for PCRE comprises a number of different sections. In
 the "man" format, each of these is a separate "man page". In the HTML format,
@ -188,7 +199,7 @@ follows:
 In the "man" and HTML formats, there is also a short page for each C library
 function, listing its arguments and results.
 </P>
-<br><a name="SEC4" href="#TOC1">AUTHOR</a><br>
+<br><a name="SEC5" href="#TOC1">AUTHOR</a><br>
 <P>
 Philip Hazel
 <br>
@ -202,11 +213,11 @@ Putting an actual email address here seems to have been a spam magnet, so I've
 taken it away. If you want to email me, use my two initials, followed by the
 two digits 10, at the domain cam.ac.uk.
 </P>
-<br><a name="SEC5" href="#TOC1">REVISION</a><br>
+<br><a name="SEC6" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 08 January 2014
+Last updated: 10 February 2015
 <br>
-Copyright &copy; 1997-2014 University of Cambridge.
+Copyright &copy; 1997-2015 University of Cambridge.
 <br>
 <p>
 Return to the <a href="index.html">PCRE index page</a>.
--- a/pcre/doc/pcre.3
+++ b/pcre/doc/pcre.3
@ -1,6 +1,18 @@
-.TH PCRE 3 "08 January 2014" "PCRE 8.35"
+.TH PCRE 3 "10 February 2015" "PCRE 8.37"
 .SH NAME
-PCRE - Perl-compatible regular expressions
+PCRE - Perl-compatible regular expressions (original API)
+.SH "PLEASE TAKE NOTE"
+.rs
+.sp
+This document relates to PCRE releases that use the original API,
+with library names libpcre, libpcre16, and libpcre32. January 2015 saw the
+first release of a new API, known as PCRE2, with release numbers starting at
+10.00 and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old
+libraries (now called PCRE1) are still being maintained for bug fixes, but
+there will be no new development. New projects are advised to use the new PCRE2
+libraries.
+.
+.
 .SH INTRODUCTION
 .rs
 .sp
@ -213,6 +225,6 @@ two digits 10, at the domain cam.ac.uk.
 .rs
 .sp
 .nf
-Last updated: 08 January 2014
-Copyright (c) 1997-2014 University of Cambridge.
+Last updated: 10 February 2015
+Copyright (c) 1997-2015 University of Cambridge.
 .fi
--- a/pcre/doc/pcre.txt
+++ b/pcre/doc/pcre.txt
@ -13,7 +13,18 @@ PCRE(3)                    Library Functions Manual                    PCRE(3)


 NAME
-       PCRE - Perl-compatible regular expressions
+       PCRE - Perl-compatible regular expressions (original API)
+
+PLEASE TAKE NOTE
+
+       This  document relates to PCRE releases that use the original API, with
+       library names libpcre, libpcre16, and libpcre32. January 2015  saw  the
+       first release of a new API, known as PCRE2, with release numbers start-
+       ing  at  10.00  and  library   names   libpcre2-8,   libpcre2-16,   and
+       libpcre2-32. The old libraries (now called PCRE1) are still being main-
+       tained for bug fixes,  but  there  will  be  no  new  development.  New
+       projects are advised to use the new PCRE2 libraries.
+

 INTRODUCTION

@ -179,8 +190,8 @@ AUTHOR

 REVISION

-       Last updated: 08 January 2014
-       Copyright (c) 1997-2014 University of Cambridge.
+       Last updated: 10 February 2015
+       Copyright (c) 1997-2015 University of Cambridge.
 ------------------------------------------------------------------------------


--- a/pcre/pcre_compile.c
+++ b/pcre/pcre_compile.c
@ -1704,6 +1704,7 @@ Arguments:
  utf      TRUE in UTF-8 / UTF-16 / UTF-32 mode
  atend    TRUE if called when the pattern is complete
  cd       the "compile data" structure
+  recurses    chain of recurse_check to catch mutual recursion

 Returns:   the fixed length,
             or -1 if there is no fixed length,
@ -1713,10 +1714,11 @@ Returns:   the fixed length,
 */

 static int
-find_fixedlength(pcre_uchar *code, BOOL utf, BOOL atend, compile_data *cd)
+find_fixedlength(pcre_uchar *code, BOOL utf, BOOL atend, compile_data *cd,
+  recurse_check *recurses)
 {
 int length = -1;
-
+recurse_check this_recurse;
 register int branchlength = 0;
 register pcre_uchar *cc = code + 1 + LINK_SIZE;

@ -1741,7 +1743,8 @@ for (;;)
    case OP_ONCE:
    case OP_ONCE_NC:
    case OP_COND:
-    d = find_fixedlength(cc + ((op == OP_CBRA)? IMM2_SIZE : 0), utf, atend, cd);
+    d = find_fixedlength(cc + ((op == OP_CBRA)? IMM2_SIZE : 0), utf, atend, cd,
+      recurses);
    if (d < 0) return d;
    branchlength += d;
    do cc += GET(cc, 1); while (*cc == OP_ALT);
@ -1775,7 +1778,15 @@ for (;;)
    cs = ce = (pcre_uchar *)cd->start_code + GET(cc, 1);  /* Start subpattern */
    do ce += GET(ce, 1); while (*ce == OP_ALT);           /* End subpattern */
    if (cc > cs && cc < ce) return -1;                    /* Recursion */
-    d = find_fixedlength(cs + IMM2_SIZE, utf, atend, cd);
+    else   /* Check for mutual recursion */
+      {
+      recurse_check *r = recurses;
+      for (r = recurses; r != NULL; r = r->prev) if (r->group == cs) break;
+      if (r != NULL) return -1;   /* Mutual recursion */
+      }
+    this_recurse.prev = recurses;
+    this_recurse.group = cs;
+    d = find_fixedlength(cs + IMM2_SIZE, utf, atend, cd, &this_recurse);
    if (d < 0) return d;
    branchlength += d;
    cc += 1 + LINK_SIZE;
@ -2129,32 +2140,60 @@ for (;;)
      {
      case OP_CHAR:
      case OP_CHARI:
+      case OP_NOT:
+      case OP_NOTI:
      case OP_EXACT:
      case OP_EXACTI:
+      case OP_NOTEXACT:
+      case OP_NOTEXACTI:
      case OP_UPTO:
      case OP_UPTOI:
+      case OP_NOTUPTO:
+      case OP_NOTUPTOI:
      case OP_MINUPTO:
      case OP_MINUPTOI:
+      case OP_NOTMINUPTO:
+      case OP_NOTMINUPTOI:
      case OP_POSUPTO:
      case OP_POSUPTOI:
+      case OP_NOTPOSUPTO:
+      case OP_NOTPOSUPTOI:
      case OP_STAR:
      case OP_STARI:
+      case OP_NOTSTAR:
+      case OP_NOTSTARI:
      case OP_MINSTAR:
      case OP_MINSTARI:
+      case OP_NOTMINSTAR:
+      case OP_NOTMINSTARI:
      case OP_POSSTAR:
      case OP_POSSTARI:
+      case OP_NOTPOSSTAR:
+      case OP_NOTPOSSTARI:
      case OP_PLUS:
      case OP_PLUSI:
+      case OP_NOTPLUS:
+      case OP_NOTPLUSI:
      case OP_MINPLUS:
      case OP_MINPLUSI:
+      case OP_NOTMINPLUS:
+      case OP_NOTMINPLUSI:
      case OP_POSPLUS:
      case OP_POSPLUSI:
+      case OP_NOTPOSPLUS:
+      case OP_NOTPOSPLUSI:
      case OP_QUERY:
      case OP_QUERYI:
+      case OP_NOTQUERY:
+      case OP_NOTQUERYI:
      case OP_MINQUERY:
      case OP_MINQUERYI:
+      case OP_NOTMINQUERY:
+      case OP_NOTMINQUERYI:
      case OP_POSQUERY:
      case OP_POSQUERYI:
+      case OP_NOTPOSQUERY:
+      case OP_NOTPOSQUERYI:
      if (HAS_EXTRALEN(code[-1])) code += GET_EXTRALEN(code[-1]);
      break;
      }
@ -2334,11 +2373,6 @@ Arguments:
 Returns:      TRUE if what is matched could be empty
 */

-typedef struct recurse_check {
-  struct recurse_check *prev;
-  const pcre_uchar *group;
-} recurse_check;
-
 static BOOL
 could_be_empty_branch(const pcre_uchar *code, const pcre_uchar *endcode,
  BOOL utf, compile_data *cd, recurse_check *recurses)
@ -2469,8 +2503,8 @@ for (code = first_significant_code(code + PRIV(OP_lengths)[*code], TRUE);
      empty_branch = FALSE;
      do
        {
-        if (!empty_branch && could_be_empty_branch(code, endcode, utf, cd, NULL))
-          empty_branch = TRUE;
+        if (!empty_branch && could_be_empty_branch(code, endcode, utf, cd,
+          recurses)) empty_branch = TRUE;
        code += GET(code, 1);
        }
      while (*code == OP_ALT);
@ -3065,7 +3099,7 @@ Returns:      TRUE if the auto-possessification is possible

 static BOOL
 compare_opcodes(const pcre_uchar *code, BOOL utf, const compile_data *cd,
-  const pcre_uint32 *base_list, const pcre_uchar *base_end)
+  const pcre_uint32 *base_list, const pcre_uchar *base_end, int *rec_limit)
 {
 pcre_uchar c;
 pcre_uint32 list[8];
@ -3082,6 +3116,9 @@ pcre_uint32 chr;
 BOOL accepted, invert_bits;
 BOOL entered_a_group = FALSE;

+if (*rec_limit == 0) return FALSE;
+--(*rec_limit);
+
 /* Note: the base_list[1] contains whether the current opcode has greedy
 (represented by a non-zero value) quantifier. This is a different from
 other character type lists, which stores here that the character iterator
@ -3152,7 +3189,8 @@ for(;;)

    while (*next_code == OP_ALT)
      {
-      if (!compare_opcodes(code, utf, cd, base_list, base_end)) return FALSE;
+      if (!compare_opcodes(code, utf, cd, base_list, base_end, rec_limit))
+        return FALSE;
      code = next_code + 1 + LINK_SIZE;
      next_code += GET(next_code, 1);
      }
@ -3172,7 +3210,7 @@ for(;;)
    /* The bracket content will be checked by the
    OP_BRA/OP_CBRA case above. */
    next_code += 1 + LINK_SIZE;
-    if (!compare_opcodes(next_code, utf, cd, base_list, base_end))
+    if (!compare_opcodes(next_code, utf, cd, base_list, base_end, rec_limit))
      return FALSE;

    code += PRIV(OP_lengths)[c];
@ -3605,11 +3643,20 @@ register pcre_uchar c;
 const pcre_uchar *end;
 pcre_uchar *repeat_opcode;
 pcre_uint32 list[8];
+int rec_limit;

 for (;;)
  {
  c = *code;

+  /* When a pattern with bad UTF-8 encoding is compiled with NO_UTF_CHECK,
+  it may compile without complaining, but may get into a loop here if the code
+  pointer points to a bad value. This is, of course a documentated possibility,
+  when NO_UTF_CHECK is set, so it isn't a bug, but we can detect this case and
+  just give up on this optimization. */
+
+  if (c >= OP_TABLE_LENGTH) return;
+
  if (c >= OP_STAR && c <= OP_TYPEPOSUPTO)
    {
    c -= get_repeat_base(c) - OP_STAR;
@ -3617,7 +3664,8 @@ for (;;)
      get_chr_property_list(code, utf, cd->fcc, list) : NULL;
    list[1] = c == OP_STAR || c == OP_PLUS || c == OP_QUERY || c == OP_UPTO;

-    if (end != NULL && compare_opcodes(end, utf, cd, list, end))
+    rec_limit = 1000;
+    if (end != NULL && compare_opcodes(end, utf, cd, list, end, &rec_limit))
      {
      switch(c)
        {
@ -3673,7 +3721,8 @@ for (;;)

      list[1] = (c & 1) == 0;

-      if (compare_opcodes(end, utf, cd, list, end))
+      rec_limit = 1000;
+      if (compare_opcodes(end, utf, cd, list, end, &rec_limit))
        {
        switch (c)
          {
@ -3947,14 +3996,14 @@ Arguments:
  adjust     the amount by which the group is to be moved
  utf        TRUE in UTF-8 / UTF-16 / UTF-32 mode
  cd         contains pointers to tables etc.
-  save_hwm   the hwm forward reference pointer at the start of the group
+  save_hwm_offset   the hwm forward reference offset at the start of the group

 Returns:     nothing
 */

 static void
 adjust_recurse(pcre_uchar *group, int adjust, BOOL utf, compile_data *cd,
-  pcre_uchar *save_hwm)
+  size_t save_hwm_offset)
 {
 pcre_uchar *ptr = group;

@ -3966,7 +4015,8 @@ while ((ptr = (pcre_uchar *)find_recurse(ptr, utf)) != NULL)
  /* See if this recursion is on the forward reference list. If so, adjust the
  reference. */

-  for (hc = save_hwm; hc < cd->hwm; hc += LINK_SIZE)
+  for (hc = (pcre_uchar *)cd->start_workspace + save_hwm_offset; hc < cd->hwm;
+       hc += LINK_SIZE)
    {
    offset = (int)GET(hc, 0);
    if (cd->start_code + offset == ptr + 1)
@ -4171,7 +4221,11 @@ if ((options & PCRE_CASELESS) != 0)
      range. Otherwise, use a recursive call to add the additional range. */

      else if (oc < start && od >= start - 1) start = oc; /* Extend downwards */
-      else if (od > end && oc <= end + 1) end = od;       /* Extend upwards */
+      else if (od > end && oc <= end + 1)
+        {
+        end = od;       /* Extend upwards */
+        if (end > classbits_end) classbits_end = (end <= 0xff ? end : 0xff);
+        }
      else n8 += add_to_class(classbits, uchardptr, options, cd, oc, od);
      }
    }
@ -4411,7 +4465,7 @@ const pcre_uchar *tempptr;
 const pcre_uchar *nestptr = NULL;
 pcre_uchar *previous = NULL;
 pcre_uchar *previous_callout = NULL;
-pcre_uchar *save_hwm = NULL;
+size_t save_hwm_offset = 0;
 pcre_uint8 classbits[32];

 /* We can fish out the UTF-8 setting once and for all into a BOOL, but we
@ -5470,6 +5524,12 @@ for (;; ptr++)
      PUT(previous, 1, (int)(code - previous));
      break;   /* End of class handling */
      }
+
+    /* Even though any XCLASS list is now discarded, we must allow for
+    its memory. */
+
+    if (lengthptr != NULL)
+      *lengthptr += (int)(class_uchardata - class_uchardata_base);
 #endif

    /* If there are no characters > 255, or they are all to be included or
@ -5870,6 +5930,7 @@ for (;; ptr++)
      {
      register int i;
      int len = (int)(code - previous);
+      size_t base_hwm_offset = save_hwm_offset;
      pcre_uchar *bralink = NULL;
      pcre_uchar *brazeroptr = NULL;

@ -5924,7 +5985,7 @@ for (;; ptr++)
        if (repeat_max <= 1)    /* Covers 0, 1, and unlimited */
          {
          *code = OP_END;
-          adjust_recurse(previous, 1, utf, cd, save_hwm);
+          adjust_recurse(previous, 1, utf, cd, save_hwm_offset);
          memmove(previous + 1, previous, IN_UCHARS(len));
          code++;
          if (repeat_max == 0)
@ -5948,7 +6009,7 @@ for (;; ptr++)
          {
          int offset;
          *code = OP_END;
-          adjust_recurse(previous, 2 + LINK_SIZE, utf, cd, save_hwm);
+          adjust_recurse(previous, 2 + LINK_SIZE, utf, cd, save_hwm_offset);
          memmove(previous + 2 + LINK_SIZE, previous, IN_UCHARS(len));
          code += 2 + LINK_SIZE;
          *previous++ = OP_BRAZERO + repeat_type;
@ -6011,26 +6072,25 @@ for (;; ptr++)
            for (i = 1; i < repeat_min; i++)
              {
              pcre_uchar *hc;
-              pcre_uchar *this_hwm = cd->hwm;
+              size_t this_hwm_offset = cd->hwm - cd->start_workspace;
              memcpy(code, previous, IN_UCHARS(len));

              while (cd->hwm > cd->start_workspace + cd->workspace_size -
-                     WORK_SIZE_SAFETY_MARGIN - (this_hwm - save_hwm))
+                     WORK_SIZE_SAFETY_MARGIN -
+                     (this_hwm_offset - base_hwm_offset))
                {
-                size_t save_offset = save_hwm - cd->start_workspace;
-                size_t this_offset = this_hwm - cd->start_workspace;
                *errorcodeptr = expand_workspace(cd);
                if (*errorcodeptr != 0) goto FAILED;
-                save_hwm = (pcre_uchar *)cd->start_workspace + save_offset;
-                this_hwm = (pcre_uchar *)cd->start_workspace + this_offset;
                }

-              for (hc = save_hwm; hc < this_hwm; hc += LINK_SIZE)
+              for (hc = (pcre_uchar *)cd->start_workspace + base_hwm_offset;
+                   hc < (pcre_uchar *)cd->start_workspace + this_hwm_offset;
+                   hc += LINK_SIZE)
                {
                PUT(cd->hwm, 0, GET(hc, 0) + len);
                cd->hwm += LINK_SIZE;
                }
-              save_hwm = this_hwm;
+              base_hwm_offset = this_hwm_offset;
              code += len;
              }
            }
@ -6075,7 +6135,7 @@ for (;; ptr++)
        else for (i = repeat_max - 1; i >= 0; i--)
          {
          pcre_uchar *hc;
-          pcre_uchar *this_hwm = cd->hwm;
+          size_t this_hwm_offset = cd->hwm - cd->start_workspace;

          *code++ = OP_BRAZERO + repeat_type;

@ -6097,22 +6157,21 @@ for (;; ptr++)
          copying them. */

          while (cd->hwm > cd->start_workspace + cd->workspace_size -
-                 WORK_SIZE_SAFETY_MARGIN - (this_hwm - save_hwm))
+                 WORK_SIZE_SAFETY_MARGIN -
+                 (this_hwm_offset - base_hwm_offset))
            {
-            size_t save_offset = save_hwm - cd->start_workspace;
-            size_t this_offset = this_hwm - cd->start_workspace;
            *errorcodeptr = expand_workspace(cd);
            if (*errorcodeptr != 0) goto FAILED;
-            save_hwm = (pcre_uchar *)cd->start_workspace + save_offset;
-            this_hwm = (pcre_uchar *)cd->start_workspace + this_offset;
            }

-          for (hc = save_hwm; hc < this_hwm; hc += LINK_SIZE)
+          for (hc = (pcre_uchar *)cd->start_workspace + base_hwm_offset;
+               hc < (pcre_uchar *)cd->start_workspace + this_hwm_offset;
+               hc += LINK_SIZE)
            {
            PUT(cd->hwm, 0, GET(hc, 0) + len + ((i != 0)? 2+LINK_SIZE : 1));
            cd->hwm += LINK_SIZE;
            }
-          save_hwm = this_hwm;
+          base_hwm_offset = this_hwm_offset;
          code += len;
          }

@ -6208,7 +6267,7 @@ for (;; ptr++)
              {
              int nlen = (int)(code - bracode);
              *code = OP_END;
-              adjust_recurse(bracode, 1 + LINK_SIZE, utf, cd, save_hwm);
+              adjust_recurse(bracode, 1 + LINK_SIZE, utf, cd, save_hwm_offset);
              memmove(bracode + 1 + LINK_SIZE, bracode, IN_UCHARS(nlen));
              code += 1 + LINK_SIZE;
              nlen += 1 + LINK_SIZE;
@ -6342,7 +6401,7 @@ for (;; ptr++)
        else
          {
          *code = OP_END;
-          adjust_recurse(tempcode, 1 + LINK_SIZE, utf, cd, save_hwm);
+          adjust_recurse(tempcode, 1 + LINK_SIZE, utf, cd, save_hwm_offset);
          memmove(tempcode + 1 + LINK_SIZE, tempcode, IN_UCHARS(len));
          code += 1 + LINK_SIZE;
          len += 1 + LINK_SIZE;
@ -6391,7 +6450,7 @@ for (;; ptr++)

        default:
        *code = OP_END;
-        adjust_recurse(tempcode, 1 + LINK_SIZE, utf, cd, save_hwm);
+        adjust_recurse(tempcode, 1 + LINK_SIZE, utf, cd, save_hwm_offset);
        memmove(tempcode + 1 + LINK_SIZE, tempcode, IN_UCHARS(len));
        code += 1 + LINK_SIZE;
        len += 1 + LINK_SIZE;
@ -6420,15 +6479,25 @@ for (;; ptr++)
    parenthesis forms.  */

    case CHAR_LEFT_PARENTHESIS:
-    newoptions = options;
-    skipbytes = 0;
-    bravalue = OP_CBRA;
-    save_hwm = cd->hwm;
-    reset_bracount = FALSE;
-
-    /* First deal with various "verbs" that can be introduced by '*'. */
-
    ptr++;
+
+    /* First deal with comments. Putting this code right at the start ensures
+    that comments have no bad side effects. */
+
+    if (ptr[0] == CHAR_QUESTION_MARK && ptr[1] == CHAR_NUMBER_SIGN)
+      {
+      ptr += 2;
+      while (*ptr != CHAR_NULL && *ptr != CHAR_RIGHT_PARENTHESIS) ptr++;
+      if (*ptr == CHAR_NULL)
+        {
+        *errorcodeptr = ERR18;
+        goto FAILED;
+        }
+      continue;
+      }
+
+    /* Now deal with various "verbs" that can be introduced by '*'. */
+
    if (ptr[0] == CHAR_ASTERISK && (ptr[1] == ':'
         || (MAX_255(ptr[1]) && ((cd->ctypes[ptr[1]] & ctype_letter) != 0))))
      {
@ -6549,10 +6618,18 @@ for (;; ptr++)
      goto FAILED;
      }

+    /* Initialize for "real" parentheses */
+
+    newoptions = options;
+    skipbytes = 0;
+    bravalue = OP_CBRA;
+    save_hwm_offset = cd->hwm - cd->start_workspace;
+    reset_bracount = FALSE;
+
    /* Deal with the extended parentheses; all are introduced by '?', and the
    appearance of any of them means that this is not a capturing group. */

-    else if (*ptr == CHAR_QUESTION_MARK)
+    if (*ptr == CHAR_QUESTION_MARK)
      {
      int i, set, unset, namelen;
      int *optset;
@ -6561,17 +6638,6 @@ for (;; ptr++)

      switch (*(++ptr))
        {
-        case CHAR_NUMBER_SIGN:                 /* Comment; skip to ket */
-        ptr++;
-        while (*ptr != CHAR_NULL && *ptr != CHAR_RIGHT_PARENTHESIS) ptr++;
-        if (*ptr == CHAR_NULL)
-          {
-          *errorcodeptr = ERR18;
-          goto FAILED;
-          }
-        continue;
-
-
        /* ------------------------------------------------------------ */
        case CHAR_VERTICAL_LINE:  /* Reset capture count for each branch */
        reset_bracount = TRUE;
@ -6620,8 +6686,13 @@ for (;; ptr++)
        if (tempptr[1] == CHAR_QUESTION_MARK &&
              (tempptr[2] == CHAR_EQUALS_SIGN ||
               tempptr[2] == CHAR_EXCLAMATION_MARK ||
-               tempptr[2] == CHAR_LESS_THAN_SIGN))
+                 (tempptr[2] == CHAR_LESS_THAN_SIGN &&
+                   (tempptr[3] == CHAR_EQUALS_SIGN ||
+                    tempptr[3] == CHAR_EXCLAMATION_MARK))))
+          {
+          cd->iscondassert = TRUE;
          break;
+          }

        /* Other conditions use OP_CREF/OP_DNCREF/OP_RREF/OP_DNRREF, and all
        need to skip at least 1+IMM2_SIZE bytes at the start of the group. */
@ -6698,8 +6769,7 @@ for (;; ptr++)
            ptr++;
            }
          namelen = (int)(ptr - name);
-          if (lengthptr != NULL && (options & PCRE_DUPNAMES) != 0)
-            *lengthptr += IMM2_SIZE;
+          if (lengthptr != NULL) *lengthptr += IMM2_SIZE;
          }

        /* Check the terminator */
@ -6735,6 +6805,7 @@ for (;; ptr++)
            goto FAILED;
            }
          PUT2(code, 2+LINK_SIZE, recno);
+          if (recno > cd->top_backref) cd->top_backref = recno;
          break;
          }

@ -6757,6 +6828,7 @@ for (;; ptr++)
          int offset = i++;
          int count = 1;
          recno = GET2(slot, 0);   /* Number from first found */
+          if (recno > cd->top_backref) cd->top_backref = recno;
          for (; i < cd->names_found; i++)
            {
            slot += cd->name_entry_size;
@ -7114,11 +7186,11 @@ for (;; ptr++)

          if (!is_recurse) cd->namedrefcount++;

-          /* If duplicate names are permitted, we have to allow for a named
-          reference to a duplicated name (this cannot be determined until the
-          second pass). This needs an extra 16-bit data item. */
+          /* We have to allow for a named reference to a duplicated name (this
+          cannot be determined until the second pass). This needs an extra
+          16-bit data item. */

-          if ((options & PCRE_DUPNAMES) != 0) *lengthptr += IMM2_SIZE;
+          *lengthptr += IMM2_SIZE;
          }

        /* In the real compile, search the name table. We check the name
@ -7475,12 +7547,22 @@ for (;; ptr++)
      goto FAILED;
      }

-    /* Assertions used not to be repeatable, but this was changed for Perl
-    compatibility, so all kinds can now be repeated. We copy code into a
+    /* All assertions used not to be repeatable, but this was changed for Perl
+    compatibility. All kinds can now be repeated except for assertions that are
+    conditions (Perl also forbids these to be repeated). We copy code into a
    non-register variable (tempcode) in order to be able to pass its address
-    because some compilers complain otherwise. */
+    because some compilers complain otherwise. At the start of a conditional
+    group whose condition is an assertion, cd->iscondassert is set. We unset it
+    here so as to allow assertions later in the group to be quantified. */
+
+    if (bravalue >= OP_ASSERT && bravalue <= OP_ASSERTBACK_NOT &&
+        cd->iscondassert)
+      {
+      previous = NULL;
+      cd->iscondassert = FALSE;
+      }
+    else previous = code;

-    previous = code;                      /* For handling repetition */
    *code = bravalue;
    tempcode = code;
    tempreqvary = cd->req_varyopt;        /* Save value before bracket */
@ -7727,7 +7809,7 @@ for (;; ptr++)
        const pcre_uchar *p;
        pcre_uint32 cf;

-        save_hwm = cd->hwm;   /* Normally this is set when '(' is read */
+        save_hwm_offset = cd->hwm - cd->start_workspace;   /* Normally this is set when '(' is read */
        terminator = (*(++ptr) == CHAR_LESS_THAN_SIGN)?
          CHAR_GREATER_THAN_SIGN : CHAR_APOSTROPHE;

@ -8054,6 +8136,7 @@ int length;
 unsigned int orig_bracount;
 unsigned int max_bracount;
 branch_chain bc;
+size_t save_hwm_offset;

 /* If set, call the external function that checks for stack availability. */

@ -8071,6 +8154,8 @@ bc.current_branch = code;
 firstchar = reqchar = 0;
 firstcharflags = reqcharflags = REQ_UNSET;

+save_hwm_offset = cd->hwm - cd->start_workspace;
+
 /* Accumulate the length for use in the pre-compile phase. Start with the
 length of the BRA and KET and any extra bytes that are required at the
 beginning. We accumulate in a local variable to save frequent testing of
@ -8212,7 +8297,7 @@ for (;;)
      int fixed_length;
      *code = OP_END;
      fixed_length = find_fixedlength(last_branch,  (options & PCRE_UTF8) != 0,
-        FALSE, cd);
+        FALSE, cd, NULL);
      DPRINTF(("fixed length = %d\n", fixed_length));
      if (fixed_length == -3)
        {
@ -8273,7 +8358,7 @@ for (;;)
        {
        *code = OP_END;
        adjust_recurse(start_bracket, 1 + LINK_SIZE,
-          (options & PCRE_UTF8) != 0, cd, cd->hwm);
+          (options & PCRE_UTF8) != 0, cd, save_hwm_offset);
        memmove(start_bracket + 1 + LINK_SIZE, start_bracket,
          IN_UCHARS(code - start_bracket));
        *start_bracket = OP_ONCE;
@ -8497,6 +8582,7 @@ do {
       case OP_RREF:
       case OP_DNRREF:
       case OP_DEF:
+       case OP_FAIL:
       return FALSE;

       default:     /* Assertion */
@ -9081,6 +9167,7 @@ cd->dupnames = FALSE;
 cd->namedrefcount = 0;
 cd->start_code = cworkspace;
 cd->hwm = cworkspace;
+cd->iscondassert = FALSE;
 cd->start_workspace = cworkspace;
 cd->workspace_size = COMPILE_WORK_SIZE;
 cd->named_groups = named_groups;
@ -9118,13 +9205,6 @@ if (length > MAX_PATTERN_SIZE)
  goto PCRE_EARLY_ERROR_RETURN;
  }

-/* If there are groups with duplicate names and there are also references by
-name, we must allow for the possibility of named references to duplicated
-groups. These require an extra data item each. */
-
-if (cd->dupnames && cd->namedrefcount > 0)
-  length += cd->namedrefcount * IMM2_SIZE * sizeof(pcre_uchar);
-
 /* Compute the size of the data block for storing the compiled pattern. Integer
 overflow should no longer be possible because nowadays we limit the maximum
 value of cd->names_found and cd->name_entry_size. */
@ -9183,6 +9263,7 @@ cd->name_table = (pcre_uchar *)re + re->name_table_offset;
 codestart = cd->name_table + re->name_entry_size * re->name_count;
 cd->start_code = codestart;
 cd->hwm = (pcre_uchar *)(cd->start_workspace);
+cd->iscondassert = FALSE;
 cd->req_varyopt = 0;
 cd->had_accept = FALSE;
 cd->had_pruneorskip = FALSE;
@ -9319,7 +9400,7 @@ if (cd->check_lookbehind)
      int end_op = *be;
      *be = OP_END;
      fixed_length = find_fixedlength(cc, (re->options & PCRE_UTF8) != 0, TRUE,
-        cd);
+        cd, NULL);
      *be = end_op;
      DPRINTF(("fixed length = %d\n", fixed_length));
      if (fixed_length < 0)
--- a/pcre/pcre_dfa_exec.c
+++ b/pcre/pcre_dfa_exec.c
@ -2736,9 +2736,10 @@ for (;;)
            condcode == OP_DNRREF)
          return PCRE_ERROR_DFA_UCOND;

-        /* The DEFINE condition is always false */
+        /* The DEFINE condition is always false, and the assertion (?!) is
+        converted to OP_FAIL. */

-        if (condcode == OP_DEF)
+        if (condcode == OP_DEF || condcode == OP_FAIL)
          { ADD_ACTIVE(state_offset + codelink + LINK_SIZE + 1, 0); }

        /* The only supported version of OP_RREF is for the value RREF_ANY,
--- a/pcre/pcre_exec.c
+++ b/pcre/pcre_exec.c
@ -1136,93 +1136,81 @@ for (;;)
    printf("\n");
 #endif

-    if (offset < md->offset_max)
+    if (offset >= md->offset_max) goto POSSESSIVE_NON_CAPTURE;
+
+    matched_once = FALSE;
+    code_offset = (int)(ecode - md->start_code);
+
+    save_offset1 = md->offset_vector[offset];
+    save_offset2 = md->offset_vector[offset+1];
+    save_offset3 = md->offset_vector[md->offset_end - number];
+    save_capture_last = md->capture_last;
+
+    DPRINTF(("saving %d %d %d\n", save_offset1, save_offset2, save_offset3));
+
+    /* Each time round the loop, save the current subject position for use
+    when the group matches. For MATCH_MATCH, the group has matched, so we
+    restart it with a new subject starting position, remembering that we had
+    at least one match. For MATCH_NOMATCH, carry on with the alternatives, as
+    usual. If we haven't matched any alternatives in any iteration, check to
+    see if a previous iteration matched. If so, the group has matched;
+    continue from afterwards. Otherwise it has failed; restore the previous
+    capture values before returning NOMATCH. */
+
+    for (;;)
      {
-      matched_once = FALSE;
-      code_offset = (int)(ecode - md->start_code);
-
-      save_offset1 = md->offset_vector[offset];
-      save_offset2 = md->offset_vector[offset+1];
-      save_offset3 = md->offset_vector[md->offset_end - number];
-      save_capture_last = md->capture_last;
-
-      DPRINTF(("saving %d %d %d\n", save_offset1, save_offset2, save_offset3));
-
-      /* Each time round the loop, save the current subject position for use
-      when the group matches. For MATCH_MATCH, the group has matched, so we
-      restart it with a new subject starting position, remembering that we had
-      at least one match. For MATCH_NOMATCH, carry on with the alternatives, as
-      usual. If we haven't matched any alternatives in any iteration, check to
-      see if a previous iteration matched. If so, the group has matched;
-      continue from afterwards. Otherwise it has failed; restore the previous
-      capture values before returning NOMATCH. */
-
-      for (;;)
+      md->offset_vector[md->offset_end - number] =
+        (int)(eptr - md->start_subject);
+      if (op >= OP_SBRA) md->match_function_type = MATCH_CBEGROUP;
+      RMATCH(eptr, ecode + PRIV(OP_lengths)[*ecode], offset_top, md,
+        eptrb, RM63);
+      if (rrc == MATCH_KETRPOS)
        {
-        md->offset_vector[md->offset_end - number] =
-          (int)(eptr - md->start_subject);
-        if (op >= OP_SBRA) md->match_function_type = MATCH_CBEGROUP;
-        RMATCH(eptr, ecode + PRIV(OP_lengths)[*ecode], offset_top, md,
-          eptrb, RM63);
-        if (rrc == MATCH_KETRPOS)
+        offset_top = md->end_offset_top;
+        ecode = md->start_code + code_offset;
+        save_capture_last = md->capture_last;
+        matched_once = TRUE;
+        mstart = md->start_match_ptr;    /* In case \K changed it */
+        if (eptr == md->end_match_ptr)   /* Matched an empty string */
          {
-          offset_top = md->end_offset_top;
-          ecode = md->start_code + code_offset;
-          save_capture_last = md->capture_last;
-          matched_once = TRUE;
-          mstart = md->start_match_ptr;    /* In case \K changed it */
-          if (eptr == md->end_match_ptr)   /* Matched an empty string */
-            {
-            do ecode += GET(ecode, 1); while (*ecode == OP_ALT);
-            break;
-            }
-          eptr = md->end_match_ptr;
-          continue;
+          do ecode += GET(ecode, 1); while (*ecode == OP_ALT);
+          break;
          }
-
-        /* See comment in the code for capturing groups above about handling
-        THEN. */
-
-        if (rrc == MATCH_THEN)
-          {
-          next = ecode + GET(ecode,1);
-          if (md->start_match_ptr < next &&
-              (*ecode == OP_ALT || *next == OP_ALT))
-            rrc = MATCH_NOMATCH;
-          }
-
-        if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-        md->capture_last = save_capture_last;
-        ecode += GET(ecode, 1);
-        if (*ecode != OP_ALT) break;
+        eptr = md->end_match_ptr;
+        continue;
        }

-      if (!matched_once)
+      /* See comment in the code for capturing groups above about handling
+      THEN. */
+
+      if (rrc == MATCH_THEN)
        {
-        md->offset_vector[offset] = save_offset1;
-        md->offset_vector[offset+1] = save_offset2;
-        md->offset_vector[md->offset_end - number] = save_offset3;
+        next = ecode + GET(ecode,1);
+        if (md->start_match_ptr < next &&
+            (*ecode == OP_ALT || *next == OP_ALT))
+          rrc = MATCH_NOMATCH;
        }

-      if (allow_zero || matched_once)
-        {
-        ecode += 1 + LINK_SIZE;
-        break;
-        }
-
-      RRETURN(MATCH_NOMATCH);
+      if (rrc != MATCH_NOMATCH) RRETURN(rrc);
+      md->capture_last = save_capture_last;
+      ecode += GET(ecode, 1);
+      if (*ecode != OP_ALT) break;
      }

-    /* FALL THROUGH ... Insufficient room for saving captured contents. Treat
-    as a non-capturing bracket. */
+    if (!matched_once)
+      {
+      md->offset_vector[offset] = save_offset1;
+      md->offset_vector[offset+1] = save_offset2;
+      md->offset_vector[md->offset_end - number] = save_offset3;
+      }

-    /* VVVVVVVVVVVVVVVVVVVVVVVVV */
-    /* VVVVVVVVVVVVVVVVVVVVVVVVV */
+    if (allow_zero || matched_once)
+      {
+      ecode += 1 + LINK_SIZE;
+      break;
+      }

-    DPRINTF(("insufficient capture room: treat as non-capturing\n"));
-
-    /* VVVVVVVVVVVVVVVVVVVVVVVVV */
-    /* VVVVVVVVVVVVVVVVVVVVVVVVV */
+    RRETURN(MATCH_NOMATCH);

    /* Non-capturing possessive bracket with unlimited repeat. We come here
    from BRAZERO with allow_zero = TRUE. The code is similar to the above,
@ -1388,6 +1376,7 @@ for (;;)
      break;

      case OP_DEF:     /* DEFINE - always false */
+      case OP_FAIL:    /* From optimized (?!) condition */
      break;

      /* The condition is an assertion. Call match() to evaluate it - setting
@ -1404,8 +1393,11 @@ for (;;)
        condition = TRUE;

        /* Advance ecode past the assertion to the start of the first branch,
-        but adjust it so that the general choosing code below works. */
+        but adjust it so that the general choosing code below works. If the
+        assertion has a quantifier that allows zero repeats we must skip over
+        the BRAZERO. This is a lunatic thing to do, but somebody did! */

+        if (*ecode == OP_BRAZERO) ecode++;
        ecode += GET(ecode, 1);
        while (*ecode == OP_ALT) ecode += GET(ecode, 1);
        ecode += 1 + LINK_SIZE - PRIV(OP_lengths)[condcode];
@ -1474,7 +1466,18 @@ for (;;)
      md->offset_vector[offset] =
        md->offset_vector[md->offset_end - number];
      md->offset_vector[offset+1] = (int)(eptr - md->start_subject);
-      if (offset_top <= offset) offset_top = offset + 2;
+
+      /* If this group is at or above the current highwater mark, ensure that
+      any groups between the current high water mark and this group are marked
+      unset and then update the high water mark. */
+
+      if (offset >= offset_top)
+        {
+        register int *iptr = md->offset_vector + offset_top;
+        register int *iend = md->offset_vector + offset;
+        while (iptr < iend) *iptr++ = -1;
+        offset_top = offset + 2;
+        }
      }
    ecode += 1 + IMM2_SIZE;
    break;
@ -1826,7 +1829,11 @@ for (;;)
        are defined in a range that can be tested for. */

        if (rrc >= MATCH_BACKTRACK_MIN && rrc <= MATCH_BACKTRACK_MAX)
+          {
+          if (new_recursive.offset_save != stacksave)
+            (PUBL(free))(new_recursive.offset_save);
          RRETURN(MATCH_NOMATCH);
+          }

        /* Any return code other than NOMATCH is an error. */

@ -3476,7 +3483,7 @@ for (;;)
          if (possessive) continue;    /* No backtracking */
          for(;;)
            {
-            if (eptr == pp) goto TAIL_RECURSE;
+            if (eptr <= pp) goto TAIL_RECURSE;
            RMATCH(eptr, ecode, offset_top, md, eptrb, RM23);
            if (rrc != MATCH_NOMATCH) RRETURN(rrc);
 #ifdef SUPPORT_UCP
@ -3897,7 +3904,7 @@ for (;;)
          if (possessive) continue;    /* No backtracking */
          for(;;)
            {
-            if (eptr == pp) goto TAIL_RECURSE;
+            if (eptr <= pp) goto TAIL_RECURSE;
            RMATCH(eptr, ecode, offset_top, md, eptrb, RM30);
            if (rrc != MATCH_NOMATCH) RRETURN(rrc);
            eptr--;
@ -4032,7 +4039,7 @@ for (;;)
          if (possessive) continue;    /* No backtracking */
          for(;;)
            {
-            if (eptr == pp) goto TAIL_RECURSE;
+            if (eptr <= pp) goto TAIL_RECURSE;
            RMATCH(eptr, ecode, offset_top, md, eptrb, RM34);
            if (rrc != MATCH_NOMATCH) RRETURN(rrc);
            eptr--;
@ -5603,7 +5610,7 @@ for (;;)
        if (possessive) continue;    /* No backtracking */
        for(;;)
          {
-          if (eptr == pp) goto TAIL_RECURSE;
+          if (eptr <= pp) goto TAIL_RECURSE;
          RMATCH(eptr, ecode, offset_top, md, eptrb, RM44);
          if (rrc != MATCH_NOMATCH) RRETURN(rrc);
          eptr--;
@ -5645,12 +5652,17 @@ for (;;)

        if (possessive) continue;    /* No backtracking */

+        /* We use <= pp rather than == pp to detect the start of the run while
+        backtracking because the use of \C in UTF mode can cause BACKCHAR to
+        move back past pp. This is just palliative; the use of \C in UTF mode
+        is fraught with danger. */
+
        for(;;)
          {
          int lgb, rgb;
          PCRE_PUCHAR fptr;

-          if (eptr == pp) goto TAIL_RECURSE;   /* At start of char run */
+          if (eptr <= pp) goto TAIL_RECURSE;   /* At start of char run */
          RMATCH(eptr, ecode, offset_top, md, eptrb, RM45);
          if (rrc != MATCH_NOMATCH) RRETURN(rrc);

@ -5668,7 +5680,7 @@ for (;;)

          for (;;)
            {
-            if (eptr == pp) goto TAIL_RECURSE;   /* At start of char run */
+            if (eptr <= pp) goto TAIL_RECURSE;   /* At start of char run */
            fptr = eptr - 1;
            if (!utf) c = *fptr; else
              {
@ -5918,7 +5930,7 @@ for (;;)
        if (possessive) continue;    /* No backtracking */
        for(;;)
          {
-          if (eptr == pp) goto TAIL_RECURSE;
+          if (eptr <= pp) goto TAIL_RECURSE;
          RMATCH(eptr, ecode, offset_top, md, eptrb, RM46);
          if (rrc != MATCH_NOMATCH) RRETURN(rrc);
          eptr--;
--- a/pcre/pcre_internal.h
+++ b/pcre/pcre_internal.h
@ -2446,6 +2446,7 @@ typedef struct compile_data {
  BOOL had_pruneorskip;             /* (*PRUNE) or (*SKIP) encountered */
  BOOL check_lookbehind;            /* Lookbehinds need later checking */
  BOOL dupnames;                    /* Duplicate names exist */
+  BOOL iscondassert;                /* Next assert is a condition */
  int  nltype;                      /* Newline type */
  int  nllen;                       /* Newline string length */
  pcre_uchar nl[4];                 /* Newline string when fixed length */
@ -2459,6 +2460,13 @@ typedef struct branch_chain {
  pcre_uchar *current_branch;
 } branch_chain;

+/* Structure for mutual recursion detection. */
+
+typedef struct recurse_check {
+  struct recurse_check *prev;
+  const pcre_uchar *group;
+} recurse_check;
+
 /* Structure for items in a linked list that represents an explicit recursive
 call within the pattern; used by pcre_exec(). */

--- a/pcre/pcre_jit_compile.c
+++ b/pcre/pcre_jit_compile.c
--- a/pcre/pcre_jit_test.c
+++ b/pcre/pcre_jit_test.c
@ -51,8 +51,6 @@ POSSIBILITY OF SUCH DAMAGE.

 #include "pcre_internal.h"

-#define PCRE_BUG 0x80000000
-
 /*
 Letter characters:
   \xe6\x92\xad = 0x64ad = 25773 (kanji)
@ -69,6 +67,9 @@ POSSIBILITY OF SUCH DAMAGE.
      \xc3\x89 = 0xc9 = 201 (E')
   \xc3\xa1 = 0xe1 = 225 (a')
      \xc3\x81 = 0xc1 = 193 (A')
+   \x53 = 0x53 = S
+     \x73 = 0x73 = s
+     \xc5\xbf = 0x17f = 383 (long S)
   \xc8\xba = 0x23a = 570
      \xe2\xb1\xa5 = 0x2c65 = 11365
   \xe1\xbd\xb8 = 0x1f78 = 8056
@ -78,6 +79,10 @@ POSSIBILITY OF SUCH DAMAGE.
   \xc7\x84 = 0x1c4 = 452
     \xc7\x85 = 0x1c5 = 453
     \xc7\x86 = 0x1c6 = 454
+ Caseless sets:
+   ucp_Armenian - \x{531}-\x{556} -> \x{561}-\x{586}
+   ucp_Coptic - \x{2c80}-\x{2ce3} -> caseless: XOR 0x1
+   ucp_Latin - \x{ff21}-\x{ff3a} -> \x{ff41]-\x{ff5a}

 Mark property:
   \xcc\x8d = 0x30d = 781
@ -626,6 +631,9 @@ static struct regression_test_case regression_test_cases[] = {
 	{ MUA, 0, "(?P<Name>a)?(?P<Name2>b)?(?(Name)c|d)+?dd", "bcabcacdb bdddd" },
 	{ MUA, 0, "(?P<Name>a)?(?P<Name2>b)?(?(Name)c|d)+l", "ababccddabdbccd abcccl" },
 	{ MUA, 0, "((?:a|aa)(?(1)aaa))x", "aax" },
+	{ MUA, 0, "(?(?!)a|b)", "ab" },
+	{ MUA, 0, "(?(?!)a)", "ab" },
+	{ MUA, 0 | F_NOMATCH, "(?(?!)a|b)", "ac" },

 	/* Set start of match. */
 	{ MUA, 0, "(?:\\Ka)*aaaab", "aaaaaaaa aaaaaaabb" },
@ -944,7 +952,7 @@ static void setstack16(pcre16_extra *extra)

 	pcre16_assign_jit_stack(extra, callback16, getstack16());
 }
-#endif /* SUPPORT_PCRE8 */
+#endif /* SUPPORT_PCRE16 */

 #ifdef SUPPORT_PCRE32
 static pcre32_jit_stack *stack32;
@ -967,7 +975,7 @@ static void setstack32(pcre32_extra *extra)

 	pcre32_assign_jit_stack(extra, callback32, getstack32());
 }
-#endif /* SUPPORT_PCRE8 */
+#endif /* SUPPORT_PCRE32 */

 #ifdef SUPPORT_PCRE16

@ -1177,7 +1185,7 @@ static int regression_tests(void)
 #elif defined SUPPORT_PCRE16
 	pcre16_config(PCRE_CONFIG_UTF16, &utf);
 	pcre16_config(PCRE_CONFIG_UNICODE_PROPERTIES, &ucp);
-#elif defined SUPPORT_PCRE16
+#elif defined SUPPORT_PCRE32
 	pcre32_config(PCRE_CONFIG_UTF32, &utf);
 	pcre32_config(PCRE_CONFIG_UNICODE_PROPERTIES, &ucp);
 #endif
--- a/pcre/pcre_study.c
+++ b/pcre/pcre_study.c
@ -70,7 +70,7 @@ Arguments:
  code            pointer to start of group (the bracket)
  startcode       pointer to start of the whole pattern's code
  options         the compiling options
-  int             RECURSE depth
+  recurses        chain of recurse_check to catch mutual recursion

 Returns:   the minimum length
           -1 if \C in UTF-8 mode or (*ACCEPT) was encountered
@ -80,12 +80,13 @@ Returns:   the minimum length

 static int
 find_minlength(const REAL_PCRE *re, const pcre_uchar *code,
-  const pcre_uchar *startcode, int options, int recurse_depth)
+  const pcre_uchar *startcode, int options, recurse_check *recurses)
 {
 int length = -1;
 /* PCRE_UTF16 has the same value as PCRE_UTF8. */
 BOOL utf = (options & PCRE_UTF8) != 0;
 BOOL had_recurse = FALSE;
+recurse_check this_recurse;
 register int branchlength = 0;
 register pcre_uchar *cc = (pcre_uchar *)code + 1 + LINK_SIZE;

@ -130,7 +131,7 @@ for (;;)
    case OP_SBRAPOS:
    case OP_ONCE:
    case OP_ONCE_NC:
-    d = find_minlength(re, cc, startcode, options, recurse_depth);
+    d = find_minlength(re, cc, startcode, options, recurses);
    if (d < 0) return d;
    branchlength += d;
    do cc += GET(cc, 1); while (*cc == OP_ALT);
@ -393,7 +394,7 @@ for (;;)
        ce = cs = (pcre_uchar *)PRIV(find_bracket)(startcode, utf, GET2(slot, 0));
        if (cs == NULL) return -2;
        do ce += GET(ce, 1); while (*ce == OP_ALT);
-        if (cc > cs && cc < ce)
+        if (cc > cs && cc < ce)     /* Simple recursion */
          {
          d = 0;
          had_recurse = TRUE;
@ -401,8 +402,22 @@ for (;;)
          }
        else
          {
-          int dd = find_minlength(re, cs, startcode, options, recurse_depth);
-          if (dd < d) d = dd;
+          recurse_check *r = recurses;
+          for (r = recurses; r != NULL; r = r->prev) if (r->group == cs) break;
+          if (r != NULL)           /* Mutual recursion */
+            {
+            d = 0;
+            had_recurse = TRUE;
+            break;
+            }
+          else
+            {
+            int dd;
+            this_recurse.prev = recurses;
+            this_recurse.group = cs;
+            dd = find_minlength(re, cs, startcode, options, &this_recurse);
+            if (dd < d) d = dd;
+            }
          }
        slot += re->name_entry_size;
        }
@ -418,14 +433,26 @@ for (;;)
      ce = cs = (pcre_uchar *)PRIV(find_bracket)(startcode, utf, GET2(cc, 1));
      if (cs == NULL) return -2;
      do ce += GET(ce, 1); while (*ce == OP_ALT);
-      if (cc > cs && cc < ce)
+      if (cc > cs && cc < ce)    /* Simple recursion */
        {
        d = 0;
        had_recurse = TRUE;
        }
      else
        {
-        d = find_minlength(re, cs, startcode, options, recurse_depth);
+        recurse_check *r = recurses;
+        for (r = recurses; r != NULL; r = r->prev) if (r->group == cs) break;
+        if (r != NULL)           /* Mutual recursion */
+          {
+          d = 0;
+          had_recurse = TRUE;
+          }
+        else
+          {
+          this_recurse.prev = recurses;
+          this_recurse.group = cs;
+          d = find_minlength(re, cs, startcode, options, &this_recurse);
+          }
        }
      }
    else d = 0;
@ -474,12 +501,21 @@ for (;;)
    case OP_RECURSE:
    cs = ce = (pcre_uchar *)startcode + GET(cc, 1);
    do ce += GET(ce, 1); while (*ce == OP_ALT);
-    if ((cc > cs && cc < ce) || recurse_depth > 10)
+    if (cc > cs && cc < ce)    /* Simple recursion */
      had_recurse = TRUE;
    else
      {
-      branchlength += find_minlength(re, cs, startcode, options,
-        recurse_depth + 1);
+      recurse_check *r = recurses;
+      for (r = recurses; r != NULL; r = r->prev) if (r->group == cs) break;
+      if (r != NULL)           /* Mutual recursion */
+        had_recurse = TRUE;
+      else
+        {
+        this_recurse.prev = recurses;
+        this_recurse.group = cs;
+        branchlength += find_minlength(re, cs, startcode, options,
+          &this_recurse);
+        }
      }
    cc += 1 + LINK_SIZE;
    break;
@ -1503,7 +1539,7 @@ if ((re->options & PCRE_ANCHORED) == 0 &&

 /* Find the minimum length of subject string. */

-switch(min = find_minlength(re, code, code, re->options, 0))
+switch(min = find_minlength(re, code, code, re->options, NULL))
  {
  case -2: *errorptr = "internal error: missing capturing bracket"; return NULL;
  case -3: *errorptr = "internal error: opcode not recognized"; return NULL;
--- a/pcre/pcregrep.c
+++ b/pcre/pcregrep.c
@ -1582,12 +1582,15 @@ while (ptr < endptr)
  int endlinelength;
  int mrc = 0;
  int startoffset = 0;
+  int prevoffsets[2];
  unsigned int options = 0;
  BOOL match;
  char *matchptr = ptr;
  char *t = ptr;
  size_t length, linelength;

+  prevoffsets[0] = prevoffsets[1] = -1;
+
  /* At this point, ptr is at the start of a line. We need to find the length
  of the subject string to pass to pcre_exec(). In multiline mode, it is the
  length remainder of the data in the buffer. Otherwise, it is the length of
@ -1729,55 +1732,86 @@ while (ptr < endptr)
      {
      if (!invert)
        {
-        if (printname != NULL) fprintf(stdout, "%s:", printname);
-        if (number) fprintf(stdout, "%d:", linenumber);
+        int oldstartoffset = startoffset;

-        /* Handle --line-offsets */
+        /* It is possible, when a lookbehind assertion contains \K, for the
+        same string to be found again. The code below advances startoffset, but
+        until it is past the "bumpalong" offset that gave the match, the same
+        substring will be returned. The PCRE1 library does not return the
+        bumpalong offset, so all we can do is ignore repeated strings. (PCRE2
+        does this better.) */

-        if (line_offsets)
-          fprintf(stdout, "%d,%d\n", (int)(matchptr + offsets[0] - ptr),
-            offsets[1] - offsets[0]);
-
-        /* Handle --file-offsets */
-
-        else if (file_offsets)
-          fprintf(stdout, "%d,%d\n",
-            (int)(filepos + matchptr + offsets[0] - ptr),
-            offsets[1] - offsets[0]);
-
-        /* Handle --only-matching, which may occur many times */
-
-        else
+        if (prevoffsets[0] != offsets[0] || prevoffsets[1] != offsets[1])
          {
-          BOOL printed = FALSE;
-          omstr *om;
+          prevoffsets[0] = offsets[0];
+          prevoffsets[1] = offsets[1];

-          for (om = only_matching; om != NULL; om = om->next)
+          if (printname != NULL) fprintf(stdout, "%s:", printname);
+          if (number) fprintf(stdout, "%d:", linenumber);
+
+          /* Handle --line-offsets */
+
+          if (line_offsets)
+            fprintf(stdout, "%d,%d\n", (int)(matchptr + offsets[0] - ptr),
+              offsets[1] - offsets[0]);
+
+          /* Handle --file-offsets */
+
+          else if (file_offsets)
+            fprintf(stdout, "%d,%d\n",
+              (int)(filepos + matchptr + offsets[0] - ptr),
+              offsets[1] - offsets[0]);
+
+          /* Handle --only-matching, which may occur many times */
+
+          else
            {
-            int n = om->groupnum;
-            if (n < mrc)
+            BOOL printed = FALSE;
+            omstr *om;
+
+            for (om = only_matching; om != NULL; om = om->next)
              {
-              int plen = offsets[2*n + 1] - offsets[2*n];
-              if (plen > 0)
+              int n = om->groupnum;
+              if (n < mrc)
                {
-                if (printed) fprintf(stdout, "%s", om_separator);
-                if (do_colour) fprintf(stdout, "%c[%sm", 0x1b, colour_string);
-                FWRITE(matchptr + offsets[n*2], 1, plen, stdout);
-                if (do_colour) fprintf(stdout, "%c[00m", 0x1b);
-                printed = TRUE;
+                int plen = offsets[2*n + 1] - offsets[2*n];
+                if (plen > 0)
+                  {
+                  if (printed) fprintf(stdout, "%s", om_separator);
+                  if (do_colour) fprintf(stdout, "%c[%sm", 0x1b, colour_string);
+                  FWRITE(matchptr + offsets[n*2], 1, plen, stdout);
+                  if (do_colour) fprintf(stdout, "%c[00m", 0x1b);
+                  printed = TRUE;
+                  }
                }
              }
-            }

-          if (printed || printname != NULL || number) fprintf(stdout, "\n");
+            if (printed || printname != NULL || number) fprintf(stdout, "\n");
+            }
          }

-        /* Prepare to repeat to find the next match */
+        /* Prepare to repeat to find the next match. If the patterned contained
+        a lookbehind tht included \K, it is possible that the end of the match
+        might be at or before the actual strting offset we have just used. We
+        need to start one character further on. Unfortunately, for unanchored
+        patterns, the actual start offset can be greater that the one that was
+        set as a result of "bumpalong". PCRE1 does not return the actual start
+        offset, so we have to check against the original start offset. This may
+        lead to duplicates - we we need the fudge above to avoid printing them.
+        (PCRE2 does this better.) */

        match = FALSE;
        if (line_buffered) fflush(stdout);
        rc = 0;                      /* Had some success */
        startoffset = offsets[1];    /* Restart after the match */
+        if (startoffset <= oldstartoffset)
+          {
+          if ((size_t)startoffset >= length)
+            goto END_ONE_MATCH;              /* We were at the end */
+          startoffset = oldstartoffset + 1;
+          if (utf8)
+            while ((matchptr[startoffset] & 0xc0) == 0x80) startoffset++;
+          }
        goto ONLY_MATCHING_RESTART;
        }
      }
@ -1974,6 +2008,7 @@ while (ptr < endptr)
  /* Advance to after the newline and increment the line number. The file
  offset to the current line is maintained in filepos. */

+  END_ONE_MATCH:
  ptr += linelength + endlinelength;
  filepos += (int)(linelength + endlinelength);
  linenumber++;
--- a/pcre/pcretest.c
+++ b/pcre/pcretest.c
@ -2257,16 +2257,19 @@ if (callout_extra)
  fprintf(f, "Callout %d: last capture = %d\n",
    cb->callout_number, cb->capture_last);

-  for (i = 0; i < cb->capture_top * 2; i += 2)
+  if (cb->offset_vector != NULL)
    {
-    if (cb->offset_vector[i] < 0)
-      fprintf(f, "%2d: <unset>\n", i/2);
-    else
+    for (i = 0; i < cb->capture_top * 2; i += 2)
      {
-      fprintf(f, "%2d: ", i/2);
-      PCHARSV(cb->subject, cb->offset_vector[i],
-        cb->offset_vector[i+1] - cb->offset_vector[i], f);
-      fprintf(f, "\n");
+      if (cb->offset_vector[i] < 0)
+        fprintf(f, "%2d: <unset>\n", i/2);
+      else
+        {
+        fprintf(f, "%2d: ", i/2);
+        PCHARSV(cb->subject, cb->offset_vector[i],
+          cb->offset_vector[i+1] - cb->offset_vector[i], f);
+        fprintf(f, "\n");
+        }
      }
    }
  }
@ -2519,7 +2522,7 @@ re->name_entry_size = swap_uint16(re->name_entry_size);
 re->name_count = swap_uint16(re->name_count);
 re->ref_count = swap_uint16(re->ref_count);

-if (extra != NULL)
+if (extra != NULL && (extra->flags & PCRE_EXTRA_STUDY_DATA) != 0)
  {
  pcre_study_data *rsd = (pcre_study_data *)(extra->study_data);
  rsd->size = swap_uint32(rsd->size);
@ -2700,7 +2703,7 @@ re->name_entry_size = swap_uint16(re->name_entry_size);
 re->name_count = swap_uint16(re->name_count);
 re->ref_count = swap_uint16(re->ref_count);

-if (extra != NULL)
+if (extra != NULL && (extra->flags & PCRE_EXTRA_STUDY_DATA) != 0)
  {
  pcre_study_data *rsd = (pcre_study_data *)(extra->study_data);
  rsd->size = swap_uint32(rsd->size);
@ -3453,7 +3456,7 @@ while (!done)
  pcre_extra *extra = NULL;

 #if !defined NOPOSIX  /* There are still compilers that require no indent */
-  regex_t preg;
+  regex_t preg = { NULL, 0, 0} ;
  int do_posix = 0;
 #endif

@ -5603,6 +5606,12 @@ while (!done)

      if (!do_g && !do_G) break;

+      if (use_offsets == NULL)
+        {
+        fprintf(outfile, "Cannot do global matching without an ovector\n");
+        break;
+        }
+
      /* If we have matched an empty string, first check to see if we are at
      the end of the subject. If so, the /g loop is over. Otherwise, mimic what
      Perl's /g options does. This turns out to be rather cunning. First we set
@ -5618,9 +5627,33 @@ while (!done)
        g_notempty = PCRE_NOTEMPTY_ATSTART | PCRE_ANCHORED;
        }

-      /* For /g, update the start offset, leaving the rest alone */
+      /* For /g, update the start offset, leaving the rest alone. There is a
+      tricky case when \K is used in a positive lookbehind assertion. This can
+      cause the end of the match to be less than or equal to the start offset.
+      In this case we restart at one past the start offset. This may return the
+      same match if the original start offset was bumped along during the
+      match, but eventually the new start offset will hit the actual start
+      offset. (In PCRE2 the true start offset is available, and this can be
+      done better. It is not worth doing more than making sure we do not loop
+      at this stage in the life of PCRE1.) */

-      if (do_g) start_offset = use_offsets[1];
+      if (do_g)
+        {
+        if (g_notempty == 0 && use_offsets[1] <= start_offset)
+          {
+          if (start_offset >= len) break;  /* End of subject */
+          start_offset++;
+          if (use_utf)
+            {
+            while (start_offset < len)
+              {
+              if ((bptr[start_offset] & 0xc0) != 0x80) break;
+              start_offset++;
+              }
+            }
+          }
+        else start_offset = use_offsets[1];
+        }

      /* For /G, update the pointer and length */

@ -5637,7 +5670,7 @@ while (!done)
  CONTINUE:

 #if !defined NOPOSIX
-  if (posix || do_posix) regfree(&preg);
+  if ((posix || do_posix) && preg.re_pcre != 0) regfree(&preg);
 #endif

  if (re != NULL) new_free(re);
--- a/pcre/testdata/grepoutput
+++ b/pcre/testdata/grepoutput
@ -743,3 +743,11 @@ RC=0
 ---------------------------- Test 106 -----------------------------
 a
 RC=0
+---------------------------- Test 107 -----------------------------
+1:0,1
+2:0,1
+2:1,1
+2:2,1
+2:3,1
+2:4,1
+RC=0
--- a/pcre/testdata/testinput1
+++ b/pcre/testdata/testinput1
@ -5720,4 +5720,14 @@ AbcdCBefgBhiBqz
 /[\Q]a\E]+/
    aa]]

+/(?:((abcd))|(((?:(?:(?:(?:abc|(?:abcdef))))b)abcdefghi)abc)|((*ACCEPT)))/
+    1234abcd
+
+/(\2)(\1)/
+
+"Z*(|d*){216}"
+
+"(?1)(?#?'){8}(a)"
+    baaaaaaaaac
+
 /-- End of testinput1 --/
--- a/pcre/testdata/testinput11
+++ b/pcre/testdata/testinput11
@ -134,4 +134,6 @@ is required for these tests. --/

 /(((a\2)|(a*)\g<-1>))*a?/B

+/((?+1)(\1))/B
+
 /-- End of testinput11 --/
--- a/pcre/testdata/testinput12
+++ b/pcre/testdata/testinput12
@ -87,4 +87,12 @@ and a couple of things that are different with JIT. --/
 /^12345678abcd/mS++
    12345678abcd

+/-- Test pattern compilation --/ 
+
+/(?:a|b|c|d|e)(?R)/S++
+
+/(?:a|b|c|d|e)(?R)(?R)/S++
+
+/(a(?:a|b|c|d|e)b){8,16}/S++
+
 /-- End of testinput12 --/
--- a/pcre/testdata/testinput2
+++ b/pcre/testdata/testinput2
@ -1380,6 +1380,8 @@
    1X
    123456\P

+//KF>/dev/null
+
 /abc/IS>testsavedregex
 <testsavedregex
    abc
@ -4078,4 +4080,76 @@ backtracking verbs. --/

 /\x{whatever}/

+"((?=(?(?=(?(?=(?(?=()))))))))"
+    a
+
+"(?(?=)==)(((((((((?=)))))))))"
+    a
+
+/^(?:(a)|b)(?(1)A|B)/I
+    aA123\O3
+    aA123\O6
+
+'^(?:(?<AA>a)|b)(?(<AA>)A|B)'
+    aA123\O3
+    aA123\O6
+
+'^(?<AA>)(?:(?<AA>a)|b)(?(<AA>)A|B)'J
+    aA123\O3
+    aA123\O6
+
+'^(?:(?<AA>X)|)(?:(?<AA>a)|b)\k{AA}'J
+    aa123\O3
+    aa123\O6
+
+/(?<N111>(?J)(?<N111>1(111111)11|)1|1|)(?(<N111>)1)/
+
+/(?(?=0)?)+/
+
+/(?(?=0)(?=00)?00765)/
+     00765
+
+/(?(?=0)(?=00)?00765|(?!3).56)/
+     00765
+     456
+     ** Failers
+     356   
+
+'^(a)*+(\w)'
+    g
+    g\O3
+
+'^(?:a)*+(\w)'
+    g
+    g\O3
+
+//C
+    \O\C+
+
+"((?2){0,1999}())?"
+
+/((?+1)(\1))/BZ
+
+/(?(?!)a|b)/
+    bbb
+    aaa 
+
+"((?2)+)((?1))"
+
+"(?(?<E>.*!.*)?)"
+
+"X((?2)()*+){2}+"BZ
+
+"X((?2)()*+){2}"BZ
+
+"(?<=((?2))((?1)))"
+
+/(?<=\Ka)/g+
+    aaaaa
+
+/(?<=\Ka)/G+
+    aaaaa
+
+/((?2){73}(?2))((?1))/
+
 /-- End of testinput2 --/
--- a/pcre/testdata/testinput4
+++ b/pcre/testdata/testinput4
@ -722,4 +722,9 @@
 /^#[^\x{ffff}]#[^\x{ffff}]#[^\x{ffff}]#/8
    #\x{10000}#\x{100}#\x{10ffff}#

+"[\S\V\H]"8
+
+/\C(\W?ſ)'?{{/8
+    \\C(\\W?ſ)'?{{
+
 /-- End of testinput4 --/
--- a/pcre/testdata/testinput5
+++ b/pcre/testdata/testinput5
@ -790,4 +790,12 @@

 /[b-d\x{200}-\x{250}]*[ae-h]?#[\x{200}-\x{250}]{0,8}[\x00-\xff]*#[\x{200}-\x{250}]+[a-z]/8BZ

+/[^\xff]*PRUNE:\x{100}abc(xyz(?1))/8DZ
+
+/(?<=\K\x{17f})/8g+
+    \x{17f}\x{17f}\x{17f}\x{17f}\x{17f}
+
+/(?<=\K\x{17f})/8G+
+    \x{17f}\x{17f}\x{17f}\x{17f}\x{17f}
+
 /-- End of testinput5 --/
--- a/pcre/testdata/testinput6
+++ b/pcre/testdata/testinput6
@ -1496,4 +1496,10 @@
 /^s?c/mi8
    scat

+/[A-`]/i8
+    abcdefghijklmno
+
+/\C\X*QT/8
+    Ӆ\x0aT
+
 /-- End of testinput6 --/
--- a/pcre/testdata/testinput8
+++ b/pcre/testdata/testinput8
@ -4837,4 +4837,8 @@
 '\A(?:[^\"]++|\"(?:[^\"]++|\"\")*+\")++'
    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED

+/(?(?!)a|b)/
+    bbb
+    aaa 
+
 /-- End of testinput8 --/
--- a/pcre/testdata/testoutput1
+++ b/pcre/testdata/testoutput1
@ -9411,4 +9411,22 @@ No match
    aa]]
 0: aa]]

+/(?:((abcd))|(((?:(?:(?:(?:abc|(?:abcdef))))b)abcdefghi)abc)|((*ACCEPT)))/
+    1234abcd
+ 0: 
+ 1: <unset>
+ 2: <unset>
+ 3: <unset>
+ 4: <unset>
+ 5: 
+
+/(\2)(\1)/
+
+"Z*(|d*){216}"
+
+"(?1)(?#?'){8}(a)"
+    baaaaaaaaac
+ 0: aaaaaaaaa
+ 1: a
+
 /-- End of testinput1 --/
--- a/pcre/testdata/testoutput11-16
+++ b/pcre/testdata/testoutput11-16
@ -231,7 +231,7 @@ Memory allocation (code space): 73
 ------------------------------------------------------------------

 /(?P<a>a)...(?P=a)bbb(?P>a)d/BM
-Memory allocation (code space): 57
+Memory allocation (code space): 61
 ------------------------------------------------------------------
  0  24 Bra
  2   5 CBra 1
@ -733,4 +733,19 @@ Memory allocation (code space): 14
 41     End
 ------------------------------------------------------------------

+/((?+1)(\1))/B
+------------------------------------------------------------------
+  0  20 Bra
+  2  16 Once
+  4  12 CBra 1
+  7   9 Recurse
+  9   5 CBra 2
+ 12     \1
+ 14   5 Ket
+ 16  12 Ket
+ 18  16 Ket
+ 20  20 Ket
+ 22     End
+------------------------------------------------------------------
+
 /-- End of testinput11 --/
--- a/pcre/testdata/testoutput11-32
+++ b/pcre/testdata/testoutput11-32
@ -231,7 +231,7 @@ Memory allocation (code space): 155
 ------------------------------------------------------------------

 /(?P<a>a)...(?P=a)bbb(?P>a)d/BM
-Memory allocation (code space): 117
+Memory allocation (code space): 125
 ------------------------------------------------------------------
  0  24 Bra
  2   5 CBra 1
@ -733,4 +733,19 @@ Memory allocation (code space): 28
 41     End
 ------------------------------------------------------------------

+/((?+1)(\1))/B
+------------------------------------------------------------------
+  0  20 Bra
+  2  16 Once
+  4  12 CBra 1
+  7   9 Recurse
+  9   5 CBra 2
+ 12     \1
+ 14   5 Ket
+ 16  12 Ket
+ 18  16 Ket
+ 20  20 Ket
+ 22     End
+------------------------------------------------------------------
+
 /-- End of testinput11 --/
--- a/pcre/testdata/testoutput11-8
+++ b/pcre/testdata/testoutput11-8
@ -231,7 +231,7 @@ Memory allocation (code space): 45
 ------------------------------------------------------------------

 /(?P<a>a)...(?P=a)bbb(?P>a)d/BM
-Memory allocation (code space): 34
+Memory allocation (code space): 38
 ------------------------------------------------------------------
  0  30 Bra
  3   7 CBra 1
@ -733,4 +733,19 @@ Memory allocation (code space): 10
 60     End
 ------------------------------------------------------------------

+/((?+1)(\1))/B
+------------------------------------------------------------------
+  0  31 Bra
+  3  25 Once
+  6  19 CBra 1
+ 11  14 Recurse
+ 14   8 CBra 2
+ 19     \1
+ 22   8 Ket
+ 25  19 Ket
+ 28  25 Ket
+ 31  31 Ket
+ 34     End
+------------------------------------------------------------------
+
 /-- End of testinput11 --/
--- a/pcre/testdata/testoutput12
+++ b/pcre/testdata/testoutput12
@ -176,4 +176,12 @@ No match, mark = m (JIT)
    12345678abcd
 0: 12345678abcd (JIT)

+/-- Test pattern compilation --/ 
+
+/(?:a|b|c|d|e)(?R)/S++
+
+/(?:a|b|c|d|e)(?R)(?R)/S++
+
+/(a(?:a|b|c|d|e)b){8,16}/S++
+
 /-- End of testinput12 --/
--- a/pcre/testdata/testoutput2
+++ b/pcre/testdata/testoutput2
@ -561,7 +561,7 @@ Failed: assertion expected after (?( at offset 3
 Failed: reference to non-existent subpattern at offset 7

 /(?(?<ab))/
-Failed: syntax error in subpattern name (missing terminator) at offset 7
+Failed: assertion expected after (?( at offset 3

 /((?s)blah)\s+\1/I
 Capturing subpattern count = 1
@ -1566,30 +1566,35 @@ Need char = 'b'

 /a(?(1)b)(.)/I
 Capturing subpattern count = 1
+Max back reference = 1
 No options
 First char = 'a'
 No need char

 /a(?(1)bag|big)(.)/I
 Capturing subpattern count = 1
+Max back reference = 1
 No options
 First char = 'a'
 Need char = 'g'

 /a(?(1)bag|big)*(.)/I
 Capturing subpattern count = 1
+Max back reference = 1
 No options
 First char = 'a'
 No need char

 /a(?(1)bag|big)+(.)/I
 Capturing subpattern count = 1
+Max back reference = 1
 No options
 First char = 'a'
 Need char = 'g'

 /a(?(1)b..|b..)(.)/I
 Capturing subpattern count = 1
+Max back reference = 1
 No options
 First char = 'a'
 Need char = 'b'
@ -3379,24 +3384,28 @@ Need char = 'a'

 /(?(1)ab|ac)(.)/I
 Capturing subpattern count = 1
+Max back reference = 1
 No options
 First char = 'a'
 No need char

 /(?(1)abz|acz)(.)/I
 Capturing subpattern count = 1
+Max back reference = 1
 No options
 First char = 'a'
 Need char = 'z'

 /(?(1)abz)(.)/I
 Capturing subpattern count = 1
+Max back reference = 1
 No options
 No first char
 No need char

 /(?(1)abz)(1)23/I
 Capturing subpattern count = 1
+Max back reference = 1
 No options
 No first char
 Need char = '3'
@ -5605,6 +5614,10 @@ No match
    123456\P
 No match

+//KF>/dev/null
+Compiled pattern written to /dev/null
+Study data written to /dev/null
+
 /abc/IS>testsavedregex
 Capturing subpattern count = 0
 No options
@ -6336,6 +6349,7 @@ No need char

 /^(?P<A>a)?(?(A)a|b)/I
 Capturing subpattern count = 1
+Max back reference = 1
 Named capturing subpatterns:
  A   1
 Options: anchored
@ -6353,6 +6367,7 @@ No match

 /(?:(?(ZZ)a|b)(?P<ZZ>X))+/I
 Capturing subpattern count = 1
+Max back reference = 1
 Named capturing subpatterns:
  ZZ   1
 No options
@ -6370,6 +6385,7 @@ Failed: reference to non-existent subpattern at offset 9

 /(?:(?(ZZ)a|b)(?(ZZ)a|b)(?P<ZZ>X))+/I
 Capturing subpattern count = 1
+Max back reference = 1
 Named capturing subpatterns:
  ZZ   1
 No options
@ -6381,6 +6397,7 @@ Need char = 'X'

 /(?:(?(ZZ)a|\(b\))\\(?P<ZZ>X))+/I
 Capturing subpattern count = 1
+Max back reference = 1
 Named capturing subpatterns:
  ZZ   1
 No options
@ -10226,6 +10243,7 @@ No starting char list
  (?(1)|.)                    # check that there was an empty component
  /xiIS
 Capturing subpattern count = 1
+Max back reference = 1
 Options: anchored caseless extended
 No first char
 Need char = ':'
@ -10255,6 +10273,7 @@ Failed: different names for subpatterns of the same number are not allowed at of
    b(?<quote> (?<apostrophe>')|(?<realquote>")) ) 
    (?('quote')[a-z]+|[0-9]+)/JIx
 Capturing subpattern count = 6
+Max back reference = 1
 Named capturing subpatterns:
  apostrophe   2
  apostrophe   5
@ -10317,6 +10336,7 @@ No match
        End
 ------------------------------------------------------------------
 Capturing subpattern count = 4
+Max back reference = 4
 Named capturing subpatterns:
  D   4
  D   1
@ -10364,6 +10384,7 @@ No match
        End
 ------------------------------------------------------------------
 Capturing subpattern count = 4
+Max back reference = 1
 Named capturing subpatterns:
  A   1
  A   4
@ -10486,6 +10507,7 @@ No starting char list
    
 /()i(?(1)a)/SI 
 Capturing subpattern count = 1
+Max back reference = 1
 No options
 No first char
 Need char = 'i'
@ -14206,4 +14228,199 @@ Failed: digits missing in \x{} or \o{} at offset 3
 /\x{whatever}/
 Failed: non-hex character in \x{} (closing brace missing?) at offset 3

+"((?=(?(?=(?(?=(?(?=()))))))))"
+    a
+ 0: 
+ 1: 
+ 2: 
+
+"(?(?=)==)(((((((((?=)))))))))"
+    a
+No match
+
+/^(?:(a)|b)(?(1)A|B)/I
+Capturing subpattern count = 1
+Max back reference = 1
+Options: anchored
+No first char
+No need char
+    aA123\O3
+Matched, but too many substrings
+ 0: aA
+    aA123\O6
+ 0: aA
+ 1: a
+
+'^(?:(?<AA>a)|b)(?(<AA>)A|B)'
+    aA123\O3
+Matched, but too many substrings
+ 0: aA
+    aA123\O6
+ 0: aA
+ 1: a
+
+'^(?<AA>)(?:(?<AA>a)|b)(?(<AA>)A|B)'J
+    aA123\O3
+Matched, but too many substrings
+ 0: aA
+    aA123\O6
+Matched, but too many substrings
+ 0: aA
+ 1: 
+
+'^(?:(?<AA>X)|)(?:(?<AA>a)|b)\k{AA}'J
+    aa123\O3
+Matched, but too many substrings
+ 0: aa
+    aa123\O6
+Matched, but too many substrings
+ 0: aa
+ 1: <unset>
+
+/(?<N111>(?J)(?<N111>1(111111)11|)1|1|)(?(<N111>)1)/
+
+/(?(?=0)?)+/
+Failed: nothing to repeat at offset 7
+
+/(?(?=0)(?=00)?00765)/
+     00765
+ 0: 00765
+
+/(?(?=0)(?=00)?00765|(?!3).56)/
+     00765
+ 0: 00765
+     456
+ 0: 456
+     ** Failers
+No match
+     356   
+No match
+
+'^(a)*+(\w)'
+    g
+ 0: g
+ 1: <unset>
+ 2: g
+    g\O3
+Matched, but too many substrings
+ 0: g
+
+'^(?:a)*+(\w)'
+    g
+ 0: g
+ 1: g
+    g\O3
+Matched, but too many substrings
+ 0: g
+
+//C
+    \O\C+
+Callout 255: last capture = -1
+--->
+ +0 ^    
+Matched, but too many substrings
+
+"((?2){0,1999}())?"
+
+/((?+1)(\1))/BZ
+------------------------------------------------------------------
+        Bra
+        Once
+        CBra 1
+        Recurse
+        CBra 2
+        \1
+        Ket
+        Ket
+        Ket
+        Ket
+        End
+------------------------------------------------------------------
+
+/(?(?!)a|b)/
+    bbb
+ 0: b
+    aaa 
+No match
+
+"((?2)+)((?1))"
+
+"(?(?<E>.*!.*)?)"
+Failed: assertion expected after (?( at offset 3
+
+"X((?2)()*+){2}+"BZ
+------------------------------------------------------------------
+        Bra
+        X
+        Once
+        CBra 1
+        Recurse
+        Braposzero
+        SCBraPos 2
+        KetRpos
+        Ket
+        CBra 1
+        Recurse
+        Braposzero
+        SCBraPos 2
+        KetRpos
+        Ket
+        Ket
+        Ket
+        End
+------------------------------------------------------------------
+
+"X((?2)()*+){2}"BZ
+------------------------------------------------------------------
+        Bra
+        X
+        CBra 1
+        Recurse
+        Braposzero
+        SCBraPos 2
+        KetRpos
+        Ket
+        CBra 1
+        Recurse
+        Braposzero
+        SCBraPos 2
+        KetRpos
+        Ket
+        Ket
+        End
+------------------------------------------------------------------
+
+"(?<=((?2))((?1)))"
+Failed: lookbehind assertion is not fixed length at offset 17
+
+/(?<=\Ka)/g+
+    aaaaa
+ 0: a
+ 0+ aaaa
+ 0: a
+ 0+ aaaa
+ 0: a
+ 0+ aaa
+ 0: a
+ 0+ aa
+ 0: a
+ 0+ a
+ 0: a
+ 0+ 
+
+/(?<=\Ka)/G+
+    aaaaa
+ 0: a
+ 0+ aaaa
+ 0: a
+ 0+ aaa
+ 0: a
+ 0+ aa
+ 0: a
+ 0+ a
+ 0: a
+ 0+ 
+
+/((?2){73}(?2))((?1))/
+
 /-- End of testinput2 --/
--- a/pcre/testdata/testoutput4
+++ b/pcre/testdata/testoutput4
@ -1271,4 +1271,10 @@ No match
    #\x{10000}#\x{100}#\x{10ffff}#
 0: #\x{10000}#\x{100}#\x{10ffff}#

+"[\S\V\H]"8
+
+/\C(\W?ſ)'?{{/8
+    \\C(\\W?ſ)'?{{
+No match
+
 /-- End of testinput4 --/
--- a/pcre/testdata/testoutput5
+++ b/pcre/testdata/testoutput5
@ -1897,4 +1897,49 @@ Failed: disallowed Unicode code point (>= 0xd800 && <= 0xdfff) at offset 5
        End
 ------------------------------------------------------------------

+/[^\xff]*PRUNE:\x{100}abc(xyz(?1))/8DZ
+------------------------------------------------------------------
+        Bra
+        [^\x{ff}]*
+        PRUNE:\x{100}abc
+        CBra 1
+        xyz
+        Recurse
+        Ket
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 1
+Options: utf
+No first char
+Need char = 'z'
+
+/(?<=\K\x{17f})/8g+
+    \x{17f}\x{17f}\x{17f}\x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}\x{17f}\x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}\x{17f}\x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}\x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}
+ 0: \x{17f}
+ 0+ 
+
+/(?<=\K\x{17f})/8G+
+    \x{17f}\x{17f}\x{17f}\x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}\x{17f}\x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}\x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}
+ 0: \x{17f}
+ 0+ 
+
 /-- End of testinput5 --/
--- a/pcre/testdata/testoutput6
+++ b/pcre/testdata/testoutput6
@ -2461,4 +2461,12 @@ No match
    scat
 0: sc

+/[A-`]/i8
+    abcdefghijklmno
+ 0: a
+
+/\C\X*QT/8
+    Ӆ\x0aT
+No match
+
 /-- End of testinput6 --/
--- a/pcre/testdata/testoutput8
+++ b/pcre/testdata/testoutput8
@ -7785,4 +7785,10 @@ Matched, but offsets vector is too small to show all matches
    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
 0: NON QUOTED "QUOT""ED" AFTER 

+/(?(?!)a|b)/
+    bbb
+ 0: b
+    aaa 
+No match
+
 /-- End of testinput8 --/