Most of the regular expression engines use backtracking to try all possible execution paths of the regular expression when evaluating
an input, in some cases it can cause performance issues, called catastrophic backtracking situations. In the worst case, the complexity
of the regular expression is exponential in the size of the input, this means that a small carefully-crafted input (like 20 chars) can trigger
catastrophic backtracking and cause a denial of service of the application. Super-linear regex complexity can lead to the same impact too
with, in this case, a large carefully-crafted input (thousands chars).
This rule detects regular expression patterns known to have potential performance issues:
Nested quantifiers which are quantifiers inside a group that is itself repeated by a quantifier (eg: /(a+)+/). There is a risk if you answered yes to any of those questions.
To avoid catastrophic backtracking situations, when possible:
{1,5} instead of
+ for instance. nested quantifiers to limit the number of way the inner group can be matched by the outer quantifier, for instance this
nested quantifier situation (ba+)+ doesn't cause performance issues, indeed, the inner group can be matched only if it exists exactly
one b char per repetition of the group. possessive quantifiers and atomic grouping. The first regex evaluation will never end in OpenJDK <= 9 and the second regex evaluation will never end in any versions of
OpenJDK:
java.util.regex.Pattern.compile("(a+)+").matcher(
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"+
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"+
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"+
"aaaaaaaaaaaaaaa!").matches(); // Sensitive
java.util.regex.Pattern.compile("(h|h|ih(((i|a|c|c|a|i|i|j|b|a|i|b|a|a|j))+h)ahbfhba|c|i)*").matcher(
"hchcchicihcchciiicichhcichcihcchiihichiciiiihhcchi"+
"cchhcihchcihiihciichhccciccichcichiihcchcihhicchcciicchcccihiiihhihihihi"+
"chicihhcciccchihhhcchichchciihiicihciihcccciciccicciiiiiiiiicihhhiiiihchccch"+
"chhhhiiihchihcccchhhiiiiiiiicicichicihcciciihichhhhchihciiihhiccccccciciihh"+
"ichiccchhicchicihihccichicciihcichccihhiciccccccccichhhhihihhcchchihih"+
"iihhihihihicichihiiiihhhhihhhchhichiicihhiiiiihchccccchichci").matches(); // Sensitive
Possessive quantifiers do not keep backtracking positions, thus can be used, if possible, to avoid performance issues:
java.util.regex.Pattern.compile("(a+)++").matcher(
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"+
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"+
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"+
"aaaaaaaaaaaaaaa!").matches(); // Compliant
java.util.regex.Pattern.compile("(h|h|ih(((i|a|c|c|a|i|i|j|b|a|i|b|a|a|j))+h)ahbfhba|c|i)*+").matcher(
"hchcchicihcchciiicichhcichcihcchiihichiciiiihhcchi"+
"cchhcihchcihiihciichhccciccichcichiihcchcihhicchcciicchcccihiiihhihihihi"+
"chicihhcciccchihhhcchichchciihiicihciihcccciciccicciiiiiiiiicihhhiiiihchccch"+
"chhhhiiihchihcccchhhiiiiiiiicicichicihcciciihichhhhchihciiihhiccccccciciihh"+
"ichiccchhicchicihihccichicciihcichccihhiciccccccccichhhhihihhcchchihih"+
"iihhihihihicichihiiiihhhhihhhchhichiicihhiiiiihchccccchichci").matches(); // Compliant