Module lessons (4/4)
Writing ReDoS-safe patterns
ReDoS (Regular expression Denial of Service) is a class of attacks where a carefully crafted input makes the matching time of a regex explode. The JS engine uses backtracking: if a pattern is "ambiguous", on an unlucky input it can perform an exponential number of attempts.
The classic catastrophic pattern
Pattern: ^(a+)+$
Input: "aaaaaaaaaaaaaaaaaaaaab"To match, the engine tries every possible partition of "a" between
the two + quantifiers. On 20 "a"s that's 2^20 = ~1 million attempts.
On 30 "a"s, a billion.
Other dangerous examples:
(a*)*b-- "nested stars".(a|a)*-- overlapping alternatives.(\w+)+@-- greedy groups with an outer quantifier.
How to spot the danger
Look for these structures:
- Quantifier inside a quantifier:
(...+)+,(...*)*,(...{n,})+. - Overlapping alternatives under a quantifier:
(a|ab)*,(\w|\d)+. - Lookahead/lookbehind with complex alternatives repeated.
Defensive rewriting
Dangerous: ^(\w+\s+)+$
Safe: ^(?:\w+\s+)*\w+$The safe version separates the base case from the repeated case, removing the ambiguity. Other techniques:
- Possessive quantifiers (
a++) -- not supported in JS, but available in other engines. - Atomic patterns (
(?>...)) -- not in JS. - Limit the length of the input before matching.
- Timeout on matching (this course's editor uses a worker with a 500 ms timeout).
Recommendations for ReDoS-safe patterns
- Ensure that options in alternation are mutually exclusive (e.g. do not use
(a|a+)). - Avoid applying nested quantifiers to overlapping expressions (e.g.
(\\s*)*). - Always prefer specific negated character classes (
[^\\n]*) over the generic wildcard dot (.*).
Try it
Identify patterns of repeated letters -- but in a ReDoS-safe way. Use a single character class, NOT nested groups like (a+)+.
Show hint
Replace the dangerous (a+)+ with a simple a+. The result is the same, but safe.
Solution available after 3 attempts
Review exercise
Recognize a sequence of words separated by spaces, in a ReDoS-safe way. Use the structure `(?:word+spaces+)*word` instead of `(word+space+)+`.
Show hint
Replace the capturing group (\\w+\\s+)+ with the disjoint version (?:\\w+\\s+)*\\w+.
Solution available after 3 attempts
Additional challenge
Make the vulnerable pattern `(\w+\s*)+:` which causes ReDoS safe, by removing the overlapping whitespace recursion.
Show hint
Move the space so it doesn't overlap with the group repetition: (?:\w+\s+)*\w+:
Solution available after 3 attempts