मुख्य सामग्री पर जाएं
eLearner.app
मॉड्यूल 8 · पाठ 4 का 4पाठ्यक्रम में 32/32~14 min
मॉड्यूल पाठ (4/4)

ReDoS-सुरक्षित पैटर्न लिखना

ReDoS (Regular expression Denial of Service) is a class of attacks where a carefully crafted input makes the matching time of a regex explode. The JS engine uses backtracking: if a pattern is "ambiguous", on an unlucky input it can perform an exponential number of attempts.

The classic catastrophic pattern

Code
Pattern: ^(a+)+$
Input:   "aaaaaaaaaaaaaaaaaaaaab"

To match, the engine tries every possible partition of "a" between the two + quantifiers. On 20 "a"s that's 2^20 = ~1 million attempts. On 30 "a"s, a billion.

Other dangerous examples:

  • (a*)*b -- "nested stars".
  • (a|a)* -- overlapping alternatives.
  • (\w+)+@ -- greedy groups with an outer quantifier.

How to spot the danger

Look for these structures:

  1. Quantifier inside a quantifier: (...+)+, (...*)*, (...{n,})+.
  2. Overlapping alternatives under a quantifier: (a|ab)*, (\w|\d)+.
  3. Lookahead/lookbehind with complex alternatives repeated.

Defensive rewriting

Code
Dangerous:  ^(\w+\s+)+$
Safe:       ^(?:\w+\s+)*\w+$

The safe version separates the base case from the repeated case, removing the ambiguity. Other techniques:

  • Possessive quantifiers (a++) -- not supported in JS, but available in other engines.
  • Atomic patterns ((?>...)) -- not in JS.
  • Limit the length of the input before matching.
  • Timeout on matching (this course's editor uses a worker with a 500 ms timeout).

Recommendations for ReDoS-safe patterns

  1. Ensure that options in alternation are mutually exclusive (e.g. do not use (a|a+)).
  2. Avoid applying nested quantifiers to overlapping expressions (e.g. (\\s*)*).
  3. Always prefer specific negated character classes ([^\\n]*) over the generic wildcard dot (.*).

Try it

व्यायाम#regex.m8.l4.e1
प्रयास: 0लोड हो रहा है...

Identify patterns of repeated letters -- but in a ReDoS-safe way. Use a single character class, NOT nested groups like (a+)+.

संपादक लोड हो रहा है...
संकेत दिखाएँ

Replace the dangerous (a+)+ with a simple a+. The result is the same, but safe.

3 प्रयासों के बाद समाधान उपलब्ध है

Review exercise

व्यायाम#regex.m8.l4.e2
प्रयास: 0लोड हो रहा है...

Recognize a sequence of words separated by spaces, in a ReDoS-safe way. Use the structure `(?:word+spaces+)*word` instead of `(word+space+)+`.

संपादक लोड हो रहा है...
संकेत दिखाएँ

Replace the capturing group (\\w+\\s+)+ with the disjoint version (?:\\w+\\s+)*\\w+.

3 प्रयासों के बाद समाधान उपलब्ध है

Additional challenge

व्यायाम#regex.m8.l4.e3
प्रयास: 0लोड हो रहा है...

Make the vulnerable pattern `(\w+\s*)+:` which causes ReDoS safe, by removing the overlapping whitespace recursion.

संपादक लोड हो रहा है...
संकेत दिखाएँ

Move the space so it doesn't overlap with the group repetition: (?:\w+\s+)*\w+:

3 प्रयासों के बाद समाधान उपलब्ध है