Pular para o conteúdo principal
eLearner.app
Módulo 2 · Lição 4 de 48/32 no curso~12 min
Lições do módulo (4/4)

Retrocesso: uma visão geral

When a greedy quantifier "overshoots", the engine performs backtracking: it steps back, tries another configuration, retries. On simple patterns it is invisible; on poorly written patterns it can become exponential and freeze the browser for seconds. It is the root of ReDoS (Regular expression Denial of Service).

Code
Pattern: .*\d
Sample:  abc123def
Steps:
  1.  .*  consumes everything: "abc123def"
  2.  \d  fails (end of string).
  3.  Backtrack: .* drops 'f' -> "abc123de", \d fails on 'f'.
  4.  Backtrack: .* drops 'ef' -> "abc123d", \d fails on 'e'.
  5.  ... and so on until \d finds a digit: match = "abc123" + "?"... actually
       the pattern matches "abc123" because .* leaves "def" and \d looks at '3'? No: the
       backtrack continues and .* ends up being "abc12", \d=3.  Final match: "abc123".

The engine always gets there, but if a greedy quantifier contains another greedy quantifier (e.g. (a+)+), the number of configurations to try can explode.

Catastrophic patterns

Code
Pattern: (a+)+b
Sample:  aaaaaaaaaaaaaaaaaaaaa

No trailing b → the pattern fails, but to get there the engine tries every possible partition of the as. Exponential growth: with 30 as you can wait for hours.

Perils of backtracking and ReDoS

When a pattern contains overlapping or nested quantifiers (e.g. (a+)+), the engine evaluates an exponential number of paths before failing. Replacing generic wildcards with excluding character classes dramatically reduces backtracking paths.

Try it

Exercício#regex.m2.l4.e1
Tentativas: 0Carregando…

Extract the content between square brackets for every occurrence in the text. Use the lazy version of the quantifier to avoid jumping to later closings.

Carregando editor…
Mostrar dica

\\[.*?\\] stops at the first closing ]: no heavy backtracking.

Solução disponível após 3 tentativas

Review exercise

Exercício#regex.m2.l4.e2
Tentativas: 0Carregando…

Find every `http://...` or `https://...` URL in the text, using the `?` quantifier for the optional `s` and `\\S+` for the non-space characters of the path.

Carregando editor…
Mostrar dica

https? means 'http' followed by an optional 's'. Then //, then \\S+ (non-spaces).

Solução disponível após 3 tentativas

Additional challenge

Exercício#regex.m2.l4.e3
Tentativas: 0Carregando…

Match strings enclosed in double quotes, avoiding excessive backtracking by using the negated class `[^"]*` instead of `.*`.

Carregando editor…
Mostrar dica

Replace the dot with the negated class [^"] to restrict the engine's backtracking paths.

Solução disponível após 3 tentativas