贪婪与懒惰

By default quantifiers are greedy: they consume as much as possible while keeping the pattern valid. Adding ? after a quantifier (*?, +?, ??, {n,m}?) gives you the lazy version: it consumes as little as possible.

Code

Sample: <b>uno</b> e <i>due</i>

Greedy pattern  <.*>     matches:  <b>uno</b> e <i>due</i>   (everything)
Lazy pattern    <.*?>    matches:  <b>           </b>
                                   <i>           </i>        (4 matches with the g flag)

The exact same pattern, a single character of difference (? added to the quantifier), totally different results.

When it matters

To extract the content between delimiters (HTML tags, quotes, parentheses) the lazy version is almost always the right one.
To match up to end of line you usually want greedy (.*).

Greedy vs Lazy strategies in the engine

A greedy quantifier consumes as much text as possible and backtracks only if forced. By adding ? (lazy), the engine consumes the absolute minimum and advances one character at a time searching for the next match of the pattern.

Try it

锻炼#regex.m2.l3.e1

尝试：0加载中...

Extract every HTML tag from the sample (e.g. `<b>`, `</b>`, `<i>`, `</i>`) using the lazy version `.*?`.

正在加载编辑器...

显示提示

Greedy <.*> matches from the start of the first tag to the end of the last one. Lazy <.*?> stops at the first > it encounters.

3 次尝试后可用的解决方案

Review exercise

锻炼#regex.m2.l3.e2

尝试：0加载中...

Extract every string between double quotes in the text (e.g. "ciao", "mondo"). Use the lazy version of the quantifier to avoid jumping to later closings.

正在加载编辑器...

显示提示

".*?" stops at the first closing double quote.

3 次尝试后可用的解决方案

Additional challenge

锻炼#regex.m2.l3.e3

尝试：0加载中...

Extract all text blocks enclosed in square brackets (e.g. `[text]`), including the brackets, using a lazy quantifier so as not to merge separate blocks.

正在加载编辑器...

显示提示

Use \[ for the open bracket, then .*? and finally \] for the closed bracket.

3 次尝试后可用的解决方案