跳转到主要内容
eLearner.app
模块 2 · 第 3 课(共 4)课程中的7/32~10 min
模块课程(3/4)

贪婪与懒惰

By default quantifiers are greedy: they consume as much as possible while keeping the pattern valid. Adding ? after a quantifier (*?, +?, ??, {n,m}?) gives you the lazy version: it consumes as little as possible.

Code
Sample: <b>uno</b> e <i>due</i>

Greedy pattern  <.*>     matches:  <b>uno</b> e <i>due</i>   (everything)
Lazy pattern    <.*?>    matches:  <b>           </b>
                                   <i>           </i>        (4 matches with the g flag)

The exact same pattern, a single character of difference (? added to the quantifier), totally different results.

When it matters

  • To extract the content between delimiters (HTML tags, quotes, parentheses) the lazy version is almost always the right one.
  • To match up to end of line you usually want greedy (.*).

Greedy vs Lazy strategies in the engine

A greedy quantifier consumes as much text as possible and backtracks only if forced. By adding ? (lazy), the engine consumes the absolute minimum and advances one character at a time searching for the next match of the pattern.

Try it

锻炼#regex.m2.l3.e1
尝试:0加载中...

Extract every HTML tag from the sample (e.g. `<b>`, `</b>`, `<i>`, `</i>`) using the lazy version `.*?`.

正在加载编辑器...
显示提示

Greedy <.*> matches from the start of the first tag to the end of the last one. Lazy <.*?> stops at the first > it encounters.

3 次尝试后可用的解决方案

Review exercise

锻炼#regex.m2.l3.e2
尝试:0加载中...

Extract every string between double quotes in the text (e.g. "ciao", "mondo"). Use the lazy version of the quantifier to avoid jumping to later closings.

正在加载编辑器...
显示提示

".*?" stops at the first closing double quote.

3 次尝试后可用的解决方案

Additional challenge

锻炼#regex.m2.l3.e3
尝试:0加载中...

Extract all text blocks enclosed in square brackets (e.g. `[text]`), including the brackets, using a lazy quantifier so as not to merge separate blocks.

正在加载编辑器...
显示提示

Use \[ for the open bracket, then .*? and finally \] for the closed bracket.

3 次尝试后可用的解决方案