Module lessons (1/4)
Positive lookahead: `(?=...)`
A positive lookahead (?=...) is a zero-width assertion:
it checks that a certain pattern follows the current position, but does
not consume those characters.
Pattern: \d+(?= euro)
Sample: Prezzo 100 euro, sconto 25 euro, totale 75 dollari.
^^^ ^^The lookahead (?= euro) requires that the digits be followed by euro, but
the match includes only the digits. 75 dollari does not match (no euro).
Why "zero-width"
Think of the lookahead as a condition on the position, not as a part of the match:
- The match stops before the lookahead.
- The position "after the match" is the start of the text inspected by the lookahead.
- The next match attempt with the
gflag will resume from there.
This makes it perfect for extracting values without their context: prices without the currency, words before a punctuation mark, and so on.
Lookahead features and zero-width scan
The positive lookahead (?=...) guarantees that the specified pattern follows the current point, but scanning resumes from the position before the lookahead. This prevents "consuming" parts of the text that might be needed for subsequent matches.
Try it
Extract ONLY the digits of the euro prices (the sequences of digits followed by ' euro'). No currency in the match.
Show hint
Move ' euro' inside a lookahead (?= euro): the string will not be part of the match.
Solution available after 3 attempts
Review exercise
Find every word immediately followed by a colon `:` (but without including the colon in the match).
Show hint
Same logic: the `:` is not in the match, but it's a position condition.
Solution available after 3 attempts
Additional challenge
Find every word (e.g. function name) followed immediately by an open parenthesis `(`, excluding the parenthesis from the match.
Show hint
Move \( (escaped open parenthesis) inside the positive lookahead (?=...).
Solution available after 3 attempts