Module lessons (2/4)
Word boundaries: `\b` `\B`
\b is a word boundary anchor: it matches the position between a
word character (\w) and a non-word character (\W, or the start/end of
the string). Like ^ and $, it does NOT consume characters.
Pattern: \bgatto\b
Sample: Il gatto e la gattina giocano.
^^^^^gatto matches only as a whole word: inside gattina the sequence
gatto is not there (the final o is missing), but it would not match
gatti either because the final i is a word character.
\B is the opposite: it matches a position that is NOT a word
boundary.
Finding "whole words"
The most typical use of \b is "match the word X only when isolated, not
as part of another word":
\bif\b matches 'if' but not 'sniff', 'gift', 'lifetime'.Word boundaries and non-word characters
The boundary \\b does not match any physical character; it is a position test. A \\b boundary exists between a \\w character and a non-\\w character (or start/end of text). The negation \\B asserts that the current position is not a word boundary.
Try it
Find every occurrence of the whole word `cat` (case-insensitive). It must NOT match `category`, `concatenate`, `scatter`.
Show hint
Wrap 'cat' between two \\b: boundary at the start AND boundary at the end.
Solution available after 3 attempts
Review exercise
Find every integer that is NOT part of an identifier (e.g. `42` yes, but neither `var42` nor `42abc`). Use `\\b` on both sides.
Show hint
\\b\\d+\\b matches only 'isolated' digit sequences. abc42 has 'c' (word) before the 42.
Solution available after 3 attempts
Additional challenge
Find the sequence `cat` only if it starts a longer word, excluding when it appears as a whole word or at the end (e.g. match `catalog` but not `wildcat` or isolated `cat`).
Show hint
Use \b at the start of cat (word boundary) and \B at the end (non-word boundary).
Solution available after 3 attempts