Module lessons (3/4)
The wildcard: the dot `.`
The dot . in regex is the wildcard: it matches any single
character\u2026 with one important exception: it does NOT match the
newline (\n).
Pattern: c.t
Sample: cat cot cut c@t c\nt
^^^ ^^^ ^^^ ^^^Three letters: a c, any character, a t. No newline in between, so c\nt
is not matched (by default).
The dot is extremely useful but also dangerous: used without discipline it
captures more than you intended. Combined with the quantifiers in module 2
(.*, .+?) it is the source of 90% of patterns that "don't work the way I
expected".
The s flag: "dotAll"
With the s flag (also called dotAll or single-line) the dot matches
every character, newline included. Useful to extract blocks that span
multiple lines.
Pattern: <p>.*</p>
Flag: gs
Sample: <p>prima\nseconda</p>
^^^^^^^^^^^^^^^^^^^^^Limits and behavior of the dot wildcard
The dot . is a powerful wildcard, but by default it does not match newline characters (\\n). If you want the dot to match absolutely everything, including newlines, you must enable the s (dotAll) flag. Be careful when combining the dot with quantifiers (.*), as it tends to consume too much text (greedy behavior).
Try it
Find every triplet of characters delimited by parentheses, e.g. `(abc)`, `(xyz)`. Use the wildcard for the 3 inner characters.
Show hint
Three dots for three arbitrary characters. The parentheses are meta-characters: they must be escaped with \\.
Solution available after 3 attempts
Review exercise
Extract the block between `[INIZIO]` and `[FINE]`, which can span multiple lines. You will need the `s` flag so the dot also matches newlines, and the 'lazy' version of the quantifier (`.*?`, module 2).
Show hint
Without the s flag, the dot stops at end of line: add it. The form .*? (lazy) stops the match at the first [FINE].
Solution available after 3 attempts
Additional challenge
Find all 3-character sequences starting with `c` and ending with `t` (e.g. `cat`, `cot`, `c-t`).
Show hint
The pattern uses the dot '.' to represent the wildcard middle character.
Solution available after 3 attempts