Module lessons (2/4)
Basic classes: \d \w \s
Beyond literal characters, regex give you predefined classes: shortcuts for "any digit", "any word character", "any whitespace". They are the first step toward truly useful patterns.
| Class | Matches |
|---|---|
\d | A digit (0-9) |
\w | A word character (A-Za-z0-9_) |
\s | A whitespace (space, tab, newline, etc.) |
\D | NOT a digit |
\W | NOT a word character |
\S | NOT a whitespace |
Each one matches a single character. To match "one or more characters"
you need quantifiers (+, in module 2), but we will use + right away
because "find me all the numbers" is too useful to wait for.
Pattern: \d+
Sample: Ho 3 mele, 12 pere e 100 prugne.
^ ^^ ^^^\d+ means one or more consecutive digits: it matches 3, 12 and
100 as three separate matches (with the g flag).
\w: word characters
\w is equivalent to [A-Za-z0-9_] (ASCII letters, digits, underscore). It
does NOT include accents, Greek letters or emoji: for those you need Unicode
property escapes (\p{L}, module 5).
Deep dive into basic classes and negations
Predefined classes \\d, \\w, \\s speed up coding. Uppercase versions (\\D, \\W, \\S) negate the set. For instance, \\S+ captures any block of text devoid of spaces (like whole words including punctuation).
The engine handles \\s by matching not only the standard space, but also tabs (\\t) and newline characters (\\n or \\r).
Try it
Extract every sequence of consecutive digits (numbers) from the text. Use the `\\d` class with the `+` quantifier.
Show hint
\\d+ captures one or more digits. With the g flag you collect every match.
Solution available after 3 attempts
Review exercise
Find every 'word' in the text: a continuous sequence of word characters (`\\w+`).
Show hint
\\w+ matches sequences of letters/digits/underscore. Punctuation is skipped.
Solution available after 3 attempts
Additional challenge
Find all sequences of one or more consecutive whitespace characters in the text (including spaces, tabs, and newlines).
Show hint
Use the \s class with the + quantifier to capture consecutive whitespace sequences.
Solution available after 3 attempts