Module lessons (1/4)
Literal patterns
In a regex, "normal" alphanumeric characters simply match themselves: write
ciao as the pattern and the engine will look for the substring ciao inside
the text. Nothing more magical: a left-to-right scan position by position.
Pattern: ciao
Sample: Buongiorno, ciao mondo! Ti dico anche ciao.
^^^^ ^^^^A match always carries two key pieces of information:
- the matched text (here
ciao); - the index (0-based offset) where it starts in the sample (here
12and38).
By default regex are case-sensitive: ciao does not match Ciao or
CIAO. To ignore case add the i flag (case-insensitive).
How the engine works
The regex engine inspects the text one character at a time. When searching for the literal pattern ciao, it first looks for the letter c. Once found, it verifies if the next character is i, then a, and finally o. If any of these matches fail, the engine backtracks to the next starting position and starts looking for c again.
Literal patterns and the g flag
Without the g (global) flag the engine stops at the first match it finds and halts. With
g it continues and collects every subsequent match until the end of the string. In this course we will
almost always turn g on: we want to see every match present in the text.
Try it
Find every exact (case-sensitive) occurrence of the word `ciao` in the sample. Hint: remember the `g` flag.
Show hint
The pattern is the word itself. Uppercase letters must not match: no i flag.
Solution available after 3 attempts
Review exercise
Find every occurrence of the word `errore` in the text, ignoring upper/lower case.
Show hint
Same word as the pattern, but add the i flag (case-insensitive) alongside g.
Solution available after 3 attempts
Additional challenge
Identify and collect all exact occurrences of the word `WARNING` (uppercase, case-sensitive) in the sample log text.
Show hint
Look directly for the literal string WARNING, making sure not to activate the i flag.
Solution available after 3 attempts