Bài học theo mô-đun (1/4)
mẫu chữ
In a regex, "normal" alphanumeric characters simply match themselves: write
ciao as the pattern and the engine will look for the substring ciao inside
the text. Nothing more magical: a left-to-right scan position by position.
Pattern: ciao
Sample: Buongiorno, ciao mondo! Ti dico anche ciao.
^^^^ ^^^^A match always carries two key pieces of information:
- the matched text (here
ciao); - the index (0-based offset) where it starts in the sample (here
12and38).
By default regex are case-sensitive: ciao does not match Ciao or
CIAO. To ignore case add the i flag (case-insensitive).
How the engine works
The regex engine inspects the text one character at a time. When searching for the literal pattern ciao, it first looks for the letter c. Once found, it verifies if the next character is i, then a, and finally o. If any of these matches fail, the engine backtracks to the next starting position and starts looking for c again.
Literal patterns and the g flag
Without the g (global) flag the engine stops at the first match it finds and halts. With
g it continues and collects every subsequent match until the end of the string. In this course we will
almost always turn g on: we want to see every match present in the text.
Try it
Find every exact (case-sensitive) occurrence of the word `ciao` in the sample. Hint: remember the `g` flag.
Hiển thị gợi ý
The pattern is the word itself. Uppercase letters must not match: no i flag.
Giải pháp khả dụng sau 3 lần thử
Review exercise
Find every occurrence of the word `errore` in the text, ignoring upper/lower case.
Hiển thị gợi ý
Same word as the pattern, but add the i flag (case-insensitive) alongside g.
Giải pháp khả dụng sau 3 lần thử
Additional challenge
Identify and collect all exact occurrences of the word `WARNING` (uppercase, case-sensitive) in the sample log text.
Hiển thị gợi ý
Look directly for the literal string WARNING, making sure not to activate the i flag.
Giải pháp khả dụng sau 3 lần thử