Relation:regex
Compares and replaces text based on a regular expression
Description
text regex pattern
regexreplace(text, pattern, replacepattern)
Parameters
text: the text to be matched
pattern: any valid regular expression
replace
columnlist: a comma separated list of valid names
replacepattern: replacing text as pattern, can catch subexpressions $1...
Regex cheat sheet
| pattern | description |
|---|---|
| . | Any character |
| [a-z] | Any character of set |
| [^a-z] | Any character not in set |
| \d \D | Digit, Not a Digit |
| \w \W | Word character [a-zA-Z0-9_], Not alphanumeric |
| \s \S | Space, Not space |
| \b \B | Word boundary, Not word boundary |
| \ \n \r \t \f \- | Newline, return, tab, formfeed - |
| (abc) | Subexpression, can be captured as $1 in replacepattern |
| x? | 0 or 1 x |
| x* | 0 or more x |
| x+ | 1 or more x |
| x+? | 1 or more x, not greedy |
| x{i,j} | i to j times x |
| a|b | a or b |
| ^ $ | Beginning and end |
Examples
Using the sample relation films.csv
read "films.csv"
select director regex "[Gg]od.*"
| film | director | year |
|---|---|---|
| A bout de souffle | Godard | 1960 |
| Pierrot le fou | Godard | 1965 |
| Week-End | Godard | 1967 |
read "films.csv"
extend article regexreplace(film,"(s?)([Ll]e|[Dd]er|[Dd]ie)(s)","$1the$3")
| film | director | year | article |
|---|---|---|---|
| A bout de souffle | Godard | 1960 | A bout de souffle |
| Tirez sur le pianiste | Truffaut | 1960 | Tirez sur the pianiste |
| Cléo de 5 à 7 | Varda | 1962 | Cléo de 5 à 7 |
| Jules et Jim | Truffaut | 1962 | Jules et Jim |
| Pierrot le fou | Godard | 1965 | Pierrot the fou |
| Week-End | Godard | 1967 | Week-End |
| Die verlorene Ehre der Katharina Blum | von Trotta | 1975 | the verlorene Ehre the Katharina Blum |
| Der starke Ferdinand | Kluge | 1976 | the starke Ferdinand |
| Sans toi ni loi | Varda | 1985 | Sans toi ni loi |
Comments
Bad regex pattern my crash the application
Use an webapp like regexpal to test patterns.
Reference: PCRE
