Relation:regex

Compares and replaces text based on a regular expression

text regex pattern

regexreplace(text, pattern, replacepattern)

text: the text to be matched

pattern: any valid regular expression

replace

columnlist: a comma separated list of valid names

replacepattern: replacing text as pattern, can catch subexpressions $1...

pattern	description
.	Any character
[a-z]	Any character of set
[^a-z]	Any character not in set
\d \D	Digit, Not a Digit
\w \W	Word character [a-zA-Z0-9_], Not alphanumeric
\s \S	Space, Not space
\b \B	Word boundary, Not word boundary
\ \n \r \t \f \-	Newline, return, tab, formfeed -
(abc)	Subexpression, can be captured as $1 in replacepattern
x?	0 or 1 x
x*	0 or more x
x+	1 or more x
x+?	1 or more x, not greedy
x{i,j}	i to j times x
a\|b	a or b
^ $	Beginning and end

Using the sample relation films.csv

read "films.csv" select director regex "[Gg]od.*"

read "films.csv" extend article regexreplace(film,"(s?)([Ll]e|[Dd]er|[Dd]ie)(s)","$1the$3")

film	director	year	article
A bout de souffle	Godard	1960	A bout de souffle
Tirez sur le pianiste	Truffaut	1960	Tirez sur the pianiste
Cléo de 5 à 7	Varda	1962	Cléo de 5 à 7
Jules et Jim	Truffaut	1962	Jules et Jim
Pierrot le fou	Godard	1965	Pierrot the fou
Week-End	Godard	1967	Week-End
Die verlorene Ehre der Katharina Blum	von Trotta	1975	the verlorene Ehre the Katharina Blum
Der starke Ferdinand	Kluge	1976	the starke Ferdinand
Sans toi ni loi	Varda	1985	Sans toi ni loi

Bad regex pattern my crash the application

Use an webapp like regexpal to test patterns.

Reference: PCRE