Replace with RegEx [Strings]
Description
Transforms input strings using a regular expression replacement.
Input
SOURCE [STRING]: Input strings
Output
RESULT [STRING]: the modified strings
Parameters
Pattern RegEx: the regular expression to use for the match inSOURCE.Replacement RegEx: the regular expression to use for the replacement inRESULT.Occurrences:First: replace only the first occurrence in each string in inputAll: replace all the occurrences in each string in input
Case-sensitive: if set tofalse, upper/lower case is ignored
Output scores can be aggregated and/or normalized.
Regular Expressions
Regular expressions are internally evaluated by a PCRE engine. For a syntax reference, see this page. For a 1-page syntax reference, see this cheat-sheet.
Some of the Most Common Questions and Mistakes
- Regular expressions are different from glob patterns using wildcards.
In particular,
*does NOT mean "anything",.*does. - All special characters (
. * + ? | \ ( ) [ ] ^ $) must be escaped (prefixed with\) when they are meant literally, in thePattern RegEx. They are always meant literally (thus, no escaping!) in theReplacent RegEx(except group references, see below) - Capturing groups are indicated by parentheses, and back-references by either
\nor$n, withnbeing the n-th group in the pattern. - Parentheses can also be used to group sub-expressions together, for example in choices:
(one|two|three). To use parentheses only for grouping and not capturing, use the?:prefix, as in(?:one|two|three). ^indicates the beginning of an input text, or negation when used inside a multiple choice (e.g.,[^\d-_]).$indicates the end of an input text.\bindicates a word-boundary (spaces, punctuation, etc.).
Examples
- Normalize spaces (with
Occurrences = All)Pattern RegEx:\s+Replacement RegEx:⎵(a single space)
- Turn
Smith, JohnintoJohn Smith:Pattern RegEx:^([^,]+)\s*,\s*(.+)$Replacement RegEx:$2 $1
- Extract any day of the week (with
Case-sensitive = false):Pattern RegEx:.*\b((?:mon|tue|wednes|thurs|fri|sat|sun)day)\b.*Replacement RegEx:$1