Filter/Match by String
Description
Finds matches between the STRING-columns in the inputs.
Various comparison options can be chosen: equals, contains, startsWith, endsWith or edit-distance.
Input
A [OBJ,STRING]: a list of candidates, in which theSTRING-column will be used for comparison and theOBJ-column will be the resultB [STRING]: a list of candidate strings, to be used for comparison
Output
FILTER [OBJ,STRING]: the filtered[OBJ,STRING]fromAMATCH [OBJ,STRING,OBJ]: the matched[OBJ,STRING]fromAand[STRING]fromB
Parameters
Comparison: Comparison function to useequal: the strings must be equalcontains: the string inBmust be contained inAcontainsWholeWord: the string inBmust be contained inA, as a whole word (only punctuation/spaces around)startsWith: the string inAmust start withBendsWith: the string inAmust end withBlevenshtein: the string inAmay not have more thanMax edit-distancedifferences (character insertions or deletions) withB. The distance does not affect the score of the match.jaro-winkler: the strings inAandBmust have a Jaro-Winkler similarity score not smaller thanMin similarity. The distance does not affect the score of the match.
Case-sensitive: if set tofalse, upper/lower case is ignored
Output scores can be aggregated and/or normalized.