Stem
Description
Extracts the stem (available for various languages) from all strings in a [OBJ,STRING] input.
Strings are expected to be single words (see Tokenize block).
Input
SOURCE [OBJ,STRING]: a 2-column input with an object-string pair. Typically obtained with theExtract stringandTokenizeblocks.
Output
RESULT [OBJ,STRING]: the pairs fromSOURCE, where the string has been modifiedSTRINGS [STRING]: the modified strings, without the object they were paired to
Parameters
Stemming: strings (single words) can be stemmed for a specific language or left as they are
Output scores can be aggregated and/or normalized.