IDF

Description

Computes IDF of T in DT pairs Strings are expected to be tokens.

Input

  • DT [OBJ,STRING]: a list of object-string pairs.

Output

  • Tidf [STRING]: IDF-weighted strings
  • DTidf [STRING]: IDF-weighted pairs

Parameters

  • Property Q: The lower the Q, the more "obscure terms" are being penalized compared to standard IDF. The range is ]0, 1]. Q=1 yields standard IDF. A suggested starting point s 0.6.

Output scores can be aggregated and/or normalized.