Match by Double (Blocking)

Description

Finds matches between the DOUBLE-columns in the inputs.

Input

  • A [OBJ,DOUBLE]: a list of candidates, in which the DOUBLE-column will be used for comparison and the OBJ-column will be the result
  • Candidates [OBJ,OBJ]: candidate pairs, only As and Bs that are in Candidates will be matched
  • B [OBJ,DOUBLE]: a list of candidates, in which the DOUBLE-column will be used for comparison and the OBJ-column will be the result

Output

  • RESULT [OBJ,OBJ]: the matched objects from A and B
  • NOTA [OBJ]: the objects from A that did not match with an item from B
  • NOTB [OBJ]: the objects from B that did not match with an item from A

Parameters

  • Comparison: Comparison function to use (=,!=,<,>,<=,>=,distance)
  • Max distance: when Comparison is set to distance, the match is valid only if the difference between the two doubles is not greater than this value
  • Slope: Positive value that determines the slope of the ranking curve. The higher the slope, the closer together the resulting scores.
  • Exclude self-matches: whether to emit the match if the objects in A and B are the same. Mostly useful when A and B come from the same source