Extract Strings (1 per object)

Description

Extracts string values from an object for a specific property. For example, it can be used to extract the title from a document, or the name of a person. The property should be provided as a parameter. With additional blocks, the extracted values can be transformed or used in a retrieval model. This variant extracts exactly 1 string per object:

  • If an object has more than 1 string for a given property, the "best" one is chosen
  • If an object has no string for a given property, a default value is used

Input

  • SOURCE [OBJ]: a list of objects

Output

  • PAIR [OBJ,STRING]: for each object in the input source, the extracted string value is provided as the second column in [OBJ,STRING]. When multiple values can be extracted for an object, each object-value pair is returned as a separate result.
  • RESULT [STRING]: the extracted values, disjoint from their parent object. Use the score aggregation parameter to define how occurrences of the same value are handled.

Parameters

  • Property: the property to extract the values from. Use * to extract values from all properties
  • Use sub-properties: when set to true, the values of all sub properties are also included. Sub-properties can be defined in the data with the rdfs:subPropertyOf relation.
  • Language: when a language is selected, only the strings in this language are extracted. This uses the language tags that are defined in the data.
  • DefaultString: when an object has no value for the given property, use this string.