Result Description
With the SPARQUE Data Virtualization Layer you can create a graph model of your dataset(s). But what do you give to your users when they access your data? This is where SPARQUE Desk comes in. In SPARQUE Desk you design your tailored search strategies.
There is one step before we can start. Your designed strategies return results (objects, resources, items, etc). The final part of data virtualization is defining the information returned for each result. We call these result descriptions. By default, all the attributes that are available, e.g., a name, a description, a date, etc., are returned. If this default is sufficient for your case, you do not need to define any result descriptions.
Including or Excluding Information
In some cases, however, you may not want to return all attributes. For example, if your data contains information that should not be shared with everyone - financial data is a common example. By defining a result description, you can exclude this information from what is returned.
In some other cases you may want to return more information than the attributes alone. Consider the case where the author of a document is modeled as a relation. The documents will not have an attribute with the author included in the result. In the result description we define which relations to include in a result.
Another case is that the information you want to provide for a result should differ depending on the context. In autocompletion, you may only need a label, whereas in full-text search you want to return much more, and as much detail as possible. You achieve this by providing result descriptions at different levels.
Defining Result Descriptions
Result descriptions are stored in the application.xml workspace in the resultdescriptions
element.
Result descriptions are defined per class. So you can define one result description for articles and another one for authors.
Example of result description:
<resultconcept class="http://example.org/article" level="1" mode="FULL">
<attribute name="title" aggr="MOST_FREQUENT"/>
<attribute name="author" aggr="AS_SET"/>
<include cardinality="*" prefix="image" predicate="hasImage"
class="http://example.org/image" />
</resultconcept>
Example of result that is generated by this result description:
{
"id": "article1001",
"class": [ "http://example.org/article" ],
"attributes": {
"title": "Information retrieval",
"author": ["John Sea", "Mary Rec"],
"image": [
{
"url": "http://example.org/image1_thumb.jpg",
"size": "100x100"
},
{
"url": "http://example.org/image1_full.jpg",
"size": "1028*768"
}
]
}
}
In this example, the content in the attributes
object is generated by the result description.
The id
and the class
are included for each result. For the title, we added a single value (aggr=MOST_FREQUENT
).
For the author, we added all authors (aggr=AS_SET
). For the image, we included all images through an include relation.
Read more about the aggregation options below. Each image is an object in itself, with a URL and a size.
Read more about data inclusion below.
When you specify a result description for a class, the attributes listed are included in the result. Attributes that are not listed in the result description are not displayed in the result.
If no result description is specified for a class, by default all direct attributes are used for the output display.
Note that the result descriptions are created at indexing time, are set project-wide, and are not affected by strategies. Strategies only determine which objects are returned as results and how relevant they are to the query, they do not determine how result objects are represented.
Modes
What to specify for a result description depends on the mode <resultconcept ... mode="<MODE>">
. The default mode is APPEND. There are three modes for result descriptions:
APPEND (default): Append attributes to the (automatically derived) set of direct attributes.
FULL: Start from an empty list and explicitly specify all attributes to be included in the result description.
REMOVE: Given the (automatically derived) set of direct attributes, specify the attributes that should be removed from this list.
APPEND is most useful when you just want to add a relation. FULL is useful when a short representation of objects is needed, such as one with only names in it. REMOVE can be used, for example, to exclude private information, such as internal comments.
Levels
Result descriptions are made per level (specified with the XML attribute level="<N>"
).
N must be a natural number (= 1
or higher). A result object can have a different representation in each level.
Using Result Descriptions in the REST API
By default, the REST API returns results at the default level, which is level 1
.
To request results on a different level, add ?level=<n>
to the request URL.
You can specify ?level=0
to get only the internal ID of the results (usually the URL of the object).
All levels (except level 0
) can be customized by defining result descriptions.
Attributes
A result description specifies which attributes should be included. Attributes need to be defined as:
<attribute name="..." type="<type>" [options...]/>
The name is the identifier of the attribute in the data. In case of RDF this is a URL.
Type
The result description does not know the type of the attribute.
By default it assumes that type="STRING"
.
It is good practice to always include the type information in an attribute definition.
NOTE:
Specifying the
type
attribute may be required in future versions.
If the type of the attribute does not match the type used in the graph, the attribute will have no value.
Formatting Options
To specify a format for a date (the default is yyyy-MM-dd), you can add the format="yyyy-MM-dd"
as an XML attribute.
Choosing Values
With fields="..."
you define which property of the data should be selected. By default, this is the same as name
. By defining the fields separately, you can rename attributes. For example:
<attribute name="creator" fields="author"/>
The author
property is selected from the data and included in the result as the creator
attribute.
You can also define multiple fields to select a value from. The value from the first available attribute will be selected. For example:
<attribute name="place" fields="city,municipality,country"/>
Aggregation Options
Since SPARQUE allows you to model your data freely, there may be multiple values for the same thing.
For example, the name could be both “John Doe”, “Doe, John”.
In these cases you can choose what to output with aggr="<value>"
.
<value>
is one of:
AS_SET: List all possible values as a set (duplicates removed) in a JSON array. The order of the values is undefined.
AUTO_SET: List all possible values as a set (duplicates removed) in a JSON array, except when there is only one value. In this case, the value will be returned as is (without the array). The order of the values is undefined.
AS_SEQUENCE: List all possible values including duplicates in a JSON array. The order of the values is undefined.
MOST_FREQUENT: Pick the most frequent value.
LONGEST: Pick the string value that is the longest (in characters).
Includes
With data inclusions, you can add information about related objects. For example, if there is a relationship between a document and its author, we want to include the author's name in the document. This is called a relational inclusion. You can also include information from a tree (ancestors or descendants).
Relational Include
<include type="relation" prefix="..." predicate="..." class="..." cardinality="..."/>
predicate
: Predicate to traverse (optionally include: reverse=”true” to inverse the traversal).class
: The desired target class.prefix
: The name in the output representation.cardinality
: Determines whether it will be represented as a single value or a set of values (values can be?
,1
,*
,+
).level
: The representation level to include.dense
: Determines whether the objects should be represented in dense formatting (avoiding white space).
Tree Include
<include type="tree" prefix="..." axis="..." tree="..."/>
tree
: The name of the tree to traverse.axis
: Either “ancestor” or “descendant”.class
: The desired target type class (to filter the traversed items). This filter will be applied after the tree has been traversed.prefix
: The name in the output representation.level
: The representation level to include.cardinality
: Determines whether it will be represented as a single value or a set of values (values can be?
,1
,*
,+
).dense
: Determines whether the objects should be represented in dense formatting (avoiding white space).distanceFrom
: Ensures that only items that are closest to the node are included (set to0
to include itself).distanceTo
: Ensures that only items that are the farthest away from the node are included. Set to1
to only include the parent or child. Set to2
to also include grandparent or grandchild.