Utility Functions

Utility functions help you perform common operations on your data within your data mappings. You make them available by including the xmlns:su="com.spinque.tools.importStream.Utils" namespace:

<xsl:stylesheet 
    version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:spinque="com.spinque.tools.importStream.EmitterWrapper"
    xmlns:su="com.spinque.tools.importStream.Utils"
    extension-element-prefixes="spinque">

    <xsl:template match="table/row"/>

</xsl:stylesheet>

Below is a description of the most commonly used utility functions:

su:e

The su:e function generates an identifier for a resource based on its type and zero to five identifiers provided:

String e(String type[, String id1, String id2, String id3, String id4, String id5])

You use su:e to create an identifier for a resource when it is impossible or undesirable to create a URI for it. If you want to create a URI, use the function su:uri.

You call su:e with only the resource type when your data does not contain an identifier for the resource. Su:e then returns the type with a universally unique identifier appended to it.

Given the following virtual XML fragment:

<table>
  <row line="2211">
    <field column="0" name="title">Pulp Fiction</field>
  </row>
</table>

the mapping:

<xsl:stylesheet 
    version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:spinque="com.spinque.tools.importStream.EmitterWrapper"
    xmlns:su="com.spinque.tools.importStream.Utils"
    extension-element-prefixes="spinque">

    <xsl:template match="table/row">
        <xsl:variable name="movieID" select="su:e('movie')"/>
        <spinque:relation subject="{$movieID}" predicate="rdf:type" object="rdfs:Resource" />
        <spinque:attribute subject="{$movieID}" attribute="https://schema.org/title" value="{field[@name='title']}" type="string"/>
    </xsl:template>

</xsl:stylesheet>

generates these triples:

(_:@movie:62b7c7c9-0c43-442e-b71b-abe6be4b39f3-0-8d0d, rdf:type, rdfs:Resource)
(_:@movie:62b7c7c9-0c43-442e-b71b-abe6be4b39f3-0-8d0d, https://schema.org/title, "Pulp Fiction")

When your data contains (an) identifier(s) for the resource, call su:e with the type and 1 to 5 identifiers. Su:e then returns the type with the identifier(s) appended to it.

Given the following virtual XML fragment:

<table>
  <row line="2211">
    <field column="0" name="id">tt0110912</field>
    <field column="1" name="title">Pulp Fiction</field>
  </row>
</table>

the mapping:

<xsl:stylesheet 
    version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:spinque="com.spinque.tools.importStream.EmitterWrapper"
    xmlns:su="com.spinque.tools.importStream.Utils"
    extension-element-prefixes="spinque">

    <xsl:template match="table/row">
        <xsl:variable name="movieID" select="su:e('movie', field[@name='id'])"/>
        <spinque:relation subject="{$movieID}" predicate="rdf:type" object="rdfs:Resource" />
        <spinque:attribute subject="{$movieID}" attribute="https://schema.org/title" value="{field[@name='title']}" type="string"/>
    </xsl:template>

</xsl:stylesheet>

generates these triples:

(_:@movie:tt0110912, rdf:type, rdfs:Resource)
(_:@movie:tt0110912, https://schema.org/title, "Pulp Fiction")

su:parseDate

The su:parseDate function parses a date in a provided format and converts it into a string with the format 'yyyy-mm-dd':

String parseDate(String date, String languageTag, String format[, String format2, ..., String format10])

Given the following virtual XML fragment:

<table>
  <row line="2211">
    <field column="0" name="id">tt0110912</field>
    <field column="1" name="date_published">28-10-1994</field>
  </row>
</table>

the mapping:

<xsl:stylesheet 
    version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:spinque="com.spinque.tools.importStream.EmitterWrapper"
    xmlns:su="com.spinque.tools.importStream.Utils"
    extension-element-prefixes="spinque">

    <xsl:template match="table/row">
        <xsl:variable name="movieID" select="su:e('movie', field[@name='id'])"/>
        <spinque:relation subject="{$movieID}" predicate="rdf:type" object="rdfs:Resource" />
        <spinque:attribute subject="{$movieID}" attribute="https://schema.org/datePublished" value="{su:parseDate(field[@name='date_published'], 'en-US', 'dd-MM-yyyy')}" type="date"/>
    </xsl:template>

</xsl:stylesheet>

generates these triples:

(_:@movie:tt0110912, rdf:type, rdfs:Resource)
(_:@movie:tt0110912, https://schema.org/datePublished, "1994-10-28")

If your data contains dates in different formats, you can enter additional formats to provide for them to be parsed as well.

su:split

The su:split function breaks a string around matches of a given regular expression and returns an XPath NodeSet that you can iterate over in your mapping:

XNodeSet split(String content, String separatorRegex[, int limit])

Su:split is often used to create separate entities for a field that holds multiple values.

The genre field in the following virtual XML fragment contains two values separated by a comma:

<table>
  <row line="2211">
    <field column="0" name="id">tt0110912</field>
    <field column="1" name="genre">Crime, Drama</field>
  </row>
</table>

The mapping splits the genre field on commas, producing a list of genres. The for-each element that receives this list activates its code for all items. The normalize-space(.) function returns the current item, the name of a genre:

<xsl:stylesheet 
    version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:spinque="com.spinque.tools.importStream.EmitterWrapper"
    xmlns:su="com.spinque.tools.importStream.Utils"
    extension-element-prefixes="spinque">

    <xsl:template match="table/row">
        <xsl:variable name="movieID" select="su:e('movie', field[@name='id'])"/>
        <spinque:relation subject="{$movieID}" predicate="rdf:type" object="rdfs:Resource" />
        
        <xsl:for-each select="su:split(field[@name='genre'], ',')">
            <xsl:variable name="genreID" select="su:e('genre', normalize-space(.))"/>
            <spinque:relation subject="{$movieID}" predicate="https://schema.org/genre" object="{$genreID}"/>
            <spinque:relation subject="{$genreID}" predicate="http://www.w3.org/1999/02/22-rdf-syntax-ns#type" object="https://imdb.com/schema/Genre"/>
        </xsl:for-each>

    </xsl:template>
	
</xsl:stylesheet>

The mapping generates these triples:

(_:@movie:tt0110912, https://schema.org/genre, _:genre:Crime)
(_:genre:Crime, rdf:type, https://imdb.com/schema/Genre)
(_:@movie:tt0110912, https://schema.org/genre, _:genre:Drama)
(_:genre:Drama, rdf:type, https://imdb.com/schema/Genre)

su:uri

The su:uri function generates a URI based on a base and one to four paths provided:

String uri(String base, String path1[, String path2, String path3, String path4])

If you want to create another type of identifier, use function su:e.

Given the following virtual XML fragment:

<table>
  <row line="2211">
    <field column="0" name="id">tt0110912</field>
    <field column="1" name="title">Pulp Fiction</field>
  </row>
</table>

the mapping:

<xsl:stylesheet 
    version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:spinque="com.spinque.tools.importStream.EmitterWrapper"
    xmlns:su="com.spinque.tools.importStream.Utils"
    extension-element-prefixes="spinque">

    <xsl:template match="table/row">
        <xsl:variable name="movieID" select="su:uri('https://imdb.com/data/movie/', field[@name='id'])"/>
        <spinque:relation subject="{$movieID}" predicate="rdf:type" object="rdfs:Resource" />
        <spinque:attribute subject="{$movieID}" attribute="https://schema.org/title" value="{field[@name='title']}" type="string"/>
    </xsl:template>

</xsl:stylesheet>

generates these triples:

(https://imdb.com/data/movie/tt0110912, rdf:type, rdfs:Resource)
(https://imdb.com/data/movie/tt0110912, https://schema.org/title, "Pulp Fiction")