Evaluation

As any designer, you want to know if your design works well; in the case of your strategy whether it returns the 'right' results or recommendations. With Sparque Desk you are able to do just that, by having stakeholders/users assess the output of your strategy for tests you created for it. With these assessments you can measure if your strategy is performing well or if it needs to be altered. You manage this process in this section of Sparque Desk.

Creating a Testset

You create a testset by giving it a name and setting its template. You can do this either by selecting a predefined template from the list or by creating a new one. The template you set for a testset should match the API Template of the strategy you want to use it for. In the case of a 'movie search' testset the strategy expects a strategy parameter 'query' of type [STRING]:


Create a testset

Create a Testset

Once you have created a testset you can fill it with tests, instantiations of one or more strategy parameters:


Fill a testset

Fill a Testset

Inviting Assessors

Different persons can contribute to assessing the returned results. Every assessor manages her own set of reviews and is able to delete assessments. You cannot delete other people's assessments.


Testset menu1

You can provide how many assessors are needed and how many topics they need to assess. The topics are automatically allocated among the different assessors. In this way every topic receives neatly distributed assessments.


Assessors invite1

Assessors invite2

Assessors invite3

After providing e-mailadresses of the assessors, a unique assessment-link is created to share with the assessor.

Scoring Results


Testset menu2

By clicking a topic, results are shown to the assessor. It's up to her to decide and state if the results are relevant, by giving a thumbs up or down or so-so.


Assess results example

Checking Overall Strategy-quality

By cumulating all assessments in an overall score, you are able to determine how well a strategy performs. Testsets and assessments are saved and archived, so you can try out different versions in parallel, and see if and how alternatives are performing.


Testset menu3

Evaluation

Important notice: A strategy can return (new) results that aren't asessed yet. Before comparing scores with other strategies, you need to make sure that the top10-results have a score on every test.

You can make sure of this, by choosing options > assess results from a strategy, and walk through the tests to see if any non-assessed (white) results are shown.