Back to main page

Match

The match API is used for predicting linked rows, text field values or fields features

Description

Match can be used to predict A) a linked row, B) a text field value or C) field feature. The match API is conceptually similar to predict API.

The differences are:

API end point

/api/v1/_match

Format:

    {
      "from" : From, 
      "where" : null | Proposition, 
      "select" : null | Selection, 
      "match" : Get, 
      "basedOn" : null | PropositionSet, 
      "offset" : null | long, 
      "limit" : null | long
    }

See also:

Normal Match Query

Match a product to the customer's wish.

POST /api/v1/_match

    {
       "from" : "messages",
       "where": {
         "message" : "Recommend me a premium laptop."
       },
       "match" : "product",
       "limit" : 2
    }

Result

The above query returns an array of results, sorted by probability

    {
      "offset" : 0,
      "total" : 11,
      "hits" : [ {
        "$p" : 0.7628980762378328,
        "description" : "hp spectre is a premium laptop, that is compatible with phones",
        "id" : 4,
        "name" : "spectre",
        "price" : 1500.0,
        "tags" : "windows laptop premium",
        "title" : "hp spectre"
      }, {
        "$p" : 0.22730328858683865,
        "description" : "apple macbook is the top laptop in the market",
        "id" : 3,
        "name" : "macbook",
        "price" : 1500.0,
        "tags" : "macosx laptop premium",
        "title" : "apple macbook"
      } ]
    }

Matching, while narrowing the results

Recommend a query that starts with 'p', after adding iphone to the shopping basket

POST /api/v1/_match

    {
       "from" : "impressions",
       "where": {
         "prevProduct" : 0,
         "queryPhrase" : { "$startsWith": "p" }
       },
       "match" : "queryPhrase",
       "limit" : 2
    }

Result

The above query returns an array of results, sorted by probability

    {
      "offset" : 0,
      "total" : 2,
      "hits" : [ {
        "$p" : 0.9989477881661053,
        "field" : "",
        "feature" : "phone cover"
      }, {
        "$p" : 0.0010522118338946083,
        "field" : "",
        "feature" : "phone"
      } ]
    }

Matching linked items

Find typical things that are said before the user leaves the service

POST /api/v1/_match

    {
       "from" : "messages",
       "where": {
         "message" : "Bye bye."
       },
       "match" : "prev.message",
       "select" : ["$p", "$value", "$highlight"],
       "limit" : 1
    }

Result

The above query returns an array of results, sorted by probability

    {
      "offset" : 0,
      "total" : 30,
      "hits" : [ {
        "$p" : 0.5601657022594406,
        "$value" : "I want to buy one",
        "$highlight" : [ {
          "score" : 6.462414284059194,
          "field" : "",
          "highlight" : "<font color=\"green\">I</font> <font color=\"green\">want</font> to <font color=\"green\">buy</font> <font color=\"green\">one</font>"
        } ]
      } ]
    }

Example Match Query with basedOn

Sometimes the matched item may contain fields, which introduce noise and bias to the results. This may especially be problem if you have lot of features and a limited amount of data. In a case like this, aito may consider a customer to like a product because of a rare but meaningless detail. Long text fields are particularly susceptible to this problem.

The 'basedOn` parameter reduces the feature space and may improve both results and performance in cases like these. In the following example, we will ignore the noisy description field, and use only 'id', 'title' and 'tags' fields in scoring.

POST /api/v1/_match

    {
       "from" : "messages",
       "where": {
         "message" : "Recommend me a good laptop."
       },
       "match" : "product",
       "basedOn" : [
          "id",
          "title",
          "tags"
        ],
       "select" : ["$p", "title", "$highlight"],
       "limit" : 1
    }

Result

    {
      "offset" : 0,
      "total" : 11,
      "hits" : [ {
        "$p" : 0.9821466858734174,
        "title" : "hp spectre",
        "$highlight" : [ {
          "score" : 2.4569185384102754,
          "field" : "tags",
          "highlight" : "windows <font color=\"green\">laptop</font> <font color=\"green\">premium</font>"
        }, {
          "score" : 1.9801066803303529,
          "field" : "id",
          "highlight" : "<font color=\"green\">4</font>"
        }, {
          "score" : 1.9131765517059516,
          "field" : "title",
          "highlight" : "<font color=\"green\">hp</font> spectre"
        } ]
      } ]
    }

Example Match Query with diagnostics

Match a product to the customer's wish.

POST /api/v1/_match

    {
       "from" : "messages",
       "where": {
         "message" : "Recommend me a laptop."
       },
       "match" : "product",
       "basedOn" : ["tags"],
       "select" : ["$p", "title", "$why"],
       "limit" : 1
    }

Result

Diagnostics provide a detail explanation of the results. The explanations are often verbose, because the scoring process typically involves a large amount of components. The explanation segment splits the document score into its base commponents, that include: