Back to main page

Predict

The predict is used for predicting field features

Description

The predict API is used to predict the likelihood of a feature given a hypothesis. For example, if you know the previous product, you may try to predict the user's next query.

API end point

/api/v1/_predict

Format:

    {
      "from" : From, 
      "where" : null | Proposition, 
      "predict" : PropositionSet, 
      "exclusiveness" : null | boolean, 
      "select" : null | Selection, 
      "offset" : null | long, 
      "limit" : null | long
    }

See also:

Normal Predict Query

Given the hypothesis that the customer id is 5, the goal is to predict the customer's next query.

POST /api/v1/_predict

    {
       "from" : "impressions",
       "where": {
         "customer" : 5
       },
       "predict" : "query",
       "limit" : 3
    }

Result

The above query would return an array list of results sorted in descending order of the probability

    {
      "offset" : 0,
      "total" : 16,
      "hits" : [ {
        "$p" : 0.2132183405811782,
        "field" : "query",
        "feature" : "best"
      }, {
        "$p" : 0.1945083897627545,
        "field" : "query",
        "feature" : "phone"
      }, {
        "$p" : 0.18158636046795404,
        "field" : "query",
        "feature" : "appl"
      } ]
    }

Predict Query with select

You can select the returned fields, and provide explanations for the results with select- expression

POST /api/v1/_predict

    {
       "from" : "impressions",
       "where": {
         "customer" : 5
       },
       "predict" : "query",
       "select" :  ["$p", "feature", "$why"],
       "limit" : 1
    }

Result

The results now omit 'field' field, and include explanations. Note the explanation format. Explanation contains essentiall 3 different components for an estimate of form p(X_i|A, B, C):

  1. The normalizer of form 1 / sum((p(X_0) + p(X_1) + ...))

    • The normalizer is only used, when exclusiveness is on. In this case, it is assumed that only one feature can be true at the same time, and that one feature will be true. In practice, exclusiveness enforces the probabilities of alternative features to sum to 1.0.
  2. The base probability p(X)

  3. And probability lifts. A probability lift (p(A|X) / p(A)) tells essentially that how much more likely the X (e.g. click) is on condition A (product has 5 star review. For example: the lift may say a product is clicked with 2.3x likelihood (or 130% higher likelihood), when it has 5 stars.

    {
      "offset" : 0,
      "total" : 16,
      "hits" : [ {
        "$p" : 0.2132183405811782,
        "feature" : "best",
        "$why" : {
          "type" : "product",
          "factors" : [ {
            "type" : "normalizer",
            "value" : 0.5461266221766155
          }, {
            "type" : "baseP",
            "value" : 0.1798780487804878
          }, {
            "type" : "relatedVariableLift",
            "variable" : "customer:5",
            "value" : 2.3999899776609483
          }, {
            "type" : "relatedVariableLift",
            "variable" : "customer.tags:nyc",
            "value" : 0.8849735066158054
          } ]
        }
      } ]
    }
    

Predicting through links

You can predict also through the links

POST /api/v1/_predict

    {
       "from" : "messages",
       "where": {
         "message" : "Show me the laptops"
       },
       "predict" : "product.tags",
       "limit" : 1
    }

Result

    {
      "offset" : 0,
      "total" : 9,
      "hits" : [ {
        "$p" : 0.2464245064165069,
        "field" : "product.tags",
        "feature" : "premium"
      } ]
    }