Follow

Prediction API for CrowdFlower AI

Overview

Make your data useful with the CrowdFlower AI API.

This API supports data submission and result retrieval from CrowdFlower AI. It accepts input data rows that need to be processed for a specified workflow and allows retrieval of processed results.

Note that due to the nature of CrowdFlower's Human-In-The-Loop processing, some results are generated in seconds while other results may take a few minutes or hours to complete.

Customers who license and integrate our AI product into their workflow may do so via API. Interested? Visit us here to sign up.

Prerequisites for usage of the REST API for AI include:

  • An Enterprise license with AI module
  • An AI ID in Ready state
  • A pre-configured AI-specific job alias
  • An API Key from CrowdFlower. The key can be specified as a bearer token, i.e. Authorization: Bearer key or via the username of a Basic Auth header.

Article Contents

API Requests

Data Input Service for CrowdFlower AI

POST /workflow/{workflowId}/data

Description

Accepts data in JSON or CSV formats and returns a count of the rows received.

Note that data submission is asynchronous, and it is possible that some or all rows fail processing though they were uploaded successfully. This can occur if the content is invalid or if your account is eligible for additional processing. Reason for not successfully processing a given row is provided along with the results during data retrieval.

An example curl request looks like: curl -u YOUR_API_TOKEN: -H 'Content-Type: application/json' --data "@PATH_TO_JSON.json" https://workflow.crowdflower.com/v1/workflow/WORKFLOW_ID/data

Another example demonstrating the JSON format to be used (for both inline and file input): curl -u YOUR_API_TOKEN: -H 'Content-Type: application/json' -d '{ "rows": [ { "features": [ { "key": "foo", "value": "bar" } ] } ] }' https://workflow.crowdflower.com/v1/workflow/WORKFLOW_ID/data

Input parameters

workflowId(required) - string
Path Parameter — Unique identifier of a workflow
inputData(required) - SourceRows
Body Parameter — Source data to be fed into the AI system

Consumes

This API call consumes the following media types via the Content-Type request header:

  • application/json

 

Response format

This service returns the status of the call to indicate if data upload was successful. It also returns the number of rows uploaded. Explanation for not successfully processing a given row is provided along with the results during data retrieval.

Code Description Example Data
202 Data upload successful "rowCount": 1 
400 Invalid parameters "Bad Request",
"Parsing error. undefined method `first' for nil:NilClass",
"Prediction error. Validation failed: Row count exceeds limit of 1000",
"Prediction error. Validation failed: At least one of the following keys must be present: column1,column2,column3"

 401 Unable to authenticate  
403 Not authorized  

 

Retrieval of Prediction Data from CrowdFlower AI

GET /workflow/{workflowId}/results

Description

Retrieves an array of result objects in JSON format. The result contains the AI judgement always. Contributor judgement is available for rows that were routed to the contributor network when AI confidence does not meet the required threshold.

Results data is maintained for a period of 90 days from the time processing is completed, after which results may be moved to our client data archive.

The sequenceId parameter determines the number of rows to skip while retrieving the result set. The sequence needs to be set to 0 for the first call. Each call will return between 0 (if no new results are available) up to a maximum of 20 new result rows.

Clients can query this service periodically (ex. every 5-15 minutes) after submitting data rows to be processed, until all results have been retrieved. Since, for each source row, the workflow may require multiple steps (processing by one or both of our AI and contributor systems), the time to generate results for rows may vary. While AI results are typically available within minutes of submission, rows directed to the contributor network may take several hours. A source row is only returned after it has successfully completed all required steps in the workflow.

If no results are available with sequence_id > than that supplied as a parameter, this call will return a 204 response with empty results array.

An example curl request looks like: curl -u YOUR_API_TOKEN: https://workflow.crowdflower.com/v1/workflow/WORKFLOW_ID/results

Input parameters

workflowId(required) - string
Path Parameter — Unique identifier of a workflow
sequenceId(optional) - sequenceId
Query Parameter — By passing the sequence number of the last row you retrieved as a parameter to this API call, the system will return the next set of available results (if any). Default value is 0, and starts at the beginning of the results for this workflow
detailedResults - boolean
Query Parameter — Due to the human-in-the-loop nature of this product, it is possible that some rows have more than one result- one from AI and another from the contributor network. By default, only the final result is returned. Set detailedResults to true if both results should be included
Default — false

Response format

This API call produces the following media types according to the Accept request header; the media type will be conveyed by the Content-Type response header.

  • application/json
Code Description Example Data
200 Data retrieval successful {ResultData}
204 Successful response, but no new results available at this time  
400 Invalid parameters  
429 Rate limit exceeded. Limit = 180 calls per minute.  

Models

sequenceNumber- integer(int32)
A sequential, consecutive identifier assigned to each record in a workflow as its processing completes
rowCount- number
Number of source rows received in the post
Provider- string
Type of processor that produced the result
- type - string
Enum(AI, Contributor)
SourceRow- string
A JSON formatted set of features provided as input to the AI system
- features - array (keyValuePair)
SourceRows- string
- rows - array (SourceRow)
Status- object
Processing status for the row. Rejected/Failed indicates we were unable to process the SourceRow for the reason given in the status message. statusMessage is present only if status is not success.
-statusCode - string
Enum(Successful, Rejected, Failed)
-statusMessage - string
keyValuePair- object
-key- string
-value- string
Confidence- number(float)
The confidence percentage as returned by the prediction provider
ResultRow- object
-sequence-sequenceNumber
-srcData-SourceRow
-status-Status
-questions- array(resultSet)
resultSet- object
-question- string
-finalResult- object(stepResult)
-detailedResults- object(detailedResults)
detailedResults- array(stepResult)
stepResult- object
confidence- object(Confidence)
answer- string
provider- object(Provider)
ResultData- object
Returns from 1 up to a maximum of 20 result rows, depending on the number of new results generated with a sequence_id greater than the one provided as a parameter.
rows- array(ResultRow)
Error- object
Workflow Error Message Object.
-errors- array(string)
-status- number

Client References 

Python Requests Library for Data Input Service:

import requests

protocol = "https"
host = "workflow.crowdflower.com"
base_url = "v1/workflow"
workflow_id = "WORKFLOW_ID"
url = "{0}://{1}/{2}/{3}/data".format(protocol, host, base_url, workflow_id)

data = { "rows": [ { "features": [ { "key": "foo", "value": "bar" } ] } ] }
api_key = "YOUR_API_KEY"

headers = {
    'content-type': "application/json",
    'cache-control': "no-cache"
    }

response = requests.post(url, json=data, headers=headers, auth=(api_key, None))

print(response.text)


Python Requests Library for Prediction Data Service:

import requests

protocol = "https"
host = "workflow.crowdflower.com"
base_url = "v1/workflow"
workflow_id = "WORKFLOW_ID"

url = "{0}://{1}/{2}/{3}/results".format(protocol, host, base_url, workflow_id)

api_key = "YOUR_API_KEY"

headers = {
    'content-type': "application/json",
    'cache-control': "no-cache"
    }

params = { 'sequenceId': 1 }

response = requests.get(url, headers=headers, params=params, auth=(api_key, None))

print(response.text)


CrowdFlower will offer Javascript and Ruby clients for your convenience shortly. Stay tuned.



Was this article helpful?
0 out of 0 found this helpful


Have more questions? Submit a request
Powered by Zendesk