Crowdsourcing in a very powerful tool and it’s important that we use a common language. In this article you will find all of the key terms that we use at CrowdFlower.
A job is composed of a customizable interface that connects your data to an on-line workforce. Each job on the CrowdFlower platform has rows of data , instructions, a user interface (CML form), test questions, and access to contributors. Contributors submit judgments on rows of data via the user interface in a job. Jobs are listed on your job dashboard and are cataloged/accessed via a unique numeric id.
Jobs are served to contributors in pages. Each page is a collection of one or more randomly selected rows of data. Each time a contributor clicks the submit button in a job, they are completing a page. Contributor pay is page based. If your job uses Test Questions, each page will contain one Test Question by default.
The dataset that you will upload to your job. This is usually a UTF-8 encoded CSV file.
A row of data uploaded to the job from a data file or via API. A judgment is a Contributor answer on a row. Therefore, a job cannot collect judgments without at least one row of data. If you create a job “without data,” then an empty row of data will be created in your job.
Test Questions serve the dual purpose of training contributors and monitoring contributor performance. Contributors are given a Trust Score that reflects their accuracy on Test Questions in a given job. If a contributor answers a Test Question incorrectly, their Trust Score is reduced and they are provided with the correct answer and a reason explaining why that answer is correct. Contributors will not know which questions are test questions and which ones aren't.
A judgment is the set of answers submitted by a Contributor on a row of data. By default, a single row will collect 3 judgments.
A trusted judgment is an answer from a contributor with a Trust Score higher then the Minimum Accuracy you set on the settings page. All trusted judgments are included in your results. Judgments can only be deemed trusted or untrusted in jobs that contain Test Questions. This is because Test Questions are used to determine a contributor's Trust Score.
The number of judgments needed for the job to complete. This number will fluctuate due to contributors occasionally transferring from trusted to untrusted. This is only relevant when a job is running.
A Tainted Judgment (Untrusted Judgment) is an answer from a contributor whose Trust Score has fallen below the Minimum Accuracy you set on the settings page (see ‘Trust Score’). Tainted Judgments are not included in your results unless you specify otherwise. You will not collect any tainted judgments if you run a job without Test Questions.
Untrusted Judgment is synonymous with Tainted Judgment.
The requester is the individual building the crowdsourcing job and requesting (ordering) the work. If you use the CrowdFlower platform to get work done then you are a requester!
CrowdFlower has scaled its crowd to the world’s largest pool of online contributors by partnering with dozens of websites that maintain large online communities. We call these partners “Channels.” Our contributors access CrowdFlower jobs via offer walls on Channel websites. Examples of Channels include Clicksense, Swagbucks and Neobux.
A Contributor is an individual member of the crowd – i.e., a person who will work on jobs. Individual contributors are identifiable by a Contributor ID.
Trust Score is the accuracy score of a contributor on Test Questions in a job. A contributor is assigned a Trust Score for each job that they work on. If a contributor’s Trust Score falls below a preset threshold, they become “untrusted”, their judgments are tainted, and they are no longer allowed to participate in the job. Contributors who maintain a Trust Score above this threshold are called “trusted.” For CrowdFlower Basic jobs, the trust threshold is set at 70%. The trust threshold is adjustable for CrowdFlower Pro and Enterprise accounts.
Each data row has a state that describes its status. The states available to a row are:
- New – A row that has not yet been ordered and will not collect judgments.
- Judgeable – A row that has been ordered and can collect judgments.
- Finalized – A row that has received enough trusted judgments to be considered complete and will no longer collect judgments.
- Golden – A Test Question.
- Hidden – A disabled Test Question.
Note: "Unit" refers to row. The former term has been elimitated from the UI.
Form Element (CML)
Form elements are data inputs that allow contributors to submit work through your job's user interface. They allow you to dictate the type of answer you receive for each question you ask in your job. CML provides a variety of common form elements (e.g., text inputs, radio buttons, checkboxes, etc). At least one form (CML) element is required to launch your job.
CML (CrowdFlower Markup Language)
CML is CrowdFlower’s very own markup language, developed to improve the experience of building online forms for crowdsourcing. The language features a broad collection of specialized Form Elements that support a wide array of user interfaces. You can read more about CML in the CrowdFlower Success Center.
CML attributes are part of the CrowdFlower Markup Language. They are used to modify the questions you ask in your job. They come in name/value pairs like: name=“value” and are always specified in the start tag – just like HTML attributes. More information can be found about CML attributes here.
Once a job is complete, all of the judgments on a row of data will be aggregated with a confidence score. The confidence score describes the level of agreement between multiple contributors (weighted by each contributors’ trust scores), and indicates our “confidence” in the validity of the aggregated answer for each row of data. The aggregate result is chosen based on the response with the greatest confidence.
Before a contributor can enter your job, they must pass a Quiz Mode composed entirely of test questions. This ensures only contributors that prove they can complete your job accurately, will be able to enter your job. Contributors that fail Quiz Mode are not paid and are permanently disqualified from working on your job
Throughput is the speed at which the crowd completes your job. This is measured as the number of finalized rows of data per hour.
A Workflow describes an ecosystem of jobs, scripts, etc. involved in your crowdsourced solution. A Workflow can consist of one or many crowd jobs. These jobs can run in series or in parallel and can be augmented with machine learning or other non–crowd processes. Typically, data collected in one job feeds to another, and so on. Complex Workflows can be used to achieve very high accuracy by breaking up solutions that require multiple decisions and are not good jobs by nature. Data is consolidated from all steps at the end of the Workflow.