CrowdFlower makes it easy to leverage the power of the crowd to carry out data transformations, cleansing, and analysis. Once you’ve loaded data to a job, either by uploading a file (.CSV, .TSV, .XLSX, .ODS) or pulling from a data feed (RSS, Atom, XML, JSON), you can manage it from the Data tab.
Fig. 1: Data page
Data Page Options
The Data Page options are as follows:
Note: These features are disabled when a job is running.
Fig. 2: Data Page options
If you would like to delete certain rows of data from your uploaded dataset, you may select the checkbox next to the data row and then select Delete from the Data Page bar. You may only delete units that are in a "New" unit state.
Add More Data
Whether you’re uploading data to a job for the first time, or supplementing the data you’ve already loaded, you can select Add More Data in the Data menu to add new data to your job.
Note: .CSV files are the preferred file type for data imports, though .TSV, .XLSX, and .ODS file formats are also supported. All data must be UTF-8 encoded.
Tip: Each column header serves as a unique identifier for the data it contains. Avoid duplicating column headers anywhere within the data of a given job unless you intend to supplement a given column with new data. In this case, it’s important that the column headers of your new data source are identical to the existing column headers.
The Force Upload checkbox appears when you are adding more data to your job. The platform automatically matches any newly uploaded column headers to column headers that are already in the job. Enabling "Force Upload" tells the platform to disregard the column headers when accepting an additional dataset to a job with existing data. Force Upload is mainly used in the following cases:
- When source data has been uploaded, and you need to upload a test question dataset (additional column headers)
- When all source data has been deleted, and you want to upload a new source dataset with different column headers. Previous dataset columns are maintained in the job, even after dataset deletion.
In cases where multiple values are stored in the cells of the same column, you can use the Split Column function to parse the data into two or more columns by specifying a delimiter (most typically a newline character).
Fig. 3: Split Column modal in Data Page
Convert Uploaded Test Questions
The Convert Uploaded Test Questions option allows you to create gold from a data file. This button will convert the units of your source data that have gold values into gold units automatically. This can save you time because instead of having to upload a source data file and a gold data file separately, you can upload them together.
Working With Your Data
When you’ve finished adding your data, there are several options you can use to best visualize it.
Crowdflower displays 25 units on each page. Switch between pages using the controls at the bottom of the data file.
Fig. 4: Data Page Navigation
Sort units according to the values in a given column by clicking the column’s header to toggle between ascending and descending order.