There are two ways to ingest data into Curiosity from simple text formats, such as comma-separated values (CSV) or tab-separated values (TSV):
- ingesting text data into a new type of node
- ingesting text data into an existing type of node
Both will be covered in this article and both methods require that you have an administrator account for your Curiosity application.
Ingesting text data into a new type of node
Go to the Data Hub (via the menu button at the top left of the screen).
Click on the "Import data" option within the "Quick actions" section at the top of the page.
Leave the "Import as" drop down selection as "Create new" and either select a file by clicking the icon alongside the text input underneath the "File" label or click the "Free Text".
If the first line of the text file contains the column names, leave the "Header" option on the "CSV has headers" selection. If the first line does not list the headers then change the Header option to "No headers".
Before importing, you will be able to confirm / specify the column names for cases where you wish to change the naming convention (if the data includes a headers row) or provide column names (if the data does not include a header row).
For example, the following table..
.. may be represented by comma-separated values text content like this:
Alice Mutton,20 - 1 kg tins,39.00,1
Aniseed Syrup,12 - 550 ml bottles,10.00,0
Boston Crab Meat,24 - 4 oz tins,18.40,0
Camembert Pierrot,15 - 300 g rounds,34.00,0
Carnarvon Tigers,16 kg pkg.,62.50,0
Chai (New),10 boxes x 20 bags,18.00,0
Chang,24 - 12 oz bottles,19.00,0
Chartreuse verte,750 cc per bottle,18.00,0
Chef Anton's Cajun Seasoning,48 - 6 oz jars,22.00,0
Chef Anton's Gumbo Mix,36 boxes,21.35,1
Côte de Blaye,12 - 75 cl bottles,263.50,0
After selecting the input file or pasting in the text content, the "Mapping Settings" section will automatically be updated.
If your text content uses tabs between fields, rather than commas, change the "Delimiter" drop down selection to "tab". Changing this also automatically updating the "Mapping Settings" section.
Precisely one column must be set as the "Key" column and this must unique across all rows of the data. Ticking one "Key" box will un-tick whichever other is currently ticked.
The search icon at the end of each row will display a popup that lists every value in that column in the current data.
By default, data from all columns will be imported but you can alter this by changing the "Keep" toggle to "Ignore" for any column that you do not want.
At this point, you may change the name of any of the columns for when they are imported into Curiosity. The value from the header row is shown to the left of the text box for reference and this header row value is the default text in the editable text box. If you are happy with the header row values as column names then you do not need to edit any of them.
When you import data that does not have a header row then the column names will be pre-populated with names such as "Column_1". In this case, it will make sense to edit all of them so that they are descriptive. Also note that in this case, no assumption is made about columns to keep (they all default to "Ignore") or which column may be the key column (when there is a header row, it is presumed that the first column contains a unique identifier).
Finally, enter a name for "Node Type" (in the text box above the field mapping rows) and click "Import".
A successful import will display a popup, showing what was loaded:
When you close this popup, you will be taken to the Data Hub view of the new node type where you may customize the appearance of the node type, specify what fields will be searchable and much more.
Note: Creating a node type in this manner will result in all of the fields being classified as strings. This limits some of the filtering functionality within Curiosity and so it is often better to define a schema first and then import data from text content into that pre-defined node type (see the Data Hub article for more details on defining schemas and see the next section for information on importing data into them).
Ingesting text data into an existing node type
If there is already a node type defined that you wish to import new data into, there are two ways that you can do so.
1. Data Hub / Import Data
This is a very similar process to that outlined in the previous section.
Change the "Import as" drop down selection from "Create new" to the name of an existing node type.
This replaces the editable field name text boxes with drop down lists of fields for the target node type, allowing you to map the column names in the text data to the field names in the node type. Each target field will
Any columns in the text data that you do not want to map to the node type can be set to "Ignore", rather than "Keep".
Note that the "Key" check boxes also disappear in this view as the Key field for the target schema has already been set and can not be changed.
Note: The import process in the previous section resulted in a schema where every field was a string and so any value was valid. When importing into an existing schema, so fields may not be string values (they may be whole or fractional numbers or they may be boolean values, for example) and it's important that the text data column values are in the correct format for fields that are of a type other than "string". If there are any rows that do not have data in the right format then it will not be possible to import them, you have two options for handling importing data that contains invalid rows:
- Import all of the rows except for the invalid ones—to do this, leave the "Ignore errors" option at its default value of "Ignore all errors". When the import is complete, if there was at least one successfully-imported row then you will be shown a popup that lists the newly-imported items. If you close this popup then, behind it, you will see another popup listing which rows were found to be invalid. If there were no successfully imported rows then it will go immediately to this error list popup.
- Import all valid rows up until the first invalid one—to do this, change the "Ignore errors" option to "Stop on first error". As above, if any rows were successfully imported then these will be displayed in an popup that lists the new items, behind which there will be a popup listing the first error that was encountered.
2. Data Hub / Select Data Type / Import
This is also very similar to the steps described in the previous sections.
Go to the Data Hub and then click on the name of the node type that you wish to import into. Click Import at the top of the left hand side menu:
The only difference is that there is no "Import as" drop down because the target node type has already been selected.