Formatting CSV Files

Learn how to format CSV files to use within the super.AI platform.

Comma-separated values (CSV) files are used for storing tabular data, similar to a spreadsheet. They are simpler than Excel files because the data is stored as plain text, with each field separated by a comma (hence "comma-separated values”). With super.AI, you can use CSV files to add data points for processing in large numbers.

1468

The same CSV file viewed in Excel and a text editor. In the text editor you can see the commas separating the values

For those new to CSV, this page can serve as an introduction where you will find examples of how to format your CSV files for upload to super.AI.

We also offer bulk upload via JSON, if you prefer that format.

On this page, you will find the following sections:

CSV basics

The basic structure of a CSV file is a series of rows (i.e., new lines) within a text document. The first—or header—line in the file is used to present the property keys, separated by commas, which can be thought of as the column titles in a spreadsheet. Each new line following the header line uses comma-separated values that match the order of the property keys, as you can see in this example:

Movie name,Lead actor,Release year
Die Hard,Bruce Willis,1988
Titanic,Leonardo DiCaprio,1997
Forrest Gump,Tom Hanks,1994

How to use Microsoft Excel to create CSV files

You can use Excel (or most other spreadsheet programs, including Google Sheets) to create CSV files. Simply create a spreadsheet where the first cell in each column is a property key (e.g., image_url) and fill each cell below with an input. For example:

921

Then, when you head to File > Save As, choose CSV as the file format.

843

This part will vary between other spreadsheet programs. On Google Sheets, for example, you will need to head to File > Download and select the CSV format from the menu there.

Formatting CSV files for super.AI

The following rules for formatting your CSV file for upload to super.AI apply regardless of which input file type(s) your project uses:

  • You cannot have duplicate input columns, i.e., two columns with the same property key
  • You cannot have missing input columns. For every input type, there must be a matching value in every row, even if that value is empty. You can see an example of optional inputs below.
  • You can define additional property keys to add extra data alongside your inputs, e.g., you could create an ID property key where in every row you pass a unique ID for use in your external database. See the example of adding extra data below for more information.

How to format CSV files for image inputs

For each image you want to submit, you will need to provide an image URL pointing to the input file. Each URL requires a separate row.

image_url
https://cdn.super.ai/example01.jpeg
https://cdn.super.ai/example02.jpeg
https://cdn.super.ai/example03.jpeg

How to format CSV files for text inputs

Text is provided as a raw text string. As with images, you can submit multiple data points for processing by using multiple rows.

text
As a writer, Orwell produced literary criticism and poetry, fiction and polemical journalism; and is best known for the allegorical novella Animal Farm (1945) and the dystopian novel Nineteen Eighty-Four (1949)
In 1938, his radio anthology series The Mercury Theatre on the Air gave Welles the platform to find international fame as the director and narrator of a radio adaptation of H. G. Wells's novel The War of the Worlds, which caused widespread panic because many listeners thought that an invasion by extraterrestrial beings was actually occurring.
Nabokov joined the staff of Wellesley College in 1941 as resident lecturer in comparative literature. The position, created specifically for him, provided an income and free time to write creatively and pursue his lepidoptery.

📘

Escape quotation marks inside text strings

Any quotation marks in the text will need to be escaped. The easiest way to do this is to use a tool to create the CSV (e.g. Microsoft Excel). Escaped means to add a backslash \ immediately before each instance of quotation marks.

How to format CSV files for multiple inputs

If the project type you’re using takes multiple input data types, e.g., audio and text, you should set each input type in the header line, and include a set of inputs in each row. As always, you can use multiple rows to submit multiple data points for processing. Here’s an example of how it should look:

audioUrl,text
https://cdn.super.ai/example01.mp3,The rain in Spain falls mainly on the plains
https://cdn.super.ai/example02.mp3,In Hertford, Hereford, and Hampshire, hurricanes hardly happen

How to format CSV files with extra data

If you wish to include custom extra data with your inputs, you can do so by creating additional property keys. We won’t do anything with this data, but will return it to you when you download your results. In this example, an ID property key allows us to decorate each input with a custom ID:

image_url, ID
https://cdn.super.ai/example01.jpeg,000001
https://cdn.super.ai/example02.jpeg,000002
https://cdn.super.ai/example03.jpeg,000003