OpenDataSoft provides different ways to add data to a dataset. Each method solves a specific use case: you may want to upload a static referential, stay synchronized with a remote service, extract data from a geographical information system, from an API...
You can attach a static file to your dataset by uploading a file from your computer via the Upload a file button.
The size limit for a file is 240Mo. If your file is too big you can compress it before uploading it on the platform.
The Paste data box can be used to directly paste data (in a CSV format). It is useful for quick tests.
Remote file (http, https, ftp and ftps)¶
By writing an url in the Enter an URL box, you can import files stored on a remote server, we support the following protocols:
And their secured versions:
http links to a single file, for example http://example.org/mydata.csv
Using a directory is often the prefered solution to automate incremental updates between a customer’s information system and the OpenDataSoft platform. All the files in the directory need to have the same format and schema (for instance, CSV files with the same column titles). Whenever the dataset is published, new and updated files are fetched from the remote location and processed. Thanks to OpenDataSoft’s native deduplication strategy, similar records are not processed twice (see Special fields documentation).
When synchronizing from a remote FTP location, OpenDataSoft keeps a persistent cache and does not automatically prune files missing from the remote directory. Please contact OpenDataSoft’s support if you need some cleanup to be performed.
We do not support the sftp protocol which is completely different from the ftps protocol.
API - Specific connector¶
Sometime, it might be proven convenient to connect a dataset to a remote data source exposing data records over an HTTP API.
OpenDataSoft natively supports the following APIs (contact your local support team to get these activated on your domain):
OpenDataSoft can also develop and integrate customer specific Web APIs. OpenDataSoft connectivity toolkit makes it possible to develop performant and secured connectors supporting incremental data processing.
The following table lists the supported format and describes configuration options for each format.
|CSV||.csv, .tsv, .txt, .dat||The classic Coma Separated Value file.||CSV connector|
|Microsoft Excel||.xls, .xlsx||Spreadsheet connector|
|OpenDocument SpreadSheet||.ods||Spreadsheet connector|
|JSON||.json||Simple JSON documents are supported. The platform lets you choose the root path (path to the table of elements to be considered as rows) and the properties path (path to the dictionary holding the list of fields for an element).||JSON connectors|
|GeoJSON||.json, .geojson||GeoJSON connector|
|KML/KMZ||.kml, .kmz||KML and KMZ connector|
|Shapefile||.zip||A zip archive containing at least the following files: <NAME>.shp, <NAME>.dbf, <NAME>.prj||ShapeFile connector|
|MapInfo||.zip||A zip archive containing either <NAME>.mid and <NAME>.mif files or <NAME>.map, <NAME>.tab, <NAME>.id and <NAME>.dat||MapInfo connector|
File formats support can be extended to match specific requirements (for instance, to support a complex XML DTD or a non standard flat file format). Contact your local support team if you need more information on file formats support extension.
OpenDataSoft supports compressed files (zip, bz2, tar, gz, gzip, tar.gr, tgz, tar.bz2).
Files (images) with metadata¶
To upload files (and images) on the platform, you have to build first a ZIP archive. This archive shall contain the following files:
- a CSV file which lists the files (images) and metadata
- the files (images) at the same level (no subdirectory)
The CSV file must contains a column with file (image) names, others columns will be considered as additional fields. For instance:
The CSV files and the images must be located at the root of the archive.
You can find a example of dataset images on discovery. The source can be downloaded.