NAV Navbar
Ods logo

OpenDataSoft API Documentation

shell

Search API v2

Search API endpoint for examples domain

https://examples.opendatasoft.com/api/v2

The OpenDataSoft search API v2 is organized around REST. It provides access to all the data available through the platform in a coherent, hierarchical way.

Authentication

An authenticated user can be granted access to restricted datasets and benefit from extended quotas for API calls. The API features an authentication mechanism for users to be granted their specific authorizations.

For the platform to authenticate a user, you need to either:

Finding and generating API keys

API keys are managed via your user profile page at https://<youropendatasoftportal>.com/account/ or by clicking on your name in the header.

Link to account settings

Go to the tab named My API keys to see your existing API keys, revoke them and create new ones.

Account's API keys page

Providing API keys within requests

Unauthenticated request on private portal

> GET https://private-portal.opendatasoft.com/api/v2/catalog/datasets/ HTTP/1.1

< HTTP/1.0 401 Unauthorized

Request authenticated with an API key

> GET https://private-portal.opendatasoft.com/api/v2/catalog/datasets/?apikey=7511e8cc6d6dbe65f9bc8dae19e08c08a2cab96ef45a86112d303eee HTTP/1.1

< HTTP/1.0 200 OK
{
    "total_count": 4,
    "links": [{
        "href": "https://private-portal.opendatasoft.com/api/v2/catalog/datasets?start=0&include_app_metas=False&rows=10",
        "rel": "self"
    }, {
        "href": "https://private-portal.opendatasoft.com/api/v2/catalog/datasets?start=0&include_app_metas=False&rows=10",
        "rel": "first"
    }, {
        "href": "https://private-portal.opendatasoft.com/api/v2/catalog/datasets?start=0&include_app_metas=False&rows=10",
        "rel": "last"
    }],
    "datasets": [...]
}

API keys are passed along requests through the query parameter apikey.

For example, accessing a private portal’s catalog unauthenticated will return a 401 Unauthorized error.

But passing the API key of an authorized user will return the JSON response with the list of accessible datasets for this user on the portal.

Using OAuth2 authorization

Overview

OpenDataSoft implements the OAuth2 authorization flow, allowing third party application makers to access the data hosted on an OpenDataSoft platform on behalf of a user while never having to deal with a password, thus avoiding any user credential to be compromised.

The OpenDataSoft OAuth2 authorization flow is compliant with RFC 6749 and makes use of Bearer Tokens in compliance with RFC 6750.

Application developers who want to use the OpenDataSoft APIs with OAuth2 must go through the following steps, which will be explained in this section.

  1. Register their application with the OpenDataSoft platform.
  2. Request approval from users via an OAuth2 authorization grant.
  3. Request a bearer token that will allows them to query the OpenDataSoft platform APIs for a limited amount of time.
  4. Refresh the Bearer Token when it expires.

Currently, applications are registered on a specific domain and can only access data on this domain.

Register an application for OAuth2 authentication

OAuth2 applications management interface

  1. Go to the My applications tab of your account page on the domain you want to register the application on.
  2. Fill the registration form with the following information:
    • Application name: the name of the application
    • Type:
      • confidential: client password is kept secret from the user and only used from a trusted environment (e.g: a web service, where the client password is stored server-side and never sent to the user)
      • public: client password is embedded in a client-side application, making it potentially available to the world (e.g: a mobile or desktop application)
    • Redirection URL: the URL users will be redirected to after they have granted you permission to access their data
  3. Store the resulting client ID and client secret that will be needed to perform the next steps.

Getting an authorization grant

Example call to /oauth2/authorize/

GET /oauth2/authorize/?
    client_id=123456789&
    redirect_uri=https://example.com&
    response_type=code&
    state=ilovedata&
    scope=all HTTP/1.1

To get an authorization grant from a user:

  1. Redirect them to /oauth2/authorize/ with the appropriate query parameters.
  2. The user will then be authenticated in the platform and redirected to a page identifying your application.
  3. From there, the user will review the information you filled in the form described above and the scope of the requested access, and grant your application the right to access their data.
  4. Once the user has accepted those terms, they will be redirected to your application’s redirection URL with query parameters describing your authorization grant.

The query parameters you need to supply when redirecting the user are the following:

Redirection following a successful authorization

HTTP/1.0 302 FOUND
Location: https://example.com?state=ilovedata&code=gKnAQc2yIfdz2mY25xxgpTY2uyG5Sv

The authorization grant redirect will have these values:

The 30-character authorization code must now be converted into a bearer token within 1 hour before expiring.

Converting an authorization grant to a bearer token

Example call to /oauth2/token/

POST /oauth2/token/ HTTP/1.1

client_id=cid&
    client_secret=csc&
    grant_type=authorization_code&
    code=GokshWxRFXmW0MaLHkDv5HrG6wieGs&
    scopes=all&
    redirect_uri=https://example.com&
    state=ilovedata

To receive a bearer token, convert the previously obtained authorization grant via a POST request to /oauth2/token/ with the following parameters:

Alternative call with an Authorization header

POST /oauth2/token/ HTTP/1.1
Authorization: Basic Y2lkOmNzYw==

grant_type=authorization_code&
    code=GokshWxRFXmW0MaLHkDv5HrG6wieGs&
    scopes=all&
    redirect_uri=https://example.com&state=ilovedata

Alternatively, you can pass your client ID and client secret through the Authorization header

Example response for a bearer token request

HTTP/1.0 200 OK
Content-Type: application/json
{
    "access_token": "9kxoTUYvSxnAiMpv008NBqRiqk5xWt",
    "expires_in": 3600,
    "token_type": "Bearer",
    "state": "ilovedata",
    "scope": "all",
    "refresh_token": "jFfDUcsK9zzNMs1zwczzJxGrimPtmf"
}

The response to this request is a JSON representation of a bearer token, which contains the following values:

Using the bearer token

Using the token as a query parameter

GET /api/end/point?access_token=9kxoTUYvSxnAiMpv008NBqRiqk5xWt HTTP/1.1

Using the token in an Authorization header

GET /api/end/point HTTP/1.1
Authorization: Bearer 9kxoTUYvSxnAiMpv008NBqRiqk5xWt

Using the token in the request body

GET /api/end/point HTTP/1.1

access_token=9kxoTUYvSxnAiMpv008NBqRiqk5xWt

The bearer token can be passed along requests for authentication in three different ways:

Refreshing a bearer token

Example token refresh call

POST /oauth2/token/ HTTP/1.1

client_id=cid&
    client_secret=csc&
    grant_type=refresh_token&
    refresh_token=jFfDUcsK9zzNMs1zwczzJxGrimPtmf&
    scopes=all&
    redirect_uri=https://example.com&
    state=ilovedata

To refresh an expired bearer token, send a request to the /oauth2/token/ endpoint, with the following query parameters:

The response to this request is identical to the bearer token response.

Sources

Retrieve a list of available sources on public domain

curl 'https://examples.opendatasoft.com/api/v2/'

The above command returns

{
    "links": [{
            "href": "https://examples.opendatasoft.com/api/v2",
            "rel": "self"
        },
        {
            "href": "https://examples.opendatasoft.com/api/v2/catalog",
            "rel": "catalog"
        },
        {
            "href": "https://examples.opendatasoft.com/api/v2/monitoring",
            "rel": "monitoring"
        },
        {
            "href": "https://examples.opendatasoft.com/api/v2/opendatasoft",
            "rel": "opendatasoft"
        }
    ]
}

The search API v2 can access data from 3 different sources, depending on the type of data to search.

Every call from the catalog or dataset APIs will be performed on the chosen data source.

Catalog source

Get a list of published datasets on public domain

curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets'

The catalog source works on the published dataset of a requested domain. Use this source to retrieve actual data from a specific domain.

HTTP Request

GET /api/v2/catalog/

Monitoring source

Get a list of monitoring datasets on public domain

curl 'https://examples.opendatasoft.com/api/v2/monitoring/datasets'

The monitoring source allows to search and work on special datasets providing analysis information about a requested domain.

HTTP Request

GET /api/v2/monitoring/

OpenDataSoft source

Get a list of all public datasets on OpenDataSoft Data Hub

curl 'https://examples.opendatasoft.com/api/v2/opendatasoft/datasets'

The OpenDataSoft allows to search and work on all available public datasets from the OpenDataSoft data network.

HTTP Request

GET /api/v2/opendatasoft/

Metadata

Metadata is data describing the dataset itself.

Each metadata belongs to a metadata template. There are 3 different types of metadata template:

Listing metadata template types

List metadata template types

curl 'https://examples.opendatasoft.com/api/v2/catalog/metadata_templates'

API response:

{
    "links": [{
            "href": "https://examples.opendatasoft.com/api/v2/catalog/metadata_templates",
            "rel": "self"
        },
        {
            "href": "https://examples.opendatasoft.com/api/v2/catalog/metadata_templates/interop",
            "rel": "Interoperatibility"
        },
        {
            "href": "https://examples.opendatasoft.com/api/v2/catalog/metadata_templates/basic",
            "rel": "Basic"
        },
        {
            "href": "https://examples.opendatasoft.com/api/v2/catalog/metadata_templates/extra",
            "rel": "Extra"
        },
        {
            "href": "https://examples.opendatasoft.com/api/v2/catalog/metadata_templates/admin",
            "rel": "Admin"
        }
    ]
}
HTTP Request

GET /api/v2/catalog/metadata_templates

This endpoint returns the list of all available metadata template types.

Listing metadata templates for a specific type

List templates for basic type

curl 'https://examples.opendatasoft.com/api/v2/catalog/metadata_templates/basic'

API response:

{
    "links": [{
            "href": "https://examples.opendatasoft.com/api/v2/catalog/metadata_templates/basic",
            "rel": "self"
        },
        {
            "href": "https://examples.opendatasoft.com/api/v2/catalog/metadata_templates",
            "rel": "metadata_templates"
        }
    ],
    "metadata_templates": [{
        "links": [{
            "href": "https://examples.opendatasoft.com/api/v2/catalog/metadata_templates/basic/default",
            "rel": "self"
        }],
        "matadata_template": {
            "type": "basic",
            "name": "default",
            "schema": [{
                    "widget": "textinput",
                    "name": "title",
                    "uri": "http://purl.org/dc/terms/title",
                    "label": "Title",
                    "values": null,
                    "self_suggest": false,
                    "allow_empty": false,
                    "help_text": null,
                    "hidden": true,
                    "type": "text",
                    "values_domain_property": null
                },
                {
                    "widget": "richtextinput",
                    "name": "description",
                    "uri": "http://purl.org/dc/terms/description",
                    "label": "Description",
                    "values": null,
                    "self_suggest": false,
                    "allow_empty": true,
                    "help_text": null,
                    "hidden": false,
                    "type": "longstring",
                    "values_domain_property": null
                },
                ...
            ]
        }
    }]
}
HTTP Request

GET /api/v2/catalog/metadata_templates/<TYPE>

This endpoint returns the list of existing metadata templates for a chosen type.

Listing metadata for a specific template

List metadata for default template

curl 'https://examples.opendatasoft.com/api/v2/catalog/metadata_templates/basic/default'

API response:

{
    "links": [{
            "href": "https://examples.opendatasoft.com/api/v2/catalog/metadata_templates/basic/default",
            "rel": "self"
        },
        {
            "href": "https://examples.opendatasoft.com/api/v2/catalog/metadata_templates/basic",
            "rel": "basic"
        },
        {
            "href": "https://examples.opendatasoft.com/api/v2/catalog/metadata_templates",
            "rel": "metadata_templates"
        }
    ],
    "metadata_template": {
        "type": "basic",
        "name": "default",
        "schema": [{
                "widget": "textinput",
                "name": "title",
                "uri": "http://purl.org/dc/terms/title",
                "label": "Title",
                "values": null,
                "self_suggest": false,
                "allow_empty": false,
                "help_text": null,
                "hidden": true,
                "type": "text",
                "values_domain_property": null
            },
            {
                "widget": "richtextinput",
                "name": "description",
                "uri": "http://purl.org/dc/terms/description",
                "label": "Description",
                "values": null,
                "self_suggest": false,
                "allow_empty": true,
                "help_text": null,
                "hidden": false,
                "type": "longstring",
                "values_domain_property": null
            },
            {
                "widget": "multidatalist",
                "name": "theme",
                "uri": "http://www.w3.org/ns/dcat#theme",
                "label": "Themes",
                "values": null,
                "self_suggest": false,
                "allow_empty": true,
                "help_text": null,
                "hidden": false,
                "type": "list",
                "values_domain_property": "metadata.themes"
            },
            ...
        ]
    }
}
HTTP Request

GET /api/v2/catalog/metadata_templates/<TYPE>/<TEMPLATE>

This endpoint returns the list of existing metadata for a chosen template.

Catalog

List available entrypoints on a catalog

curl 'https://examples.opendatasoft.com/api/v2/catalog/'

API Response


{
    "links": [{
            "href": "https://examples.opendatasoft.com/api/v2/catalog",
            "rel": "self"
        },
        {
            "href": "https://examples.opendatasoft.com/api/v2/catalog/datasets",
            "rel": "datasets"
        },
        {
            "href": "https://examples.opendatasoft.com/api/v2/catalog/exports",
            "rel": "exports"
        },
        {
            "href": "https://examples.opendatasoft.com/api/v2/catalog/metadata_templates",
            "rel": "metadata_templates"
        }
    ]
}

A catalog is the list of datasets sourced in a domain.

The catalog API allows to:

Each endpoint above is documented in its own section, along with its available parameters. Some of these parameters however accept field literals, which are documented right below. We recommend reading the Field literal in catalog queries section before diving into the catalog API.

Field literal in catalog queries

Some parameters, such as select, where or group_by, accept field literals, which can either be technical fields or metadata.

Dataset technical fields

Use technical field as field literal

# Count dataset grouped by their features
curl 'https://examples.opendatasoft.com/api/v2/catalog/aggregates?select=count(*)&group_by=features'
# Note: (since a dataset can have multiple features, total count is not the number of datasets in the domain)
Field name Description
datasetid Human readable dataset identifier
has_records Boolean field indicating if a dataset contains records
features List of dataset features. Possible values: calendar, geo, image, apiproxy, timeserie and aggregate
Dataset metadata

Use metadata as field literal

curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets?where=default.modified>2015'
# Since modified is a `basic` metadata, `where` expression can be simplified to `modified>2015`
curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets?where=modified>2015'

# Get datasets that have been downloaded more than a 100 times
curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets?where=explore.download_count>100'

All metadata can be used as field literal in a query parameter.

Metadata must be fully qualified with their template name. It means that the name of the metadata must be prefixed by its template name, followed by a dot. Example: <template_name>.<metadata_name> For basic metadata, this prefix is optional.

The list of metadata and their types for a domain can be obtained with the metadata API.

Searching datasets

Get first 10 datasets

curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets?rows=10'

Get 10 datasets starting at the 10th result

curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets?rows=10&start=10'

Search datasets containing world in their metas

curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets?where="world"'

This endpoint provides a search facility in the dataset catalog.

HTTP Request

GET /api/v2/catalog/datasets

URL Parameters
Parameter Default Description
where None Filter expression used to restrict returned datasets (ODSQL documentation)
start 0 Index of the first item to return
rows 10 Number of items to return. Max value: 100
include_app_metas false Explicitely request application metadata for each datasets
timezone UTC Timezone applied on datetime fields in query and response

Aggregating datasets

Aggregation query without group_by

curl 'https://examples.opendatasoft.com/api/v2/catalog/aggregates/?select=count(*) as count'

API Response

{
    "aggregations": [
        {
            "count": 2
        }
    ]
}

Aggregation query with a single group_by

curl 'https://examples.opendatasoft.com/api/v2/catalog/aggregates/?select=features,count(*) as count&group_by=features'

API Response

{
    "aggregations": [
        {
            "count": 2,
            "features": "analyze"
        },
        {
            "count": 2,
            "features": "timeserie"
        },
        {
            "count": 1,
            "features": "geo"
        }
    ]
}

Invalid aggregation with a selected field not present in group_by

curl 'https://examples.opendatasoft.com/api/v2/catalog/aggregates/?select=records_count'

Valid aggregation with an aggregation function

curl 'https://examples.opendatasoft.com/api/v2/catalog/aggregates/?select=sum(records_count)'

API Response

{
    "aggregations": [
        {
            "sum(records_count)": 3893
        }
    ]
}

Aggregation with an multiple group_by

curl 'https://examples.opendatasoft.com/api/v2/catalog/aggregates/?select=features,theme,count(*)&group_by=features,theme'

API Response

{
    "links": [{
        "href": "https://examples.opendatasoft.com/api/v2/catalog/aggregates",
        "rel": "self"
    }],
    "aggregations": [{
            "theme": "Administration, Government, Public finances, Citizenship",
            "count(*)": 1,
            "features": "analyze"
        },
        {
            "theme": "Culture, Heritage",
            "count(*)": 1,
            "features": "analyze"
        },
        {
            "theme": "Administration, Government, Public finances, Citizenship",
            "count(*)": 1,
            "features": "timeserie"
        },
        {
            "theme": "Culture, Heritage",
            "count(*)": 1,
            "features": "timeserie"
        },
        {
            "theme": "Culture, Heritage",
            "count(*)": 1,
            "features": "geo"
        }
    ]
}

This endpoint provides an aggregation facility in the datasets catalog.

An aggregation query returns a JSON array containing an object for each group created by the query. Each JSON object contains a key/value pair for each select instruction. However, without the group_by parameter, the query returns an array with only one object.

select parameter can only be composed of aggregation function or by aggregated value. It means that literal field in select clause outside aggregation function must be present in group_by clauses.

If query contains multiple group_by clauses, returned groups are combined together.

HTTP Request

GET /api/v2/catalog/aggregates

URL Parameters
Parameter Default Description
where None Filter expression used to restrict returned datasets (see where clause in ODSQL documentation)
select None Select clause for aggregation (see select clause in ODSQL documentation)
group_by None Group by clause for aggregation (see group_by clause in ODSQL documentation)
timezone UTC Timezone applied on datetime fields in query and response
limit None Number of items to return

Exporting datasets

Get a list of available export formats

curl 'https://examples.opendatasoft.com/api/v2/catalog/exports/'

The endpoint allows to download all datasets for a requested domain.

A dataset catalog can be exported in 7 different formats:

HTTP Request

GET /api/v2/catalog/exports

Exporting a catalog in JSON

Export datasets in json format

curl 'https://examples.opendatasoft.com/api/v2/catalog/exports/json'
HTTP Request

GET /api/v2/catalog/exports/json

Exporting a catalog in CSV

Export datasets in csv format using , as delimiter

curl 'https://examples.opendatasoft.com/api/v2/catalog/exports/csv?delimiter=,'

In the CSV format, the default separator is ;. It can be changed with the delimiter parameter.

HTTP Request

GET /api/v2/catalog/exports/csv

URL Parameters
Parameter Default Description
delimiter ; Delimiter used between column values
list_separator , Character used to separate values in a list

Exporting a catalog in XLS

Export datasets in xls format

curl 'https://examples.opendatasoft.com/api/v2/catalog/exports/xls'

Export datasets to an XLS format using SpreadsheetML specification.

HTTP Request

GET /api/v2/catalog/exports/xls

Exporting a catalog in RSS

Export datasets in rss format

curl 'https://examples.opendatasoft.com/api/v2/catalog/exports/rss'
HTTP Request

GET /api/v2/catalog/exports/rss

Exporting a catalog in TTL

Export datasets in turle rdf format

curl 'https://examples.opendatasoft.com/api/v2/catalog/exports/ttl'

Export datasets to a Turtle RDF format using DCAT application for data portals in Europe.

HTTP Request

GET /api/v2/catalog/exports/ttl

Exporting a catalog in RDF

Export datasets in rdf-xml format

curl 'https://examples.opendatasoft.com/api/v2/catalog/exports/rdf'

Export datasets to an XML-RDF format using DCAT application for data portals in Europe.

HTTP Request

GET /api/v2/catalog/exports/rdf

Exporting a catalog in Data.json

Export datasets in data.json format

curl 'https://examples.opendatasoft.com/api/v2/catalog/exports/data.json'

Export datasets in the Project Open Data Metadata Schema v1.1 (data.json).

HTTP Request

GET /api/v2/catalog/exports/data.json

Looking up a dataset

Lookup Unesco dataset

curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets/world-heritage-unesco-list'

This endpoint allows to retrieve information about a specific datasets.

HTTP Request

GET /api/v2/catalog/datasets/<dataset_id>

Dataset

All datasets contain specific data called “records”.

The dataset API allows to work on these records. More specifically, the dataset API allows to:

Each endpoint above is documented in its own section, along with its available parameters. Some of these parameters however accept field literals, which are documented right below. We recommend reading the Field literal in dataset queries section before diving into the dataset API.

Field literal in dataset queries

Some parameters, such as select, where or group_by, accept field literal. In dataset search, a field literal can either be a technical field or a field from dataset mapping.

Dataset technical fields
# Sort records by their technical size
curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets/baby_names_nc_2013/records?sort=record_size'
Field name Description
datasetid Human readable dataset identifier
record_timestamp Date field indicating the publishing date
recordid Unique identifier of the record
record_size Size of the record
Dataset fields

Use a field name as field_literal

#Use field_name `name` to restrict records where `name` is Jonathan
curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets/baby_names_nc_2013/records?where=name="Jonathan"'

# Select only `name` column
curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets/baby_names_nc_2013/records?select=name'

Any field name from a dataset can be used as field literal in query parameters.

The list of fields for a specific dataset can be obtained with the dataset lookup API.

Searching records

Get first 10 records

curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets/baby_names_nc_2013/records?rows=10'

Get 10 records starting at the 10th result

curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets/baby_names_nc_2013/records?rows=10&start=10'

Search datasets containing noa in their fields

curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets/baby_names_nc_2013/records?where="Noa"'

This endpoint provides a search facility in the dataset catalog.

HTTP Request

GET /api/v2/catalog/datasets/<dataset_id>/records

URL Parameters
Parameter Default Description
select * Select expression used to retrieve specific fields (see ODSQL documentation)
where None Filter expression used to restrict returned datasets (see ODSQL documentation)
start 0 Index of the first item to return
rows 10 Number of items to return. Max value: 100
include_app_metas false Explicitely request application metadata for each dataset
timezone UTC Timezone applied on datetime fields in query and response

Aggregating records

Aggregation query without group_by

curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets/baby_names_nc_2013/aggregates?select=count(*) as count'

Returns an array with one element

{
    "aggregations": [
        {
            "count": 2841
        }
    ]
}

Aggregation query with a single group_by

# Retrieve population, state name, number of cities for each state (for the 1000 largest cities in the US)
curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets/largest-us-cities/aggregates?select=count(*) as num_cities,state,sum(population) as sum_population&group_by=state'

Returns an array with an object for each feature containing feature’s name and number of datasets

{
    "aggregations": [
        {
            "state": "California",
            "num_cities": 212,
            "sum_population": 27910620
        },
        {
            "state": "Texas",
            "num_cities": 83,
            "sum_population": 14836230
        },
        ...
    ]
}

Invalid aggregation with a selected field not present in group_by

curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets/largest-us-cities/aggregates?select=state'

Valid aggregation with an aggregation function

curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets/largest-us-cities/aggregates?select=sum(population)'

Aggregation with an multiple group_by

# Retrieve number of Unesco sites grouped by continent and cities
curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets/world-heritage-unesco-list/aggregates?select=continent_en,country_en,count(*)&group_by=continent_en,country_en'

This endpoint provides an aggregation facility for records.

An aggregation query returns a JSON array containing an object for each group created by the query. Each JSON object contains a key/value pair for each select instruction. However, without the group_by parameter, it returns an array with only one object.

select parameter can only be composed of aggregation function or by aggregated value. It means that literal field in select clause outside aggregation function must be present in group_by clauses.

If a query contains multiple group_by clauses, returned groups are combined together.

HTTP Request

GET /api/v2/catalog/datasets/<dataset_id>/aggregates

URL Parameters
Parameter Default Description
where None Filter expression used to restrict returned datasets (see where clause in ODSQL documentation)
select None Select clause for aggregation (see select clause in ODSQL documentation)
group_by None Group by clause for aggregation (see group_by clause in ODSQL documentation)
timezone UTC Timezone applied on datetime fields in query and response
limit None Number of items to return

Exporting records

Get a list of available export formats

curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets/world-heritage-unesco-list/exports'

This endpoint allows to download all records for a requested dataset.

Records can be exported in 10 different formats:

HTTP Request

GET /api/v2/catalog/<dataset_id>/exports

Exporting records in JSON

Export records in json format

curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets/world-heritage-unesco-list/exports/json'
HTTP Request

GET /api/v2/catalog/<dataset_id>/exports/json

Exporting records in GeoJSON

Export records in GeoJSON format

curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets/world-heritage-unesco-list/exports/geojson'
HTTP Request

GET /api/v2/catalog/<dataset_id>/exports/geojson

Export records to a GeoJSON format.

Exporting records in JSON Lines

Export records in json lines format

curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets/world-heritage-unesco-list/exports/jsonl'
HTTP Request

GET /api/v2/catalog/<dataset_id>/exports/jsonl

Export records to a NDJSON (Json lines) format. The JSONlines format returns a record by line. It can be useful for streaming operations.

Exporting records in CSV

Export records in csv format using , as delimiter

curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets/world-heritage-unesco-list/exports/csv?delimiter=,'

Export records to CSV format. Default separator is ;. It can be changed with delimiter parameter.

HTTP Request

GET /api/v2/catalog/<dataset_id>/exports/csv

URL Parameters
Parameter Default Description
delimiter ; Delimiter used between column values
list_separator , Character used to separate values in a list

Exporting records in XLS

Export records in xls format

curl 'https://examples.opendatasoft.com/api/v2/catalog/exports/xls'

Export records to an XLS format using SpreadsheetML specification.

HTTP Request

GET /api/v2/catalog/<dataset_id>/exports/xls

Exporting records in Shapefile

Export records to shapefile format

curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets/world-heritage-unesco-list/exports/shp'

Export datasets to a Shapefile format.

HTTP Request

GET /api/v2/catalog/<dataset_id>/exports/shp

Exporting records in Turtle RDF

Export records in turle rdf format

curl 'https://examples.opendatasoft.com/api/v2/catalog/exports/turtle'
HTTP Request

GET /api/v2/catalog/<dataset_id>/exports/turtle

Exporting records in RDF-XML

Export records in rdf-xml format

curl 'https://examples.opendatasoft.com/api/v2/catalog/exports/rdfxml'
HTTP Request

GET /api/v2/catalog/<dataset_id>/exports/rdfxml

Exporting records in N3 RDF

Export records in n3 rdf format

curl 'https://examples.opendatasoft.com/api/v2/catalog/exports/n3'
HTTP Request

GET /api/v2/catalog/<dataset_id>/exports/n3

Exporting records in JSON-LD RDF

Export records in json-ld rdf format

curl 'https://examples.opendatasoft.com/api/v2/catalog/exports/jsonld'
HTTP Request

GET /api/v2/catalog/<dataset_id>/exports/jsonld

Looking up a record

Lookup airbnb-listings dataset


# Get eiffel tower specific record from Unesco dataset
curl 'https://examples.opendatasoft.com/api/v2/catalog/datasets/world-heritage-unesco-list/records/0ef334837810f591330d1c6bc0e9289d00ff1c9d'

This endpoint allows to retrieve information about a specific record.

HTTP Request

GET /api/v2/catalog/datasets/<dataset_id>/records/<record_id>

ODSQL

Filtering features are built in the core of OpenDataSoft API engine.

The OpenDataSoft Query Language (ODSQL) makes it possible to express complex queries as a filtering context for datasets or records, but also to build aggregations or computed fields.

Note that a given filtering context can simply be copied from one API to another. For example, it is possible to build a user interface which first allows the user to visually select the records they are are interested in, using full text search, facets and geo filtering ; and then allowing them to download these records with the same filtering context.

Introduction

The ODSQL is split into 5 different kinds of clauses:

These clauses are used as parameters in the Search API v2 for searching, aggregating and exporting datasets and records. Depending on the used endpoint, some features of the query language are available or not in the request.

Language elements

ODSQL clauses are composed of basic language elements. These can either be literals or reserved keywords.

Literals in ODSQL clauses

Literals are used in comparison, assignments or functions.

There are 7 types of literal:

Field literal

Example of field literal

my_field > 10  # my_field is a string literal
`12` > 10      # without back-quotes 12 is considered as a numeric literal
`and`: "value" # AND is a keyword

A field literals is a literal not enclosed in quotes. It can only contain alphanumeric characters or underscores.

String literal

Example of string literal

"Word"
"Multiple words"
'Using single quotes'

A string literal is a literal enclosed in either single or double quotes.

Number literal

Example of number literal

100
5.8
my_field > 108.7

A number literal is either an integer or a decimal value. It is not enclosed in quotes.

Date literal

Example of date literal

date'2017-04-03T08:02'
date'2018/04/01'

A date literal is defined with a date keyword followed by a valid date format enclosed in single quotes.

A valid date can be:

Boolean literal

Example of boolean literal

my_boolean_field is TRUE
my_boolean_field: FALSE

A boolean literal can either be a TRUE or a FALSE keyword (case insensitive). It should be used in boolean filters.

Geometry literal

Example of geometry literal

distance(my_geo_field, geom'POINT(1 1)', 10km)
geometry(my_geo_field, geom'{"type": "Polygon","coordinates":[[[100.0, 0.0],[101.0, 0.0],[101.0, 1.0],[100.0, 1.0],[100.0,0.0]]]}')

A geometry literal is defined with a geom keyword followed by a valid geometry expression enclosed in single quotes.

Supported geometry expressions are:

Scalar functions

A scalar function can be used in select arithmetic expressions or filter expressions.

Function Parameters Description Limitation
length string literal or string field literal Returns the number of characters
now no parameter Returns the current date Only works on filter expressions
year date field literal Returns the year of the field literal
month date field literal Returns the month of the field literal
day date field literal Returns the day of the field literal
hour date field literal Returns the hour of the field literal
minute date field literal Returns the minute of the field literal
second date field literal Returns the second of the field literal
date_format date field literal Returns the formatted date (see Group by date format for examples)

Reserved keywords in ODSQL clauses

not is a reserved keywords and must be escaped with back-quotes if referred as field literal

my_field_literal is not true # my_field_literal is not a reserved keyword, no need to escape it
`not` is not true # not is a reserved keyword and must be escaped

Reserved keywords can be used inside clauses for building ODSQL expressions. When used in a clause as a field literal, reserved keyword must be escaped with back-quotes.

List of reserved keywords:

Select clause

The select clause can be used in records search APIs as the parameter select.

The select clause allows to: - choose the fields that will be returned for each row - transform fields using arithmetic - rename fields - add computed virtual fields to fields - include or exclude fields based on a pattern

A select clause is composed of a single select expression or a list of comma-separated expressions.

A select expression can be:

Except for the include/exclude function, a select expression can define a label with the keyword AS. This label will be used in the output of the API as key for the select expression result.

Select field literal

Examples of select field literal

*                           # Select all fields
field1, field2, field3      # Only select field1, field2 and field3
field1 AS my_field, field2  # Renaming field1 as my_field and select field2

A select field literal is the simplest form of select expression. It takes a field literal that must be returned in the result. It also accepts the special character * to select all fields (it is the default behaviour).

Include and exclude functions

Example of include/exclude

include(pop) # will only include fields which name is pop
exclude(pop) # will exclude fields which name is pop
include(pop*) # Will include fields beginning with pop

Include and exclude are functions that accept fields names.

Fields listed in an include function are present in the result whereas fields listed in an exclude function are absent from the result.

Fields can contain a wildcard suffix (the * character). In that case, the inclusion/exclusion works on all field names beginning with the value preceding the wildcard.

Arithmetic select expression

Example of include/exclude

2 as const_2 # Creates a field `const_2` containing the value `2`
2 * population as double_population # Create a field `double_population` containing the double of population field
"hello" as hello # Creates a field containing "hello" value
length(country_name) # Get length (number of characters) of country_name field values

An arithmetic select expression accepts simple arithmetic operations. It accepts field literals, constant numeric or text values and scalar functions. More complex arithmetic expressions can be formed by connecting these elements with arithmetic operators:

Select aggregation

Examples of aggregation expression

SUM(population) as sum_population # Will compute the sum of all values for the field `population` returned as sum_population
COUNT(*) # Return number of elements

Like in the SQL language, a select can also express an aggregation expression.

Available aggregation functions are:

Count aggregation

Examples of count aggregation

COUNT(*) # Return number of elements
count(population) as population_count_not_empty # Return number of elements where `population` field is not empty

This function computes numbers of elements.

It accepts the following parameters: - a field literal: only returns the count for not null value of this field - a * : returns the count of all elements

Max aggregation

Examples of max aggregation

max(population) as max_population # Return max value for population field

This function takes a numeric field literal. It returns the maximum value (max) of this field.

Min aggregation

Examples of min aggregation

min(population) as min_population # Return min value for population field

This function takes a numeric field literal. It returns the minimum value (min) of this field.

Avg aggregation

Examples of avg aggregation

avg(population) as avg_population # Return the average of the population

This function takes a numeric field literal. It returns the average (avg) of this field.

Where clause

Where clause with boolean operators

my_numeric_field > 10 and my_text_field like "paris" or distance(my_geo_field, geom'POINT(1 1)', 1 km)

This where clause filters results where numeric_field > 10 and (my_text_field contains the word paris or distance between my_geo_field and the point with 1,1 as lat,lon is under 1 kilometer)

The where clause can be used in the whole search API as the parameter where.

The where clause allows to filter rows with a combination of where expressions.

A where expression can be:

Where expressions can be combined with boolean operators (see Boolean operations documentation section below) and grouped via parenthesis.

Boolean operators

Boolean operators

my_boolean_field OR my_numeric_field > 50 and my_date_field > date'1972'
# Results can have my_boolean_field to true. They can also have my_numeric_field greater than 50 and my_date_field older than 1972

(my_boolean_field OR my_numeric_field > 50) and my_date_field > date'1972'
# Results must have my_date_field older than 1972. They also must have my_boolean_field to true or my_numeric_field greater than 50

Where expressions can use boolean operators to express boolean filter.

There are 3 different boolan operations:

AND has precedence over OR operator. It means that, in the expression a or b and c, the sub-expression b and c is interpreted and executed first. It can also be written with parenthesis: a or (b and c).

In order to change operator precedence, it is possible to use parenthesis in the expression. To give precedence to OR operator, the above expression can be written (a or b) and c.

Search query filter

Examples for search query

"tree"
"tree" AND "flower"
"tree" OR "car"
NOT "dog"
"dog" AND NOT "cat"

Examples of search query with multiple words

"film"           # returns results that contain film
"action movies"  # returns results that contain action and movies.

Example of wildcarded search query

"film*"      # returns results that contain film, films, filmography, etc.

Filter search queries are queries that don’t refer to fields, they only contain quoted strings and boolean operators. Filter search queries perform full-text searches on all visible fields of each record and return matching rows.

If the string contains more than one word, the query will be an AND query on each tokenized word.

It is possible to perform a greedy query by adding a wildcard * at the end of a word.

Filter functions

Filter functions are built-in functions that can be used in a where clause.

There are 3 filter functions:

Distance function

Distance function examples

DISTANCE(field_name, GEOM'<geometry>', 1 km)
DISTANCE(field_name, GEOM'<geometry>', 100 yd)

The distance function limits the result set to a geographical area defined by a circle. This circle must be defined by its center and a distance.

Geometry function

Geometry function examples

GEOMETRY(field_name, GEOM'<geometry>', INTERSECT)
GEOMETRY(field_name, GEOM'<geometry>', DISJOINT)
GEOMETRY(field_name, GEOM'<geometry>', WITHIN)

The geometry function limits the result set to a geographical area defined by a polygon.

This polygon must be defined with both:

Bbox function

Bbox function example

BBOX(field_name, GEOM'<geometry>', GEOM'<geometry>')

The bbox function limits the result set to a rectangular box.

This rectangular box is defined by its top left and its bottom right coordinates, both expressed with 2 geometry literals.

Comparison filter

3 kinds of comparison filter can be used in a where clause:

Text comparison filter

Operators Description
:,= Perform an exact query (not tokenized and not normalized) on the specified field

Numeric comparison filter

Operators Description
:,= Match a numeric value
>,<,>=,<= Return results whose field values are larger, smaller, larger or equal, smaller or equal to the given value

Date comparison filter

Operators Description
:,= Match a date value
>,<,>=,<= Return results whose field date are after or before the given value.

Filter expression

3 kinds of filter expression can be used in a where clause:

Boolean field filter

Example of a boolean field filter

my_boolean_field          # Filters results where boolean_field is true
my_boolean_field is false # Filters results where boolean_field is false

A boolean field filter takes a boolean field and restricts results only if the boolean value is true.

There are 2 ways of creating a filter expression:

Format:

in which <field_literal> must be a valid boolean field

Like filter

Example of a like expression

film_name like "star"      # matches `star wars` and `Star Trek`
film_name like "star wars" # match fields containing `star` and `wars`

A like filter restricts results to field literal values containing a defined string literal.

Format:

<field_literal> LIKE <string_literal>

In filter

In filter on numeric

numeric_field IN [1..10] # Filter results such as 1 <= numeric_field <= 10
numeric_field IN ]1..10[ # Filter results such as 1 < numeric_field < 10
numeric_field: [1..10]   # Use `:` instead of `IN` operator

In filter on date

date_field IN [date'2017'..date'2018'] # Filter results such as date_field date is between year 2017 and 2018

In filters results are based on a numeric or a date range. In filters only work on a field literal.

Format:

Group by clause

Simple group by expression with label

group_by=my_field as myfield

multiple group by expressions with label

group_by=my_field1,my_field2 as my_field

The group by clause can be used in aggregations of the search API as the parameter group_by.

The group by clause creates groups from data depending on a group by expression.

A group by clause can contain:

Like selects, a group by expression can have an AS statement to give it a label.

A group by expression can be:

Group by field

Simple group by field

group_by=my_field

A group by field expression allows to group specified field values. It creates a group for each different field value.

Format:

group_by=<field_literal>

Group by static range

group by static range examples

RANGE(population, ]10, 50, 100[) # Creates 4 groups: *-10, 10-50, 50-100 and 100-*
RANGE(population, [20.5[)        # Creates 1 group: 20.5-*
RANGE(population, [1,2,3])       # Creates 2 groups: 1-2 and 2-3

The static range function takes 2 parameters:

The side of the brackets determines if the values lower than the lower bound and higher than the higher bound should be grouped together or ignored.

Format:

group_by=range(<field_literal>, [|] <numeric_literal> [,<numeric_literal>]* [|]) in which <field_literal> must be a numeric field

Group by equi range

group by equi range examples

RANGE(population, EQUI(5, 10, 30))  # 5 is step value. 10 is the lower bound and 30 The higher bound.

Creates the following group: markdown - *-10 - 10-15 - 15-20 - 20-25 - 25-30 - 30-*

An equi range function can be used in a static range function replacing the static range parameter.

The equi range function takes 4 parameters:

The equi range function creates a group for the lower bound. It then creates another group at each step, adding the step value from the previous value until the higher bound is reached.

Format:

group_by=range(<field_literal>, EQUI(<numeric_literal>[,<numeric_literal>]*)) in which <field_literal> must be a numeric field

Group by date functions

group by date examples

year(date_field) # Create a group for each different years in date_field values
hour(date_field) # Create a group for each different hours in date_field values

Group by date functions allow to group data on a date field by a specific unit of time.

Function name Description
year Groups by year
month Groups by month
day Groups by day
hour Groups by hour
minute Groups by minute
second Groups by second
millisecond Groups by millisecond
Format:

group_by=<date_function>(<field_literal>) in which <field_literal> must be a datetime field

Group by date format

group by date format examples

date_format(date_field, "YYYY-MM-dd'T'HH") # Creates a group for each minutes in date_field and returning date with an pseudo ISO 8061 format
date_format(date_field, "w") # Create a group for each different week in date_field

A group by date format expression allows to group by a custom date format.

A date format is a string enclosed in double-quotes. Every character between a-z and A-Z is considered to be a pattern representing a date unit. In order to use these characters as simple characters and not pattern, they must be enclosed in single-quotes.

The formats below are available for a date format expression. They come from joda time documentation.

Symbol Meaning Presentation Examples
G era text AD
C century of era (>=0) number 20
Y year of era (>=0) year 1996
x weekyear year 1996
w week of weekyear number 27
e day of week number 2
E day of week text Tuesday; Tue
y year year 1996
D day of year number 189
M month of year month July; Jul; 07
d day of month number 10
a halfday of day text PM
K hour of halfday (0~11) number 0
h clockhour of halfday (1~12) number 12
H hour of day (0~23) number 0
k clockhour of day (1~24) number 24
m minute of hour number 30
s second of minute number 55
S fraction of second number 978
z time zone text Pacific Standard Time; PST
Z time zone offset/id zone -0800; -08:00; America/Los_Angeles
escape for text delimiter
single quote literal

The count of pattern letters determine the format.

Format:

group_by=date_format(<string_literal>) in which <string_literal> contains a date format