Skip to main content

Exporting Collections

After data has been imported into a Collection and enriched/transformed/etc., it can be exported to a file in a Data Bucket. The file will be in line-delimited JSON format. This mechanism is useful for exporting data to be consumed by another system such as a search engine, database, data warehouse, etc.

The DX Graph provides the Job Type, Export Collection, to export data from a Collection into a file in a Data Bucket.

Export Collection Job

The Export Collection job exports every record from a Collection that matches a specified filter (if provided) into a file in a Data Bucket. Any transformation or schema errors are uploaded to an errors file in the same Data Bucket. More details on error files are here. Exported files are in line-delimited JSON format.

Job Parameters

NameParameterRequiredDescription
Collection CodecollectionCodeYesThe Collection to export from.
Target Data BuckettargetBucketCodeYesThe DX Graph Bucket that the Data File will be exported to.
Target FilenamefilenamePatternYesThe name of the file that the data will be written to. You can use the placeholder {{timestamp}} to include the timestamp of the export request. Example: products_{{timestamp}}.jsonl would result in files that look like: products_20230514_131001.jsonl. See Filename Patterns for more information.
FilterfilterNoA DX Graph filter that will be applied to the source records. If a filter is not provided, then all records will be exported.
Record Layout ConfigurationrecordLayoutConfigNoIf this is specified, the export of the records are in the Expanded Record Format. This defines what fields and relationships to return. You can see details here. If not specified, the records are exported as-is.
Record LimitlimitNoThe number that number of records that should be exported. If a limit is not provided, then all records will be exported.
TransformerstransformersNoAn array of transformations applied to each source record in the collection.

For examples on the use of filter, recordLayoutConfig, and limit, see Querying DX Graph Collections.

info

When specifying any transformers, you must keep in mind that the source record is in the Expanded Record Format. The transformers have access to Expanded Record Format functions

POST {{engineUrl}}/job-types/exportCollection/_execute
Content-Type: application/json
X-Customer-Code: {{customerCode}}
Authorization: Bearer {{apiKey}}
{
"params": {
"customerCode": "{{customerCode}}",
"collectionCode": "movie",
"targetBucketCode": "processed",
"filenamePattern": "movies.jsonl"
}
}