Skip to main content

DX Graph Overview

Conscia's DX Graph™ allows you to create a modernization layer on top of your legacy systems so that your content can be made available for modern experiences via APIs from a centralized repository. Conscia syncs data and content from external content and data repositories to create an experience graph that provides a 360 view of all business entities involved in building a customer experience.

Use Cases

The DX Graph can be used for the following use cases:

  • Content Graph - A unified view of content from multiple CMSs
  • Customer Identity Graph
  • Master Data Management - A connected view of data from multiple source systems
  • Product Information Management (PIM)

Having both customer data and content in one connected graph allows digital teams to build personalized content experiences.

Elements of the DX Graph

There are many elements that support the complex nature of the DX Graph. These elements work together to create the combined graph structure. We'll break down these elements to better understand the graph and it's usage:


Data Collections

In DX Graph, data resides in Data Collections. A Collection contains records of a certain user-defined type, such as customers, products, stores, categories

Collections can be used to manage and/or analyze various types of data such as:

  • Master data (e.g. Customers, Stores, Products, Promotions)
  • Lookup data (e.g. Customer Types, Regions, Countries, Colours). When editing a record in another Collection via DX Studio, a drop-down list may be presented that is populated by the data records in a Lookup Collection.
  • Categorization data (e.g. Customer Audiences/Segments, Product Categories). Categorization data can be configured to hierarchical (i.e. records may have a parent record).

Collections may also contains data that falls under more than one of the above types of data.

Schemas

Each Collection has a schema that allows for an extremely flexible record layout. A field can be of any type and structure.

Relationships

Collections may have relationships between them. There three types of relationships:

1-to-1

A 1-to-1 relationship is where a single value on a record must be the same as another records' Data Record Identifier (i.e. unique id).

1-to-Many

A 1-to-many relationship is where each value in an array of values on a record must be the same as another record's Data Record Identifier.

Dynamic

Dynamic Relationships are used when neither 1-to-1 or 1-to-many suffice.

Shadow Fields

A field can be designated as a Shadow Field which is a field whose value is calculated by using an expresion that references other fields from the same collection and/or related collections. The expression can be either a Javascript Expression or a JSONata expression. As well as specifying the expression, the referenced relationship(s) and field(s) need to specified to let the DX Graph's query engine know how to properly calculate the Shadow Fields.

Shadow Field expressions cannot reference other Shadow Fields.

For a Shadow Field to be used for searching and filtering, it must be materialized. Materialization is the process of persisting the Shadow Field values onto the records so that they do not need to be calculated at query-time. When querying a Collection, you can specify whether Shadow Fields are calculated (to see the freshest data) or if the materialized shadow field values are used.

Snapshots

At any given time, a Snapshot can be taken of a Collection. A Snapshot takes every record in a Data Collection and exports it to line-delimited JSON file in Snapshots Data Bucket.

Intel

If an Intel Data Table has been created in the Snapshots Data Bucket (which makes it a Snapshot Data Table), a periodic job will sync the latest snapshot file into the Data Table. This is useful when the data from a Collection is required for Computed Values.

Relationships between Data are managed within the Admin section of Control Center:

alt-text In this example, we look at a Product collection and the different relationship configurations. Brand is an attribute on the product and thus is setup as a one-to-one (1:1) mapping between the product and the brand.

alt-text A product can belong to many categories and so using a 1-to-Many Lookup supports this. The product above is given a both lookup collection and the field on that collection to display. Multiple values of categories will appear on the product using either the one-line template or large template depending on the page rendering it.

alt-text Product showcasing multiple categories it belongs to

alt-text

Dynamic relationships provide the flexibilty to bridge separate sources of information based on common attributes. In the above illustration, Products are being connected to SKUs through their common key (master_key).

alt-text

As a real-world example, dynamic relationships can interconnect a multitude of data elements to form a graph like structure. Starting from one element, users have the ability to see all the connected elements, drill down through them and connect to other elements in turn. In the illustration above, the left portion showcases all the different elemnets connected to this movie. Clicking on them (i.e. Live Action Genre), will show a set of all movies connected with that Genre.

Relationships

Without the ability to connect entities to each other, a "Graph" would be meaningless. The data within a Data Collection may have relationships between them. There are three types of relationships in Conscia:

  1. 1-to-1 Lookup Relationship
  2. 1-to-Many Lookup Relationship
  3. Dynamic Relationships

Lookup Relationships

Lookup Releationships map a Data Record in a Data Collection to either one (One-to-One Lookup) or more (One-to-Many Lookup) Data Records in another Data Collection. When a Lookup Relationship is defined for a Data Collection, the following must be specified:

  • Lookup Data Collection - The Data Collection from which users can select one ore more records
  • One-to-One or One-to-Many - If One-to-One, users can select, at most, one Data Record. If One-to-Many, users can select one or more Data Records.
  • Display Field - In Studio, the field in the Lookup Data Collection whose value will be displayed in the Studio Admin (instead of showing the underlying identifier of the Data Record)

Dynamic Relationships

Dynamic Relationships are defined using a Join Expression that relates the Data Records in one Data Collection to those that reside in another Data Collection.

  • A Join Expression can be as simple as relating IDs in one Data Collection to the same IDs in another Data Collection.
  • A more complex Join Expression could relate Data Records in one Data Collection to Data Records in another Data Collection that have the same first 4 characters in a specific field (e.g. name field) while while satisfying a filter (e.g. age > 21).

When a Dynamic Relationship is defined for a Data Collection, the following must be specified:

  • Relationship Code - The unique identifier for this relationship on the Data Collection. It must be unique within a single Data Collection.
  • Target Data Collection - The Data Collection this relationship points to. The Source Data Collection is the Data Collection that the relationship is being defined on.

Intel Time Partitions

Every Data Table will typically have a timestamp or date field that identifies when each record (i.e. event/transaction) occurred. That field is referred to as the timestampField. Records in a Data Table are processed (made queryable) by putting records into a Time Partition. A Time Partition is specified when creating with a Data Table using the partitionGranularity property. It can be YEAR, MONTH, DAY or HOUR.

An example of creating a Data Table is as follows:

PUT https://io.conscia.ai/vue/_api/v1/data-buckets/{{dataBucketCode}}/data-tables/{{dataTableCode}}
Authorization: Bearer {{apiKey}}
content-type: application/json
X-Customer-Code: {{customerCode}}

{
"name": "{{dataTableName}}",
"dataTableConfig": {
"schema": [
{"fieldName": "CRM_Order_Object_GUID", "incomingDatatype": "varchar", "processedDatatype": "varchar"},
{"fieldName": "Fiscal_Year_Variant", "incomingDatatype": "varchar", "processedDatatype": "varchar"},
{"fieldName": "Store", "incomingDatatype": "varchar", "processedDatatype": "varchar"},
{"fieldName": "District", "incomingDatatype": "varchar", "processedDatatype": "varchar"},
{"fieldName": "Article", "incomingDatatype": "varchar", "processedDatatype": "varchar"},
{"fieldName": "POS_Transaction_Date", "incomingDatatype": "varchar", "processedDatatype": "date", "expression": "DATE( substr(pos_transaction_date,1,4) || '-' || substr(pos_transaction_date,5,2) || '-' || substr(pos_transaction_date,7,2) )"},
{"fieldName": "Region", "incomingDatatype": "varchar", "processedDatatype": "varchar"},
{"fieldName": "Site_Group", "incomingDatatype": "varchar", "processedDatatype": "varchar"},
{"fieldName": "Product_Quantity", "incomingDatatype": "varchar", "processedDatatype": "double"},
{"fieldName": "Unit_of_Measure", "incomingDatatype": "varchar", "processedDatatype": "varchar"},
{"fieldName": "Total_Points", "incomingDatatype": "varchar", "processedDatatype": "double"},
{"fieldName": "Net_Amount", "incomingDatatype": "varchar", "processedDatatype": "double"},
{"fieldName": "process_day", "incomingDatatype": "varchar", "processedDatatype": "varchar"}
],
"format": "CSV",
"csv_escape": "\\",
"csv_quote": "\"",
"csv_separator": ",",
"skip_header_line_count": 1,
"partition": ["bucket(POS_Transaction_Number, 10)"],
"bucketCount": 50,
"timestampField": "POS_Transaction_Date"
},
"partitionGranularity": "MONTH"
}

Processing a Data Table is performed using the following API call:

POST https://io.conscia.ai/vue/_api/v1/data-buckets/{{dataBucketCode}}/data-tables/{{dataTableCode}}/incoming/_processDataRecords
Authorization: Bearer {{apiKey}}
content-type: application/json
X-Customer-Code: {{customerCode}}

{
"filenamePattern": "%",
"partition": {
"year": 2022
}
}

This API call will:

  • scan all the records in the uploaded data files that match filenamePattern ( % means all files)
  • Put each record in to the time partition based on the record’s timestamp field. In the above example, every record with a timestamp field falling in 2022, will be put into its corresponding month time partition (i.e. 01-2022 through 12-2022).