Skip to main content

Analyze Spreadsheets with Conscia DX Graph Collections

You can follow this tutorial that explains how to import two spreadsheets as Collections and then uses Conscia DX Graph to explore potential intersections between the two Collections. In this example, one spreadsheet contains information about individuals and another spreadsheet contains information about companies. Importing both spreadsheets into DX Graph Collections and comparing values in Fields of each Collection can help to identify relationships between records in the two Collections, specifically which individuals might be associated with what companies.

Tutorial Structure

This tutorial consists of the following steps:

  1. Prepare for DX Graph API Access: Choose a Conscia API endpoint, obtain an API Token, and configure your workstation to invoke DX Graph APIs.
  2. Prepare Files for Import to DX Graph: Evaluate and format files to import.
  3. Prepare Buckets: If Buckets don’t already exist, create Buckets for storing files.
  4. Prepare Collections: Define DX Graph Collections, Schemas, Relationships, Shadow Fields, Custom Views, and any other elements for use with these Collections.
  5. Define Relationships: Define Relationships beween the Collections, such as matching on a column value.
  6. Shadow Fields: Define and materialize any Shadow Fields.
  7. Custom Viewers: Implement any Custom Viwewers for the Collections.
  8. Upload and Files: Upload files into DX Graph.
  9. Import Records from Files: Import Records from files into DX Graph Collections.
  10. Update Navigation: Update the DX Graph UI navigation definition to include the Collections.
  11. Export: Export data from the Collections.

1. Prepare for DX Graph API Access

Choose a Conscia API endpoint, obtain an API Token, and configure your workstation to invoke DX Graph APIs.

This defines the $DXG_BASE and $DXG_HOST environment variables used by curl commands in this tutorial, as well as the ~/bin/.dxcurlrc file that contains the X-Customer-Code and Authorization HTTP headers used for Conscia Webservice API calls.

2. Prepare Files for Import to DX Graph

Prepare files for import into Conscia DX Graph.

You can import CSV files (Comma-Separated Values) into Conscia DX Graph as Collections. Often, you can save spreadsheets as CSV files.

For each spreadsheet or existing CSV file to import, open the file in a text editor. If the file uses semicolons or another delimiter rather than comma, add the following (or equivalent) as the first line in the file.

sep=;

Close the text editor and open the file in Microsoft Excel. Ensure that the header row contains column titles, or create column titles.

If there is no column in the spreadsheet that provides a unique identifier for each row, then create one. For example, insert column A, set the value of cell A1 to ID, set the value of cell A2 to 1, set the value of cell A3 to =a2+1, and drag the bottom right corner of cell A3 to the last row of data in the sheet.

Review the spreadsheet for errors. Save the file as a new CSV. Alternatively, specify the alternate delimiter when importing Records from the file.

3. Prepare Buckets

Create Buckets to manage files used to import Records into Conscia DX Graph Collections.

To import Records from a file into a Collection, you will need four Buckets:

  • inbound - Files uploaded.
  • processed - Records processed during an import process.
  • skipped - Records skipped during an import process.
  • invalid - Records determined invalid by an import process.

Delete a Bucket

To delete a Bucket. replace {bucket-code}:

curl -X DELETE -K ~/bin/.dxcurlrc $DXG_BASE/buckets/{bucket-code}

Create a Bucket

To create a Bucket, replace {bucket-code} and {bucket-name}:

curl -X PUT -K ~/bin/.dxcurlrc $DXG_BASE/buckets/{bucket-code} -H 'Content-Type: application/json' -d '{"dataBucketEntry": {"name": "{bucket-name}" }}'

4. Prepare Collections

Prepare two Collections, one for each of the files to import.

Before we can import the Records from a file into Conscia DX Graph, we need to create a Collection.

You can follow these instructions to prepare Collections to import CSV files into Conscia DX Graph.

The example curl commands use Connect to Conscia DX Graph APIs to manage Conscia Webservice API connection details.

To Delete a Collection

To delete a Collection, replace {collection-code}:

curl -K ~/bin/.dxcurlrc -X DELETE $DXG_BASE/collections/{collection-code}

To Create a Collection

To create a Collection, you must specify a key Field, which provides a unique identifier for Records in the Collection. For spreadsheets, this is the name of a column that uniquely identifies each row. For the tutorial example, this is ID.

You can use a curl command line such as the following to define a Collection with a schema, replacing {collection-code}, {collection-name}, {collection-description}, and {key-field}.

curl -X POST -K ~/bin/.dxcurlrc https://io-staging.conscia.ai/vue/_api/v1/collections -H 'Content-Type: application/json' -d '{"collectionCode": "{collection-code}", "name": "{collection-name}", "description": "{collection-description}", "dataRecordIdentifierProperty": "{key-field}"}'

To Define a Schema for a Collection

You can define a Schema for the Records in a Collection.

curl -X POST -K ~/bin/.dxcurlrc 'https://io-staging.conscia.ai/vue/_api/v1/collections/{collection-code}/schema' -H 'Content-Type: application/json' -d '{
"fields": {
//TODO: additional fields
"{key-field}": {
"jsonSchema": { "type": "number" } // or string
}
}
}'

5. Define Relationships

You can define relationships between the Collections.

For this tutorial, the individuals Collection defines a Dynamic Relationship with the companies Collection using the current_company Field in the individuals Collection and the Company Field in the companies Collection.

In the DX Graph UI:

  1. In the top navigation, click Settings, and then click Data Model. The Data Collections page appears.
  2. In the list of collections, right-click on the Collection that will contain the Relationship (for this tutorial, the individuals Collection), then click Edit, and then click Edit Relationships. The Edit Data Collection Relationships wizard appears.
  3. From the drop-down at the top right, select Dynamic Relationships.
  4. Enter a unique Relationship Key, such as individuals-companies.
  5. Expand Relationship Definition and enter a Display Name such as Individuals and Companies for the relationship.
  6. For Data Repository, select Master Content.
  7. For Related Data Collection, select the Collection with which to define the Relationship (for this tutorial, the companies Collection).
  8. For Display Field, select a Field from the Collection with which to define the Relationship. The value of this field will appear in the Collection that contains the Relationship.
  9. For One-Line Template, enter a Handlebars template for how to display records from the Collection with which to define the Relationship. For example, to render the Company field from the companies Collection, enter:
{{Company}}
  1. For Join Expression, enter an expression that joins the two collections, where {{base}} represents a Record in the Collection that defines the Relationship and {{target}} represents the Record in the related Collection.
LEVENSHTEIN_MATCH({{target}}.Company, {{base}}.current_company, 2)

The LEVENSHTEIN_MATCH() function calculates a factor in the likelihood of two expressions matching. To adjust the sensitivity of term matching, you can modify the final argument, where higher values generate more potential matches.

6. Shadow Fields

Define and materialize any Shadow Fields.

Implement any Shadow Fields. This tutorial includes a Shadow Field in the individuals Collection that shows the first company Record associated with an individual Record through the relationship defined between those two Collections.

To define the Shadow Field, in the DX Graph UI:

  1. Click Settings, and then click Data Model. The Data Collections page appears.
  2. Right-click the collection, then click Edit, and then click Edit Schema. The Edit Data Collection Schema wizard appears.
  3. Drag a text field onto the schema and set its Name to Company.
  4. Select the Shadow Field Checkbox.
  5. Set Shadow Field Expression. For this tutorial, the Shadow Field will show the first Company (entities[0]) in the companies Collection associated with the individual record (base) through the individuals-companies Relationship.
{{base}}.relationships['linkedin-buildwith'].entities[0].values.Company
  1. Click Submit.

To materialize shadow fields, replace {customer-code}, {collection-code}, and {repository-code}, which is generally {content-master}

curl -K ~/bin/.dxcurlrc -X PATCH $DXG_HOST/drs/_api/v1/customers/{customer-code}/dataRepositories/{repository-code}/dataCollections/{collection-code}/dataRecords/_materialize

7. Custom Views

For this tutorial, you can use a Custom View to present data from both the individuals Collection and the companies Collection when a Conscia user selects a row in the individuals Collection.

For this tutorial, the Record in the individuals Collection contains a Dynamic Relationship with the companies Collection. Therefore, there can be zero or more company Records associated with an individual. The following Custom View for the individuals Collection iterates those company Records.

curl -X PUT -K ~/bin/.dxcurlrc -H 'Content-type: application/json' -d '{ "template": "<table border=\"0\"><tr><td valign=\"top\" width=\"300\">{{#if selectedDataRecord.avatar}}<img height=\"40px\" width=\"40px\" src=\"{{selectedDataRecord.avatar}}\" />{{/if}}{{selectedDataRecord.full_name}}<br />{{#if selectedDataRecord.current_company_position}}<i>{selectedDataRecord.current_company_position}}</i><br />{{/if}}{{#if selectedDataRecord.headline}}<i>{{selectedDataRecord.headline}}</i><br />{{/if}}{{#if selectedDataRecord.current_company}}<strong>{{selectedDataRecord.current_company}}</strong><br />{{/if}}{{#if selectedDataRecord.location_name}}{{selectedDataRecord.location_name}}{{/if}}</td><td valign=\"top\">{{#if selectedDataRecordRelatedValues.linkedin-buildwith}}BuiltWith: <ul>{{#each selectedDataRecordRelatedValues.linkedin-buildwith}}<li><strong>{{this.Company}}</strong></li>{{/each}}</ul>{{/if}}</td></tr></table>" }' $DXG_BASE/collections/linkedin/recordTemplate

8. Upload Files

You can use the DX Graph UI to upload files, or you can use the Conscia Webservice API.

To upload files through the DX Graph UI:

  1. In the top navigation, click Files. The Buckets and Files page appears.
  2. Select the Bucket that will receive the files.
  3. Drag files into the Upload area or click Browse to upload files.

To use the API to upload a file into a Bucket, replace {bucket-code} and {filename} in this curl command.

curl --X PUT -K ~/bin/.dxcurlrc $DXG_BASE/{bucket-code}/upload' --form 'file[]=@{filename}'

9. Import Records from Files

You can use the Conscia Webservice API to import Records from a file in a Bucket into a DX Graph Collection.

To use the Conscia Webservice API to import Records from a file in a Bucket into a DX Graph Collection, in the following curl command, replace:

  • {upload-bucket-code}: The code of the Bucket that contains the file(s).
  • {skipped-bucket-code}: The code of the Bucket to store data skipped during import.
  • {processed-bucket-code}: The code of the Bucket to store processed data.
  • {invalid-bucket-code}: The code of the Bucket to store invalid data.
  • {filename} - Import records from files with matching names.
  • {key-field} - Identifier for unique key Field in Collection.
curl -K ~/bin/.dxcurlrc -X POST $DXG_BASE/buckets/{upload_bucket-code}/files/_import -H 'Content-Type: application/json' -d '
{"skippedBucketCode": "{skipped-bucket-code}",
"processedBucketCode": "{processed-bucket-code}",
"invalidBucketCode": "{invalid-bucket-code}",
"filenamePattern": "{filename}", "skipInvalidRecords": false,
"recordIdentifierField": "{key-field}",
"collectionCode": "{collection-code}",
"parseOptions": {"format": "DELIMITED","delimiter": ",","quoteChar": "\"","escapeChar": "\"" }}'

10. Update Navigation

To define the DX Graph UI navigation, replace {title}, {label}, and {collection-code}, and add any additional collections. For more options, see:

curl -K ~/bin/.dxcurlrc -X PUT $DXG_BASE/applications/dx-graph/pages/source/_configureLeftNav -H 'Content-Type: application/json' -d '{
"navigationConfiguration": [
{
"title": "{title}",
"ordinal": 1,
"active": true,
"actions": [
//TODO: additional collections
{
"label": "{label}",
"dataRepositoryCode": "content-master",
"dataCollectionCode": "{collection-code}"
}
]
}
]
}'

For the Source view in the tutorial:

curl -K ~/bin/.dxcurlrc -X PUT $DXG_BASE/applications/dx-graph/pages/source/_configureLeftNav -H 'Content-Type: application/json' -d '{
"navigationConfiguration": [
{
"title": "Individuals and Companies",
"ordinal": 1,
"active": true,
"actions": [
{
"label": "Individuals",
"dataRepositoryCode": "content-master",
"dataCollectionCode": "individuals"
},
{
"label": "Companies",
"dataRepositoryCode": "content-master",
"dataCollectionCode": "companies"
}
]
}
]
}'

And for Master view:

curl -K ~/bin/.dxcurlrc -X PUT $DXG_BASE/applications/dx-graph/pages/master/_configureLeftNav -H 'Content-Type: application/json' -d '{
"navigationConfiguration": [
{
"title": "Individuals and Companies",
"ordinal": 1,
"active": true,
"actions": [
{
"label": "Individuals",
"dataRepositoryCode": "content-master",
"dataCollectionCode": "individuals"
}
]
}
]
}'

11. Export Collection

You can use the DX Graph user interface to export a Collection.

To export a collection, in the DX Graph user interface:

  1. Select the Collection to export.
  2. At the top right, click Email Data Collection.
  3. For Export File format, select the appropriate format, such as Comma Separated Values (csv).
  4. Click Submit. DX Graph emails the exported file to the email address associated with your account.