Skip to main content

Data Enrichment

The Conscia platform empowers business users to access, improve, categorize, enhance and manage data and content from an intuitive user interface. With a myriad of transformation and machine learning features available, users can enrich their data to address a variety of use cases. Control Center supports a growing number of data enrichment features designed to aid product and content managers in managing content as well as improve the findability and browsability of products and content in end user experiences.

These include:

Data Quality Validation#

Create Inspector cards to identify gaps, inaccuracies and inconsistencies in your data, enabling analysts and content managers to reduce manual processes and improve overall data quality.

AI to Categorize Unstructured Data Automatically#

Large amounts of incoming content presents a challenge when using manual curation of categories. To facilitate auto-categorization, the Conscia platform supports the creation of a machine learning model and the subsequent application of the model to incoming content records.

Creating the Categorization Model#

  1. Select the “Learn Categorization” feature from the context menu.

  2. In the learn categorization modal, enter the following information: alt-text

  3. Category Column - the column containing the category value to learn from

    • Input Columns - the series of columns that serve as input to the categorization model
    • Model File Folder - storage for the categorization model
    • Model Name - name of the categorization model
  4. Submit the categorization model.

Applying the Categorization Model#

  1. Select the “Auto Categorize” feature from the context menus.

  2. In the auto-categorization modal, enter the following information: alt-text

    - Input Columns - the series of input columns to evaluate. These should map to the same columns used in the building the categorization model.
    - Category Output Column - the new of the column to create
    - Model File Folder - the location of the previously created categorization model
    - Model Name - the name of the previously created categorization model
  3. Hit submit.

  4. View the categorization values in the newly created column.

Criteria-based Categorization of Data Records#

Assign categories to data records based on business rules. Use metadata that is descriptive (e.g. title, author, keywords), structural (e.g. how information is put together on a page) or administrative (e.g. how and when content is created or deleted) to improve navigation, search and recommendations. This is covered in further detail in the Taxonomy Section.

Analyze Images to Support Visual Search#

Use image analysis to extract labels, colors, and text from images to further searchability of products and content.


Natural Language Processing (NLP)#

Natural Language Processing (NLP) is a means to extract important aspects from textual information such as people, places, things, sentiment and more.

alt-text

The following NLP content enrichment features are available:

NLP FeatureDescription
NLP (Categories)This feature extracts key concepts and stores them as a separate output column.  This feature variant obtains its processing universe from the given data corpus, providing a much stricter context.
NLP (Key Concepts)This feature extracts key concepts and stores them as a separate output column.  This feature variant obtains its processing universe from the given data corpus, providing a much stricter context.
NLP (Keywords)This feature extracts key concepts and stores them as a separate output column.  This feature variant obtains its processing universe from the given data corpus, providing a much stricter context.
NLP (Entities)This feature extracts key concepts and stores them as a separate output column.  This feature variant obtains its processing universe from the given data corpus, providing a much stricter context.
Sentiment AnalysisIdentifies and categorizes opinions expressed in a piece of text to determine whether the attitude towards a particular topic is positive, negative or neutral.

Data Normalization Features#

When you are pulling data from multiple data sources and/or is being managed by siloed departments and teams, it is common to have inconsistencies.

The following transformations are available to standardize and normalize your data through Control Center and can also be operationalized as data preparation recipes that can be applied on a schedule.

Enrichment FeatureDescription
Change CaseChanges the sentence case of values in the selected column and data set.
Concatenate FieldsCombines two columns into a new column with values separated by a user-specified delimiter.
Convert Array to StringConverts the values in an array to a delimited string.
Convert Date Field FormatConverts date values in a selected field to an alternative date format.
Convert Field to NumberChanges the type of a field by converting it from a text to number
Convert Field to TextConverts the type of a field to text
Remove Non-printable CharactersRemoves any non-printable characters found within the values in a field
Trim WhitespaceRemoves whitespaces in values of the selection column
Find and Replace (Single)Find a specified pattern in the data set and replaces it with the user defined value or pattern
Find and Replace (Multiple)Allows a user to reference a collection containing a series of find and replace values. When a value is found in the target column, it is replaced with the corresponding value from the lookup table.
Mass Apply (Single)Overwrites a field value with a user defined target value
Mass Apply (Multiple)Overwrites values in the target column. The user can define either to overwrite the target column with the mass applied values, or append the values to the end.
Mass RemoveRemoves a value from a selected field
Group Children RecordsAggregates child records under a parent record
Extract PatternTakes in a regular expression (regex) pattern and outputs a new column containing text which matches the user defined pattern. Each match is separated by a pipe (|) in the new column.
Extract Unique ValuesExtracts all unique values stored within a field. Output is stored as a separate collection.
Spell CheckChecks the spellings of words in a string column and suggest correct spellings.
Split Field (Single)Splits data into separate columns given a character. Contents of this column will be split on the given character and stored in a new single array column
Split Field (Multiple)Splits a value within the target field into multiple fields. The number of generated fields is based on the number of delimiters found.

Accessing Enrichment Features#

Content enrichment features available in the Conscia platform share a commonality in that they are accessible through the action ribbon or as part of inline context menus. Once triggered, each enrichment feature presents the user with a guide of the feature and how to apply it.

Enrichments in the Action Ribbon#

Depending on the nature of a content collection, the actions in the ribbon will vary. To access enrichment features that appear on the action ribbon:

  1. Click on the respective button in the action ribbon. Hovering over the button displays the feature name. alt-text

  2. In the modal that appears, enter the corresponding information for the feature. In the example below, the name of the feature and description appear in the left panel. The right panel includes information required for the feature to run. This will vary based on the enrichment type.
    alt-text

  3. Take note of the number of records the enrichment feature will be applied to. This is available in the bottom left corner of the modal.

  4. Hit Submit to apply the enrichment

Enrichments in Inline Context Menus#

Enrichments are also available in the grid and can be triggered by viewing the inline context menus. To access the feature:

  1. Hover over the column header icon of the column to apply the enrichment on. The icon will change to an arrow alt-text
  2. Left-click the arrow to bring up the context menu alt-text
  3. Navigate to the desired enrichment and left click to bring up the modal