MCP Server for Technical Docs
This recipe defines an end-to-end architecture for making structured technical documentation—originally authored in Heretto CCMS — discoverable, contextual, and actionable by AI agents like ChatGPT, Perplexity, or custom assistants. It combines the strengths of Conscia’s Core Services, DX Graph, Hybrid Search, and DX Engine orchestration, all exposed through a unified Model Context Protocol (MCP) Server. The goal is to implement a real-time, agent-native retrieval flow—where content is indexed using both keyword and vector search, enriched at query time with graph relationships, ranked using business rules (and optionally LLMs), and delivered to agents in a structured, consumable format. This approach supports RAG techniques like Vector-RAG, Graph-RAG, and Tool-RAG, empowering LLMs to not just find content, but understand and reason over it—securely and at scale.
At the heart of this approach is the recognition that structured content is a prerequisite for intelligent, AI-driven retrieval. Formats like DITA XML are not just publishing frameworks—they encode meaning, relationships, topic types, and intent. That structure gives your content machine-understandable semantics: enabling RAG (Retrieval-Augmented Generation) techniques to identify whether a topic is a task, concept, or reference; to resolve prerequisites; and to follow RELATED or CONCEPT_OF links—all without training a model on your domain from scratch.
Why export content into DX Graph?
Exporting Heretto’s structured DITA topics into DX Graph turns static documentation into a live, multi-domain knowledge fabric that your search, chat, and commerce experiences can explore in real time.
- Cross-domain linking: Join docs to SKUs, support issues, personas, and any third-party data—well beyond Heretto’s own reltables.
- Graph queries at speed: DX Graph answers relationship-heavy queries in less than 50 ms. Deploy API was never built for that load.
- Hybrid RAG ready: Blend keyword + vector search with on-the-fly graph enrichment, giving LLMs richer context.
- Delivery decoupled from authoring: Content teams keep writing in Heretto; delivery teams tune retrieval rules in DX Engine—no CMS changes.
- Governance & observability: Core-Services Jobs, Triggers, and audit logs provide retries, metrics, and policy enforcement.
- Easy downstream syndication: Same graph feed can power analytics, help centers, or partner portals via export jobs.
In short, Heretto remains the single source of truth, while DX Graph becomes the high-performance, relationship-aware runtime that unlocks personalization, RAG accuracy, and cross-system reuse.
1 Building‑Block Cheat‑Sheet
Layer | Product / Service | Purpose |
---|---|---|
Authoring | Heretto CCMS (DITA topics & maps) | Writers craft structured technical docs. |
Export | Heretto Deploy API | Streams JSON/DITA OT bundles on every Publish. |
Data Ops | Conscia Core Services | Jobs, Buckets, Triggers, Integration Patterns for ETL in/out of DX Graph. |
Graph | DX Graph | Unified KG: tech docs, products, relationships. |
Search | OpenSearch (keyword + vector) | Hybrid BM‑25 + dense‑vector retrieval. |
Flow | DX Engine discover.docs.rag | Orchestrates search → graph → ranking (+optional LLM). |
API Gateway | Universal MCP Server | Publishes capability discoverDocs ; proxies to DX Engine. |
Agent / UI | ChatGPT, website chat, voice bot, etc. | Calls discoverDocs and renders enriched answers. |
2 Ingestion Pipeline (Core‑Services)
Step | Core‑Services Component | Action |
---|---|---|
A1 | Incoming Bucket | Nightly Heretto JSON dump (dita_*.json ) lands in heretto-dumps (S3/Azure). |
A2 | importDataFiles Job | Cron 0 2 * * * parses each file → writes records to dita-topics collection in DX Graph. |
A3 | Transform Job (optional) | transformDataFiles normalises metadata, maps Heretto IDs to product SKUs. |
A4 | DX Graph Event | DataRecordCreated/Updated event emitted for each topic. |
A5 | Trigger ➜ callDxEngine Job | Trigger ditaUpdated calls DX Engine template dita-to-opensearch to: ① enrich with related entities ② embed body text ③ upsert doc in OpenSearch. |
A6 | Integration Pattern 2.1 | If external systems need change notifications, event stream can hit a webhook or Kafka topic. |
A7 | Scheduled Export (optional) | exportCollection → downstream lake / BI via processCollectionWithWebserviceEndpoint . |
Example Job & Trigger
// importDataFiles Job (runs nightly)
{
"jobDefinitionCode": "dita-import",
"jobType": "importDataFiles",
"schedule": { "cron": "0 2 * * *" },
"params": {
"incomingBucketCode": "heretto-dumps",
"processedBucketCode": "processed",
"invalidBucketCode": "invalid",
"filenamePattern": "dita_*.json",
"recordIdentifierField": "topicId",
"collectionCode": "dita-topics"
}
}
// Trigger → DX Engine
{
"eventType": "DataRecordUpdated",
"triggerCode": "ditaUpdated",
"criteria": "`event.data.dataCollectionCode === 'dita-topics'`",
"job": {
"jobType": "callDxEngine",
"params": {
"templateCode": "dita-to-opensearch",
"context": "`event`"
}
}
}
3 Heretto Deploy API Cheat‑Sheet
Item | Detail |
---|---|
Base URL | https://deploy.<org‑id>.heretto.com/api/v2/ |
Auth | X-API-Key: <key> (Simple Key) or Authorization: Bearer <token> |
Full Map | GET /deployments/{depId}/maps/{mapId}?locale=en-US&depth=all&format=json |
Single Topic | GET /deployments/{depId}/topics/{topicId}?locale=en-US&format=json |
Delta Feed | GET /deployments/{depId}/delta?modifiedSince=2025-07-26T00:00:00Z |
Caching | Responses are prerendered & server‑cached → safe for CDN. |
4 Runtime Discovery Algorithm
discoverDocs(q, k) ───► DX Flow discover.docs.rag
① hybridSearch(OpenSearch, q, k) // BM25 + vector
② for each hit ➜ DX Graph lookup(id) // RELATED, PREREQUISITE, etc.
③ scoring
base = 0.6*bm25 + 0.4*vector
+10 if topicType == 'Task'
+2 per RELATED edge 'Troubleshooting'
LLM‑rerank top 20 (optional)
④ return top N enriched docs
Business rules live in DX Engine; change weights or filters without code deploys.
5 Architecture Diagram
6 End‑to‑End Sequence
# | Actor | Action |
---|---|---|
1 | Heretto Publish | Writers release a new map; Deploy API JSON lands in bucket. |
2 | Job dita-import | Parses file; upserts topics into DX Graph. |
3 | Graph Event → Trigger | ditaUpdated fires callDxEngine job. |
4 | DX Engine Flow | Enriches doc, embeds text, indexes in OpenSearch. |
5 | Agent | Calls discoverDocs({q, k}) via UMCP. |
6 | OpenSearch | Returns top k keyword+vector hits. |
7 | DX Graph Lookup | Graph‑augment step fetches relations per hit. |
8 | DX Engine Rules Engine | Applies rules / LLM re‑rank; returns enriched list. |
9 | Agent LLM | Crafts conversational answer & follow‑ups. |
7 Orchestration Flow Design for Search Ranking
# | Component (type) | Key config | Purpose |
---|---|---|---|
A | Universal API – search-os | Method = POST /opensearch/_search | Hybrid keyword + vector query. |
B | Object Mapper – mapHits | hits.hits[*] → $.hits[] | Extract id, bm25Score, vectorScore, topicType . |
C | For-Each Loop – enrichHits | Iterate over $.hits[] | Drives per-hit graph look-up. |
C1 | Universal API – graphLookup | Path /collections/dita-topics/records/{{hit.id}}?include=related | Pull RELATED / PREREQUISITE edges. |
C2 | Data-Transformation Script – mergeGraph | JS snippet (see below) | Append related[] to the hit object. |
D | Data-Transformation Script – scoreHits | JS scoring logic | 0.6 × bm25 + 0.4 × vector + rule boosts. |
E | (Optional) LLM Scorer – llmRerank | Model gpt-4o-mini | Re-orders top 20 JSON docs. |
F | Object Mapper – responseMap | Map to MCP schema | Final payload for UMCP. |
8 Key Component Configurations
A Universal API search-os
method: POST
url: https://search.acme.io/opensearch/_search
headers:
Content-Type: application/json
bodyTemplate: |
{
"size": {{variables.k}},
"query": {
"hybrid": {
"query": "{{variables.q}}",
"vector": "{{variables.q | embed 'text-embedding-3'}}"
}
},
"_source": ["title","topicType"]
}
responsePath: "$.hits"
timeoutMs: 200
B Object Mapper mapHits
mappings:
- path: "$.hits.hits[*]"
to: "$.hits[]"
fields:
id: "$._id"
title: "$._source.title"
topicType: "$._source.topicType"
bm25Score: "$._score.keyword"
vectorScore: "$._score.vector"
C1 Universal API graphLookup
method: GET
url: https://graph.conscia.ai/collections/dita-topics/records/{{item.id}}
params:
include: related
responsePath: "$"
timeoutMs: 100
C2 · Data Transformation Script mergeGraph
ctx.item.related = ctx.api.related || [];
return ctx.item;
D · Data Transformation Script scoreHits
function ruleBoost(hit){
let b = 0;
if (hit.topicType === 'Task') b += 10;
if (hit.related.some(r => r.edge === 'Troubleshooting')) b += 2;
return b;
}
ctx.hits.forEach(h => {
h.baseScore = 0.6*h.bm25Score + 0.4*h.vectorScore;
h.score = h.baseScore + ruleBoost(h);
});
ctx.hits.sort((a,b) => b.score - a.score);
ctx.hits = ctx.hits.slice(0, ctx.variables.k);
return ctx.hits;
E · LLM Scorer llmRerank
enabled: "{{ variables.k <= 20 && variables.useLLM === true }}"
model: gpt-4o-mini
prompt: |
Rank the following JSON docs by relevance to "{{variables.q}}".
Return the docs in the same JSON shape, sorted best → worst.
input: "{{ json ctx.hits }}"
F · Object Mapper responseMap
mappings:
- path: "$.hits[*]"
to: "$.hits[]"
fields:
id: "$.id"
title: "$.title"
score: "$.score"
topicType: "$.topicType"
related: "$.related"
9 · Flow Variables
Variable | Source | Used by |
---|---|---|
q | UMCP request | Component A |
k | UMCP request (default 10) | A, D |
hits[] | Mapper mapHits | Loop C, Scripts D & E |
useLLM | UMCP flag (bool) | LLM Scorer |
10 · Live Tuning
- Weights & boosts live in
scoreHits
— change them in the console; no redeploy. - Search backend swap — just update
search-os
URL/body. - Disable LLM — set
useLLM:false
in the UMCP request.
7 · Universal MCP Server Stub
paths:
/discoverDocs:
post:
summary: Hybrid tech‑doc discovery
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
q:
type: string
description: Natural‑language query
k:
type: integer
description: Max hits
default: 10
responses:
"200":
description: Enriched docs
content:
application/json:
schema:
$ref: '#/components/schemas/DocHitList'
8 · Implementation Hints
- Scalability – Core‑Services Jobs are stateless → run in parallel pods optimizing data ingestion performance;
- Latency Targets – Keep LLM re‑rank to top 20 docs only. Aim for less than 200 ms p95.
- Versioning – Name capability paths
/v1/discoverDocs
so future schema changes won’t break agents. - Monitoring – Core‑Services exposes Job execution logs, retries, and Trigger audit trails—pipe them to Grafana/Datadog or use Conscia's Observability dashboard.
✅ Result
A fully governed, end‑to‑end pipeline that:
- Automates ingestion of Heretto DITA exports via Core‑Services Jobs & Buckets.
- Enriches & embeds topics in DX Engine then indexes them in OpenSearch.
- Serves hybrid, graph‑aware search through a single Universal MCP Server capability (
discoverDocs
). - Empowers agents & UIs to deliver rich, context‑aware technical answers—in milliseconds.