Knowledge Base (RAG) - Starleads api docs

Introduction

Starleads integrates Retrieval-Augmented Generation (RAG) to give your AI agents access to your own knowledge. Instead of relying solely on a static prompt, your agents can search through your documents in real-time during conversations — retrieving the most relevant information to answer questions accurately and naturally. A knowledge base in Starleads is built around datasets that contain your uploaded documents (PDFs, text files, etc.). Once uploaded and parsed, these documents are indexed and made searchable. You then create a chat assistant that connects to one or more datasets, defining how the AI retrieves and uses your content — including similarity thresholds, prompt templates, and response behavior. The general workflow follows four steps: Dataset (create a container) then Documents (upload and parse your files) then Chat (configure retrieval settings) then Agent (connect the knowledge base to your voice agent). Once connected, your agent can draw on your documents during live conversations, providing precise and context-aware responses.

Getting Started

Create a dataset

A dataset is a container for your documents. Give it a descriptive name that reflects its content (e.g. “Product Documentation”, “FAQ 2026”).You can configure parser settings at creation time or update them later. See Advanced Configuration for parser options.Endpoint: POST /Dataset

Upload documents

Upload files to your dataset using multipart/form-data. Starleads accepts common document formats such as PDF, TXT, DOCX, and more.There are two limits enforced at the gateway level:

Maximum 5 MB per file
Maximum 10 files per request

Files exceeding these limits are rejected immediately without being forwarded to the backend.Endpoint: POST /Dataset/{datasetId}/document

Parse documents

After uploading, documents need to be parsed to extract and index their content. Parsing is asynchronous — the API returns immediately and processing continues in the background.You can check the parsing status by listing the documents in your dataset. Each document includes a status field indicating whether parsing is pending, in progress, or complete.Endpoint: POST /Dataset/{datasetId}/document/parse

Create a chat assistant

A chat assistant ties together one or more datasets with retrieval configuration. It defines how your knowledge base searches for relevant content and how the LLM generates responses.You can configure similarity thresholds, the number of top results to consider, and a custom prompt template to control response formatting.

The LLM model is managed by the Starleads platform and cannot be changed via the API. The model name does not appear in API requests or responses.

Endpoint: POST /KnowledgeBaseChat

Connect to an agent

Finally, connect your chat assistant to a voice agent. Once linked, the agent will use the knowledge base during conversations to retrieve relevant information and answer questions.An agent can be connected to one chat assistant at a time. Connecting a new chat replaces the previous one.Endpoint: PUT /Agent/{agentId}/chat

Code Examples

Create a dataset

curl -X POST https://api.starleads.co/Dataset \
  -H "X-Api-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{"name": "My Knowledge Base"}'

import requests

response = requests.post(
    "https://api.starleads.co/Dataset",
    headers={"X-Api-Key": "your-api-key"},
    json={"name": "My Knowledge Base"}
)
dataset = response.json()

Upload documents

curl -X POST https://api.starleads.co/Dataset/{datasetId}/document \
  -H "X-Api-Key: your-api-key" \
  -F "files=@document.pdf"

import requests

with open("document.pdf", "rb") as f:
    response = requests.post(
        "https://api.starleads.co/Dataset/{datasetId}/document",
        headers={"X-Api-Key": "your-api-key"},
        files={"files": ("document.pdf", f, "application/pdf")}
    )
documents = response.json()

Create a chat assistant

curl -X POST https://api.starleads.co/KnowledgeBaseChat \
  -H "X-Api-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{"name": "Support Bot", "datasetIds": ["ds_123"]}'

import requests

response = requests.post(
    "https://api.starleads.co/KnowledgeBaseChat",
    headers={"X-Api-Key": "your-api-key"},
    json={
        "name": "Support Bot",
        "datasetIds": ["ds_123"]
    }
)
chat = response.json()

Connect to an agent

curl -X PUT https://api.starleads.co/Agent/{agentId}/chat \
  -H "X-Api-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{"chatId": "chat_456"}'

import requests

response = requests.put(
    "https://api.starleads.co/Agent/{agentId}/chat",
    headers={"X-Api-Key": "your-api-key"},
    json={"chatId": "chat_456"}
)

Advanced Configuration

Parser configuration

When creating or updating a dataset, you can configure how documents are parsed and chunked:

Parameter	Type	Description
`chunkTokenNum`	integer	Number of tokens per chunk. Controls how documents are split into searchable segments.
`layoutRecognize`	boolean	Enable layout recognition for structured documents (tables, columns, headers). Improves accuracy for complex PDFs.
`delimiter`	string	Custom delimiter for splitting text. Use this when your documents have a consistent separator (e.g. `\n---\n`).

LLM settings

You can fine-tune how the LLM generates responses from retrieved content:

Parameter	Type	Description
`temperature`	number	Controls randomness in responses. Lower values (e.g. 0.1) produce more focused answers; higher values (e.g. 0.9) produce more creative ones.
`topP`	number	Nucleus sampling threshold. Controls the diversity of token selection.
`presencePenalty`	number	Penalizes tokens that have already appeared, encouraging the model to cover new topics.
`frequencyPenalty`	number	Penalizes tokens based on their frequency, reducing repetition.

The LLM model itself is managed by the Starleads platform. You can adjust generation parameters but cannot select or change the underlying model.

Prompt settings

Control how retrieved documents are used to generate responses:

Parameter	Type	Description
`similarityThreshold`	number	Minimum similarity score (0 to 1) for a document chunk to be considered relevant. Higher values return fewer but more precise results.
`topN`	integer	Maximum number of document chunks to retrieve and include in the context.
`prompt`	string	Custom prompt template. Use `{knowledge}` as a placeholder where retrieved content will be injected.

Knowledge Graph

Knowledge Graph is available on the Business plan only.

Knowledge graphs provide a structured, relational view of your documents. Instead of searching through flat text chunks, a knowledge graph extracts entities and relationships, enabling more precise and context-aware retrieval. The knowledge graph lifecycle follows four steps:

Enable — Activate the knowledge graph feature on a dataset (POST /Dataset/{datasetId}/knowledge-graph/enable).
Build — Trigger the graph construction. This is an asynchronous operation that processes all parsed documents in the dataset (POST /Dataset/{datasetId}/knowledge-graph/build).
Check status — Monitor the build progress (GET /Dataset/{datasetId}/knowledge-graph/status). The status indicates whether the build is in progress, complete, or encountered an error.
Retrieve — Once built, retrieve the knowledge graph data (GET /Dataset/{datasetId}/knowledge-graph).

You can also delete a knowledge graph to rebuild it from scratch (DELETE /Dataset/{datasetId}/knowledge-graph).

API Reference

Explore all RAG-related endpoints:

Datasets

Create, list, update, and delete datasets that hold your documents.

Documents

Upload, download, parse, and manage documents within your datasets.

Knowledge Base Chat

Configure chat assistants that connect your datasets to AI retrieval.

Knowledge Graph

Build and manage knowledge graphs for advanced document retrieval.

​Introduction

​Getting Started

​Code Examples

​Create a dataset

​Upload documents

​Create a chat assistant

​Connect to an agent

​Advanced Configuration

​Parser configuration

​LLM settings

​Prompt settings

​Knowledge Graph

​API Reference

Datasets

Documents

Knowledge Base Chat

Knowledge Graph

Introduction

Getting Started

Code Examples

Create a dataset

Upload documents

Create a chat assistant

Connect to an agent

Advanced Configuration

Parser configuration

LLM settings

Prompt settings

Knowledge Graph

API Reference