Skip to main content

Document Grounding

Introduction

This guide provides examples on how to manage data in SAP Document Grounding. It's divided into two main sections: Data Ingestion and Data Retrieval.

warning

All classes under any of the ...model packages are generated from an OpenAPI specification and marked as @Beta. This means that these model classes are not guaranteed to be stable and may change with future releases. They are safe to use, but may require updates even in minor releases.

Prerequisites

Before using the Document Grounding module, ensure that you have met all the general requirements outlined in the overview. Additionally, include the necessary Maven dependency in your project.

Maven Dependencies

Add the following dependency to your pom.xml file:

<dependency>
<groupId>com.sap.ai.sdk</groupId>
<artifactId>document-grounding</artifactId>
<version>${ai-sdk.version}</version>
</dependency>

See an example pom in our Spring Boot application

Usage

In addition to the prerequisites above, we assume you have already set up the following to carry out the examples in this guide:

  • A running instance of SAP AI Core with correctly setup credentials, including a resource group id.

Data Ingestion

The following APIs are available for data ingestion: Pipeline and Vector.

Pipeline API

Consider the following code sample to read pipelines, create a new one and get its status:

var api = new GroundingClient().pipelines();
var resourceGroupId = "default";

// get all pipelines
Pipelines pipelines = api.getAllPipelines(resourceGroupId);

// create new pipeline
var type = "MSSharePoint"; // or "S3" or "SFTP"
var pipelineSecret = "my-secret-name";
var config = PipelinePostRequstConfiguration.create().destination(pipelineSecret);
var request = PipelinePostRequst.create().type(type)._configuration(config);
PipelineId pipeline = api.createPipeline(resourceGroupId, request);

// get pipeline status
PipelineStatus status = api.getPipelineStatus(resourceGroupId, pipeline.getPipelineId());

Vector API

var api = new GroundingClient().vector();
var resourceGroupId = "default";

// resolve collection id
var collectionId = UUID.fromString("12345-123-123-123-0123456abcdef");

var request = DocumentCreateRequest.create()
.documents(BaseDocument.create()
.chunks(TextOnlyBaseChunk.create()
.content("The dog makes _woof_")
.metadata(KeyValueListPair.create()
.key("animal").value("dog")))
.metadata(DocumentKeyValueListPair.create()
.key("topic").value("sound")));
DocumentsListResponse response = api.createDocuments(resourceGroupId, collectionId, request);

Refer to the DeploymentController.java in our Spring Boot application for a complete example.

Data Retrieval

The following APIs are available for data retrieval: Retrieval and Orchestration.

Retrieval API

Consider the following code sample to search for relevant document grounding data based on a query:

var api = new GroundingClient().retrieval();
var resourceGroupId = "default";

var filter =
RetrievalSearchFilter.create()
.id("question")
.dataRepositoryType(DataRepositoryType.VECTOR)
.dataRepositories(List.of("*"))
.searchConfiguration(SearchConfiguration.create().maxChunkCount(10));
var search = RetrievalSearchInput.create().query("What is SAP Cloud SDK for AI?").filters(filter);
RetievalSearchResults results = api.search(resourceGroupId, search);

Grounding via Orchestration

You can use the grounding service via orchestration. Please find the Orchestration documentation on grounding.

OrchestrationClient client;

var databaseFilter =
DocumentGroundingFilter.create()
.dataRepositoryType(DataRepositoryType.VECTOR)
.searchConfig(GroundingFilterSearchConfiguration.create().maxChunkCount(3));
var groundingConfigConfig =
GroundingModuleConfigConfig.create()
.inputParams(List.of("query"))
.outputParam("results")
.addFiltersItem(databaseFilter);
var groundingConfig =
GroundingModuleConfig.create()
.type(GroundingModuleConfig.TypeEnum.DOCUMENT_GROUNDING_SERVICE)
.config(groundingConfigConfig);
var configWithGrounding =
new OrchestrationModuleConfig()
.withLlmConfig(GPT_4O)
.withGroundingConfig(groundingConfig);

var inputParams = Map.of("query", "What is SAP Cloud SDK for AI?");

var prompt =
new OrchestrationPrompt(
inputParams,
Message.system("Context message with embedded grounding results. {{?results}}"));

OrchestrationChatResponse response = client.chatCompletion(prompt, configWithGrounding);