OpenSearch Guidelines for Generative AI Agents
Pre-General Availability: 2024-01-24
The following legal notice applies to Oracle pre-GA (Beta) releases. For copyright and other applicable notices, see Oracle Legal Notices.
Pre-General Availability Draft Documentation Notice
This documentation is in pre-General Availability status and is intended for demonstration and preliminary use only. It may not be specific to the hardware on which you are using the software. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to this documentation and will not be responsible for any loss, costs, or damages incurred due to the use of this documentation.
This documentation is not a commitment by Oracle to deliver any material, code, functionality or services. This documentation, and Oracle Pre-GA programs and services are subject to change at any time without notice and, accordingly, should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality for Oracle’s Pre-GA programs and services remains at the sole discretion of Oracle. All release dates or other predictions of future events are subject to change. The future availability of any future Oracle program or service should not be relied on in entering into any license or service agreement with Oracle.
To make your data available to OCI Generative AI Agents, you have two options:
- Object Storage data: You can upload your data files to OCI Object Storage and let Generative AI Agents automatically ingest the data. Skip this topic if your data files are in Object Storage.
- OpenSearch data: You can bring your own (BYO) ingested and indexed OCI Search with OpenSearch data for the agents to use. This topic provides guidelines for this option, assuming you are already familiar with OCI Search with OpenSearch.
Contact the Beta program for a link and instructions to download an OCI Resource Manager Terraform stack that creates an OCI Search with OpenSearch cluster with a public management instance. Use the following guidelines:
- Select Terraform version 1.2.x.
-
If you're using OCI Identity Domain for authentication, in the Consele, navigate to the domain section and copy the domain URL. For example,
https://idcs-xxx.identity.oraclecloud.com:443
- If you're using a federation-based tenancy, in the Console, navigate to
Federation and under Identity,
select your identity provider. Get the OpenID URL by
copying the IDCS URL. For example,
https://idcs-xxxx.identity.oraclecloud.com
After running the stack in the previous section, create indexes and add your data to OpenSearch with the following guidelines:
To ingest large documents into OpenSearch, you must first chunk those documents to files with less than 10,000 tokens each.
- In the OCI Console, navigate to the details page of your stack. Click
Stack resources, and copy the values for the
following resources:
public_ip
: The public IP address for the VM that was created by the stack.opendashboard_private_ip
: The private IP to use for the dashboard endpointopensearch_private_ip
: The private IP to use for the API endpointprivate_key_pem
: The private IP to use for the API endpoint
- Save the
private_key_pem
value to themanagement-instance-pk.pem
file. - Format the
management-instance-pk.pem
file:sed -i "" 's/\"//g' management-instance-pk.pem sed -i "" 's/\\n/\n/g' management-instance-pk.pem chmod 600 management-instance-pk.pem
- Sign in to the VM that was created with the stack in the previous section. Use
the private key that you created during the Terraform stack creation. Use SSH.
For
example,
ssh -C -N -v -t -L 127.0.0.1:5601:<your_OpenSearch_dashboards_private_IP>:5601 -L 127.0.0.1:9200:<your_OpenSearch_private_IP>:9200 opc<your_VM_instance_public_IP> -i <path_to_your_private_key>
Reference: Search and visualize data using OCI Search Service with OpenSearch
- Format the private key file:
# Step 1: Create a private key file touch <<private-key-file-name>>.pem # Step 2: Edit File vim <<private-key-file-name>>.pem # Paste copied private key # Click Esc key and type :wq to save it # Step 3: Format Private Key File #To remove " sed -i "" 's/\"//g' <<private-key-file-name>>.pem #To replace \n with new line: sed -i "" 's/\\n/\n/g' <<private-key-file-name>>.pem # Step 4: check file content if formatted properly: cat <<private-key-file-name>>.pem # Step 5: Change Private key file access chmod 600 <<private-key-file-name>>.pem
- Ingest data into
OpenSearch:
# Step 1: ssh to management-instance which will be created as part of stack creation: ssh -i private_key.pem opc@<<management-instance-ip>> # Step 2: Create index curl -XPUT https://<<OpenSearch-cluster-ip>>:9200/<<index-name>> -u <<OpenSearch-cluster-username>>:<<password>> --insecure e.g.: curl -XPUT https://254.125.0.0:9200/iaas -u pocuser:Poc@1234 --insecure # Step 3: Dump Data #copy files to management instance scp -i <<private_key.pem>> <<local-files-path>> opc@<<management-instance-ip>>:<<path>> e.g: scp -i private_key.pem ~/Documents/Setup/ opc@207.211.175.225:data # Using Script, file format supported: pdf, html, docx ingestor -f <<path-to-files-directory>> -ip <<OpenSearch-cluster-ip>> -po 9200 -u <<OpenSearch-cluster-username>> -pw <<password>> -i <<index-name>> ingestor -p <<object-storage-par-URL>> -ip <<OpenSearch-cluster-ip>> -po 9200 -u <<OpenSearch-cluster-username>> -pw <<password>> -i <<index-name>> #Using curl, file format supported: x_and_json curl -H 'Content-Type: application/x-ndjson' -XPOST https://<<OpenSearch-cluster-ip>>:9200/<<index-name>>/_bulk?pretty --data-binary @<<file-name>>.json -u <<OpenSearch-cluster-username>>:<<password>> --insecure # Step 4: Print Count curl -XGET https://<<OpenSearch-cluster-ip>>:9200/<<index-name>>/_count?pretty -u <<OpenSearch-cluster-username>>:<<password>> --insecure # Step 5 : Search data (Optional) curl -XGET https://<<OpenSearch-cluster-ip>>:9200/<<index-name>>/_search -u <<OpenSearch-cluster-username>>:<<password>> --insecure
- Configure OpenID Connect for the OpenSearch
cluster:
curl -XPUT "https://<<OpenSearch-cluster-ip>>:9200/_plugins/_security/api/securityconfig/config" -u <<OpenSearch-cluster-username>>:<<password>> --insecure -H 'Content-Type: application/json' -d' { "dynamic": { "security_mode": "ENFORCING", "http": { "anonymous_auth_enabled": false, "xff": { "enabled": false } }, "authc": { "OpenID_auth_domain": { "http_enabled": true, "transport_enabled": true, "order": 1, "http_authenticator": { "challenge": true, "type": "OpenID", "config": { "subject_key": "sub", "roles_key": "scope", "openid_connect_url": "<<IDCS-URL>>/.well-known/openid-configuration" } }, "authentication_backend": { "type": "noop", "config": {} }, "description": "Authenticate using OpenID connect" }, "basic_internal_auth_domain": { "http_enabled": true, "transport_enabled": true, "order": 0, "http_authenticator": { "challenge": false, "type": "basic", "config": {} }, "authentication_backend": { "type": "intern", "config": {} }, "description": "Authenticate via HTTP Basic against internal users database" } }, "authz": null } }'
Note
Ensure that you add the correct OpenID_connect_URL value. - Reset the OpenSearch cluster password:
In the OCI Console, navigate to the OpenSearch listed clusters and select your cluster. In the Security Information tabl, click Update security information and update the password.
Creating a Secret
Before you add your Search with OpenSearch data to a knowledge base in Generative AI Agents, you must create a secret for OpenSearch in OCI Vault service.
To create a secret for basic authentication and then use that secret for the knowledge base perform the steps in the first dropdown section. To create a secret for an Identity Cloud Service (IDCS), use the guidelines in the second dropdown section. Follow the guidelines in only one of the following two sections.
- In the OCI Console, Create a Vault.
- After the vault is active, create a key for the vault
- For the vault, create a secret with the following specifics:
- Select the key that you created in the previous step.
- Manually enter the username and password for the OpenSearch cluster with the following format:
- Secret Type Template:
Plain-Text
- Secret Contents:
<OpenSearch-username>:<OpenSearch-password>
- Secret Type Template:
Creating a confidential application
Create a confidential application if you don't have one:
- In the IDCS Console, navigate to Applications, click Add application, and select Confidential Application.
- Create a resource server application with
agent-endpoint
as the primary audience by adding the following values in the Configure Oauth step:- Select Configure this application as a resource server now.
- Access token expiration: 3600
- Primary audience:
https://agent-endpoint/
- Create a client application with
agent-endpoint
as the redirect URL by selecting the following options in the Authorization section:- Resource Owner
- Client Credentials
- JWT assertion
- Refresh token
- Authorization code
- TLS client authentication
For Redirect URL, enter
https://agent-endpoint/
. - Click Finish to create the application.
- After the application is active, in the application's detail page, click Activate.
- For the resource server application, edit the OAuth configuration and select Add scope. Then add the scope,
genaiagent
. - Edit the client application and select Add resources. Under Resources, click Add scope and select the resource server application that you created. The Scope field displays
https://agent-endpoint/genaiagent
Setting Up the OpenSearch OpenID
Reference: Configuring OpenID Connect for an OpenSearch Cluster
Step 1: Add OpenID Config
curl -XPUT "https://<opensearch-ip>:9200/_plugins/_security/api/securityconfig/config" -u username:password --insecure -H 'Content-Type: application/json' -d'
{
"dynamic": {
"security_mode": "ENFORCING",
"http": {
"anonymous_auth_enabled": false,
"xff": {
"enabled": false
}
},
"authc": {
"openid_auth_domain": {
"http_enabled": true,
"transport_enabled": true,
"order": 0,
"http_authenticator": {
"challenge": false,
"type": "openid",
"config": {
"subject_key": "sub",
"roles_key": "scope",
"openid_connect_url": "https://<openid-domain-host>/.well-known/openid-configuration"
}
},
"authentication_backend": {
"type": "noop",
"config": {}
},
"description": "Authenticate using OpenId connect"
},
"basic_internal_auth_domain": {
"http_enabled": true,
"transport_enabled": true,
"order": 1,
"http_authenticator": {
"challenge": true,
"type": "basic",
"config": {}
},
"authentication_backend": {
"type": "intern",
"config": {}
},
"description": "Authenticate via HTTP Basic against internal users database"
}
},
"authz": null
}
}'
# Step 5: Create Readonly Role
curl -XPUT "https://<opensearch-ip>:9200/_plugins/_security/api/roles/genaiagent_readall" -u username:password --insecure -H 'Content-Type: application/json' -d'{
"description": "Role to be used by Generative AI Agent having read only permission to all Indexes",
"cluster_permissions": [
"cluster_composite_ops_ro"
],
"index_permissions": [{
"index_patterns": [
"*"
],
"fls": [],
"masked_fields": [],
"allowed_actions": [
"read"
]
}],
"tenant_permissions": []
}'
# Step 6: Add role mapping for genaiagent_readall
curl -XPUT "https://<opensearch-ip>:9200/_plugins/_security/api/rolesmapping/genaiagent_readall" -u username:password --insecure -H 'Content-Type: application/json' -d'{
"backend_roles" : [ "genaiagent" ],
"hosts" : [],
"users" : []
}'
# Step 7: Test Access
curl --location 'https://<domain-host>/oauth2/v1/token' \
--header 'authorization: Basic <Base64 clientId:clientSecret>' \
--header 'content-type: application/x-www-form-urlencoded; charset=utf-8' \
--data 'grant_type=client_credentials&scope=<scope>'
Creating a Vault secret for IDCS client credential
- In the OCI Console, Create a Vault.
- After the vault is active, create a key for the vault
- For the vault, create a secret with the following specifics:
- Select the key that you created in the previous step.
- Manually enter the IDCS client secret for the OpenSearch cluster with the following format:
- Secret Type Template:
Plain-Text
- Secret Contents:
clientSecret
- Secret Type Template: