knowledge & Memory

eliza 中的存储结构

cache :

eliza 中存在一个缓存管理器 ICacheManager。

接口比较简单，只有 get , set , delete 。

CacheStore 分为三种： redis 、database、filesystem 。

cacheManager 真实类型对应了各种 CacheAdapter，绝对存储位置

基础的 CacheManager 类分三种：

MemoryCacheAdapter
FsCacheAdapter
DbCacheAdapter

默认使用 database 存储，默认存储在 database 中。

database:

agent 中定义了一个通用的 database 。

支持两种 db :

一种 sqlite 默认，
一种 postgres 。根据 POSTGRES_URL 的定义来标识
其他

db 作为数据持久层。 cache 依赖 database

knowlege 相关概念

每个知识碎片都会被解析成一个字符串，通过 uuid 生成算法，生成唯一的 id 。在写入之前，会在 documentsManager 中通过知识碎片的 id 检查一遍，是否存在，如果之前处理过了，就跳过。

知识碎片处理模块: src/core/knowlege.ts 主入口: set 方法。

写入知识碎片分两步：(知识碎皮写入之前会按照 trunk_size 拆成更小的碎片。)

一步是文档 documentManager 这里的 embedding 是 0 也就是空数据。
真实的知识碎片，包含了向量数据的在 knowledgeManager 中。

知识碎片标识 : agentId , roomId , userId

后续可以根据上下文获取对应范围的知识碎片。

memoryManagers 是 agent 中的所有的记忆管理器，

包含组件类型:

messageManager
descriptionManager
loreManager
documentsManager documentManager 包含了知识原始数据。
knowledgeManager

同时 agent 本身保留了 opt.managers 的扩展接口，如果需要，可以对内设置自己的记忆管理器。

专门的 rag 知识管理器： ragKnowledgeManager 是 RAGKnowledgeManager

knowledge 存储的下层是 Memory , 这是 Eliza 中容易混淆的两个概念。

memory 的基本单位:

export interface Memory {
  /** Optional unique identifier */
  id?: UUID;
  /** Associated user ID */
  userId: UUID;
  /** Associated agent ID */
  agentId: UUID;
  /** Optional creation timestamp */
  createdAt?: number;
  /** Memory content */
  content: Content;
  /** Optional embedding vector */
  embedding?: number[];
  /** Associated room ID */
  roomId: UUID;
  /** Whether memory is unique */
  unique?: boolean;
  /** Embedding similarity score */
  similarity?: number;
}

时间长了以后慢的原因： searchMemories adapter-sqlite 中 searchMemories 会把数据所有的向量距离计算出来。都在一个表中，按照类型（表名来区分），然后计算向量和，取 top n

使用举例

拿 direct-client 中的使用举例：

memory 写入:

const responseMessage: Memory = {
id: stringToUuid(messageId + "-" + runtime.agentId),
...userMessage,
userId: runtime.agentId,
content: response,
embedding: getEmbeddingZeroVector(),
createdAt: Date.now(),
};
 
await runtime.messageManager.createMemory(responseMessage);

memory 读取:

最近聊天记录读取:

const [actorsData, recentMessagesData, goalsData]: [
Actor[],
Memory[],
Goal[],
] = await Promise.all([
getActorDetails({ runtime: this, roomId }),
this.messageManager.getMemories({
roomId,
count: conversationLength,
unique: false,
}),
getGoals({
runtime: this,
count: 10,
onlyInProgress: false,
roomId,
}),
]);

knowlege 获取:

从 knowledgeManager 或者 ragKnowledgeManager 获取

let knowledgeData = [];
let formattedKnowledge = "";
 
if (this.character.settings?.ragKnowledge) {
const recentContext = recentMessagesData
.slice(-3) // Last 3 messages
.map((msg) => msg.content.text)
.join(" ");
 
      knowledgeData = await this.ragKnowledgeManager.getKnowledge({
          query: message.content.text,
          conversationContext: recentContext,
          limit: 5,
      });
 
      formattedKnowledge = formatKnowledge(knowledgeData);
 
} else {
knowledgeData = await knowledge.get(this, message);
formattedKnowledge = formatKnowledge(knowledgeData);
}

默认 ragKnowledge = false

使用 knowledgeManager runtime.knowledgeManager.searchMemoriesByEmbedding

启用 ragKnowledge 后：

调用专门的 rag 资源 ragKnowledgeManager 搜索

knowledgeManager 是一个和 memmory 存储相似的结构。

knowlege 和 memorry 的区别在于， memorry 主要存储和房间相关的信息，而 knowledge 主要存储更通用的信息。

knowledge 写入:

参看 : knowledge.set 方法写入

async function set(
  runtime: AgentRuntime,
  item: KnowledgeItem,
  chunkSize = 512,
  bleed = 20
) {
  await runtime.documentsManager.createMemory({
    id: item.id,
    agentId: runtime.agentId,
    roomId: runtime.agentId,
    userId: runtime.agentId,
    createdAt: Date.now(),
    content: item.content,
    embedding: getEmbeddingZeroVector(),
  });
 
  const preprocessed = preprocess(item.content.text);
  const fragments = await splitChunks(preprocessed, chunkSize, bleed);
 
  for (const fragment of fragments) {
    const embedding = await embed(runtime, fragment);
    await runtime.knowledgeManager.createMemory({
      // We namespace the knowledge base uuid to avoid id
      // collision with the document above.
      id: stringToUuid(item.id + fragment),
      roomId: runtime.agentId,
      agentId: runtime.agentId,
      userId: runtime.agentId,
      createdAt: Date.now(),
      content: {
        source: item.id,
        text: fragment,
      },
      embedding,
    });
  }
}

如果启用了 ragKnowledge 那么调用:

runtime.ragKnowledgeManager 的方法写入

createKnowledge
processFile

获取 embed 数据:

import { embed, getEmbeddingZeroVector } from "….core/embedding.ts";
const embedding = await embed(runtime, processed);