Knowledge Graph Module¶

Added in: v0.16.0 File: src/selectools/knowledge_graph.py Classes: Triple, TripleStore, InMemoryTripleStore, SQLiteTripleStore, KnowledgeGraphMemory

Table of Contents¶

Overview
Quick Start
Triple Dataclass
TripleStore Protocol
Store Implementations
KnowledgeGraphMemory
LLM-Powered Extraction
Agent Integration
Observer Events
Querying the Graph
Best Practices

Overview¶

The Knowledge Graph module builds a graph of relationships between entities extracted from conversations. While Entity Memory tracks individual entities and their attributes, the Knowledge Graph tracks how entities relate to each other -- forming a structured, queryable web of knowledge.

Purpose¶

Relationship Tracking: Capture subject-relation-object triples from conversation
LLM Extraction: Automatically extract relationships using an LLM provider
Keyword Query: Retrieve relevant triples by keyword or entity name
Context Injection: Feed relationship context into the system prompt
Persistence: Store triples in memory or SQLite for durability

How It Differs from Entity Memory¶

Feature	Entity Memory	Knowledge Graph
Tracks	Individual entities + attributes	Relationships between entities
Structure	Key-value (entity -> attributes)	Graph (subject -> relation -> object)
Example	Alice:	Alice --works_at--> Acme Corp
Query	By entity name	By keyword, subject, or object
Best for	"What do I know about X?"	"How are X and Y related?"

Quick Start¶

from selectools import Agent, AgentConfig, OpenAIProvider, ConversationMemory, Message, Role
from selectools.knowledge_graph import KnowledgeGraphMemory, InMemoryTripleStore

kg = KnowledgeGraphMemory(
    store=InMemoryTripleStore(),
    provider=OpenAIProvider(),  # used for LLM-based extraction
)

agent = Agent(
    tools=[],
    provider=OpenAIProvider(),
    memory=ConversationMemory(max_messages=50),
    config=AgentConfig(knowledge_graph=kg),
)

# Turn 1 -- relationships extracted automatically
result = agent.run([
    Message(role=Role.USER, content="Alice works at Acme Corp. Acme Corp is based in Seattle.")
])

# Turn 2 -- agent has relationship context
result = agent.run([
    Message(role=Role.USER, content="Where does Alice's company operate?")
])
# Agent knows: Alice works_at Acme Corp, Acme Corp located_in Seattle

Triple Dataclass¶

Each relationship is represented as a Triple:

from dataclasses import dataclass
from datetime import datetime
from typing import Optional

@dataclass
class Triple:
    subject: str                       # source entity
    relation: str                      # relationship type (e.g., "works_at")
    object: str                        # target entity
    confidence: float = 1.0            # extraction confidence (0.0 - 1.0)
    source_turn: Optional[int] = None  # conversation turn where extracted
    created_at: Optional[datetime] = None

Example Triples¶

Triple(subject="Alice", relation="works_at", object="Acme Corp", confidence=0.95)
Triple(subject="Acme Corp", relation="located_in", object="Seattle", confidence=0.90)
Triple(subject="Alice", relation="manages", object="Project Atlas", confidence=0.85)
Triple(subject="Bob", relation="reports_to", object="Alice", confidence=0.80)

TripleStore Protocol¶

All backends implement the TripleStore protocol:

from typing import Protocol, List, Optional

class TripleStore(Protocol):
    def add(self, triples: List[Triple]) -> None:
        """Add triples to the store. Duplicates are ignored."""
        ...

    def query(
        self,
        subject: Optional[str] = None,
        relation: Optional[str] = None,
        object: Optional[str] = None,
    ) -> List[Triple]:
        """Query triples by any combination of subject, relation, object.
        None fields act as wildcards.
        """
        ...

    def search(self, keywords: List[str], top_k: int = 20) -> List[Triple]:
        """Search triples matching any of the given keywords.
        Matches against subject, relation, and object fields.
        """
        ...

    def delete(
        self,
        subject: Optional[str] = None,
        relation: Optional[str] = None,
        object: Optional[str] = None,
    ) -> int:
        """Delete matching triples. Returns the number of triples deleted."""
        ...

    def all(self) -> List[Triple]:
        """Return all triples in the store."""
        ...

    def clear(self) -> None:
        """Remove all triples."""
        ...

    def count(self) -> int:
        """Return the total number of triples."""
        ...

Store Implementations¶

1. InMemoryTripleStore¶

Best for: Prototyping, testing, short-lived sessions

from selectools.knowledge_graph import InMemoryTripleStore

store = InMemoryTripleStore()

store.add([
    Triple(subject="Alice", relation="works_at", object="Acme Corp"),
    Triple(subject="Acme Corp", relation="located_in", object="Seattle"),
])

# Query by subject
results = store.query(subject="Alice")
# [Triple(subject="Alice", relation="works_at", object="Acme Corp")]

# Keyword search
results = store.search(keywords=["Alice", "Seattle"], top_k=10)
# Returns triples mentioning Alice or Seattle

Features:

No dependencies
Fast in-memory lookup
No persistence (lost on restart)
Suitable for up to ~10k triples

2. SQLiteTripleStore¶

Best for: Production single-instance, persistent knowledge graphs

from selectools.knowledge_graph import SQLiteTripleStore

store = SQLiteTripleStore(db_path="knowledge_graph.db")

Schema:

CREATE TABLE triples (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    subject TEXT NOT NULL,
    relation TEXT NOT NULL,
    object TEXT NOT NULL,
    confidence REAL DEFAULT 1.0,
    source_turn INTEGER,
    created_at TEXT,
    UNIQUE(subject, relation, object)
);

CREATE INDEX idx_subject ON triples(subject);
CREATE INDEX idx_object ON triples(object);
CREATE INDEX idx_relation ON triples(relation);

Features:

Persistent storage
Indexed queries
Duplicate-safe (UNIQUE constraint)
ACID transactions
Suitable for up to ~100k triples

Choosing a Store¶

Feature	InMemory	SQLite
Persistence	No	Yes
Dependencies	None	None
Max Triples	~10k	~100k
Query Speed	Fast	Fast (indexed)
Setup	None	DB path

KnowledgeGraphMemory¶

KnowledgeGraphMemory wraps a TripleStore with LLM-powered extraction and context building:

Constructor¶

class KnowledgeGraphMemory:
    def __init__(
        self,
        store: TripleStore,
        provider: Optional[Provider] = None,
        extraction_model: Optional[str] = None,
        max_context_triples: int = 30,
        min_confidence: float = 0.5,
    ):
        """
        Args:
            store: Backend triple store.
            provider: LLM provider for relationship extraction.
                      If None, extraction is disabled (manual-only).
            extraction_model: Override model for extraction calls.
            max_context_triples: Max triples to include in context injection.
            min_confidence: Minimum confidence threshold for context inclusion.
        """

Core Methods¶

def extract_triples(self, text: str) -> List[Triple]:
    """Extract relationship triples from text using the LLM provider.

    Returns a list of Triple objects parsed from the LLM response.
    """

def update(self, triples: List[Triple]) -> None:
    """Add triples to the underlying store."""

def query(self, keywords: List[str], top_k: int = 20) -> List[Triple]:
    """Search the triple store by keywords.

    Filters results by min_confidence threshold.
    """

def build_context(self, keywords: Optional[List[str]] = None) -> str:
    """Build a context string for system prompt injection.

    If keywords are provided, only relevant triples are included.
    Otherwise, the most recent triples (up to max_context_triples) are used.
    """

def clear(self) -> None:
    """Clear all triples from the store."""

def to_dict(self) -> Dict[str, Any]:
    """Serialize for persistence (used by session storage)."""

@classmethod
def from_dict(cls, data: Dict[str, Any], store: TripleStore) -> "KnowledgeGraphMemory":
    """Restore from serialized data."""

LLM-Powered Extraction¶

When a provider is configured, extract_triples() sends text to the LLM with a structured prompt:

Extract all relationships from the following text as subject-relation-object triples.

For each triple, provide:
- subject: the source entity
- relation: the relationship (use snake_case, e.g., "works_at", "located_in", "manages")
- object: the target entity
- confidence: how confident you are (0.0 to 1.0)

Respond as a JSON array.

Text:
"""
Alice is a senior engineer at Acme Corp. She manages the Atlas project
and reports to Bob, the VP of Engineering. Acme Corp is headquartered in Seattle.
"""

The LLM responds:

[
    {"subject": "Alice", "relation": "works_at", "object": "Acme Corp", "confidence": 0.95},
    {"subject": "Alice", "relation": "has_role", "object": "senior engineer", "confidence": 0.95},
    {"subject": "Alice", "relation": "manages", "object": "Atlas project", "confidence": 0.90},
    {"subject": "Alice", "relation": "reports_to", "object": "Bob", "confidence": 0.90},
    {"subject": "Bob", "relation": "has_role", "object": "VP of Engineering", "confidence": 0.90},
    {"subject": "Acme Corp", "relation": "headquartered_in", "object": "Seattle", "confidence": 0.95}
]

Agent Integration¶

Configuration¶

from selectools import Agent, AgentConfig, OpenAIProvider, ConversationMemory
from selectools.knowledge_graph import KnowledgeGraphMemory, SQLiteTripleStore

kg = KnowledgeGraphMemory(
    store=SQLiteTripleStore(db_path="kg.db"),
    provider=OpenAIProvider(model="gpt-4o-mini"),
    max_context_triples=30,
    min_confidence=0.6,
)

agent = Agent(
    tools=[...],
    provider=OpenAIProvider(),
    memory=ConversationMemory(max_messages=50),
    config=AgentConfig(knowledge_graph=kg),
)

Context Injection Flow¶

run() / arun() called
    |
    +-- knowledge_graph.extract_triples(user_message)
    |   +-- LLM extracts relationship triples
    |
    +-- knowledge_graph.update(extracted_triples)
    |   +-- Store triples in backend
    |
    +-- Extract keywords from user message
    |
    +-- knowledge_graph.build_context(keywords)
    |   +-- "[Known Relationships]
    |   |    - Alice works_at Acme Corp (confidence: 0.95)
    |   |    - Acme Corp headquartered_in Seattle (confidence: 0.95)
    |   |    - Alice manages Atlas project (confidence: 0.90)"
    |
    +-- Prepend context to system message
    |
    +-- Execute agent loop
    |
    +-- Return AgentResult

Context Format¶

The build_context() method produces:

[Known Relationships]
- Alice works_at Acme Corp (0.95)
- Alice manages Atlas project (0.90)
- Alice reports_to Bob (0.90)
- Acme Corp headquartered_in Seattle (0.95)
- Bob has_role VP of Engineering (0.90)

Observer Events¶

Knowledge graph extraction fires an observer event:

from selectools import AgentObserver

class KGWatcher(AgentObserver):
    def on_kg_extraction(
        self,
        run_id: str,
        triples_extracted: int,
        triples_total: int,
        triples: list,
    ) -> None:
        print(f"[{run_id}] Extracted {triples_extracted} triples, {triples_total} total in store")
        for t in triples:
            print(f"  {t.subject} --{t.relation}--> {t.object} ({t.confidence:.2f})")

Event	When	Parameters
`on_kg_extraction`	After extracting and storing triples	`run_id`, `triples_extracted`, `triples_total`, `triples`

Querying the Graph¶

By Subject¶

# All relationships where Alice is the subject
triples = kg.store.query(subject="Alice")
# Alice works_at Acme Corp
# Alice manages Atlas project
# Alice reports_to Bob

By Object¶

# All relationships pointing to Acme Corp
triples = kg.store.query(object="Acme Corp")
# Alice works_at Acme Corp

By Relation Type¶

# All "manages" relationships
triples = kg.store.query(relation="manages")
# Alice manages Atlas project

By Keywords¶

# Free-text keyword search
triples = kg.query(keywords=["Alice", "engineering"], top_k=10)
# Returns triples mentioning Alice or engineering

Combined Queries¶

# Alice's role at Acme Corp specifically
triples = kg.store.query(subject="Alice", object="Acme Corp")
# Alice works_at Acme Corp

Best Practices¶

1. Use SQLite for Persistent Graphs¶

# Prototyping
kg = KnowledgeGraphMemory(store=InMemoryTripleStore(), provider=provider)

# Production
kg = KnowledgeGraphMemory(
    store=SQLiteTripleStore(db_path="knowledge.db"),
    provider=provider,
)

2. Filter by Confidence¶

# Only high-confidence triples in context
kg = KnowledgeGraphMemory(
    store=store,
    provider=provider,
    min_confidence=0.8,  # ignore uncertain extractions
)

3. Use a Cost-Effective Extraction Model¶

# Use a smaller model for extraction
kg = KnowledgeGraphMemory(
    store=store,
    provider=OpenAIProvider(model="gpt-4o-mini"),
)

4. Limit Context Size¶

# Prevent context from growing too large
kg = KnowledgeGraphMemory(
    store=store,
    provider=provider,
    max_context_triples=20,  # cap at 20 triples in prompt
)

5. Combine with Entity Memory¶

from selectools.entity_memory import EntityMemory
from selectools.knowledge_graph import KnowledgeGraphMemory, SQLiteTripleStore

agent = Agent(
    tools=[...],
    provider=OpenAIProvider(),
    memory=ConversationMemory(),
    config=AgentConfig(
        entity_memory=EntityMemory(max_entities=100, provider=OpenAIProvider()),
        knowledge_graph=KnowledgeGraphMemory(
            store=SQLiteTripleStore(db_path="kg.db"),
            provider=OpenAIProvider(),
        ),
    ),
)
# Agent gets both [Known Entities] and [Known Relationships] context

6. Seed Domain Knowledge¶

from selectools.knowledge_graph import Triple

kg.update([
    Triple(subject="Python", relation="is_a", object="programming language", confidence=1.0),
    Triple(subject="selectools", relation="written_in", object="Python", confidence=1.0),
    Triple(subject="selectools", relation="supports", object="OpenAI", confidence=1.0),
    Triple(subject="selectools", relation="supports", object="Anthropic", confidence=1.0),
])

Testing¶

def test_triple_store_add_and_query():
    store = InMemoryTripleStore()

    store.add([
        Triple(subject="Alice", relation="works_at", object="Acme"),
        Triple(subject="Bob", relation="works_at", object="Acme"),
    ])

    results = store.query(subject="Alice")
    assert len(results) == 1
    assert results[0].object == "Acme"

    results = store.query(object="Acme")
    assert len(results) == 2


def test_triple_store_keyword_search():
    store = InMemoryTripleStore()

    store.add([
        Triple(subject="Alice", relation="works_at", object="Acme Corp"),
        Triple(subject="Bob", relation="lives_in", object="Seattle"),
    ])

    results = store.search(keywords=["Alice"], top_k=10)
    assert len(results) == 1
    assert results[0].subject == "Alice"


def test_duplicate_triples_ignored():
    store = InMemoryTripleStore()

    store.add([
        Triple(subject="A", relation="r", object="B"),
        Triple(subject="A", relation="r", object="B"),  # duplicate
    ])

    assert store.count() == 1


def test_build_context():
    store = InMemoryTripleStore()
    store.add([
        Triple(subject="Alice", relation="works_at", object="Acme", confidence=0.9),
    ])

    kg = KnowledgeGraphMemory(store=store, max_context_triples=10)
    context = kg.build_context()

    assert "[Known Relationships]" in context
    assert "Alice" in context
    assert "works_at" in context
    assert "Acme" in context


def test_confidence_filtering():
    store = InMemoryTripleStore()
    store.add([
        Triple(subject="A", relation="r1", object="B", confidence=0.9),
        Triple(subject="C", relation="r2", object="D", confidence=0.3),
    ])

    kg = KnowledgeGraphMemory(store=store, min_confidence=0.5)
    results = kg.query(keywords=["A", "C"], top_k=10)

    assert len(results) == 1
    assert results[0].subject == "A"

API Reference¶

Class	Description
`Triple(subject, relation, object, confidence)`	Dataclass for a subject-relation-object relationship
`TripleStore`	Protocol defining add/query/search/delete/clear interface
`InMemoryTripleStore()`	In-memory triple store for prototyping
`SQLiteTripleStore(db_path)`	SQLite-backed persistent triple store
`KnowledgeGraphMemory(store, provider, max_context_triples, min_confidence)`	LLM-powered knowledge graph with context injection

Method	Returns	Description
`extract_triples(text)`	`List[Triple]`	Extract triples from text via LLM
`update(triples)`	`None`	Add triples to the store
`query(keywords, top_k)`	`List[Triple]`	Search triples by keywords
`build_context(keywords)`	`str`	Build `[Known Relationships]` context string
`clear()`	`None`	Remove all triples
`to_dict()`	`Dict`	Serialize for persistence
`from_dict(data, store)`	`KnowledgeGraphMemory`	Restore from serialized data

AgentConfig Field	Type	Description
`knowledge_graph`	`Optional[KnowledgeGraphMemory]`	Knowledge graph instance for relationship tracking