CortexaDB LogoCortexaDB
API Reference

Python API

Complete Python API reference

Complete reference for the CortexaDB Python package.

CortexaDB

The main database class.

CortexaDB.open(path, **kwargs)

Opens or creates a database at the specified path.

Parameters:

ParameterTypeDefaultDescription
pathstrRequiredDatabase directory path
dimensionint?NoneVector dimension. Required if no embedder is set
embedderEmbedder?NoneEmbedding provider for auto-embedding
syncstr"strict"Sync policy: "strict", "async", or "batch"
max_entriesint?NoneMaximum number of memories before eviction
max_bytesint?NoneMaximum storage size in bytes before eviction
index_modestr | dict"exact""exact", "hnsw", or HNSW config dict
recordstr?NonePath to replay log file for recording

Returns: CortexaDB

Example:

from cortexadb import CortexaDB
from cortexadb.providers.openai import OpenAIEmbedder

# With embedder
db = CortexaDB.open("agent.mem", embedder=OpenAIEmbedder())

# With manual dimension
db = CortexaDB.open("agent.mem", dimension=128, sync="batch")

# With HNSW indexing
db = CortexaDB.open("agent.mem", dimension=128, index_mode={
    "type": "hnsw", "m": 16, "ef_search": 50, "metric": "cos"
})

CortexaDB.replay(log_path, db_path, **kwargs)

Rebuilds a database from a replay log file.

Parameters:

ParameterTypeDefaultDescription
log_pathstrRequiredPath to the replay log file
db_pathstrRequiredPath for the new database
syncstr"strict"Sync policy for the new database
strictboolFalseIf True, raises on first error. If False, skips bad operations

Returns: CortexaDB

Example:

db = CortexaDB.replay("session.log", "restored.mem", strict=False)
report = db.last_replay_report

Memory Operations

.add(text=None, vector=None, metadata=None, collection=None)

Stores a new memory entry. If an embedder is configured and no embedding is provided, the text is auto-embedded.

Parameters:

ParameterTypeDefaultDescription
textstr?NoneText content to store
vectorlist[float]?NonePre-computed embedding vector
metadatadict[str, str]?NoneKey-value metadata pairs
collectionstr?"default"Target collection

Returns: int - The assigned memory ID

Example:

mid = db.add("User prefers dark mode")
mid = db.add("text", metadata={"source": "onboarding"})
mid = db.add("text", vector=[0.1, 0.2, ...], collection="agent_a")

.query(text=None, vector=None)

Starts a fluent query builder to search across the database.

Methods:

MethodDescription
.limit(n)Set maximum number of results (default 5)
.collection(name)Filter to a specific collection
.use_graph()Enable hybrid graph traversal
.recency_bias()Boost recent memories in scoring
.execute()Run the query and return list[Hit]

Example:

hits = db.search("What does the user prefer?", limit=5, use_graph=True)

for hit in hits:
    print(f"ID: {hit.id}, Score: {hit.score:.3f}")

.get(mid)

Retrieves a full memory entry by ID.

Parameters:

ParameterTypeDescription
midintMemory ID

Returns: Memory

Raises: CortexaDBNotFoundError if the memory doesn't exist.

Example:

mem = db.get(42)
print(mem.id)          # 42
print(mem.content)     # b"User prefers dark mode"
print(mem.collection)  # "default"
print(mem.metadata)    # {"source": "onboarding"}
print(mem.created_at)  # 1709654400
print(mem.importance)  # 0.0
print(mem.embedding)   # [0.1, 0.2, ...] or None

.delete(mid)

Permanently deletes a memory and updates all indexes.

Parameters:

ParameterTypeDescription
midintMemory ID to delete

Raises: CortexaDBNotFoundError if the memory doesn't exist.

Example:

db.delete(42)

Graph Operations

.connect(from_id, to_id, relation)

Creates a directed edge between two memories.

Parameters:

ParameterTypeDescription
from_idintSource memory ID
to_idintTarget memory ID
relationstrRelationship label

Example:

db.connect(1, 2, "relates_to")
db.connect(1, 3, "caused_by")

Both memories must be in the same collection. Cross-collection edges are forbidden.


.get_neighbors(mid)

Returns all outgoing edges from a memory.

Parameters:

ParameterTypeDescription
midintMemory ID

Returns: list[Edge] where each Edge has to (int) and relation (str) fields.

Example:

neighbors = db.get_neighbors(1)
for edge in neighbors:
    print(f"→ {edge[0]} ({edge[1]})")

Document Ingestion

.ingest(text, strategy="recursive", chunk_size=512, overlap=50, metadata=None, collection=None)

Chunks text and stores each chunk as a memory.

Parameters:

ParameterTypeDefaultDescription
textstrRequiredText to chunk and store
strategystr"recursive"Chunking strategy
chunk_sizeint512Target chunk size in characters
overlapint50Overlap between chunks
metadatadict?NoneMetadata to attach to all chunks
collectionstr?NoneTarget collection

Returns: list[int] - Memory IDs of stored chunks


.load(file_path, strategy="markdown", chunk_size=512, overlap=50, metadata=None, collection=None)

Loads a file, chunks it, and stores each chunk.

Parameters:

ParameterTypeDefaultDescription
file_pathstrRequiredPath to the file
strategystr"markdown"Chunking strategy
chunk_sizeint512Target chunk size
overlapint50Overlap between chunks
metadatadict?NoneMetadata for all chunks
collectionstr?NoneTarget collection

Supported formats: .txt, .md, .json, .docx (requires cortexadb[docs]), .pdf (requires cortexadb[pdf])

Example:

db.load("README.md", strategy="markdown")
db.load("paper.pdf", strategy="recursive", chunk_size=1024)

Collections

.collection(name, readonly=False)

Returns a scoped view of the database for a specific collection.

Parameters:

ParameterTypeDefaultDescription
namestrRequiredCollection name
readonlyboolFalseIf True, write operations raise errors

Returns: Collection

Example:

col = db.collection("agent_a")
mid = col.add("text")
hits = col.query("query").execute()
col.delete(mid)
col.ingest("long text")

Maintenance Operations

.compact()

Reclaims disk space by removing deleted entries from segment files.

.flush()

Forces all pending writes to be synced to disk.

.checkpoint()

Creates a binary snapshot of the current state and truncates the WAL. Also saves the HNSW index if using HNSW mode.

.stats()

Returns database statistics.

Returns: Stats

stats = db.stats()
print(stats.entries)              # Total memory count
print(stats.indexed_embeddings)   # Embeddings in vector index
print(stats.wal_length)           # WAL file size in bytes
print(stats.vector_dimension)     # Configured vector dimension
print(stats.storage_version)      # Storage format version

Replay Properties

.last_replay_report

Diagnostic report from the most recent replay() call.

Type: dict

KeyTypeDescription
total_opsintTotal operations in the log
appliedintSuccessfully applied
skippedintSkipped (malformed)
failedintFailed (execution error)
op_countsdictPer-type counts
failureslistUp to 50 failure details

.last_export_replay_report

Diagnostic report from the most recent export_replay() call.

Type: dict

KeyTypeDescription
exportedintMemories written
skipped_missing_embeddingintEntries without vectors
skipped_missing_idintGaps in ID space
errorslistUnexpected errors

Export

.export_replay(path)

Exports the current database state as a replay log.

Parameters:

ParameterTypeDescription
pathstrOutput file path

Types

Hit

Query result from .query().execute().

FieldTypeDescription
idintMemory ID
scorefloatRelevance score (0.0 to 1.0)

Memory

Full memory entry from .get().

FieldTypeDescription
idintMemory ID
collectionstrCollection name
contentbytesRaw content
embeddinglist[float]?Vector embedding
metadatadict[str, str]Key-value metadata
created_atintUnix timestamp
importancefloatImportance score

Stats

Database statistics from .stats().

FieldTypeDescription
entriesintTotal memory count
indexed_embeddingsintEmbeddings in index
wal_lengthintWAL size in bytes
vector_dimensionintVector dimension
storage_versionintFormat version

ChunkResult

Result from chunk().

FieldTypeDescription
textstrChunk content
indexintZero-based index
metadatadict?Optional metadata

Standalone Functions

chunk(text, strategy="recursive", chunk_size=512, overlap=50)

Chunks text without storing it.

Returns: list[ChunkResult]

from cortexadb import chunk

chunks = chunk("Long text...", strategy="recursive", chunk_size=512, overlap=50)

Exceptions

ExceptionDescription
CortexaDBErrorBase exception for all CortexaDB errors
CortexaDBNotFoundErrorMemory or file not found
CortexaDBConfigErrorInvalid configuration
CortexaDBIOErrorI/O failure

On this page