Database security // tenant isolation
The Multi-Tenant RAG Nightmare
One forgotten filter should never let Tenant A's assistant retrieve Tenant B's contracts. Put the boundary inside PostgreSQL, where every query must obey it.
Security invariant
The application chooses a tenant; the database enforces the boundary
Retrieval code should not be trusted to remember isolation. It should establish an authenticated tenant context, while PostgreSQL decides which rows exist for that transaction. A query with no tenant context must see nothing.
01 // The fragile filter
WHERE tenant_id = $1 is useful, but not a security boundary
Application filtering fails open. One copied query omits the clause, an ORM's default scope is disabled for a maintenance job, or a new similarity endpoint accepts an optional tenant parameter. The SQL remains valid and returns plausible results, so the mistake can survive tests until a semantically similar document crosses the boundary in production.
A safe query becomes unsafe when one predicate disappears.
Raw SQL or an unscoped relation bypasses application conventions.
A client-controlled tenant value is trusted instead of verified membership.
Keep explicit tenant predicates for clarity and query planning, but make them defense in depth. Row-Level Security adds policy expressions to table access regardless of which controller, ORM, or background worker issued the query.
02 // Schema and policy
Build a fail-closed vector table
Use separate migration and runtime roles. The application login
must be NOSUPERUSER and NOBYPASSRLS.
PostgreSQL superusers and roles with BYPASSRLS always
bypass policies.
CREATE EXTENSION IF NOT EXISTS vector;
CREATE ROLE rag_app
LOGIN NOSUPERUSER NOCREATEDB NOCREATEROLE NOINHERIT NOBYPASSRLS;
CREATE TABLE rag_chunks (
id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
tenant_id uuid NOT NULL,
document_id uuid NOT NULL,
content text NOT NULL,
embedding vector(1536) NOT NULL,
created_at timestamptz NOT NULL DEFAULT now()
);
CREATE INDEX rag_chunks_tenant_idx ON rag_chunks (tenant_id);
CREATE INDEX rag_chunks_embedding_hnsw
ON rag_chunks USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
ALTER TABLE rag_chunks ENABLE ROW LEVEL SECURITY;
ALTER TABLE rag_chunks FORCE ROW LEVEL SECURITY;
CREATE POLICY tenant_rows ON rag_chunks
TO rag_app
USING (
tenant_id = NULLIF(current_setting('app.tenant_id', true), '')::uuid
)
WITH CHECK (
tenant_id = NULLIF(current_setting('app.tenant_id', true), '')::uuid
);
GRANT SELECT, INSERT, UPDATE, DELETE ON rag_chunks TO rag_app;
GRANT USAGE, SELECT ON SEQUENCE rag_chunks_id_seq TO rag_app;
Table owners normally bypass RLS. FORCE ROW LEVEL
SECURITY makes the owner subject to policy too. It does not
constrain superusers or BYPASSRLS roles, so the
production connection must never use either.
USING controls visible existing rows;
WITH CHECK rejects inserts or updates that assign a
different tenant. Missing context returns null, and the comparison
denies every row.
03 // HNSW behavior
Security and recall are different problems
Do not invent a multicolumn HNSW index containing
tenant_id. pgvector recommends a normal index on the
filter column plus an HNSW index on the vector. With approximate
indexes, filtering occurs after the index scan, so selective tenant
policies can reduce returned neighbors. RLS still protects rows;
recall may need tuning.
SET LOCAL hnsw.iterative_scan = strict_order;
SET LOCAL hnsw.ef_search = 100;
SELECT id, document_id, content,
embedding <=> $1::vector AS distance
FROM rag_chunks
ORDER BY embedding <=> $1::vector
LIMIT $2;
pgvector 0.8.0 introduced iterative scans, which continue scanning until enough filtered results are found or a scan limit is reached. For very large tenants, benchmark partitioning as well.
04 // Transaction-local identity
Inject verified tenant context with set_config
Resolve the tenant from a verified session and server-side
membership lookup—not a request header supplied by the browser.
Check out one pooled client, begin a transaction, and set the custom
configuration value locally. The third argument to
set_config is true, so context disappears
at transaction end and cannot bleed into the next pooled request.
export async function retrieve(pool, auth, embedding, limit = 8) {
const tenantId = await requireTenantMembership(auth);
const client = await pool.connect();
try {
await client.query("BEGIN READ ONLY");
await client.query(
"SELECT set_config('app.tenant_id', $1, true)",
[tenantId]
);
await client.query("SET LOCAL hnsw.iterative_scan = strict_order");
const result = await client.query(
`SELECT id, document_id, content
FROM rag_chunks
ORDER BY embedding <=> $1::vector
LIMIT $2`,
[JSON.stringify(embedding), limit]
);
await client.query("COMMIT");
return result.rows;
} catch (error) {
await client.query("ROLLBACK");
throw error;
} finally {
client.release();
}
}
Never split this across pool.query calls.
node-postgres requires every transaction statement to use the same
client. Parameterizing set_config also avoids building
SQL from identity strings.
The runtime role can set this custom value, so RLS prevents omitted-filter bugs—not a fully compromised application from forging tenant context. Never expose the database login to clients. Validate membership server-side. If database clients are mutually hostile, use distinct database identities or databases instead of trusting a shared role to assert tenancy.
05 // Proving isolation
Test the invariant through the runtime role
Seed two tenants with identical embeddings and unique canary text.
Run tests through rag_app, not an owner or administrator,
and deliberately omit application tenant predicates. The database
must still isolate results.
test("RLS survives a missing application filter", async () => {
await asTenant(TENANT_A, (db) =>
db.query(INSERT_CHUNK, [TENANT_A, DOC_A, "A_CANARY", VECTOR])
);
await asTenant(TENANT_B, (db) =>
db.query(INSERT_CHUNK, [TENANT_B, DOC_B, "B_CANARY", VECTOR])
);
const rows = await asTenant(TENANT_A, (db) =>
db.query(`
SELECT tenant_id, content
FROM rag_chunks
ORDER BY embedding <=> $1::vector
LIMIT 10
`, [JSON.stringify(VECTOR)])
);
assert.equal(rows.rows.some((row) => row.content === "B_CANARY"), false);
assert.equal(rows.rows.every((row) => row.tenant_id === TENANT_A), true);
});
test("missing tenant context defaults to deny", async () => {
const result = await runtimeClient.query("SELECT * FROM rag_chunks");
assert.equal(result.rowCount, 0);
});
The asTenant helper should use the same transaction
wrapper as production. Do not mock PostgreSQL, the policy, or the
pool: these tests exist to exercise the real execution identity and
transaction lifecycle. Run them against a fresh database created
from production migrations. Seed through tenant-scoped transactions
or a CI-only fixture role that is never available to the service.
Tenant A's nearest-neighbor query never returns Tenant B's canary.
A query before set_config returns zero rows.
A inserts or updates a row with B's tenant ID and receives an RLS error.
After A commits, a new transaction on the same connection sees no inherited context.
Assert rolsuper = false and rolbypassrls = false.
A similarity query with no explicit WHERE remains isolated.
Add these cases to CI and retain results with migration reviews, policy definitions, grants, and role-attribute evidence. This supports SOC 2 logical-access testing and GDPR security obligations; it does not automatically create compliance. Auditors need proof that the control is deployed, monitored, and cannot be bypassed by the production identity.
Test query plans and result counts too. An HNSW tuning change may
reduce recall without weakening isolation, while a role or policy
migration can weaken isolation even if retrieval quality looks
perfect. Alert separately on both classes of regression. Finally,
verify backup, analytics, and migration jobs use explicit privileged
identities with audited purposes; never “temporarily” grant
BYPASSRLS to the web application's role.
06 // The invariant
Make the unsafe query return nothing
RLS turns a forgotten filter from a breach into a harmless bug. Keep runtime roles unprivileged, force policy application, bind tenant context to one transaction, tune HNSW recall separately, and continuously test cross-tenant denial. The safest RAG query is not the one every developer remembers to secure; it is the one the database refuses to make unsafe. That invariant should survive every refactor, framework upgrade, and emergency patch.
Sources // primary documentation