Secure Multi-Tenant RAG with pgvector and PostgreSQL RLS

Security invariant

The application chooses a tenant; the database enforces the boundary

Retrieval code should not be trusted to remember isolation. It should establish an authenticated tenant context, while PostgreSQL decides which rows exist for that transaction. A query with no tenant context must see nothing.

01 // The fragile filter

`WHERE tenant_id = $1` is useful, but not a security boundary

Application filtering fails open. One copied query omits the clause, an ORM's default scope is disabled for a maintenance job, or a new similarity endpoint accepts an optional tenant parameter. The SQL remains valid and returns plausible results, so the mistake can survive tests until a semantically similar document crosses the boundary in production.

Copy-paste drift

A safe query becomes unsafe when one predicate disappears.

ORM escape hatch

Raw SQL or an unscoped relation bypasses application conventions.

Identity confusion

A client-controlled tenant value is trusted instead of verified membership.

Keep explicit tenant predicates for clarity and query planning, but make them defense in depth. Row-Level Security adds policy expressions to table access regardless of which controller, ORM, or background worker issued the query.

02 // Schema and policy

Build a fail-closed vector table

Use separate migration and runtime roles. The application login must be NOSUPERUSER and NOBYPASSRLS. PostgreSQL superusers and roles with BYPASSRLS always bypass policies.

001_secure_vectors.sqlPostgreSQL + pgvector

CREATE EXTENSION IF NOT EXISTS vector;

CREATE ROLE rag_app
  LOGIN NOSUPERUSER NOCREATEDB NOCREATEROLE NOINHERIT NOBYPASSRLS;

CREATE TABLE rag_chunks (
  id          bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  tenant_id   uuid NOT NULL,
  document_id uuid NOT NULL,
  content     text NOT NULL,
  embedding   vector(1536) NOT NULL,
  created_at  timestamptz NOT NULL DEFAULT now()
);

CREATE INDEX rag_chunks_tenant_idx ON rag_chunks (tenant_id);
CREATE INDEX rag_chunks_embedding_hnsw
  ON rag_chunks USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 64);

ALTER TABLE rag_chunks ENABLE ROW LEVEL SECURITY;
ALTER TABLE rag_chunks FORCE ROW LEVEL SECURITY;

CREATE POLICY tenant_rows ON rag_chunks
  TO rag_app
  USING (
    tenant_id = NULLIF(current_setting('app.tenant_id', true), '')::uuid
  )
  WITH CHECK (
    tenant_id = NULLIF(current_setting('app.tenant_id', true), '')::uuid
  );

GRANT SELECT, INSERT, UPDATE, DELETE ON rag_chunks TO rag_app;
GRANT USAGE, SELECT ON SEQUENCE rag_chunks_id_seq TO rag_app;

Why FORCE matters

Table owners normally bypass RLS. FORCE ROW LEVEL SECURITY makes the owner subject to policy too. It does not constrain superusers or BYPASSRLS roles, so the production connection must never use either.

USING controls visible existing rows; WITH CHECK rejects inserts or updates that assign a different tenant. Missing context returns null, and the comparison denies every row.

03 // HNSW behavior

Security and recall are different problems

Do not invent a multicolumn HNSW index containing tenant_id. pgvector recommends a normal index on the filter column plus an HNSW index on the vector. With approximate indexes, filtering occurs after the index scan, so selective tenant policies can reduce returned neighbors. RLS still protects rows; recall may need tuning.

transaction tuning and searchCosine distance

SET LOCAL hnsw.iterative_scan = strict_order;
SET LOCAL hnsw.ef_search = 100;

SELECT id, document_id, content,
       embedding <=> $1::vector AS distance
FROM rag_chunks
ORDER BY embedding <=> $1::vector
LIMIT $2;

pgvector 0.8.0 introduced iterative scans, which continue scanning until enough filtered results are found or a scan limit is reached. For very large tenants, benchmark partitioning as well.

04 // Transaction-local identity

Inject verified tenant context with `set_config`

Resolve the tenant from a verified session and server-side membership lookup—not a request header supplied by the browser. Check out one pooled client, begin a transaction, and set the custom configuration value locally. The third argument to set_config is true, so context disappears at transaction end and cannot bleed into the next pooled request.

retrieve.jsnode-postgres

export async function retrieve(pool, auth, embedding, limit = 8) {
  const tenantId = await requireTenantMembership(auth);
  const client = await pool.connect();

  try {
    await client.query("BEGIN READ ONLY");
    await client.query(
      "SELECT set_config('app.tenant_id', $1, true)",
      [tenantId]
    );
    await client.query("SET LOCAL hnsw.iterative_scan = strict_order");

    const result = await client.query(
      `SELECT id, document_id, content
         FROM rag_chunks
        ORDER BY embedding <=> $1::vector
        LIMIT $2`,
      [JSON.stringify(embedding), limit]
    );

    await client.query("COMMIT");
    return result.rows;
  } catch (error) {
    await client.query("ROLLBACK");
    throw error;
  } finally {
    client.release();
  }
}

Never split this across pool.query calls. node-postgres requires every transaction statement to use the same client. Parameterizing set_config also avoids building SQL from identity strings.

Know the trust boundary

The runtime role can set this custom value, so RLS prevents omitted-filter bugs—not a fully compromised application from forging tenant context. Never expose the database login to clients. Validate membership server-side. If database clients are mutually hostile, use distinct database identities or databases instead of trusting a shared role to assert tenancy.

05 // Proving isolation

Test the invariant through the runtime role

Seed two tenants with identical embeddings and unique canary text. Run tests through rag_app, not an owner or administrator, and deliberately omit application tenant predicates. The database must still isolate results.

tenant-isolation.test.jsIntegration test sketch

test("RLS survives a missing application filter", async () => {
  await asTenant(TENANT_A, (db) =>
    db.query(INSERT_CHUNK, [TENANT_A, DOC_A, "A_CANARY", VECTOR])
  );
  await asTenant(TENANT_B, (db) =>
    db.query(INSERT_CHUNK, [TENANT_B, DOC_B, "B_CANARY", VECTOR])
  );

  const rows = await asTenant(TENANT_A, (db) =>
    db.query(`
      SELECT tenant_id, content
      FROM rag_chunks
      ORDER BY embedding <=> $1::vector
      LIMIT 10
    `, [JSON.stringify(VECTOR)])
  );

  assert.equal(rows.rows.some((row) => row.content === "B_CANARY"), false);
  assert.equal(rows.rows.every((row) => row.tenant_id === TENANT_A), true);
});

test("missing tenant context defaults to deny", async () => {
  const result = await runtimeClient.query("SELECT * FROM rag_chunks");
  assert.equal(result.rowCount, 0);
});

The asTenant helper should use the same transaction wrapper as production. Do not mock PostgreSQL, the policy, or the pool: these tests exist to exercise the real execution identity and transaction lifecycle. Run them against a fresh database created from production migrations. Seed through tenant-scoped transactions or a CI-only fixture role that is never available to the service.

Cross-tenant read

Tenant A's nearest-neighbor query never returns Tenant B's canary.

Missing context

A query before set_config returns zero rows.

Write violation

A inserts or updates a row with B's tenant ID and receives an RLS error.

Pool reuse

After A commits, a new transaction on the same connection sees no inherited context.

Role posture

Assert rolsuper = false and rolbypassrls = false.

Raw SQL regression

A similarity query with no explicit WHERE remains isolated.

Add these cases to CI and retain results with migration reviews, policy definitions, grants, and role-attribute evidence. This supports SOC 2 logical-access testing and GDPR security obligations; it does not automatically create compliance. Auditors need proof that the control is deployed, monitored, and cannot be bypassed by the production identity.

Test query plans and result counts too. An HNSW tuning change may reduce recall without weakening isolation, while a role or policy migration can weaken isolation even if retrieval quality looks perfect. Alert separately on both classes of regression. Finally, verify backup, analytics, and migration jobs use explicit privileged identities with audited purposes; never “temporarily” grant BYPASSRLS to the web application's role.

06 // The invariant

Make the unsafe query return nothing

RLS turns a forgotten filter from a breach into a harmless bug. Keep runtime roles unprivileged, force policy application, bind tenant context to one transaction, tune HNSW recall separately, and continuously test cross-tenant denial. The safest RAG query is not the one every developer remembers to secure; it is the one the database refuses to make unsafe. That invariant should survive every refactor, framework upgrade, and emergency patch.

Sources // primary documentation

The Multi-Tenant RAG Nightmare

The application chooses a tenant; the database enforces the boundary

`WHERE tenant_id = $1` is useful, but not a security boundary

Build a fail-closed vector table

Security and recall are different problems

Inject verified tenant context with `set_config`

Test the invariant through the runtime role

Make the unsafe query return nothing

Technical references

The application chooses a tenant; the database enforces the boundary

WHERE tenant_id = $1 is useful, but not a security boundary

Build a fail-closed vector table

Security and recall are different problems

Inject verified tenant context with set_config

Test the invariant through the runtime role

Make the unsafe query return nothing

Technical references

`WHERE tenant_id = $1` is useful, but not a security boundary

Inject verified tenant context with `set_config`