Solution
End-to-end attested LLM.
Verify the answer came from the model you trust, on hardware you can prove. Privasys AI runs open-weight models inside Intel TDX confidential VMs with NVIDIA H100 in CC mode, so every chat session ships with a hardware-signed receipt of the exact code, model and configuration that produced it.
Three things every Privasys AI session gives you.
Confidential VM
Inference runs inside an Intel TDX trust domain with the GPU in NVIDIA Confidential Computing mode. CPU memory and GPU VRAM are encrypted and isolated from the cloud operator. Even Privasys cannot see your prompts.
Reproducible inference
Every response carries the model digest, server image hash and seed metadata. Anyone can rebuild the exact runtime from source and replay the same generation. The answer is auditable, not just trusted.
Attested chat
Before the first prompt leaves your browser, the chat client verifies a fresh TDX quote bound to the connection’s TLS key. You see the exact model and code hash you are talking to, signed by the hardware.
How it works.
Hardware-rooted trust chain
The TDX module measures the boot kernel, the verified read-only root filesystem, the inference server image and the model weights into the TDX RTMRs. The H100 attests its CC-mode firmware over SPDM. Both evidence trees are folded into the TLS certificate the chat client sees.
OpenAI-compatible API, attested edges
Each fleet exposes a vLLM-backed OpenAI-compatible endpoint behind a gateway that performs RA-TLS. Existing tooling works unchanged; the only difference is that you can prove who answered.
Distributed attestation verifier
Quote signature verification runs against an independent attestation-server, so the inference VM cannot lie about its own attestation. The verifier’s policy and code hashes are part of the published trust chain.
Per-tenant isolation by design
Dedicated fleets get their own VM, their own model menu and their own private retrieval store. Public fleets share infrastructure but never state. Quota and identity are enforced by Privasys ID, not by the inference node.
How it compares.
vs. closed APIs
OpenAI, Anthropic and the like give you a smart endpoint and a policy promise. There is no cryptographic proof of which model answered, what code ran, or where your data went after it left your browser. Privasys AI gives you all three on every request.
vs. self-hosted vLLM
Self-hosted vLLM gives you control of the box. Privasys AI gives you the same control plus a hardware-attested trust chain, a managed attestation verifier, reproducibility metadata, and a chat front-end your users can verify without reading PEM files.
Honest about the boundaries.
Confidential computing protects against the cloud operator and the host OS, not against bugs in the model itself or in the inference server. Attestation proves what code ran; it does not prove that code is correct. We publish the full source, the build recipe and the patch set, and we make it easy to rebuild and diff. The trust chain is only as strong as what you actually verify.