Technology

How Enclara runs AI on encrypted medical images

A deep dive into the cryptography, model design, and system architecture that makes private diagnosis possible.

Primer

What is Fully Homomorphic Encryption?

FHE is a form of encryption that allows computation on ciphertext. The result, when decrypted, matches the result of performing the same operations on the plaintext — but no one except the key holder ever sees the unencrypted data.

1

Encrypt

Data is encrypted client-side with a secret key. It becomes mathematically opaque to anyone without the key.

2

Compute

The server performs additions and multiplications directly on the encrypted values. It processes data it cannot read.

3

Decrypt

The encrypted result is returned to the client. Only the secret key holder can decrypt the output.

Enc(x) + Enc(y) = Enc(x + y)   |   Enc(x) * Enc(y) = Enc(x * y)
The server sees ciphertext. The math still works. The patient sees results.

Model

QuantVGG11Patch

A 5-bit quantized VGG11 variant purpose-built for FHE. Every layer is chosen to be compatible with encrypted arithmetic.

Standard VGG11QuantVGG11Patch (FHE)Why
Conv2dQuantConv2d (5-bit weights)Reduces multiplication depth
ReLUQuantReLU (5-bit activations)Bounded activations for FHE polynomial approx
MaxPool2dAvgPool2dMax is a comparison — expensive in FHE
FC layers (4096, 4096, 1000)Single QuantLinear(512, 7)Fewer parameters = faster encrypted inference
224×224 input32×32 patch inputSmaller input = manageable FHE circuit size

PatchAggregator

The aggregation layer sits outside the FHE circuit. It takes the 49 per-patch classification vectors (each with 7 logits), computes the max logit per class across all patches, and returns the final diagnosis. This runs in plaintext on the client after decryption — no encrypted comparisons needed.

Pipeline

From training to encrypted inference

01

Train

train.py
  • Fine-tune QuantVGG11Patch on HAM10000 (10,015 dermoscopic images, 7 classes)
  • Initialize conv layers from pretrained ImageNet VGG11 weights
  • Split train/val by lesion_id to prevent data leakage between patients
  • Inverse-frequency class weighting handles severe class imbalance
02

Compile to FHE

fhe_convert.py
  • Load trained weights into QuantVGG11Patch
  • Build calibration dataset from random validation patches
  • Compile to FHE circuit via Concrete-ML’s compile_brevitas_qat_model
  • Optionally sweep rounding_threshold_bits (8→4) for speed/accuracy tradeoffs
  • Outputs client.zip (quantization params + keys) and server circuit
03

Client encryption

Mobile app
  • On first launch: generate FHE secret key + evaluation key via Concrete-ML
  • Upload evaluation key to server with a random client ID for reuse
  • For each scan: crop to 224×224, split into 49 patches (32×32)
  • Quantize each patch per client.zip parameters, encrypt, send one-by-one
04

Server inference

AWS EC2
  • Receive 49 encrypted patches + client ID
  • Load evaluation key for that client
  • Run FHE circuit on each encrypted patch
  • Return 49 encrypted classification vectors (7 logits each)
Design decisions

Why every choice exists

Why patches instead of the full image?

FHE circuit complexity grows with input size. A 32×32 patch keeps the circuit small enough for practical inference times. The 7×7 grid covers the full 224×224 image.

Why 5-bit quantization?

Lower bit-width reduces the multiplicative depth of the encrypted circuit. 5 bits is the sweet spot where model accuracy is preserved while FHE computation remains feasible.

Why AvgPool instead of MaxPool?

Max requires comparison operations, which are extremely expensive in FHE (they require many encrypted multiplications). Average pooling uses only additions and a constant division.

Why a single linear classifier?

VGG11 normally has 3 fully-connected layers with 4096 neurons each. Each encrypted multiplication adds latency. A single QuantLinear(512, 7) minimizes the FC overhead.

Why send patches one-by-one?

Each encrypted patch is large (the evaluation key alone is significant). Sending patches sequentially keeps memory manageable on both client and server.

Why client-side aggregation?

The PatchAggregator takes the max logit per class across 49 patches. Max is a comparison — doing this in FHE would be expensive. In plaintext after decryption, it’s trivial.

Architecture

Client-server trust model

Client (iOS app)

  • Holds secret key — never leaves device
  • Generates evaluation key (public, sent to server once)
  • Encrypts patches, decrypts results
  • Runs PatchAggregator in plaintext
  • Stores scan history locally as JSON

Server (AWS EC2)

  • Holds only evaluation keys and FHE circuit
  • Cannot decrypt any patient data
  • Processes encrypted patches through FHE circuit
  • Returns encrypted classification vectors
  • Stateless per-inference — no patient data stored

Key insight: Even a compromised server reveals nothing. Without the secret key, the encrypted patches and results are computationally indistinguishable from random noise.

Programmable Cryptography

Cryptography as a cooperation tool

FHE transforms the trust model for medical AI. Instead of asking patients to trust a server, Enclara lets them verify mathematically that their data was never exposed. This is what programmable cryptography unlocks — useful computation between parties who don't need to trust each other.