var img = document.createElement('img'); img.src = "https://nethermind.matomo.cloud//piwik.php?idsite=6&rec=1&url=https://www.surge.wtf" + location.pathname; img.style = "border:0"; img.alt = "tracker"; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(img,s);
Skip to main content

ZisK Prover Setup

Raiko uses the ZisK GPU backend to generate ZK proofs for Surge blocks. This guide covers the two-VM setup (recommended), the same-VM setup (4x L40+ only), and how to wire it into a Surge devnet via simple-surge-node.

How real-time proving works

  1. Catalyst receives a new L2 block.
  2. Raiko generates a ZK proof using the ZisK GPU backend (~10–17s steady state).
  3. The proof is submitted atomically with the block proposal to the RealTimeInbox L1 contract.
  4. The block is finalized immediately on L1.

No bonds, no proving windows, no on-chain prover registration.

Hardware Requirements

ComponentMinimum (prover only)Recommended
GPU1x RTX 3090 (24 GB VRAM)8x RTX 5090
RAM64 GB256 GB
CPU32 cores64 cores
Disk150 GB SSD300 GB NVMe
OSUbuntu 22.04+ (Docker)Ubuntu 24.04 (bare metal needs GLIBC ≥ 2.36)

A single RTX 3090 is the floor — proofs will land but cold start is multi-minute and steady-state is much slower than the multi-GPU rows below. If you need real-time-feel devnet usage, plan for at least multi-GPU L40 / 5090.

Performance

GPU configSteady-state proof time
1x L40~27s
4x L40~15s
8x L40~13–14s
8x RTX 5090~10–11s
Cold start on a single GPU is much longer than the table suggests

The performance table measures steady-state proof time. The first proof after Raiko starts (or restarts with a new chain spec) goes through ZisK proofman init, fixed-poly loading, SNARK prover init, and ASM microservice startup before it can prove. On a single L40 this is ~16 minutes, not the 70s older docs claimed. Subsequent proofs settle to ~3 min on the same hardware. Multi-GPU configs hit the steady-state numbers in the table much sooner because they parallelize the warm-up.

If curl http://<prover-ip>:8080/guest_data already returns the vkey, Raiko is listening but not necessarily warm — the warm-up happens on the first real proof request, not on container start.

Step 1 — On the Prover VM

# Clone raiko
git clone https://github.com/NethermindEth/raiko.git
cd raiko

# Install system dependencies (apt packages, Rust, CUDA 12.x with linker pinning).
# Also detects and purges legacy CUDA 11.x leftovers that cause
# "undefined symbol: cudaGetDeviceProperties_v2" on Hopper/Blackwell GPUs.
./script/install-zisk-deps.sh

# Install ZisK SDK + proving keys (~150 GB to ~/.zisk)
TARGET=zisk make install
# Or, on a small home dir:
# ZISK_DIR=/path/to/large/disk TARGET=zisk make install

# Start Raiko
cp docker/.env.sample.zk docker/.env
docker compose -f docker/docker-compose-zk.yml up -d

# Wait for the vkey (~4–5 min multi-GPU; ~16 min single-GPU cold start)
curl localhost:8080/guest_data

Expected vkey response:

{"zisk":{"batch_vkey":"2ccafa601f5e29e4d61fc2d96474c98c3b752999b97af38c4c988c0fee24f0a0"}}

Open TCP/8080 to the L2 stack VM and verify reachability:

# From the L2 stack VM
curl http://<prover-ip>:8080/guest_data

Step 2 — On the L2 stack VM

git clone https://github.com/NethermindEth/simple-surge-node.git
cd simple-surge-node
git submodule update --init --recursive
cp .env.devnet .env

Edit .env:

RAIKO_HOST_ZKVM=http://<prover-ip>:8080
note

Don't pass RAIKO_HOST_ZKVM inline — deploy-surge-full.sh sources .env on every run and inline exports get overridden.

Deploy:

./deploy-surge-full.sh \
--environment devnet \
--deploy-devnet true \
--deployment remote \
--stack-option 2 \
--mode silence \
--force

The script automatically:

  1. Fetches the ZisK batch vkey from <prover-ip>:8080/guest_data
  2. Registers it on the on-chain ZisK verifier
  3. Generates configs/chain_spec_list.json and configs/config.json
  4. Starts Catalyst configured to send proof requests to the prover
  5. Runs the built-in Raiko readiness check (see deploy-surge.mdx)
  6. Sends DEX setup transactions on L1 and L2 (setL1Vault, resolver registration)
Step 5 will halt the deploy on its own — that's by design

After Catalyst is up, the script polls <prover-ip>:8080/guest_data and aborts with a clear error before any DEX/L2 transaction if Raiko isn't responding with the expected vkey. This is the proof gate: until you complete step 3 below (sync configs + restart Raiko), the prover is still running against its default chain spec and any L2 tx the deploy sent would reorg out.

What you'll typically see:

  • Multi-GPU — step 3's restart can finish during the readiness window, the check passes, and the deploy continues straight through DEX setup.
  • Single-GPU — the cold start is longer than the 30-min readiness budget. Expect step 2's deploy to abort at the readiness check with the "sync configs + force-recreate" hint. Do step 3, then come back and re-run via step 4.

Step 3 — Sync configs back to the Prover VM (proof gate)

Raiko started with the default chain spec; replace it with the real one, restart, and wait for it to be warm again before doing anything else:

# From the L2 stack VM
scp configs/chain_spec_list.json <prover-host>:~/raiko/host/config/devnet/chain_spec_list.json
scp configs/config.json <prover-host>:~/raiko/host/config/devnet/config.json

# On the Prover VM
cd ~/raiko
docker compose -f docker/docker-compose-zk.yml up -d --force-recreate

# Gate: do not proceed until the new vkey is back
curl localhost:8080/guest_data
# {"zisk":{"batch_vkey":"<64 hex>"}} — must match the vkey deploy registered

--force-recreate is needed so the container picks up the remounted config files. The first proof after this restart triggers another cold start (~16 min on single-GPU; quicker on multi-GPU). /guest_data returning the vkey only means Raiko is listening — for single-GPU configs, also wait for nvidia-smi to show GPU utilisation drop back to idle (the warm-up actually completes when the first proof request lands, ~16 min later).

Step 4 — Finish the L2 deploy (single-GPU only)

If step 2's deploy aborted at the Raiko readiness check (single-GPU expected behaviour), re-run after step 3:

# On the L2 stack VM
cd simple-surge-node
./deploy-surge-full.sh \
--environment devnet \
--deploy-devnet false \
--deployment remote \
--stack-option 2 \
--mode silence \
--force

--deploy-devnet false skips the Kurtosis L1 redeploy. The script picks up at the readiness check; with Raiko now on the right chain spec, the check passes, setL1Vault lands in an L2 block, Catalyst gets a real proof, and the DEX setup completes.

Only delete locks if the readiness check ran AFTER DEX work started

With the readiness check now ahead of DEX deployment, a step-2 abort halts before any DEX lock files are touched, so the re-run picks up cleanly. If you're recovering from an older deploy that failed during DEX linking, the link/dex .lock files may have been written before a cast send actually confirmed — delete them before retrying:

rm -f deployment/link_vaults_l1.lock \
deployment/link_vaults_l2.lock \
deployment/cross_chain_dex.lock

If cast call <l2_vault> "l1Vault()(address)" returns "contract does not have any code", the L2 was reorged back to genesis and cross-chain-dex-l2.json's addresses are now stale — tear down and redeploy the L2 stack instead of retrying.

Same-VM deployment

Realistic only with 4x L40+ on one host. Catalyst's container reaches Raiko via the host gateway:

# In simple-surge-node/.env
RAIKO_HOST_ZKVM=http://host.docker.internal:8080

A bare localhost won't work — Catalyst is in a container.

The rest of the flow matches two-VM, minus the scp + restart step (configs are local to the host).

Switching from mock to real prover

If your devnet is already running with --mock-prover, the on-chain verifier needs to change (ProofVerifierDummySurgeVerifier), so L1 contracts must redeploy:

# Wipe everything except the L1 Kurtosis enclave
./remove-surge-full.sh \
--remove-l1-devnet false \
--remove-l2-stack true \
--remove-data true \
--remove-configs true \
--force

# Set RAIKO_HOST_ZKVM in .env first, then:
./deploy-surge-full.sh \
--environment devnet \
--deploy-devnet false \
--deployment remote \
--stack-option 2 \
--mode silence \
--force

Sync the new configs to the Prover VM and --force-recreate Raiko (Docker tab).

Troubleshooting

Batches: 0 in Catalyst logs. Test reachability from inside the catalyst container:

docker exec l2-catalyst-node curl -m 5 $RAIKO_HOST_ZKVM/guest_data

If that fails, the endpoint is wrong. Same-VM uses host.docker.internal:8080; two-VM uses the prover's public/LAN IP.

Driver logs show currOperator=0x0. Harmless legacy log line. The realtime fork doesn't use the preconf whitelist — proposing flows through RealTimeInbox.propose() permissionlessly. Catalyst already considers itself the operator (Operator has changed from 0x0 to 0x59… earlier in the log).

L2 stuck at block 1 for 15+ minutes. Single-GPU cold start. Watch nvidia-smi -l 5 on the prover — proofman + SNARK init are burning ~50% GPU. Wait for the first proof; subsequent proofs are ~5x faster.

TCP/8080 not reachable from the L2 stack VM. Open the port on the prover VM:

sudo ufw allow 8080/tcp
# or your provider's security group

lib-float build failure on RTX 5090 / Blackwell. install-zisk-deps.sh passes CARGO_BUILD_JOBS=1 automatically. If invoking make install outside the script:

CARGO_BUILD_JOBS=1 TARGET=zisk make install

undefined symbol: cudaGetDeviceProperties_v2 at link time. Apt's nvidia-cuda-toolkit (CUDA 11.5) leftovers on the linker path shadow the CUDA 12 libs. Re-run install-zisk-deps.sh — it detects this "cleanup-only" state and offers to purge.

GPU memory errors on heavy blocks. Bridge-heavy transactions need more VRAM. Spot-check with nvidia-smi -l 1 (don't run continuously — nvidia-smi briefly locks the GPU during reads).

VKey changes after guest code update. Re-fetch and redeploy with the new vkey:

curl localhost:8080/guest_data

Intermittent ZisK failures

ZisK occasionally fails on certain transaction types (especially bridge-related, due to a known keccak256 issue). Catalyst retries automatically.