|
print("[SwiftLM] ⚠️ Model does not support SSD expert streaming (\(profile.modelType) is not MoE). Ignoring --stream-experts flag.") |
It seems this gets (wrongly) triggered when running with:
--model Qwen/Qwen3.5-397B-A17B (or an 8-bit MXFP8 quantized version thereof) --stream-experts
Resulting in:
"Model does not support SSD expert streaming (qwen3_5_moe is not MoE). Ignoring --stream-experts flag."
Along with a subsequent:
"zsh: killed ./SwiftLM....."
Running with the same --model and ./SwiftLM parameters works fine when using the latest release binary (SwiftLM b648).
Example:
./SwiftLM \
--model "/Users/user/models/Qwen3.5-397B-A17B-mxfp8-grp32" \
--port 5413 --stream-experts --thinking --ssd-prefetch
SwiftLM/Sources/SwiftLM/Server.swift
Line 348 in d5a9d11
It seems this gets (wrongly) triggered when running with:
--model Qwen/Qwen3.5-397B-A17B (or an 8-bit MXFP8 quantized version thereof) --stream-experts
Resulting in:
"Model does not support SSD expert streaming (qwen3_5_moe is not MoE). Ignoring --stream-experts flag."Along with a subsequent:
"zsh: killed ./SwiftLM....."Running with the same --model and ./SwiftLM parameters works fine when using the latest release binary (SwiftLM b648).
Example:
./SwiftLM \ --model "/Users/user/models/Qwen3.5-397B-A17B-mxfp8-grp32" \ --port 5413 --stream-experts --thinking --ssd-prefetch