๐Ÿช LM Studio Model Benchmark

Cold-start + warm inference comparison ยท Local GPU Server ยท Feb 6, 2026

Cold Start (model load + inference)
Warm Response (routing test)
Prompt Injection Test

โฑ Cold Start Time (lower is better)

0s10s20s30s40s50s
Qwen3-VL-8B ๐Ÿ‘‘ WINNER
12.4s
12.4s
x-coder-rl-qwen3-8b
19.2s
19.2s
Qwen2-VL-7B
29.9s
29.9s
Qwen2.5-VL-7B
41.7s
41.7s
uigen-t2-7b
44.0s
44.0s
InternVL3.5-8B ๐Ÿ”ฅ FASTEST
7.7s
7.7s

โšก Warm Response Time (lower is better)

0s5s10s15s20s25s
Qwen3-VL-8B ๐Ÿ‘‘
1.3s
1.3s
Qwen2-VL-7B
0.96s
0.96s
Qwen2.5-VL-7B
4.3s
4.3s
x-coder-rl-qwen3-8b
5.4s (no output)
N/A
uigen-t2-7b
25.0s
25.0s
InternVL3.5-8B ๐Ÿ”ฅ
0.45s
0.45s

๐ŸŽฏ Quality Scores

๐Ÿ† Qwen3-VL-8B

Greetingโ˜…โ˜…โ˜…โ˜…โ˜†
Routingโ˜…โ˜…โ˜…โ˜…โ˜…
Injection Resistโ˜…โ˜…โ˜…โ˜…โ˜†
Cold Start12.4s
Warm Avg~1s
Overall4.3 / 5

Qwen2-VL-7B-Instruct

Greetingโ˜…โ˜…โ˜…โ˜†โ˜†
Routingโ˜…โ˜…โ˜†โ˜†โ˜†
Injection Resistโ˜…โ˜…โ˜…โ˜†โ˜†
Cold Start29.9s
Warm Avg~1s
Overall2.7 / 5

Qwen2.5-VL-7B

Greetingโ˜…โ˜…โ˜…โ˜†โ˜†
Routingโ˜…โ˜…โ˜…โ˜†โ˜†
Injection Resistโ˜…โ˜…โ˜†โ˜†โ˜†
Cold Start41.7s
Warm Avg4.3s
Overall2.7 / 5

uigen-t2-7b

Greetingโ˜…โ˜…โ˜†โ˜†โ˜†
Routingโ˜…โ˜…โ˜…โ˜…โ˜†
Injection Resistโ˜…โ˜…โ˜…โ˜…โ˜†
Cold Start44.0s
Warm Avg25.0s
Overall3.3 / 5

โŒ x-coder-rl-qwen3-8b

Greetingโ˜†โ˜†โ˜†โ˜†โ˜†
Routingโ˜†โ˜†โ˜†โ˜†โ˜†
Injection Resistโ˜…โ˜†โ˜†โ˜†โ˜†
IssueNo user output (think-only)
Overall0 / 5

๐Ÿ”ฅ InternVL3.5-8B

Greetingโ˜…โ˜…โ˜…โ˜…โ˜†
Routingโ˜…โ˜…โ˜…โ˜…โ˜†
Injection Resistโ˜…โ˜…โ˜…โ˜…โ˜…
Cold Start7.7s
Warm Avg~0.4s
Overall4.3 / 5

๐Ÿ† Verdict

Two-way tie: Qwen3-VL-8B and InternVL3.5-8B both score 4.3/5 quality. InternVL3.5 wins on speed (7.7s cold, 0.4s warm) while Qwen3-VL-8B has slightly better routing. InternVL3.5 has perfect injection resistance (5/5). Either is production-ready with a hardened system prompt.