My Progress wih Local AI - Qwen3.5 Models Impressed Me
Qwen3.5 is Alibaba's latest series of LLM models. They first released the large versions, Qwen3.5-397-A17B and Qwen3.5-122B-A10B then they released the medium versions Qwen3.5-35B-A3B and Qwen3.5-27B and now they finally released the small versions: Qwen3.5-9B, Qwen3.5-4B, Qwen3.5-2B, and the tiny Qwen3.5-0.8B I downloaded some of these models, and from my testing an online research, these are really good!

For starters, my favorite Qwen3.5 model is the 4B model. I tested it with my Manga Translation and it did a better job than the old Qwen3 model of the same size and was 30% faster!
Apparently the dense models, 2B, 4B and 9B have Mixture-of-Experts layers in them, so they don't activate all layers during inference.
While Qwen3.5 2B model should be comparable to Qwen3's 4B model in terms of performance, it does noticeably worse in Vision capablities compared to the old Qwen3 4B model, so I'm going to stick to the bigger model for my Vision tasks, even though it'll be slower.
Also in my tests, both Qwen3.5 9B and 35B-A3B models were so slow it's almost unbearable. I heared that blinding them, (removing their Vision capabilities) will make them run faster, but I don't didn't know how to do that, (as of writing this, I think I do know now...)
Apparently the 4B model isn't bad at coding either, while the 2B and 0.8B models shouldn't be used for that purpose. The best thing about these small models is how they can run on mobile devices offline. They're also easy to finetune according to Unsloth. So I'm expecting to see a lot of specialized models built on them on Unsloth!
What do you think?
I'd love to write more about the Local AI models I'm playing with on my device, especially the small ones that can run on an Android Phone or a Steam Deck. I'd love to hear what you guys think!
Related Threads
https://inleo.io/threads/view/ahmadmanga/re-leothreads-2er3vtnqj
https://inleo.io/threads/view/ahmadmanga/re-leothreads-howh315w