Developer Mia released Qwable 27B on Hugging Face, a full fine-tune of Alibaba's Qwen3.6-27B trained on a Fable 5-style reasoning dataset, following a June 15, 2026 announcement. The model replicates the structured thinking approach of Anthropic's Fable 5 while running entirely on local hardware without API costs or mandatory data retention policies. Shortly after, open-source contributor Huihui-ai released an abliterated version that removes built-in refusal behavior by modifying model weights using llama.cpp's cvector-generator. The releases followed a week in which the U.S. government ordered Fable 5 pulled for all foreign nationals over a disputed jailbreak finding. Both Qwable variants provide local alternatives to cloud-based AI services, eliminating server dependencies and third-party data processing requirements.
Qwable 27B is a full fine-tune of Alibaba's Qwen3.6-27B base model built by developer Mia on a dataset of Fable 5-style reasoning examples. The training approach is instruction fine-tuning on trace-style examples, where the developer collected examples formatted like Fable 5's step-by-step answers and trained Qwen to produce similar output structures. The resulting 27-billion parameter model targets Fable 5's instruction-following structure, producing more guided, explanatory, and step-by-step task completion outputs than the base Qwen model.
The model runs in GGUF format, the compressed file type compatible with LM Studio and llama.cpp. The Q4 quantized build requires approximately 16.5 GB of storage. All processing occurs locally without sending data to external servers, eliminating the mandatory 30-day data retention requirement that Fable 5 imposed on all traffic including enterprise customers with previous zero-retention agreements.
Huihui-ai applied abliteration to produce Huihui-Qwable-3.6-27b-abliterated, a variant that eliminates the model's refusal behavior. The process identifies a refusal direction embedded in model weights by running the model on large sets of harmful and harmless prompts, measuring differences in internal activations, then modifying weights to eliminate that difference. After abliteration, the model no longer contains the mathematical signals that trigger refusal responses.
Huihui-ai applied the technique directly to the Qwable GGUF using llama.cpp's cvector-generator, requiring no Python environment, full-weight retraining, or rented servers. The process differs from jailbreaking by permanently modifying model architecture rather than exploiting prompt vulnerabilities. The model card specifies the abliterated version is for research and controlled environments only, with legal and ethical responsibility resting entirely with users.
The abliterated Qwable is available on Hugging Face in three builds. The recommended Q4_K_M_Q8 version weighs approximately 19 GB and represents the smallest, most consumer-friendly option. A version supporting multi-token prediction is available for systems with sufficient computational resources, providing faster response generation. Both the standard Qwable and abliterated variant run on consumer hardware through local runtimes like LM Studio.
The standard Qwable suits coding assistance, technical debugging, and workflows requiring models that display reasoning processes rather than producing direct answers. It runs in local agent setups and most local runtimes. The abliterated version serves security researchers requiring raw model behavior without provider-side filtering, synthetic data pipelines needing outputs on sensitive topics, and evaluation work testing model capabilities without content policy interference. The model card warns that reduced safety filtering means outputs can be sensitive, controversial, or inappropriate.
What is Qwable 27B and when was it released?
Qwable 27B is a full fine-tune of Alibaba's Qwen3.6-27B trained on a Fable 5-style reasoning dataset, announced by developer Mia on June 15, 2026. The model runs locally in GGUF format and requires approximately 16.5 GB in its Q4 quantized build.
How does the abliterated version differ from the standard Qwable model?
The abliterated version, created by Huihui-ai, removes refusal behavior by modifying model weights using llama.cpp's cvector-generator. The process eliminates the mathematical signals that trigger refusal responses, resulting in a model that processes all prompts without content filtering while maintaining full functionality.
What are the hardware requirements for running Qwable models?
The Q4 quantized build requires approximately 16.5 GB of storage, while the recommended Q4_K_M_Q8 abliterated version weighs around 19 GB. Both models run on consumer hardware through local runtimes like LM Studio or llama.cpp, with a multi-token prediction version available for systems with higher computational capacity.
Related News
Backblaze Signs $335M CoreWeave Storage Deal as AI Infrastructure Demand Surges
z.AI's GLM-5.2 Model Gains US Attention with Low-Cost High Performance
Hong Kong-listed AI large-model concept stocks fall, Zhipu plunges 14.11% during the session
YZi Labs leads the first round investment; Renaiss Protocol’s revenue exceeds 20 million in six months
Anthropic AI Access Restrictions Shift Crypto Focus to Decentralized AI