Edge core
In designtargeting 1.5 – 2 B params
On-device reasoning. Laptop and phone budgets.
- spec
-
- INT8 weights, FP16 activations on high-sensitivity layers.
- Context window: 8 K tokens initial, scaling pending memory profile.
- Distilled from a frontier teacher. No fine-tune fork from a public base.
- target deploy
- WebGPU on laptops, Apple Neural Engine on phone-class silicon, commodity ARM via llama.cpp.
- eval plan
- A frozen suite covering reasoning, instruction following, and known refusal patterns. Reported as the geomean against a stated FP32 teacher baseline; per-task numbers in the model card.