RELEASE: FIKRA-NANO v0.2
The Foundry.
Open weights for the post-GPU era. Our Fikra series utilizes 1.58-bit ternary quantization ({-1, 0, 1}) to deliver state-of-the-art reasoning on commodity edge hardware.
BibTeX Citation
@misc{lacesse2026fikra,
title={Fikra: 1.58-bit Ternary LLMs for Edge},
author={Lacesse Research},
year={2026},
publisher={Lacesse Foundry}
}
title={Fikra: 1.58-bit Ternary LLMs for Edge},
author={Lacesse Research},
year={2026},
publisher={Lacesse Foundry}
}
Available Weights 2 Models
| Model ID | Params | Bit-Width | Context | Download |
|---|---|---|---|---|
|
Fikra-1B-Nano
Stable
Updated Feb 10, 2026
|
1.2B | w1.58 | 4,096 | HuggingFace ↗ |
|
Fikra-3B-Edge
Training
ETA: Q2 2026
|
3.0B | w1.58 | 8,192 | Coming Soon |
Quickstart
# Install the Lacesse Inference Engine
$
pip install lacesse-fikra
# Run inference on CPU (No GPU required)
$
python3 -m lacesse.run --model "Fikra-1B-Nano"
Inference Speed (CPU)
Fikra-1B (Ternary)
48.5 tok/s
Llama-3-8B (FP16)
4.2 tok/s
*Benchmarks run on Apple M1 Air (8GB RAM). Fikra runs ~11x faster than standard FP16 models on equivalent hardware.
Minimum Specs
-
Memory (RAM) 2GB (Fikra-1B) / 4GB (Fikra-3B)
-
Processor Any AVX2 Compatible CPU (Intel i5 8th Gen+) or ARM64 (Raspberry Pi 5).
-
GPU Not Required.