RELEASE: FIKRA-NANO v0.2

The Foundry.

Open weights for the post-GPU era. Our Fikra series utilizes 1.58-bit ternary quantization ({-1, 0, 1}) to deliver state-of-the-art reasoning on commodity edge hardware.

BibTeX Citation

@misc{lacesse2026fikra,
  title={Fikra: 1.58-bit Ternary LLMs for Edge},
  author={Lacesse Research},
  year={2026},
  publisher={Lacesse Foundry}
}

Available Weights 2 Models

Model ID	Params	Bit-Width	Context	Download
Fikra-1B-Nano Stable Updated Feb 10, 2026	1.2B	w1.58	4,096	HuggingFace ↗
Fikra-3B-Edge Training ETA: Q2 2026	3.0B	w1.58	8,192	Coming Soon

Quickstart

# Install the Lacesse Inference Engine

$ pip install lacesse-fikra

# Run inference on CPU (No GPU required)

$ python3 -m lacesse.run --model "Fikra-1B-Nano"

Inference Speed (CPU)

Fikra-1B (Ternary) 48.5 tok/s

Llama-3-8B (FP16) 4.2 tok/s

*Benchmarks run on Apple M1 Air (8GB RAM). Fikra runs ~11x faster than standard FP16 models on equivalent hardware.

Minimum Specs

Memory (RAM) 2GB (Fikra-1B) / 4GB (Fikra-3B)
Processor Any AVX2 Compatible CPU (Intel i5 8th Gen+) or ARM64 (Raspberry Pi 5).
GPU Not Required.

The Foundry.

Available Weights 2 Models

Quickstart

Inference Speed (CPU)

Minimum Specs

We value your privacy.