Documentation

Supported Models

The Fikra API offers access to state-of-the-art open-weights models, accelerated by Groq LPU™ technology for ultra-fast inference. Use the exact string in the Model ID column when making your API requests.

Model ID	Developer	Context Window	Specs & Best Use Cases
llama3-70b	Meta	131,072 tokens	Specs: 70 Billion parameters. High-precision instruction following and deep world knowledge. Best For: Enterprise-grade reasoning, complex RAG (Retrieval-Augmented Generation) pipelines, multi-step logical deduction, and advanced code generation.
llama3-8b	Meta	131,072 tokens	Specs: 8 Billion parameters. Extremely low latency and highly cost-effective compute. Best For: High-throughput tasks, real-time responsive chatbots, bulk text classification, rapid data extraction, and document summarization at scale.
qwen-3-32b	Alibaba Cloud	131,072 tokens	Specs: 32 Billion parameters. Superior multilingual capabilities and robust coding proficiency. Best For: Translating across diverse language sets, parsing massive financial or legal document dumps, and analyzing extensive codebases with high contextual awareness.
gpt-oss-20b	Open AI	131,072 tokens	Specs: 20 Billion parameters. Highly aligned instruction tuning and balanced architecture. Best For: General-purpose chatting, nuanced creative writing, safe and aligned content generation, and dynamic conversational agent personas.

Exact API Payload Strings

When sending your HTTP POST requests, you must use the exact hardware routing string in the "model" parameter of your JSON payload. If you use the advertised name instead of the routing string, the gateway will return a 400 or 404 error.

Advertised Model	Required Payload String
Llama 3 70B	"llama-3.3-70b-versatile"
Llama 3 8B	"llama-3.1-8b-instant"
Qwen 3 32B	"qwen-2.5-32b"
GPT OSS 20B	"openai/gpt-oss-20b"