Documentation

Supported Models

The Fikra API offers access to state-of-the-art open-weights models, accelerated by Groq LPU™ technology for ultra-fast inference. Use the exact string in the Model ID column when making your API requests.

Model ID Developer Context Window Specs & Best Use Cases
llama3-70b Meta 131,072 tokens Specs: 70 Billion parameters. High-precision instruction following and deep world knowledge.
Best For: Enterprise-grade reasoning, complex RAG (Retrieval-Augmented Generation) pipelines, multi-step logical deduction, and advanced code generation.
llama3-8b Meta 131,072 tokens Specs: 8 Billion parameters. Extremely low latency and highly cost-effective compute.
Best For: High-throughput tasks, real-time responsive chatbots, bulk text classification, rapid data extraction, and document summarization at scale.
qwen-3-32b Alibaba Cloud 131,072 tokens Specs: 32 Billion parameters. Superior multilingual capabilities and robust coding proficiency.
Best For: Translating across diverse language sets, parsing massive financial or legal document dumps, and analyzing extensive codebases with high contextual awareness.
gpt-oss-20b Open AI 131,072 tokens Specs: 20 Billion parameters. Highly aligned instruction tuning and balanced architecture.
Best For: General-purpose chatting, nuanced creative writing, safe and aligned content generation, and dynamic conversational agent personas.

Exact API Payload Strings

When sending your HTTP POST requests, you must use the exact hardware routing string in the "model" parameter of your JSON payload. If you use the advertised name instead of the routing string, the gateway will return a 400 or 404 error.

Advertised Model Required Payload String
Llama 3 70B "llama-3.3-70b-versatile"
Llama 3 8B "llama-3.1-8b-instant"
Qwen 3 32B "qwen-2.5-32b"
GPT OSS 20B "openai/gpt-oss-20b"