
No ratings yet
Be the first to review this model
Groq's Language Processing Unit platform offering the fastest inference speeds available for open-source models like Llama and Mixtral. Achieves hundreds of tokens per second with near-zero latency through custom chip design. The benchmark for AI inference speed with a generous free tier.
Released
February 20, 2024
Parameters
N/A (inference platform)
Context
128K
Pricing
Free/Paid