Model List

226 models

GPU TEE

Anthropic: Claude 3.7 Sonnet
Updated a month ago
Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. It introduces a hybrid reasoning approach, allowing users to choose between rapid responses and extended, step-by-step processing for complex tasks. The model demonstrates notable improvements in coding, particularly in front-end development and full-stack updates, and excels in agentic workflows, where it can autonomously navigate multi-step processes.

Claude 3.7 Sonnet maintains performance parity with its predecessor in standard mode while offering an extended reasoning mode for enhanced accuracy in math, coding, and instruction-following tasks.

Read more at the
by anthropic|200K context|$3/M input tokens|$15/M output tokens
DeepSeek: R1 Distill 70B
GPU TEE
Updated 2 months ago
DeepSeek R1 Distill 70B is a distilled large language model.
by phala|16K context|$0.23/M input tokens|$0.69/M output tokens
DeepSeek: DeepSeek V3
Updated 2 months ago
DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction following and coding abilities of the previous versions. Pre-trained on nearly 15 trillion tokens, the reported evaluations reveal that the model outperforms other open-source models and rivals leading closed-source models.

For model details, please visit for more information, or see the .
by deepseek|64K context|$0.14/M input tokens|$0.28/M output tokens
DeepSeek: DeepSeek R1
Updated 2 months ago
DeepSeek R1 is here: Performance on par with , but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass.

Fully open-source model & .

MIT licensed: Distill & commercialize freely!
by deepseek|163K context|$7/M input tokens|$7/M output tokens
Meta: Llama 3.3 70B Instruct
GPU TEE
Updated 4 months ago
The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks.

Supported languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

by phala|131K context|$0.12/M input tokens|$0.3/M output tokens
Meta: Llama 3.3 70B Instruct
Updated 4 months ago
The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks.

Supported languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

by meta-llama|131K context|$0.13/M input tokens|$0.4/M output tokens
Amazon: Nova Lite 1.0
Updated 4 months ago
Amazon Nova Lite 1.0 is a very low-cost multimodal model from Amazon that focused on fast processing of image, video, and text inputs to generate text output. Amazon Nova Lite can handle real-time customer interactions, document analysis, and visual question-answering tasks with high accuracy.

With an input context of 300K tokens, it can analyze multiple images or up to 30 minutes of video in a single input.
by amazon|300K context|$0.06/M input tokens|$0.24/M output tokens
Amazon: Nova Micro 1.0
Updated 4 months ago
Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency responses in the Amazon Nova family of models at a very low cost. With a context length of 128K tokens and optimized for speed and cost, Amazon Nova Micro excels at tasks such as text summarization, translation, content classification, interactive chat, and brainstorming. It has simple mathematical reasoning and coding abilities.
by amazon|128K context|$0.035/M input tokens|$0.14/M output tokens
Amazon: Nova Pro 1.0
Updated 4 months ago
Amazon Nova Pro 1.0 is a capable multimodal model from Amazon focused on providing a combination of accuracy, speed, and cost for a wide range of tasks. As of December 2024, it achieves state-of-the-art performance on key benchmarks including visual question answering (TextVQA) and video understanding (VATEX).

Amazon Nova Pro demonstrates strong capabilities in processing both visual and textual information and at analyzing financial documents.

NOTE: Video input is not supported at this time.
by amazon|300K context|$0.8/M input tokens|$3.2/M output tokens
Qwen: QwQ 32B Preview
Updated 4 months ago
QwQ-32B-Preview is an experimental research model focused on AI reasoning capabilities developed by the Qwen Team. As a preview release, it demonstrates promising analytical abilities while having several important limitations:

Language Mixing and Code-Switching: The model may mix languages or switch between them unexpectedly, affecting response clarity.

Recursive Reasoning Loops: The model may enter circular reasoning patterns, leading to lengthy responses without a conclusive answer.

Safety and Ethical Considerations: The model requires enhanced safety measures to ensure reliable and secure performance, and users should exercise caution when deploying it.

Performance and Benchmark Limitations: The model excels in math and coding but has room for improvement in other areas, such as common sense reasoning and nuanced language understanding.
by qwen|32K context|$0.15/M input tokens|$0.6/M output tokens
EVA Qwen2.5 72B
Updated 4 months ago
A roleplay and storywriting specialist model, full-parameter finetune of Qwen2.5-72B on mixture of synthetic and natural data.

It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve versatility, creativity and "flavor" of the resulting model.
by eva-unit-01|16K context|$4/M input tokens|$6/M output tokens
OpenAI: GPT-4o (2024-11-20)
Updated 4 months ago
The 2024-11-20 version of GPT-4o offers a leveled-up creative writing ability with more natural, engaging, and tailored writing to improve relevance & readability. It’s also better at working with uploaded files, providing deeper insights & more thorough responses.

GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of while being twice as fast and 50% more cost-effective. GPT-4o also offers improved performance in processing non-English languages and enhanced visual capabilities.
by openai|128K context|$2.5/M input tokens|$10/M output tokens
Mistral Large 2411
Updated 4 months ago
Mistral Large 2 2411 is an update of released together with

It provides a significant upgrade on the previous , with notable improvements in long context understanding, a new system prompt, and more accurate function calling.
by mistralai|128K context|$2/M input tokens|$6/M output tokens
Mistral Large 2407
Updated 4 months ago
This is Mistral AI's flagship model, Mistral Large 2 (version mistral-large-2407). It's a proprietary weights-available model and excels at reasoning, code, JSON, chat, and more. Read the launch announcement .

It supports dozens of languages including French, German, Spanish, Italian, Portuguese, Arabic, Hindi, Russian, Chinese, Japanese, and Korean, along with 80+ coding languages including Python, Java, C, C++, JavaScript, and Bash. Its long context window allows precise information recall from large documents.
by mistralai|128K context|$2/M input tokens|$6/M output tokens
Mistral: Pixtral Large 2411
Updated 4 months ago
Pixtral Large is a 124B parameter, open-weight, multimodal model built on top of . The model is able to understand documents, charts and natural images.

The model is available under the Mistral Research License (MRL) for research and educational use, and the Mistral Commercial License for experimentation, testing, and production for commercial purposes.
by mistralai|128K context|$2/M input tokens|$6/M output tokens
xAI: Grok Vision Beta
Updated 4 months ago
Grok Vision Beta is xAI's experimental language model with vision capability.
by x-ai|8K context|$5/M input tokens|$15/M output tokens
Mistral Nemo Inferor 12B
Updated 5 months ago
Inferor is a merge of top roleplay models, expert on immersive narratives and storytelling.

This model was merged using the merge method using as a base.
by infermatic|32K context|$0.25/M input tokens|$0.5/M output tokens
Qwen2.5 Coder 32B Instruct
Updated 5 months ago
Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:

Significantly improvements in code generation, code reasoning and code fixing.

A more comprehensive foundation for real-world applications such as Code Agents. Not only enhancing coding capabilities but also maintaining its strengths in mathematics and general competencies.

To read more about its evaluation results, check out .
by qwen|32K context|$0.08/M input tokens|$0.18/M output tokens
SorcererLM 8x22B
Updated 5 months ago
SorcererLM is an advanced RP and storytelling model, built as a Low-rank 16-bit LoRA fine-tuned on .

Advanced reasoning and emotional intelligence for engaging and immersive interactions

Vivid writing capabilities enriched with spatial and contextual awareness

Enhanced narrative depth, promoting creative and dynamic storytelling
by raifle|16K context|$4.5/M input tokens|$4.5/M output tokens
EVA Qwen2.5 32B
Updated 5 months ago
A roleplaying/storywriting specialist model, full-parameter finetune of Qwen2.5-32B on mixture of synthetic and natural data.

It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve versatility, creativity and "flavor" of the resulting model.
by eva-unit-01|16K context|$2.6/M input tokens|$3.4/M output tokens

QwQ-32B-Preview is an experimental research model focused on AI reasoning capabilities developed by the Qwen Team. As a preview release, it demonstrates promising analytical abilities while having several important limitations:

Language Mixing and Code-Switching: The model may mix languages or switch between them unexpectedly, affecting response clarity.
Recursive Reasoning Loops: The model may enter circular reasoning patterns, leading to lengthy responses without a conclusive answer.
Safety and Ethical Considerations: The model requires enhanced safety measures to ensure reliable and secure performance, and users should exercise caution when deploying it.
Performance and Benchmark Limitations: The model excels in math and coding but has room for improvement in other areas, such as common sense reasoning and nuanced language understanding.