ChatApache 2.0
Qwen3 235B A22B
Alibaba Qwen · Qwen
Flagship open-weight Qwen3 MoE model often chosen for serious reasoning, coding, multilingual work, and agent experiments.
Best for: Builders testing frontier-style open-weight reasoning and coding in hosted or multi-GPU environments.
ReasoningMIT
DeepSeek R1
DeepSeek · DeepSeek
Open model family commonly used as a reference point for reasoning-heavy open-weight workflows.
Best for: Reasoning experiments, coding workflows, and evaluating open reasoning models against closed alternatives.
Local: Can be tested locally through smaller distilled or quantized variants; the full model is better suited to server-class hardware.
CodeDeepSeek License
DeepSeek Coder V2
DeepSeek · DeepSeek
Coding-specialized open model family worth testing for completion, refactoring, and coding assistant workflows.
Best for: Developers comparing open coding models for IDE assistants and coding agents.
Local: Can be tested locally with smaller or quantized builds; larger variants need serious GPU memory.
Hardware24GB+ for smaller quantized variantsUpdated2026
Details →AgentsModified MIT
Kimi K2
Moonshot AI · Kimi
Large open-weight MoE model from Moonshot AI often discussed for agentic coding and tool-use workflows.
Best for: Builders testing agentic workflows, coding tasks, and tool-use behavior with open weights.
AgentsCheck model card
GLM-4.5
Z.ai · GLM
Open-weight MoE model family from Z.ai designed around agent, coding, and reasoning workflows.
Best for: Teams evaluating open models for tool-calling, software engineering, and agent workflows.
ReasoningApache 2.0 for referenced 80k release
MiniMax M1
MiniMax · MiniMax
Open-weight hybrid-attention reasoning model worth testing for long-context and agent-oriented workloads.
Best for: Builders comparing long-context reasoning models and hosted open-weight options.
ChatLlama 3 Community
Llama 3 70B
Meta · Llama
Meta open-weight model family still commonly used as a baseline for local AI stacks and app prototypes.
Best for: Builders who want a widely supported open-weight chat model with broad runtime compatibility.
Local: Commonly used in local workflows through quantized builds, but 70B-class models are best with high-memory GPUs or workstation/server hardware.
ChatGemma Terms
Gemma 3 27B
Google · Gemma
Open-weight Gemma-family model worth testing where Google ecosystem support and local deployment matter.
Best for: Developers testing capable medium-sized chat models with broad tooling support.
Local: Can be tested locally with quantized builds on higher-end consumer GPUs or unified-memory systems.
ChatApache 2.0
Mistral Small 3.1
Mistral AI · Mistral
Efficient open-weight Mistral-family model often considered for practical local chat and app workloads.
Best for: Builders who want a smaller capable model with permissive licensing and good local runtime support.
Local: A practical local candidate when quantized; still test memory use and context length on your hardware.
EdgeMIT
Phi-4 Mini
Microsoft · Phi
Small open model family useful for lightweight local prototypes and edge-oriented AI experiments.
Best for: Builders testing small local models on laptops, CPUs, and constrained hardware.
Local: A small-model option for local experiments on laptops, CPUs, and low-VRAM machines.
CodeCheck exact model card
Qwen3 Coder
Alibaba Qwen · Qwen
Coding-focused Qwen model family worth testing for editor assistants, code review, and agentic coding workflows.
Best for: Developers comparing open coding models for Continue, Aider, Cline, and Roo Code workflows.
Local: Smaller or quantized coder variants can be tested locally for IDE and coding-agent workflows.
RerankingCheck model card
BGE Reranker v2 M3
BAAI · BGE
Lightweight multilingual reranker commonly used to improve RAG result ordering after vector search.
Best for: RAG builders who need a practical reranker after Qdrant, Chroma, pgvector, or other vector search.
Local: Commonly used in local RAG stacks as a reranking step after vector search.
EmbeddingMIT
Multilingual E5 Large
Microsoft / intfloat · E5
Widely used open embedding model for multilingual semantic search and RAG prototypes.
Best for: Teams building multilingual retrieval, semantic search, and RAG pipelines.
Local: Runs locally for many embedding and semantic search prototypes on CPU or modest GPU hardware.
EmbeddingCheck exact model card
Qwen3 Embedding
Alibaba Qwen · Qwen
Qwen embedding model family worth testing for modern RAG and multilingual retrieval stacks.
Best for: Builders who want a newer embedding family to compare against E5, BGE, and Jina.
Local: Smaller embedding variants are practical for local RAG and retrieval experiments.
VisionCheck exact model card
Qwen3 VL
Alibaba Qwen · Qwen
Vision-language Qwen family useful when workflows need images, screenshots, documents, or UI understanding.
Best for: Builders adding visual understanding to open AI workflows.
Local: Can be tested locally when compatible checkpoints and runtimes are available; multimodal serving is more demanding than text-only models.
AudioMIT
Whisper Large V3
OpenAI · Whisper
Open speech recognition model commonly used for transcription and multilingual audio workflows.
Best for: Builders adding local transcription, podcast processing, meeting notes, or audio translation.
Local: Commonly used for local transcription workflows; GPU improves batch throughput.