Open Source AI Models

Filterable directory of open-source and open-weight models: practical use cases, hardware notes, licenses, and source links.

+ Submit a Model
ChatApache 2.0

Qwen3 235B A22B

Alibaba Qwen · Qwen

Flagship open-weight Qwen3 MoE model often chosen for serious reasoning, coding, multilingual work, and agent experiments.

Best for: Builders testing frontier-style open-weight reasoning and coding in hosted or multi-GPU environments.

HardwareServer-classUpdated2026
Details →
ReasoningMIT

DeepSeek R1

DeepSeek · DeepSeek

Open model family commonly used as a reference point for reasoning-heavy open-weight workflows.

Best for: Reasoning experiments, coding workflows, and evaluating open reasoning models against closed alternatives.

Local: Can be tested locally through smaller distilled or quantized variants; the full model is better suited to server-class hardware.

HardwareServer-classUpdated2026
Details →
CodeDeepSeek License

DeepSeek Coder V2

DeepSeek · DeepSeek

Coding-specialized open model family worth testing for completion, refactoring, and coding assistant workflows.

Best for: Developers comparing open coding models for IDE assistants and coding agents.

Local: Can be tested locally with smaller or quantized builds; larger variants need serious GPU memory.

Hardware24GB+ for smaller quantized variantsUpdated2026
Details →
AgentsModified MIT

Kimi K2

Moonshot AI · Kimi

Large open-weight MoE model from Moonshot AI often discussed for agentic coding and tool-use workflows.

Best for: Builders testing agentic workflows, coding tasks, and tool-use behavior with open weights.

HardwareServer-classUpdated2026
Details →
AgentsCheck model card

GLM-4.5

Z.ai · GLM

Open-weight MoE model family from Z.ai designed around agent, coding, and reasoning workflows.

Best for: Teams evaluating open models for tool-calling, software engineering, and agent workflows.

HardwareServer-classUpdated2026
Details →
ReasoningApache 2.0 for referenced 80k release

MiniMax M1

MiniMax · MiniMax

Open-weight hybrid-attention reasoning model worth testing for long-context and agent-oriented workloads.

Best for: Builders comparing long-context reasoning models and hosted open-weight options.

HardwareServer-classUpdated2026
Details →
ChatLlama 3 Community

Llama 3 70B

Meta · Llama

Meta open-weight model family still commonly used as a baseline for local AI stacks and app prototypes.

Best for: Builders who want a widely supported open-weight chat model with broad runtime compatibility.

Local: Commonly used in local workflows through quantized builds, but 70B-class models are best with high-memory GPUs or workstation/server hardware.

Hardware48GBUpdated2026
Details →
ChatGemma Terms

Gemma 3 27B

Google · Gemma

Open-weight Gemma-family model worth testing where Google ecosystem support and local deployment matter.

Best for: Developers testing capable medium-sized chat models with broad tooling support.

Local: Can be tested locally with quantized builds on higher-end consumer GPUs or unified-memory systems.

Hardware24GBUpdated2026
Details →
ChatApache 2.0

Mistral Small 3.1

Mistral AI · Mistral

Efficient open-weight Mistral-family model often considered for practical local chat and app workloads.

Best for: Builders who want a smaller capable model with permissive licensing and good local runtime support.

Local: A practical local candidate when quantized; still test memory use and context length on your hardware.

Hardware16GB+ quantizedUpdated2026
Details →
EdgeMIT

Phi-4 Mini

Microsoft · Phi

Small open model family useful for lightweight local prototypes and edge-oriented AI experiments.

Best for: Builders testing small local models on laptops, CPUs, and constrained hardware.

Local: A small-model option for local experiments on laptops, CPUs, and low-VRAM machines.

Hardware4GBUpdated2026
Details →
CodeCheck exact model card

Qwen3 Coder

Alibaba Qwen · Qwen

Coding-focused Qwen model family worth testing for editor assistants, code review, and agentic coding workflows.

Best for: Developers comparing open coding models for Continue, Aider, Cline, and Roo Code workflows.

Local: Smaller or quantized coder variants can be tested locally for IDE and coding-agent workflows.

HardwareVaries by sizeUpdated2026
Details →
RerankingCheck model card

BGE Reranker v2 M3

BAAI · BGE

Lightweight multilingual reranker commonly used to improve RAG result ordering after vector search.

Best for: RAG builders who need a practical reranker after Qdrant, Chroma, pgvector, or other vector search.

Local: Commonly used in local RAG stacks as a reranking step after vector search.

HardwareCPU or small GPUUpdated2026
Details →
EmbeddingMIT

Multilingual E5 Large

Microsoft / intfloat · E5

Widely used open embedding model for multilingual semantic search and RAG prototypes.

Best for: Teams building multilingual retrieval, semantic search, and RAG pipelines.

Local: Runs locally for many embedding and semantic search prototypes on CPU or modest GPU hardware.

HardwareCPU or small GPUUpdated2026
Details →
EmbeddingCheck exact model card

Qwen3 Embedding

Alibaba Qwen · Qwen

Qwen embedding model family worth testing for modern RAG and multilingual retrieval stacks.

Best for: Builders who want a newer embedding family to compare against E5, BGE, and Jina.

Local: Smaller embedding variants are practical for local RAG and retrieval experiments.

HardwareCPU or small GPUUpdated2026
Details →
VisionCheck exact model card

Qwen3 VL

Alibaba Qwen · Qwen

Vision-language Qwen family useful when workflows need images, screenshots, documents, or UI understanding.

Best for: Builders adding visual understanding to open AI workflows.

Local: Can be tested locally when compatible checkpoints and runtimes are available; multimodal serving is more demanding than text-only models.

HardwareVaries by sizeUpdated2026
Details →
AudioMIT

Whisper Large V3

OpenAI · Whisper

Open speech recognition model commonly used for transcription and multilingual audio workflows.

Best for: Builders adding local transcription, podcast processing, meeting notes, or audio translation.

Local: Commonly used for local transcription workflows; GPU improves batch throughput.

Hardware10GBUpdated2026
Details →