Mixture of Expert (MoE)

Mixture-of-Experts (MoE) is an artificial neural network architecture that uses separate subsets of the neural network for different tasks. It can be seen as a new type of model that incorporates an agent (router), many expert models and functions (tools) with an enormous context window.

This architecture, in theory, makes the models easier to train and cheaper to run as not every prediction has to traverse the entire model memory.

High-profile models that use this architecture include Gemini 1.5, Mistral 8x7B, and DeepSeek.

Liwen's Notes

Recent Notes

Python Programming Language

LLM Explainability & Interpretability

Accountability & Motivation

Reasoning Models

Gen AI: The Future Is Agentic

Hallmarks of Practical Agentic Systems

Google Agentspace

Model Context Protocol (MCP)

Magentic One

Manual - Obsidian

Explorer