A brief summary of key terms about ModularMind and AI in general.

LLM

Large language models (LLMs) are AI language models with many parameters that are capable of performing a variety of surprisingly useful tasks. These models are trained on vast amounts of text data and can generate human-like text, answer questions, summarize information, and more. ModularMind supports multiple LLMs to help you get the most out of your AI assistant by leveraging the strengths of different models.

Context Length

The “context length”, also known as the “context window”, refers to the amount of text a language model can look back on and reference when generating new text. This is different from the large corpus of data the language model was trained on, and instead represents a “working memory” for the model. A longer context length allows the model to understand and respond to more complex and lengthy prompts, while a shorter context length may limit the model’s ability to handle longer prompts or maintain coherence over extended instructions and data.

Tokens

Tokens are the smallest individual units of a language model, and can correspond to words, subwords, characters, or even bytes (in the case of Unicode). A token approximately represents 3 English characters for language models, though the exact number can vary depending on the language used. Tokens are typically hidden when interacting with language models at the “text” level but become relevant when examining the exact inputs and outputs of a language model.