Question 1

Is Pygmalion.chat the same as the Pygmalion models?

Accepted Answer

No, and this is the most common confusion. The Pygmalion models (Pygmalion-6B, Pygmalion-13B, MythoMax-L2-13B) are open-source neural network weights distributed on Hugging Face as GGUF or similar files. Pygmalion.chat is the hosted browser frontend run by the same collective. You can use either independently: the models work in any llama.cpp-compatible runtime (SillyTavern, KoboldCpp, LM Studio, Oobabooga, Ollama), and pygmalion.chat works even if you have never touched a local model.

Question 2

What hardware do I actually need to run Pygmalion-13B or MythoMax locally?

Accepted Answer

For smooth 13B inference at usable speed, target an NVIDIA GPU with 8 GB VRAM or more — RTX 3060 12GB and RTX 4060 Ti 16GB are the most cited sweet-spot cards. CPU-only is possible with 16+ GB system RAM but generation is slow enough (seconds per token) that most users give up. Quantization makes a real difference: Q4_K_M is the standard balance of quality and footprint, Q5 / Q6 trade speed for slightly better outputs. Context window of 4096–8192 tokens covers most RP use; longer contexts hit memory limits fast on consumer cards.

Question 3

Pygmalion vs MythoMax — which model should I actually download?

Accepted Answer

MythoMax-L2-13B is the safer default in 2026. Built on Llama 2 and trained on the MythoLogic dataset, it has cleaner narrative fluency and better character consistency than the original Pygmalion-6B / 13B fine-tunes from GPT-J / GPT-NeoX bases. Most SillyTavern users have moved to MythoMax as their daily driver. Original Pygmalion models remain interesting for users specifically wanting the older base behaviour or for fine-tune experimentation, but for chatting, MythoMax is the standard.

Question 4

Is Pygmalion right for me?

Accepted Answer

PygmalionAI isn't a single chat app — it's an ecosystem of open-source models and compatible frontends designed specifically for roleplay. The core distinction: when you visit pygmalion.chat, you're using a hosted browser frontend, but the real power lies in the Pygmalion models themselves: Pygmalion-6B, Pygmalion-13B, and the community-driven MythoMax-L2-13B. These are neural network weights released on Hugging Face, fine-tuned from GPT-J, GPT-NeoX, and Llama bases to excel at character roleplay — emotional depth, consistent personas, and uncensored dialogue.

Most serious RP users eventually load Pygmalion models into SillyTavern or KoboldCpp for deeper control. But if you want to jump in without local setup, pygmalion.chat offers a simple entry point: browse community character cards, chat with them in the browser, and connect your own backend if desired. The platform embodies the philosophy of the F cluster (local open-source RP ecosystem): put the user in control, no content filters, no paywalls between you and quality roleplay.

Question 5

How do I get started in ten minutes?

Accepted Answer

Starting with Pygmalion.chat is straightforward. Visit the homepage, log in (or browse anonymously), and scroll the character library. Characters are user-created cards with personality profiles, conversation histories, and scenario details. Click any character and begin chatting in the browser. The platform works in plain vanilla browser — no client download required.

If you want to level up, link your own backend. Pygmalion.chat supports connecting KoboldCpp, LM Studio, or any KoboldAI-compatible API. This bypasses the platform's default model and injects your local Pygmalion-13B or MythoMax instance. Go to settings, paste your backend URL (e.g., http://localhost:5000), and you're instantly using your own open-source firepower. The beauty: same character cards, same UI, but you control the model and parameters — temperature, context length, sampling strategy, all at your fingertips.

Question 6

What are the Pygmalion and MythoMax models?

Accepted Answer

Pygmalion-6B and Pygmalion-13B were the first models explicitly optimized for roleplay. Built by the community and released on Hugging Face, they use dialogue data (not Q&A), preserve emotional markers like smiles and hugs you, and maintain character consistency across long conversations. The 13B version is the workhorse: 13 billion parameters fine-tuned from GPT-NeoX, strong emotional expression, no content filters, entirely open-source.

MythoMax-L2-13B is the community's refinement. Based on Llama 2 and trained on the MythoLogic dataset, it offers superior narrative fluency and character depth compared to the original Pygmalions. Most experienced RP users now prefer MythoMax-L2-13B, and it's become the de facto standard model in the SillyTavern community. Both models are distributed as GGUF (quantized, efficient format), typically 4–8 GB files that run on consumer GPUs with 8GB VRAM or better. Quantization levels (Q4_K_M, Q5, Q6) trade speed for quality; Q4 is the sweet spot for most setups.

Question 7

How do I run this locally?

Accepted Answer

Local deployment is where Pygmalion truly shines. Download the GGUF model from Hugging Face, grab KoboldCpp or LM Studio, and within minutes you have a personal RP engine with zero cost, zero latency, zero data leaving your machine. KoboldCpp is the minimal path: a single .exe that wraps llama.cpp and serves the KoboldAI API. Download it, select your GGUF file, click Launch, configure SillyTavern to point to localhost:5000, and you're running MythoMax-13B locally.

LM Studio offers a friendlier GUI: drop your GGUF file into the model folder, click Load, and chat directly in the interface. For hardened users, run the raw llama.cpp or OobaBooga TextGen WebUI. All paths converge on the same principle: load Pygmalion/MythoMax locally, then control every generation parameter. Typical hardware needs: NVIDIA GPU with 8GB+ VRAM for smooth 13B inference, or 16GB+ RAM if CPU-only. Context windows of 4096–8192 tokens are standard for RP, ensuring character memories and plot details persist.

Question 8

How does it compare to other platforms?

Accepted Answer

Pygmalion models stand apart from general-purpose LLMs by design. Compared to Llama 3, Claude, or GPT-4, Pygmalion excels at emotional roleplay, character consistency, and creative freedom — but lacks the broad knowledge and reasoning power of those models. They're not for homework help or coding; they're for inhabiting fictional worlds. In the RP platform landscape, Pygmalion occupies the intersection of SillyTavern (the UI powerhouse) and open-source ideology: full control, no corporate surveillance, no content moderation.

Best suited for: users migrating from Character.AI who value uncensored RP; SillyTavern veterans hunting better models; writers building long-form narratives; and hobbyists who own a decent GPU and want to experiment. If you're comfortable with local setup and value privacy, Pygmalion is the benchmark. If you want plug-and-play and don't mind UI limitations, pygmalion.chat gets you started. Either way, you're part of a deep community: Discord servers, character libraries (Chub AI, CharacterHub), and parameter guides shared by thousands of RP enthusiasts worldwide.

Alternatives: SillyTavern, Agnai, Chub AI

Question 9

Migrating from Character.AI to Pygmalion

Accepted Answer

The most common journey into Pygmalion goes through Character.AI dissatisfaction. The pattern is consistent enough to be worth mapping: users start on C.AI, accumulate favourite characters, hit policy or filter friction, search for alternatives, land first on Janitor or Chub for the catalogue, then notice that the actual model behind those platforms matters as much as the UI. That is the entry point where Pygmalion (or really MythoMax) becomes a target — the open-source RP-tuned model is what gives the new platform its character voice.

The practical migration has three steps. First, export or recreate the character. Character.AI does not export character definitions as Tavern V2 cards, so the option is to copy the persona description, scenario, and a few example dialogues into a fresh card on Chub or directly into your local frontend. Tavern V2 PNG format is the universal currency: SillyTavern, KoboldCpp + SillyTavern, Agnai, Backyard, Chub, and Janitor all read it. Second, pick a frontend: SillyTavern for maximum control and extensions, Agnai if browser convenience and multi-user matters, Backyard if a packaged desktop app is what you want. Third, pick a model: MythoMax-L2-13B as default, original Pygmalion-13B for the older RP flavour, or a Llama 3 fine-tune if you have the VRAM headroom.

Three frictions are worth being honest about. Generation speed: a 13B model on a mid-range GPU runs 5–20 tokens per second, which feels noticeably slower than cloud chat — most users adapt within a week, but the first session can feel jarring. Memory ceiling: 4096–8192 token contexts mean very long RP sessions need either summarisation discipline or vector-memory extensions like SillyTavern's smart-context. Model behaviour drift: the open-source RP models are less polished than frontier general-purpose models — phrasing is sometimes awkward, knowledge gaps are real, and the right prompt template (Alpaca, ChatML, Vicuna) is part of getting good outputs.

The upside on the other side of those frictions is the part that keeps users on Pygmalion. No filters of any kind. No subscription. No remote logging of intimate conversations. Full control over sampler settings (temperature, top-p, top-k, repetition penalty), prompt templates, and context handling. The same character card works across multiple models, so swapping between MythoMax and Pygmalion is a one-click test rather than a rebuild. And the community library — Chub, CharacterHub, r/PygmalionAI, r/SillyTavernAI — is large enough that almost any RP scenario already exists in someone's published card.

Question 10

Common pitfalls and how to avoid them

Accepted Answer

The single most common pitfall is picking the wrong prompt template. Pygmalion-13B was trained with one instruction format, MythoMax with another, and Llama-base fine-tunes with yet another. Using the wrong template silently degrades output quality — characters become repetitive, dialogue breaks structure, and the model feels worse than it actually is. SillyTavern, Agnai, and KoboldCpp all let you pick the template per chat; check the model card on Hugging Face for the recommended format and apply it before judging the model.

The second pitfall is quantization confusion. GGUF files come in multiple quantization levels — Q2, Q3, Q4_K_M, Q5_K_M, Q6, Q8 — and the file size difference is large. Q2 will fit on 4 GB VRAM but generation quality is visibly degraded; Q4_K_M is the standard balance; Q6 and Q8 produce noticeably cleaner outputs at the cost of more VRAM. For a first-time Pygmalion / MythoMax user with an 8 GB card, Q4_K_M is the right pick; Q5_K_M if you have 12 GB; Q6 if you have 16 GB or more. There is no value in running Q2 unless your hardware leaves no other choice.

The third pitfall is expecting the model to "just remember" everything. Open-source RP models do not have hidden long-term memory; the context window is the entire memory. A 4096-token context covers roughly the last few thousand words of chat. Beyond that, anything older is gone unless re-introduced. SillyTavern's Author's Note, Lorebook, and vector-memory extensions are the standard ways to keep long-running RP sessions coherent — without them, characters forget the relationship setup after the first few hours of chat.

Finally, community-shared sampler settings are a shortcut, not a cargo cult. Reddit threads on r/SillyTavernAI publish recommended sampler presets for specific models. Those presets are good starting points and worth trying. They are not magic — temperature and repetition penalty interact with the prompt and the character description, so a preset that someone praises may underperform on your card. Keep one preset as a baseline, change one parameter at a time, and revert when an experiment makes things worse.

Pygmalion.chat

Overview

Preview

Get to know Pygmalion.chat