Skip to content
hyppyhyppo edited this page Sep 12, 2023 · 18 revisions
  • Where can I put training photo dataset ?

In the concepts tab. First add a config then add a concept to this config. When you click on the concept it open a window where you specify the path to your dataset. You can define several concepts for a config.

  • Where can I put the trigger word ?

The trigger word for Lora and Embedding is simply the name of the output model, just name it tomcruise.safetensors if you want tomcruise as trigger word. For your sample use a prompt calling the LoRA name or <embedding> that acts as a placeholder. LoRA: photo of man <lora:tomcruise> Embedding: photo of a <embedding>

  • [Embedding] How shall I use <embedding> when training embeddings ?

For embeddings, the trigger word is the embedding name, if it is TomCruise:

Do you add <TomCruise> to captions or <embedding> ? Answer: <embedding>

Do you add <TomCruise> or TomCruise to embedding-tab -> Initial embedding text? Answer: a brief description of your subject to help your embedding to train faster, so just "*" or "man" or "short man" ...

Do you add <TomCruise> or TomCruise or <embedding> to sampling-tab? Answer: <embedding>

Note: TomCruise is only set as the output model name: TomCruise.safetensors.

  • [Embedding] How to set the initial embedding text ?

In short, the initial embedding text should describe your subject but should not exceed in tokens the embedding token count.

Here is an explanation. Let's take your example prompt of "photograph of <embedding> with", and (for simplicity) assume every word is encoded into a single token. This could result in the following token IDs: [1, 2, ..., 3], where "..." is the place your embedding is inserted. if you have a 3 token embedding, it would be for example [1, 2, 100, 101, 102, 3].

Now let's say you set an init text of "blond woman wearing a pink hat". that's 6 tokens. but the embedding only supports 3 tokens, so only the first 3 (of the 6) tokens are actually used. the rest is truncated.

This also goes the other way around. if you only supply a shorter text (like "blond woman"), it doesn't know what to use as the third token. in OneTrainer, tokens are padded with the " * " token, so "blond woman" becomes "blond woman *".

Clone this wiki locally