Skip to content

Commit

Permalink
Update documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
schorndorfer committed Nov 10, 2023
1 parent 1e13ccc commit e688c91
Show file tree
Hide file tree
Showing 3 changed files with 13 additions and 27 deletions.
19 changes: 6 additions & 13 deletions _sources/llm-defined.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,6 @@ A `Language Model` (LM) is a model that assigns probabilities to sequences of wo
</video>
:::

<!-- ```{figure} ./images/animated-transfomer.mp4
---
width: 600px
name: animated-transfomer
---
[Animated Transfomer](https://prvnsmpth.github.io/animated-transformer/)
``` -->

```{figure} ./images/lm-hist.png
---
Expand All @@ -40,9 +33,9 @@ name: lm-hist

:::{admonition} **[Large](https://en.wikipedia.org/wiki/Large_language_model#List)**
:class: tip
We call a language model `large` when it has
- Many parameters (billions)
- And has been trained on large quantities of language data (billions of words/tokens)
We call a language model `large` when:
- The model has billions of parameters
- The model has been trained on billions of words/tokens
:::

```{figure} ./images/wikipedia-list.png
Expand All @@ -55,7 +48,7 @@ name: wikipedia-list

:::{admonition} **Transformer**
:class: tip
Transformers were the key innovation that allowed language models to get large. They are a deep learning architecture that allow massive parallelization of training and inference on GPUs
`Transformers` were the key innovation that allowed language models to get large. They are a deep learning architecture that allow massive parallelization of training and inference on GPUs
:::

```{figure} ./images/ai-2-transformer.png
Expand All @@ -76,12 +69,12 @@ name: attention

:::{admonition} **Pre-trained**
:class: tip
Pre-trained language models have been trained via self-supervision on vast quantities of text. These are also called [`foundation`](https://en.wikipedia.org/wiki/Foundation_models) models. They are not typically useful until...
`Pre-trained` language models have been trained via self-supervision on vast quantities of text. These are also called [`foundation`](https://en.wikipedia.org/wiki/Foundation_models) models. They are not typically useful until...
:::

:::{admonition} **Generative**
:class: tip
Generative models are foundation models that have been further trained via supervised fine-tuning and reinforcement learning from human feedback (RLHF) to behave in a useful and safe manner, for example by responding to questions with answers like a chat assistant.
`Generative` models are foundation models that have been further trained via supervised fine-tuning and reinforcement learning from human feedback (RLHF) to behave in a useful and safe manner, for example by responding to questions with answers like a chat assistant.
:::

:::{card} [OpenAI](https://en.wikipedia.org/wiki/OpenAI):
Expand Down
19 changes: 6 additions & 13 deletions llm-defined.html
Original file line number Diff line number Diff line change
Expand Up @@ -411,13 +411,6 @@ <h1>Generative Pre-Trained Transformer (GPT)<a class="headerlink" href="#generat
</video>
</div>
</div>
<!-- ```{figure} ./images/animated-transfomer.mp4
---
width: 600px
name: animated-transfomer
---
[Animated Transfomer](https://prvnsmpth.github.io/animated-transformer/)
``` -->
<figure class="align-default" id="lm-hist">
<a class="reference internal image-reference" href="_images/lm-hist.png"><img alt="_images/lm-hist.png" src="_images/lm-hist.png" style="width: 600px;" /></a>
<figcaption>
Expand All @@ -426,10 +419,10 @@ <h1>Generative Pre-Trained Transformer (GPT)<a class="headerlink" href="#generat
</figure>
<div class="tip admonition">
<p class="admonition-title"><strong><a class="reference external" href="https://en.wikipedia.org/wiki/Large_language_model#List">Large</a></strong></p>
<p>We call a language model <code class="docutils literal notranslate"><span class="pre">large</span></code> when it has</p>
<p>We call a language model <code class="docutils literal notranslate"><span class="pre">large</span></code> when:</p>
<ul class="simple">
<li><p>Many parameters (billions)</p></li>
<li><p>And has been trained on large quantities of language data (billions of words/tokens)</p></li>
<li><p>The model has billions of parameters</p></li>
<li><p>The model has been trained on billions of words/tokens</p></li>
</ul>
</div>
<figure class="align-default" id="wikipedia-list">
Expand All @@ -440,7 +433,7 @@ <h1>Generative Pre-Trained Transformer (GPT)<a class="headerlink" href="#generat
</figure>
<div class="tip admonition">
<p class="admonition-title"><strong>Transformer</strong></p>
<p>Transformers were the key innovation that allowed language models to get large. They are a deep learning architecture that allow massive parallelization of training and inference on GPUs</p>
<p><code class="docutils literal notranslate"><span class="pre">Transformers</span></code> were the key innovation that allowed language models to get large. They are a deep learning architecture that allow massive parallelization of training and inference on GPUs</p>
</div>
<figure class="align-default" id="trans-subset">
<a class="reference internal image-reference" href="_images/ai-2-transformer.png"><img alt="_images/ai-2-transformer.png" src="_images/ai-2-transformer.png" style="width: 600px;" /></a>
Expand All @@ -454,11 +447,11 @@ <h1>Generative Pre-Trained Transformer (GPT)<a class="headerlink" href="#generat
</figure>
<div class="tip admonition">
<p class="admonition-title"><strong>Pre-trained</strong></p>
<p>Pre-trained language models have been trained via self-supervision on vast quantities of text. These are also called <a class="reference external" href="https://en.wikipedia.org/wiki/Foundation_models"><code class="docutils literal notranslate"><span class="pre">foundation</span></code></a> models. They are not typically useful until…</p>
<p><code class="docutils literal notranslate"><span class="pre">Pre-trained</span></code> language models have been trained via self-supervision on vast quantities of text. These are also called <a class="reference external" href="https://en.wikipedia.org/wiki/Foundation_models"><code class="docutils literal notranslate"><span class="pre">foundation</span></code></a> models. They are not typically useful until…</p>
</div>
<div class="tip admonition">
<p class="admonition-title"><strong>Generative</strong></p>
<p>Generative models are foundation models that have been further trained via supervised fine-tuning and reinforcement learning from human feedback (RLHF) to behave in a useful and safe manner, for example by responding to questions with answers like a chat assistant.</p>
<p><code class="docutils literal notranslate"><span class="pre">Generative</span></code> models are foundation models that have been further trained via supervised fine-tuning and reinforcement learning from human feedback (RLHF) to behave in a useful and safe manner, for example by responding to questions with answers like a chat assistant.</p>
</div>
<div class="sd-card sd-sphinx-override sd-mb-3 sd-shadow-sm docutils">
<div class="sd-card-body docutils">
Expand Down
2 changes: 1 addition & 1 deletion searchindex.js

Large diffs are not rendered by default.

0 comments on commit e688c91

Please sign in to comment.