forked from hakimel/reveal.js
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
47d9030
commit 9a52983
Showing
21 changed files
with
327 additions
and
17 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,101 @@ | ||
|
||
|
||
# Large Language Models: The Digital Grimoires of the 21st Century | ||
Ing. Flavio Cordari | ||
|
||
-- | ||
|
||
## Agenda | ||
asdasd asdd | ||
|
||
-- | ||
|
||
## Why Large Language Models? | ||
|
||
-- | ||
|
||
LLMs represent a significant leap in artificial intelligence and natural language processing capabilities. Their ability to understand, generate, and interact using human-like language has opened up new possibilities in AI, from creating more intuitive user interfaces to generating content and even coding. | ||
|
||
-- | ||
|
||
## Why Grimoires? | ||
|
||
-- | ||
|
||
![[grimoire.webp]] | ||
|
||
|
||
Notes: | ||
The analogy here is that just as grimoires were the repositories of arcane knowledge and power in their time, LLMs are the contemporary digital equivalents, holding vast amounts of human knowledge. However, instead of spells and magical rites, LLMs contain the collective textual data of humanity, capable of generating insights, answers, and even creating new content based on this data. | ||
|
||
-- | ||
|
||
## The "Imitation Game" | ||
|
||
-- | ||
|
||
The Turing Test was designed to assess a machine’s ability to exhibit intelligent verbal behavior comparable to that of a human. Turing proposed that a human evaluator would engage in natural language conversations with both a human and a machine, and if the evaluator could not distinguish between them, the machine would demonstrate its capacity for faithfully imitating human verbal behavior. | ||
|
||
-- | ||
|
||
> ChatGPT-4 exhibits behavioral and personality traits that are statistically indistinguishable from a random human from tens of thousands of human subjects from more than 50 countries. | ||
[A Turing test of whether AI chatbots are behaviorally similar to humans](https://www.pnas.org/doi/10.1073/pnas.2313925121) | ||
|
||
-- | ||
## Characteristica Universalis and Calculus Ratiocinator | ||
|
||
This concept envisioned a universal language or symbolism that could represent all human knowledge in a formal, logical system. Leibniz imagined this as a means to encode ideas, arguments, and principles in a way that they could be analyzed and manipulated logically. The ultimate goal was to reduce reasoning to a form of computation, where arguments could be solved with the same certainty as mathematical equations. | ||
|
||
Notes: | ||
Gottfried Wilhelm Leibniz (1646–1716) was a German polymath and philosopher who made significant contributions across a wide range of academic fields, including mathematics, logic, philosophy, ethics, theology, law, and history. He is perhaps best known for his development of calculus independently of Sir Isaac Newton, which led to a notorious dispute over priority. Beyond his advancements in mathematics, Leibniz's work in philosophy is also highly regarded, particularly his ideas regarding metaphysics, the problem of evil, and his optimistic belief that we live in the best of all possible worlds. | ||
|
||
-- | ||
|
||
A LLM can be seen as a realization of Leibniz's vision in several ways. It processes natural language (a form of universal language) to understand, generate, and manipulate information. Though not precisely what Leibniz envisioned as a purely symbolic system, natural language processing (NLP) technologies achieve a similar end: encoding and reasoning about human knowledge. | ||
|
||
-- | ||
|
||
## Compression is Comprehension | ||
|
||
-- | ||
|
||
In information theory, compression is about representing information in a way that reduces redundancy without losing the essence of the original data. This is done through various algorithms that identify patterns and represent them more efficiently. | ||
|
||
-- | ||
|
||
In the context of cognitive science, our brains understand and learn about the world by compressing sensory inputs and experiences into models, schemas, or concepts that are simpler than the sum total of possible data. This process allows us to make sense of complex environments and predict future events based on past experiences. | ||
|
||
-- | ||
|
||
As early as 1969, neuroscientist Horace Barlow wrote that the operations involved in the compression of information: | ||
|
||
> “… have a rather fascinating similarity to the task of answering an intelligence test, finding an appropriate scientific concept, or other exercises in the use of inductive reasoning. Thus, compression of information may lead one towards understanding something about the organization of memory and intelligence, as well as pattern recognition and discrimination.” | ||
-- | ||
|
||
LLMs training involve a lossy compression of textual datasets but, despite this, can still generate text. | ||
|
||
-- | ||
|
||
## What about consciousness? | ||
|
||
-- | ||
|
||
There are reported examples of individuals who believe that ChatGPT is conscious. As reported by The New York Times on 23 July 2022 (accessed on 23 July 2022), Google fired engineer Blake Lemoine for claiming that Google’s Language Model for Dialogue Applications (LaMDA) was sentient, (i.e., experiencing sensations, perceptions, and other subjective experiences). | ||
|
||
-- | ||
|
||
## Consciousness vs Intelligence | ||
|
||
-- | ||
|
||
According to Daniel Kahneman, humans possess two complementary cognitive systems: “System 1”, which involves rapid, intuitive, automatic, and non-conscious information processing; and “System 2”, which encompasses slower, reflective, conscious reasoning and decision-making | ||
|
||
-- | ||
|
||
The fast neural network computation performed by LLMs, resulting in convincing dialogues, aligns with the fast thinking associated with “System 1”. According to Kahneman’s description, being on the “System 1” level means that LLMs lack consciousness, which, in this context, is characteristic of “System 2”. | ||
|
||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# DL - Deep Learning | ||
|
||
-- | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# NLP - Natural Language Processing | ||
|
||
-- |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,116 @@ | ||
# LLMs - Large Language Models | ||
|
||
-- | ||
|
||
## What is a Language Model? | ||
|
||
-- | ||
|
||
A language model is a statistical and computational tool that enables a computer to understand, interpret, and generate human language based on the likelihood of occurrence of words and sequences of words. | ||
|
||
-- | ||
|
||
**Statistical Language Models:** These earlier models rely on the statistical properties of language, using the probabilities of sequences of words (n-grams) to predict the likelihood of the next word in a sequence. | ||
|
||
[Bigrams Example](https://colab.research.google.com/drive/1ikJuNYOOliuy8tTl9csKuWDlVdHJhVQg?usp=sharing) | ||
|
||
-- | ||
|
||
**Neural Language Models:** These models use **neural networks** to predict the likelihood of a sequence of words, learning and representing language in high-dimensional spaces. | ||
|
||
[Simplified NLM Example](https://colab.research.google.com/drive/1ON9CO6LUtX1mbDmYIq3Pt5mSqoxzGxPr?usp=sharing) | ||
|
||
-- | ||
|
||
## What is a *Large* Language Model? | ||
|
||
-- | ||
|
||
A Large Language Model is a Neural Language Model | ||
- which is trained on very big datasets | ||
- where its underlying neural network uses billions of parameters | ||
|
||
Notes: | ||
A large language model is a type of artificial intelligence algorithm designed to understand, generate, and work with human language in a way that mimics human-like understanding and production. These models are "large" both in terms of the size of the neural network architecture they are based on and the amount of data they are trained on. | ||
|
||
-- | ||
|
||
## Modern Large Language Models Architectures | ||
|
||
-- | ||
|
||
## Transformer-Based Models | ||
|
||
- **BERT (Bidirectional Encoder Representations from Transformers)** | ||
- **GPT (Generative Pre-trained Transformer) Series** | ||
- **T5 (Text-to-Text Transfer Transformer)** | ||
|
||
-- | ||
|
||
## Attention Is All You Need | ||
|
||
![[1706.03762.pdf]] | ||
|
||
Notes: | ||
Developed by Google, BERT was one of the first transformer-based models to use bidirectional training to understand the context of words in a sentence. It significantly improved the performance of NLP tasks such as question answering and language inference. | ||
|
||
OpenAI's GPT series, including GPT-3 and its successors, are known for their generative capabilities, enabling them to produce human-like text. These models are pre-trained on diverse internet text and fine-tuned for specific tasks, showcasing remarkable language understanding and creativity. | ||
|
||
Developed by Google, T5 approaches NLP tasks by converting all text-based language problems into a unified text-to-text format, allowing it to perform a wide range of tasks from translation to summarization with the same model architecture. | ||
|
||
-- | ||
|
||
### Sparse Models | ||
|
||
- **Mixture of Experts (MoE)** | ||
|
||
### Hybrid Models | ||
|
||
- **ERNIE (Enhanced Representation through kNowledge Integration)** | ||
|
||
Notes: | ||
The MoE architecture involves a set of expert models (typically, neural networks) where each expert is trained on a subset of the data. A gating mechanism decides which expert to use for a given input. This approach allows for more scalable and efficient training on large datasets. | ||
|
||
Developed by Baidu, ERNIE is designed to better understand the syntax and semantic information in a language by integrating knowledge graphs with text, leading to improved performance on NLP tasks that require world knowledge and reasoning. | ||
|
||
|
||
-- | ||
|
||
## Real World Examples | ||
|
||
-- | ||
|
||
Large language models (LLMs) can also be categorized based on their availability as either open source, where the model architecture and weights are publicly accessible, or closed source, where the model details are proprietary and access is restricted. | ||
|
||
-- | ||
|
||
- Closed source | ||
- OpenAI's GPT-3 / GPT-4 | ||
- Google's BERT models | ||
- ... | ||
|
||
-- | ||
|
||
- Open source | ||
- [OpenAI's GPT-2](https://github.com/openai/gpt-2) | ||
- [Hugging Face’s Transformers](https://huggingface.co/) (repository of open source models) | ||
- ... | ||
|
||
-- | ||
|
||
- Mixed open/closed source | ||
- [Meta's LLaMA](https://github.com/Meta-Llama/llama) | ||
- the company has provided some level of access to the research community but still maintains control over the distribution and usage of the model. | ||
|
||
-- | ||
|
||
- [Navigating the World of Large Language Models](https://www.bentoml.com/blog/navigating-the-world-of-large-language-models) | ||
|
||
-- | ||
|
||
## LLaMA 2 | ||
|
||
![[10000000_662098952474184_2584067087619170692_n.pdf]] | ||
|
||
-- | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# LLMs Horizons | ||
|
||
-- | ||
|
||
## Tools Use | ||
|
||
-- | ||
|
||
## LLMs OS | ||
|
||
-- | ||
|
||
## LLMs Security |
Empty file.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,6 @@ | ||
title: "Hello World!" | ||
theme: "cloudogu" | ||
slides: 'C:\Users\bitwise\projects\presentations\slides\0000-00-00\slides.html' | ||
width: 1920 | ||
height: 1080 | ||
show_notes_for_printing: false |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
<section data-markdown="./Chapter 0 - Introduction.md" data-separator="^(\r)?\n---(\r)?\n$" data-separator-vertical="^(\r)?\n--(\r)?\n$"></section> | ||
<section data-markdown="./Chapter 1 - Deep Learning.md" data-separator="^(\r)?\n---(\r)?\n$" data-separator-vertical="^(\r)?\n--(\r)?\n$"></section> | ||
<section data-markdown="./Chapter 2 - Natural Language Processing.md" data-separator="^(\r)?\n---(\r)?\n$" data-separator-vertical="^(\r)?\n--(\r)?\n$"></section> | ||
<section data-markdown="./Chapter 3 - Large Language Models.md" data-separator="^(\r)?\n---(\r)?\n$" data-separator-vertical="^(\r)?\n--(\r)?\n$"></section> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.