[Update (2020-08-06)] The model in the backend of the website has been upgraded to GPT-3 - the code and the description below refers to the GPT-2-generated content, and is therefore out of date.
This is the project repo for www.infinite-infinite-jest.com
Infinite Infinite Jest is an AI-powered art project. The main page displays prose generated by a state-of-the-art machine learning model (Open AI's GPT-2) fine-tuned on Infinite Jest, a novel by David Foster Wallace.
Part of each sequence of several paragraphs generated by the model is fed back into it as the prompt for the next original chunk of machine-generated prose.
The process repeats. The result is Entertainment.
I am fascinated by generative AI models, and a fan of David Foster Wallace's work.
In February 2019, a non-profit AI research company, Open AI, ignited a media storm with their unusual decision not to release their most recent language model. The model was said to be able to generate text on any topic with previously unseen coherence. The decision not to release the model was made amid fears over weaponising it and flooding the internet with convincing fake news and propaganda.
After reading the initial statement by Open AI, and seeing the examples of text written by GPT-2, I couldn’t shake this visceral sense that something profound has or is about to change. The change wouldn’t be limited to how effective swarms of social media bots can become; it extends to the wider context of seeing creative writing as an inherently human endeavour.
I thought there's no better book than Infinite Jest to be fed into this neural network of ominous fame. The idea of a seemingly infinite machine-generated prose based on a book about losing battles to addiction and weaponised modern entertainment seemed poignant and worth exploring.
After a couple of months, Open AI shared their plan for a staged release of GPT-2. The delay was meant to improve societal preparedness for the new quality in text generation. In November 2019, Open AI published the largest version of the model, which is the one used for generating Infinite Infinite Jest.
I've read hundreds of pages of the generated text samples while babysitting the model. Prose generated by GPT-2-Wall-Es can be stubbornly repetitious, contradictory, and hardly coherent long-term. But it sure has its moments of wit, and true, airy Wallacean absurd.
(tl;dr: No.) The generated text contains characters from the novel, as well as its major themes and locations, but other than the opening line used as the first prompt, I haven't found a single sentence taken directly from DFW. There shouldn’t be anything that would end up spoiling the experience of reading the actual book. That said, I now realise I’m not sure what an effective spoiler to Infinite Jest would look like.
The neural network model (GPT-2) is fine-tuned on Infinite Jest:
GPT-2 is an unsupervised artificial neural network, a generative language model developed by Open AI. The original model (available here) was trained on a dataset containing over 8 million documents (40GB of text) shared in Reddit submissions with at least 3 upvotes.
Various interactive examples (like Talk to Transformer) demonstrate the model’s ability to generate coherent sequences of text based on user-defined prompt. The apparent versatility of the model is unparalleled. Give it a news headline, and it will continue describing the event in a journalistic fashion; feed it a few verses from a poem, and it will try carrying it on. Ask it a question, and it will give you an answer: truthful or bogus, but usually coherent.
These abilities refer to the generic model shared by Open AI. What happens, if we fine-tune it on a single book, orienting it more towards a specific novel’s style and content?
In the broader context of machine learning, “fine-tuning” a model means taking a model trained to do one task to do another task. The process involves freezing most of the networks layers, and retraining the remaining layers of the model on a new dataset.
In our case, fine-tuning GPT-2 should allow the model to retain the knowledge of high-level linguistic patterns (learnt on the 40GB Reddit dataset), while paying particular attention to patterns specific to the text it is fine-tuned to (ie. Infinite Jest). I was aiming to train the model as to learn the themes and characters from the book, and convincingly imitate the author's writing style, while not overfitting to the text, ie. not copying the sentences from the book directly.
It’s not, but so far it’s already significantly longer than the original book.
-
For some reason the model picked on the monologues of Jim's (James Incandenza’s) dad. I found it funny that many sentences in some of the rambling monologues that the model is spewing end with ", Jim.". It's been a while since I last read the book, but I don't think these parts comprised more than, say, 5% of the book. It's interesting that the model fixated on them so much. My pet theory is that the model sees a rambling semi-coherent monologue it's producing, and then looks back at it and thinks "Oh, I must be inside one of these dumb Jim's dad monologues"
-
Text contains neologisms that I was sure were taken verbatim from the book, but were in fact original to the generated text.
-
I found the phrase "to eat cheese" (frequently used in the book, meaning: to share something incriminating about somebody with the authorities) appearing numerous times in the text in a benign culinary context. Very cute.
-
Word repetitions (sometimes the model fixates on a single word and uses it excessively in a single paragraph.) Repetition is certainly a feature of David Foster Wallace’s style [citation needed], but GPT-2-Wall-Es is taking this too far sometimes.
-
I’ve seen long sequences of sentences beginning with “That (...)”, clearly inspired by similar segments from the source book (pages 200-205 in my copy) containing life-lessons from the rehab. Due to the feedback loop (of using part of the last sequence as the prompt for the next sequence), once the model starts generating lists of sentences starting with “That”, it’s hard for it to escape it.
-
Feeding the model numbered lists as a prompt resulted in an output resembling Infinite Jest’s endnotes (mostly bogus descriptions of pharmaceuticals, etc).
Raf ([email protected]) - concept, machine learning, writing the accompanying article.
Mateusz Zaręba ([email protected]) - frontend engineering
To the best of my knowledge, none of the authors are affiliated and/or associated with Quebecois separatist movements.
Max Woolf (@minimaxir) - for creating gpt-2-simple, a toolkit for fine-tuning GPT-2
Google Cloud Platform - for making TPUs available for free via Google Colab
Open AI - for hard work on advancing the field of artificial intelligence while setting precedents in responsible releases of transformative technologies.
Anna Warso, Piotr Migdał (@pmigdal) - for guidance, proofreading the human-generated text above.