YAZGPT - an experimental bibliographic chatbot

The purpose of this little project is to build a basic platform where I can experiment with the ability of an LLM to generate queries, issue searches against a bibliographic database, and interpret the results. The idea is to use a chatbot as a way to quickly explore and prototype prompts and models that might support different types of library applications like cataloging, collection analysis, resource sharing, etc.

I'm incredibly rusty as a developer and know no Python, so please don't judge.

Caveat

In its present form, this is not meant tobe a useful tool for any specific use case -- it is just intended as a fun way to explore whether generative API can have a role in bibliographic data flows. The present version is hardwired to search the Libray of Congress.

Because full MARC records take up a lot of room in the context buffer, the LLM has the ability to retrieve brief (summary) or detailed (full MARC) records.

Prerequisites

YAZGPT.py needs to zee "zoomsh" in its path. Zoomsh is part of the YAZ toolkit. You can easily find it for almost any platform. I run this on a Mac.

You need an OpenAI API key. The code looks for it in the OPENAI_API_KEY environment variable.

Synopsis

python3 YAZGPT.py --help     # for an overview of command-line options and parameters

Example sessions

Note, the AI has been told that bibliographic search targets prefer normalized names, so it swaps surname/name when needed.

Welcome to the YAZGPT AI-powered Z39.50 client

YAZGPT>> find publications by Robert Pirsig, combine the results by title and list
          editions under each title
                                                               
Function Call search {
  "author": "Pirsig, Robert"
}
Hitcount:  11
Here are the publications by Robert Pirsig, grouped by title and listed with their editions:

1. Title: John Dewey, Robert Pirsig, and the art of living
   - Edition 1: New York: Palgrave Macmillan, 2006

2. Title: Lila
   - Edition 1: New York: Bantam Books, 1991
   - Edition 2: New York, N.Y.: Bantam Audio, p1991 (Contributor: Will Patton)

3. Title: On quality
   - Edition 1: Boston: Mariner Books, [2022] (Editor: Wendy K. Pirsig)

4. Title: Zen and the art of motorcycle maintenance
   - Edition 1: New York: W. Morrow, 1999
   - Edition 2: New York: Morrow, 1974
   - Edition 3: New York: Quill, 1999
   - Edition 4: New York: Morrow, 1984, c1974
   - Edition 5: Los Angeles, CA: Audio Renaissance ; p1999 (Contributor: Lawrence Pressman)

5. Title: Zen and now
   - Edition 1: New York: Alfred A. Knopf, 2008

Please note that "Lila" and "Zen and the art of motorcycle maintenance" have multiple editions.

YAZGPT>>

Here is a fun recursive example -- note, lots of tuning and tweaking needed to make this operational, but a really interesting example of the AI using the output of one search to launch another.

YAZGPT>> find the book with isbn 0688002307, then use a title/author search to locate
            potential matching editions. Make a list of only those results where the
            Author and Title are similar to the original book
     
Function Call search {
  "isbn": "0688002307"
}
Hitcount:  1
Function Call search {
  "title": "Zen and the art of motorcycle maintenance:",
  "author": "Pirsig, Robert M."
}
Hitcount:  7
Here are the potential matching editions of the book "Zen and the art of motorcycle maintenance" by
   Robert M. Pirsig that have similar titles and authors:

1. Title: Zen and the art of motorcycle maintenance:
   - Author: Robert M. Pirsig
   - Publication: New York, Morrow, 1974.
   - ISBN: 0688002307

2. Title: Zen and the art of motorcycle maintenance :
   - Author: Robert M. Pirsig
   - Publication: New York, W. Morrow, 1999.
   - ISBN: 0688171664 (pbk.)

3. Title: Zen and the art of motorcycle maintenance :
   - Author: Robert M. Pirsig
   - Publication: New York, Morrow, 1984, c1974.
   - ISBN: 0688032079

Please note that these are the editions that have similar titles and authors to the original
   book. If you need more information about any of these editions, please let me know.

YAZGPT>>

TODO

Explore use cases and sumarize results: General research, copy cataloging, resource sharing

Would it make sense to predefine multiple search targets and let the LLM choose the modst appropriate one? Create etc/targets and a way to specify a list of available targets for a session.

List remaining context buffer space; add some kind of cleanout/compression of older messages. Right now, everything is kept which is likely to lead to crummy results is a session goes on too long anyway.

Make the temperature a command line setting to allow experimentation. Possibly, t=0 would be preferable for these kinds of explorations but right now the "creativity" of temp=1 may be good for discovery/ideation.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
.idea		.idea
etc/prompts		etc/prompts
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
YAZGPT.py		YAZGPT.py
tmp		tmp
zlog.py		zlog.py
zoomsh_wrapper.py		zoomsh_wrapper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YAZGPT - an experimental bibliographic chatbot

About

Releases

Packages

Languages

License

sebhammer/YAZGPT

Folders and files

Latest commit

History

Repository files navigation

YAZGPT - an experimental bibliographic chatbot

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages