Libretexts: get real cover page #68

benoit74 · 2024-11-13T14:20:31Z

Currently, to build index (#55) and soon detailed licensing (#54) and probably for the table of content (#56) we need to get the URL and ID of the "cover page" of current book.

In #55, we've built this logic by going up into the tree and search for a page with "article": "topic-category" property (see get_page_parent_book_id logic).

Real logic used online on libretexts.org is based on tags, in getCoverpage function of https://cdn.libretexts.net/github/LibreTextsMain/Miscellaneous/reuse.js:

    /**
     * Locates the parent page that is the coverpage, if it exists
     * @param url - page to look up the coverpage for
     * @returns {Promise<string>} - path to the coverpage
     */
    async function getCoverpage(url = window.location.href) {
        if (typeof getCoverpage.coverpage === 'undefined') {
            const urlArray = url.replace("?action=edit", "").split("/");
            for (let i = urlArray.length; i > 3; i--) {
                let path = urlArray.slice(3, i).join("/");
                if (!path)
                    break;
                let response = await LibreTexts.authenticatedFetch(path, 'tags?dream.out.format=json');
                let tags = await response.json();
                if (tags.tag) {
                    if (tags.tag.length) {
                        tags = tags.tag.map((tag) => tag["@value"]);
                    }
                    else {
                        tags = tags.tag["@value"];
                    }
                    if (tags.includes("coverpage:yes") || tags.includes("coverpage:toc") || tags.includes("coverpage:nocommons")) {
                        getCoverpage.coverpage = path;
                        break;
                    }
                }
            }
        }
        return getCoverpage.coverpage;
    }

As one can see, this code walks up the tree of pages and look for first page with coverpage:yes or coverpage:toc or coverpage:nocommons tags.

Obviously, this is something we might have to consider at some point (probably easier to compute on the tree of pages at crawl time).

To be analyzed at least, if not integrated into the crawler (finding all page tags means doing a query to retrieve them, we do not use them ATM, and this will have an impact on crawl time).

The text was updated successfully, but these errors were encountered:

benoit74 · 2024-11-13T15:21:04Z

In fact, we need to retrieve tags only for the backmatter pages and their parents. Quite a limited impact, so let's do it now.

benoit74 added enhancement New feature or request question Further information is requested labels Nov 13, 2024

benoit74 added this to the 0.1 milestone Nov 13, 2024

benoit74 self-assigned this Nov 13, 2024

benoit74 removed the question Further information is requested label Nov 13, 2024

benoit74 mentioned this issue Nov 13, 2024

Get real cover page of books #70

Merged

benoit74 closed this as completed in #70 Nov 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Libretexts: get real cover page #68

Libretexts: get real cover page #68

benoit74 commented Nov 13, 2024

benoit74 commented Nov 13, 2024

Libretexts: get real cover page #68

Libretexts: get real cover page #68

Comments

benoit74 commented Nov 13, 2024

benoit74 commented Nov 13, 2024