Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search functionality #18

Open
Beatrice-nava opened this issue May 17, 2023 · 11 comments
Open

Search functionality #18

Beatrice-nava opened this issue May 17, 2023 · 11 comments

Comments

@Beatrice-nava
Copy link
Collaborator

1) Full text search (currently in the collection of letters, but we will add the writings)

  • Transcription (show the paragraph level <p>, <opener> or <closer>)
    <div type="original" xml:lang="nl">

  • Translation show the paragraph level <p>, <opener> or <closer>)
    <div type="translation" xml:lang="en">

  • Postal data (show the div)
    <div> @type="postalData"

  • Annotations (show the note level <note>)
    <note>

2) Facet search

  • Manuscript identifier
    <idno> child of <msIdentifier>

  • Country of preservation
    <country> child of <msIdentifier>

  • Place of preservation
    <institution> child of <msIdentifier>

  • Number of the letter
    <idno> @type="letterId" child of <altIdentifier>

  • Correspondent
    <correspAction> @type="received"
    <name>
    (use @key to find the name in bio.xml eventually)

  • Mentioned person
    <rs> @type="person" (use @key to find the name in bio.xml eventually)

  • Mondrian’s artwork
    <rs> @type="artwork-m" (use @key to find data in RKD db eventually)

  • Other’s artwork
    <rs> @type="artwork" (ignore for the moment)

  • Period
    <date> @when child of <correspAction> @type= "sent"
    (show the value of @when)

  • Place
    <placeName> child of <correspAction> @type="sent"

  • Exhibition
    <rs> @type="exhibition" (use @key to find data in expo.xml eventually)

  • Organizations (Museum, associations etc.) to be added later.

Search results should be sortable by

  • Date (asc/desc)
  • Correspondent
  • Place
@Beatrice-nava Beatrice-nava converted this from a draft issue May 17, 2023
@Beatrice-nava Beatrice-nava changed the title Search functions Search functionality May 17, 2023
@pboot
Copy link
Collaborator

pboot commented May 19, 2023

Something to take into account from the start is perhaps that the search facility will also be used for documents other than letters: the writings of course but also the introductions, biography, etc. Most of the above facets will not be applicable for these documents, but there should be a sort of super-facet: which type of document are we searching in or for.

@dirkroorda
Copy link
Member

Let's talk a little about this, e.g. the manuscript identifier.

In one of the letters we see:

<sourceDesc>
    <msDesc>
        <msIdentifier>
            <country>Nederland</country>
            <settlement>Otterlo</settlement>
            <institution>Kröller Müller Museum</institution>
            <idno>KM 123.397</idno>
            <altIdentifier><idno type="letterId">19090216y_IONG_1303</idno></altIdentifier>
            <altIdentifier><idno type="def"/></altIdentifier>
        </msIdentifier>
        <physDesc>
            <objectDesc form="correspondentiekaart"/>
            <decoDesc>
                <decoNote/>
            </decoDesc>
        </physDesc>
    </msDesc>
</sourceDesc>

Does that mean that every result in the divs in this letter should showup in the facets

  • manuscript identifier under the value KM 123.397
  • country under the value Nederland
  • letter number under the value 19090216y_IONG_1303

And do we ignore the objectDesc?

@pboot
Copy link
Collaborator

pboot commented May 22, 2023

Yes and yes.
(1) If they don't appear in the facets, the user can's use the facets to make further selections.
(2) To Beatrice and me this didn't seem urgent.
But some of this may be fine-tuned on the basis of feedback from Wietse and Leo once they see what it looks like.

@dirkroorda
Copy link
Member

@pboot Shall we also pick up the sender of the letter? I assume that there are also letters in the corpus that have been sent to Mondriaan?

And even if that is not the case, if somebody later combines this dataset with other letters sent to Mondriaan, then it is nice to have the sender metadata in place.
Anyway, the info is there, and it is easy to pick it up.

@dirkroorda
Copy link
Member

dirkroorda commented May 25, 2023

I can also pick up the date in short form (from the when attribute) and in long form, from the element content. Done it already.

@dirkroorda
Copy link
Member

By the way, the <rs> elements refer by attribute key or ref. For persons I see ref, for artworks I see key.
@Beatrice-nava Is there an intentional difference?

@Beatrice-nava
Copy link
Collaborator Author

Yes, because at some point we decided to use both attributes within the <rs> element:

  • the @key attribute to point to external objects (RKD artwork).
  • the @ref attribute to point to an xml:id in our database.

But things will probably change, as Peter mentioned in our last meeting, as we are discussing the possibility of using the RKD database to fill our xml files. This implies that we will create an artwork.xml similar to bio.xml, etc., and will probably refer only to our database, using @ref. We will keep you updated on this!

dirkroorda added a commit that referenced this issue May 25, 2023
@pboot
Copy link
Collaborator

pboot commented May 25, 2023

Shall we also pick up the sender of the letter?

Yes, that'll be useful too.

@dirkroorda
Copy link
Member

All the features mentioned under 2) are in Text-Fabric now, and put through to the WATM annotations.
However, as far as the rs elements are concerned, the feature values only contain the contents of the ref or key attributes.

Later we will use that to follow those references to pull data out related files, such as artwork.xml, bio.xml, biblio.xml.

@dirkroorda
Copy link
Member

@Beatrice-nava
Question came up: shouldn't we add this to the metadata of the letter as well:

teiHeader > fileDesc > sourceDesc > msDesc > physDesc > objectDesc het attribuut form=correspondentiekaart

@Beatrice-nava
Copy link
Collaborator Author

I think we decided to ignore it for the time being, as we basically only have letter and postcard as objectDesc, but that could change based on Wietse and Leo's feedback (I think we have already mentioned this in a previous comment, but for the moment there is no news).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

3 participants