-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ontology is unusable without domains and ranges #121
Comments
@azaroth42, thanks for shining a light on this issue—you are spot on! I'm reminded of something I heard Nicola Guarino say: "Interoperability is not compatible with underspecification. […] A well-founded computational ontology is a specific artifact expressing the intended meaning of a vocabulary in a machine-readable form." As an ontology, BIBFRAME is underspecified, and that is bound to be a barrier to adoption. Your point about To Reproduce
The declaration of
Fixing the bug requires removing the ( The broader issue of underspecification is still one that needs to be addressed—although data validation is a separate issue from conceptual modeling, and some of the interoperability issues are bound to be solved through community consensus, application profiles, SHACL shapes, etc. Finally, I also can't help myself, as you say, Rob 😄 I was reminded of your 2015 report, Analysis of the BIBFRAME Ontology for Linked Data Best Practices, namely, section 2.4.5, "Only Define What Matters":
Of course, that was nearly 10 years ago, but the line of thinking expressed there was influential at the time and, ironically, paved the way for the modeling decisions that you now rightly critique! |
Thanks Tim :) My 2015 understanding of good practice and the 2024 understanding are definitely different on that particular topic, and the differences (I think) come from the renewed emphasis in the ontology world on foundational models, rather than domain specific ones. I'm surprised you still have a copy of the report! The challenge that 2.4.5 (and 2.4.1) was attempting to address was the proliferation of predicates. We can see the end result of going down this line in the 1100 predicates of RDA. We see it rearing its ugly head already with the proliferation of organization based classes such as ShelfMarkNlm and the proposed OCLC in #120. BF 2.x is much much better than 1.0 in these regards, but we can still improve further. I agree with the final sentence quoted above still, but the approach to get there is to have broader semantics with a deeper conceptual class hierarchy. To take #19 as the example... the root cause is an incomplete definition (and thereby understanding) of "location". To say that "online" can be a Place expands the notion of location from purely physical/spatial into digital. Or potentially into the abstract to say that the location is "in storage" (a state, or at best a classification of the intended use of a Place). Rather than understand how to improve the model, instead the existing relationships have been broadened far beyond their initial intent. We don't want "placeGeographic", "placeObject", "placeConceptual", "placeDigital" -- that would be falling into the same trap as we got out of in 2.0 of having every predicate also have the name of the range class embedded in it. Instead, there should be a digital thing class that can have a locator in digital space. Physical objects can be at a location in geographic space, or related to some other object (letter is in the folder, folder is on the shelf). Conversely, it doesn't make sense to say that Concepts or Works have a location, so these would need to be in a different branch of the class hierarchy. Other interesting cases would be the beginning of existence of a thing and partitioning of things. 2.4.1 (reuse vocabulary terms) is where I think my understanding has changed the most significantly over the past decade with experience of Open Annotation, BF, IIIF, CIDOC CRM, RICO and Linked Art. Grafting predicates between conceptual models risks unintentionally importing undesirable semantics. Better to have a single core conceptual model with a strong ontology that allows sufficient scope for reuse across domains. |
The Guarino quote is interesting, it called to mind the work Guarino & Welty did on 'ontology cleaning' back in the day (circa 2000). Guarino & Welty used symbolic modal logic to suggest a backbone architecture to meta properties for ontology development. A nice exercise was was published in 2007 that applied one part (roles) of the ontology cleaning method to FRBR (https://experts.illinois.edu/en/publications/three-of-the-four-frbr-group-1-entity-types-are-roles-not-types ) but the authors, in a stroke of deep generosity to the profession suggested that
Indeed. For the problems of bibliographic control, it would seem that the actual world is perhaps world enough. To advance generously, then, we might view BIBFRAME as a denormalized ontology. I like to think of BIBFRAME as bringing forth a rich bibliographic cataloging tradition into linked data. Svenonious wrote in the preface to the Intellectual Foundation of Information Organization on the aims of the book that
I often look to my colleagues for help in making sense of a beautifully pragmatic cataloging practice. I count as colleagues those who have guided the changes we see in BIBFRAME today. I support their work; pragmatically and empirically, BIBFRAME is used in disparate systems and interoperates just fine. I've presented a little on useful approaches, but they aren't the only way to interoperate, though, like the actual world, it is enough. |
@azaroth42, yes, speaking of digital space... :) it looks as though I recovered a copy from Google Drive in 2018!
Great examples. In addition to space, BIBFRAME also lacks a theory or model of time. It's all practice (porting MARC 21), and no theory. The lack of attention to definitions (compared to more robust models) is another case in point. In BIBFRAME, time is reduced to a unidimensional
Right, alignment with upper-level ontologies or conceptual models can do more toward advancing interoperability. The LRMoo approach looks appealing, though I need to study the model more closely. |
Thanks, @jimfhahn, for the references--I'll take a look! I do appreciate where you're coming from, although you may be romanticizing cataloging practice a bit :) I speak from experience, having worked in the MARC 21 milieu for a few years. The problem, to me, is that BIBFRAME seems to be modeling the world of cataloging practice rather than the actual world itself. Within the cataloging community, shared rules and norms are probably enough to drive interoperability through consensus. But if we are interested at all in interoperability broadly speaking, we should think about how to model our data in a way that's coherent and well defined. Projects like openWEMI are also interesting in this regard. |
Rather than making snarky comments on the myriad issues, I'll open a high level issue for the problems that are being caused by them.
By reducing every domain and range to rdfs:Resource, you have destroyed any usability or interoperability of the ontology to the point where it's completely worthless. Why? Because rdfs:Literal is a subClass of rdfs:Resource.
So all of those properties like
agent
,carrier
,language
,place
, etc etc can all be either an entity of any class or a literal value of any type. A language with a value of a date? No problem! A place that's a Concept ... sure, why not!At this point, all the ontology actually provides is a flat list of names of properties and classes that implementers can choose from and mix and match freely to their hearts' content. Two implementations that follow the ontology that take diametrically opposed approaches to almost any modeling choice are both completely valid, and thereby interoperability is gone. This makes usability by software engineers either wonderful (no constraints, so everything is correct) or impossible (no constraints, so everything has to be tested for, which is impossible across arbitrary data).
The text was updated successfully, but these errors were encountered: