-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
class: TaxonConcept #1
Comments
I have adopted the text from Franz & Peet (2009) for now, but I am not sure it was intended as a definition, as it is more a description of the relationship between a taxon concept and its name. For me, defining the taxon concept by reference to a taxon name is like putting the world upside-down, so I would prefer something like this:
I had 'a taxonomic group of organisms' first, but I think we should just refer to the definition in Darwin Core to indicate that our usage of the term 'taxon' also includes groups that are not normally considered taxa, such as hybrids and cultivars. Franz and Peet's (2009) text could still very well serve as a comment. |
It was intended as a definition (just to clarify that). To further clarify, it makes sense to center this definition around the relationship between a label/string and its referential extension as viewed by a certain author/source and at a given time, if the intent is not to use the term "Taxon". Which is not mentioned once in Franz & Peet (2009). I understand the contextual requirement for compatibility with DwC, however. In some sense, given that the 2009 article was not really meant to be all that compatible with DwC, there might be a case here for just omitting it; or say: for an alternative conception, see... |
It is not obvious to me that the author's full blown view of a name is going to be an extension, or a single extension. (Similarly there will be extensions, such as those hypothesized algorithmically or those for "partly blown views", that are not any author's full blown view.) For example, the full blown view might be vague or ambiguous or incomplete or inconsistent, perhaps intentionally so. A view and an extension are very different kinds of things - they have different properties and identity criteria. I could try to invent a definition that I like better but I'd insist on starting with use cases - those should be able to dictate any missing details of the particular sense that we would like to assign to this class name. Can someone offer high-quality examples where some particular thing (hypothesis, extension, text, etc.) would be a member of class:taxonConcept? Maybe there are some in TCS ...? It is always better to build a class up from examples than to think about it in a vacuum. The latter generally just leads to a lot of unproductive philosophizing and arguing. I'm not disagreeing with @nfranz , I'm saying that there are easier ways to hone definitions than just talking about them abstractly. |
@nfranz, what is the rationale behind trying to avoid the term "Taxon"? As that is where I would start and that has got nothing to do with Darwin Core. |
Thanks, @jar398. I may, or may not, be able to rescue "full blown" by reading it narrowly to just mean: all that one (someone else) can justifiably (according to mainstream systematic criteria for acceptable practice at the time) infer from that source about the concept's extension. Agree though, ok to move forward from that. |
@nielsklazenga It is one term that has two I think very important flavors or functional domains in biology, a more realist one (referring [however imperfectly] to natural, causally sustained phenomena) and a more constructivist one (modeling human data evolution); and, as often defined and applied in DwC, it can support both or kind of either in context, but as just one term it is not well suited to keep the two flavors apart consistently and explicitly, when and where that is needed. (And this is my vote not to continue this subthread further here; I am merely answering a question I was asked.) |
@jar398 , not sure what sort of examples you are after, but the use cases @jgerbracht and @camwebb presented at the IG meeting in September might be a good start. I have plenty of examples too, but they are mosses and I think it is better to illustrate with examples of better-known organisms. At Biodiversity_Next, Olaf Banki had a nice example of the African Elephant (I think that came originally from David Remsen). @jliljeblad spoke at that symposium as well, so may have some insect examples. And @nfranz just posted an example. |
I'm looking not just for pointers to documents but designation of particular entities either in or described in documents. The point is to be able to nail down identity criteria, use vs. mention questions, and other ontological fundamentals (so as to enable interoperability). Are we talking (in an example) about the text in an article, or the meaning of the text (that is different), or the extension of the meaning of some text? Or something else? If there are distinct taxon concepts with the same extension, what is an example of that? Having more examples is not necessarily better since different examples may indicate different answers to the general questions; that is why I think a small number of "best" or "canonical" examples (one might say: "type" examples) is better. They should be relevant to some existing project like eBird, not aspirational. OBO tries to include one or more examples with every class definition, and I think that is a good idea, but again, one has to be a bit careful here or else the text describing the example will be too ambiguous. (In particular, taxonomic names are ambiguous in multiple ways.) A use case may have to not just give data, but talk a bit about how it will be used, because without that there are likely to be ambiguities. - I agree that this shouldn't be difficult, I'm just not sure I'm the right person to be putting such things forward. You say @nfranz posted an example but even in that careful article I can't quite tell whether the given 'taxon concepts' are meant to be extensions, or entities (perhaps bibliographic or conceptual) with associated extensions. Is it possible to have two distinct taxon concepts with the same extension? That's not answered. They have 'membership compositions' - that is useful information - but do they have anything else? Talking about the inferences that happens inside of real applications is going to hone these questions more than looking at prose like Nico's article. So when I say "example" I really mean, ideally, examples of digital data (like a row of a csv file) in its natural habitat (such as Euler/X or iNaturalist). Stuff of the sort that this ontology is targeting. Prose is not so helpful as an example source (unless the entity in question is textual), and if there is no inference (such as deduplication) it is hard to know what is intended. Again, this is not a matter of speculation. We should be able to look at what our applications do and figure it out from there. |
In TCS 1 taxon concepts are definitely textual in nature - at least, I don't see how to read it otherwise, although it doesn't come out and say so. Evidence: a TaxonConcept can have only one taxon name. (by comparison, an extension might have many taxon names, if it has many descriptions/publications.) We could say that TaxonConcept is carried over as compatibly as possible from TCS 1 to TCS 2, but I think examples would still be in order, to clarify questions like this. |
Thanks, @jar398. I get that. I am not sure we have that on tap and ready to be deployed. A while ago I started more of a HowTo guide; but that is semi abandoned. Two sensible outs I see. (1) Decide that that is more implementation than TNC document specification and take a pass, either pragmatically or perhaps even more profoundly (see below). (2) Yes, do that work for an existing, sufficiently structured, relevant source. eBird would be great. Avibase might be easier. Both options (two phrases back) may have downsides. The way I have preferred to implement this, to address your questions. I have looked for maximizing intensional congruence across concepts in separate treatments, wherever I could imagine sitting in front of an audience of skeptics and hold my ground well enough. My base challenge to myself has been: at what point can I no longer claim with a straight face that intensional congruence can somehow be rescued for any subset of the concepts being aligned? Inversely: I have looked for rather unassailable evidence, textual or contextual or otherwise, of non-congruence; and in absence of that given it the benefit of the doubt. So therefore I have asserted RCC-5 articulations for meanings, more so than texts. Extension has been mostly a synonym of meaning. Because there is now a kind of pay-off by saying: two separately published concepts are congruent ("same extension") - that is the supposedly helpful integration product being offered - yes of course two concepts can be distinct minimally in the sense of: two non-identical name sec. source labels; while having congruent extensions. In the Perelleschus paper referenced above, 54 concepts are recognized. There are numerous instances of congruent extensions among these, hence there are far fewer instances of reciprocally non-congruent sets or clusters of concepts (if that makes sense). That is further explored here: https://doi.org/10.1371/journal.pone.0118247 (which has input data files for reasoning). Look for "Alignment 1 — Voss (1954) and Günther (1936)" to dig deeper. Hard though for anyone, I suppose, not just me, to divorce this effort then from some related political aspirations. I'd rather be vague and allow different more specific implementations of a purposefully under-specified TNC document fight for however local and temporary functional adoption, than overly constrain future application through examples that might limit someone's freedoms to actually use RCC-5 productively. The experienced truth, I think, and way out of this maybe false choice, is to ask ourselves: ok, research communities in biodiversity are often quite shy with this RCC-5 business. How, minimally and through well chosen examples, can the TNC document serve to reduce that reluctance? |
Yes, I think it is definitely the intention to carry over the TCS 1 TaxonConcept as compatibly as possible and I agree that we need a lot more than TCS 1 has, including examples of taxon concepts (and also examples of what are not taxon concepts). The definition in the TCS user guide, by the way, is:
There is some good stuff in sect. 14.1. |
@nfranz, I think there is enough current practice that (a) we don't have to be political and (b) there is little need for underspecification. I think that if you are dealing with an existing specification or platform that is underspecified, then that underspecification needs to be preserved. DwC seems to be like this. But that is not the case here. TCS 1 is pretty well specified (if I remember correctly - would need to review it), and the TCS 2 features that are not in TCS 1 are new so we can be totally prescriptive (subject to a desire for utility). - to repeat what I said before, underspecification is a recipe for non-interoperability, chaos, and errors. The political gains of underspecification are short term and rely on debts that always have to be paid off later. There are limits to how sharply anything can or should be specified but that's not what I'm talking about here. Looks to me like 'Taxon concept' sensu TCS 1 should be preserved, possibly under a different name, and possibly a new, separate class 'extension' added - and conceivably a third one, for 'intension', as you might be suggesting - each with different identity criteria. I'm not sure where our discussion most recently ended on that. I think I had suggested using 'extension' informally in the documentation but not turning it into an ontology term but I'm not going to voice much of a position here. - but. again, the deciding factor here should be which things we need for data exchange, and that depends mostly on what kind of reasoning our various platforms and applications are doing (also on what we consider erroneous inputs, misuse, etc of the platforms), and that's an empirical question. If having two entities (rows, etc) with the same extension is the wrong way to use a given application, then those entities are probably intended to represent extensions, not taxon concepts. If it's right then they're taxon concepts or taxon intensions (which can be distinguished by the same method), etc. We can tell the difference by looking at how the application works and what input constraints have to be observed to get good results from it. You'll probably have a chance to be politically aspirational in the TCS 2 documentation. Or maybe your aspirations are captured well by Euler/X and the ways it's used, and targeting Euler/X as a use case could ensure that they're represented. I understand you've voiced a nuanced view above and maybe I'm not treating it with sufficient care, let me know if any of this helps |
Thanks, @jar398! I don't have much to add. Yes, for the paradigm case higher-level taxonomic concept alignment, say "the oak genus of the Chinese Flora" vs. "the oak genus of the Mexican Flora", my taxonomic instincts have pointed to: intensionally congruent; extensionally overlapping (some children being widespread). But not all data worthy of alignment offer that duality, that clearly. Sometimes, a single "congruent" is the most sensible, and will do good services that way. That said, I feel your comment is pointing in the right direction. |
I am wondering why "Taxon Concept" (the label) and not "Taxonomic Concept". The latter makes more sense to me, but I am not a native English speaker. In Spanish it is also better "Concepto Taxonómico" than "Concepto de Taxon". |
About the "Definition" of Taxon[omic] Concept: "The underlying meaning, or referential extension, of a scientific name as stated by a particular author in a particular publication." That part is overly complex. |
The terms are used interchangeably, but it was |
We could lose the ',or referential extension` bit, if that helps.
That is sort of what it was in TCS 1. The problem is that this definition excludes a lot of things that we consider taxon concepts, such as checklist entries etc. A Taxon Concept needs neither a description nor a scientific name (so for me the problem with the (current) definition is the word 'scientific'). They need a label and sufficient context to be able to compare them with other Taxon Concepts (a lot of things that we have to deal with as Taxon Concepts do not even have that). I would have gone from the Taxon rather than the Taxon Name, so something like:
...but most people do not like that and also that still not covers all the things that we want to treat as Taxon Concepts. |
How about 'The delimitation of a taxon as stated by a particular author in a particular publication' ?
…--
Jeff Gerbracht
Lead Application Developer
Birds of the World
Cornell Lab of Ornithology
607-254-2117
________________________________
From: Niels Klazenga ***@***.***>
Sent: Monday, January 31, 2022 10:19 PM
To: tdwg/tcs2 ***@***.***>
Cc: Jeff A. Gerbracht ***@***.***>; Mention ***@***.***>
Subject: Re: [tdwg/tcs2] class:TaxonConcept (#1)
About the "Definition" of Taxon[omic] Concept: "The underlying meaning, or referential extension, of a scientific name as stated by a particular author in a particular publication." That part is overly complex.
We could lose the ',or referential extension` bit, if that helps.
Simpler/simplest: "A description or a definition of a taxon denoted by a scientific name, as stated by a particular author in a particular publication."
That is sort of what it was in TCS 1. The problem is that this definition excludes a lot of things that we consider taxon concepts, such as checklist entries etc. A Taxon Concept needs neither a description nor a scientific name (so for me the problem with the (current) definition is the word 'scientific'). They need a label and sufficient context to be able to compare them with other Taxon Concepts (a lot of things that we have to deal with as Taxon Concepts do not even have that).
I would have gone from the Taxon rather than the Taxon Name, so something like:
An opinion about the delimitation of a Taxon (sensu Darwin Core) or taxonomic group as as stated by a particular author in a particular publication.
...but most people do not like that and also that still not covers all the things that we want to treat as Taxon Concepts.
—
Reply to this email directly, view it on GitHub<#1 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAO4SSOV4DU45YQ5SQPPL43UY5GMDANCNFSM445LIIJA>.
Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
I would be very happy with that. Maybe make it '...stated or implied by...'? I am sure we can pick holes in this definition, as we can in all others, but I think perfection is unattainable here. |
My contribution during the meeting, take it or leave it 🙂: "The delimitation (or boundaries) of a taxon, usually by humans, often I'd revise it to: "The delimitation of a taxon, often established through the work of taxonomic circumscription, usually communicated in a publication." or even... "The delimitation of a taxon, usually communicated in a publication." |
Sorry to keep bringing this up, after we spent almost all of last meeting on it, but I think this is the whole ball of wax and we will not get anywhere unless we can settle this properly, as we (or other people if we do not) will keep revisiting it if we do not. I do not really think that anybody in the meeting thought this would be the end of the discussion, but, after having thought about it for a while (almost three weeks now), I think we cannot just put it to bed and move on. Needless to say, after this, that I am not happy with the definition as it stands now (#1 (comment)). It entirely lacks the "concept" and is less a definition of a term than a description of the data we want to put in there. I think that, if we define terms based on what the data looks like, we will always be going in circles. I have been thinking about language a lot in the last two weeks in order to understand my own thinking and to try to understand why I have so much trouble explaining things that I think I have so clearly in my mind to other people (or why other people do not get it really). I think that people's brains are wired (slightly) differently because of the language they grew up with. Also, if English is not your first language, if you want to really understand something, you always fall back on your first language. I have been living in Australia for 22 years and, since I graduated from university, have never written anything significant in any language other than English, and I still do that. So, I hope the following is helpful. We have the word 'concept' in Dutch, but we are much more likely to use one of its synonyms (https://www.interglot.com/dictionary/en/nl/translate/concept), 'begrip', which literally translates in English to 'understanding', or 'opvatting', which translates (also literally) to 'opinion'. So, in my mind, all the definition of 'Taxon Concept' needs to (and should) be is:
This is what I have always understood taxon concepts to be and what my colleagues, who know nothing about biodiversity informatics and have never heard of TCS, understand them to be. Taxon concepts were not invented by TCS or Franz and Peet (2009), they have always been there; maybe not exactly as the combination of words 'taxon concept' (that would be one word in Dutch if we had one), but we certainly always have been talking about someone's concept of a taxon. Note that the definition above is the same in meaning, if not verbatim, as the definition from Franz & Peet (2009) that we started with. Franz and Peet's definition was for a different audience and they tried to avoid the term 'taxon', because of difficulties with the term for that audience. For our audience, the term is unproblematic, as there is a perfectly adequate definition of Taxon in Darwin Core. On the other hand, I think it is important to avoid 'scientific name' to make clear to an audience of largely non-systematists that names are not the things we are interested in, but are the labels of the things we are interested in. I also removed the 'as stated by a particular author in a particular publication', as I do not think being published (or having an 'according to') makes something a taxon concept. A notion (which funnily enough also translates to 'begrip' in Dutch) about a taxon in someone's head is just as much a taxonomic concept as a published opinion. Of course we cannot do anything in TCS with taxon concepts that are not communicated in some way, shape or form, and in TCS taxon concepts need to have an 'according to' (and a label), but that has got nothing to do with definition. I think we have focused way too much on names and publications and possibly have lost track a bit of what we really want to describe, convey, or exchange. That is what happens when you look too closely at the data – or get at it from the data. You stop seeing the forest through the trees (or the domain through the data). I think less is more here and that removing every reference to names and publications actually makes the definition clearer and makes it easier for people to understand what things are taxon concepts and what things are not. I think it is clear, for example, that it is clear that there is no difference between the taxon concepts in individual publications ("taxon name usages") and the so-called "deep" taxon concepts in e.g. AviBase (this is absolutely not to take away from AviBase, which I think is great). It is also clear that synonyms, no matter how broadly you take the term, are not taxon concepts. At the data level, I myself, like most of us, have always treated synonyms as taxon concepts (or the same as taxon concepts), and not just because this is the only way you can deal with synonyms in TCS 1 and Darwin Core, but I have never thought they are taxon concepts (they are names) and I do not think this should be accommodated by the standard (and certainly not by the definition). That would just stop people from looking for better ways...and there is a better way. @deepreef is not the only one who can write long comments. |
A way to provide a functional definition may be this? An identifiable taxonomic position that can be aligned to other such positions via [TCS-compatible] relationships. I like this because it shifts the work of productive definitional precision (and productive ambiguity) to those agents that are providing the relationships. And it's the production of relationships that we really should try to incentivize (I assume that is a shared view). If and when these agents (humans, human-specified algorithms) feel justified in producing alignments, well I suppose then we others are justified in harvesting the information integration benefits, thereby granting in turn that was has been aligned somehow met the functional thresholds of being taxonomic concepts. |
I think @nfranz is on the right track here. I'm not sure the word "position" is right (maybe "assertion"? But that's not much better, and may be worse), so there needs to be some wordsmithing, but my gut tells me this is the right direction to go. Now, for some elaboration: This general issue has been extensively discussed/debated for decades, and remains unresolved. Ironically, it parallels the "species concept" debate (i.e., no end in sight), even though "concept" is used in a different sense. I would STRONGLY prefer to avoid the word "concept" -- in part because of the "species concept" confusion, but mostly because of the excessive amount of "baggage" that word carries. By "baggage" I mean that almost everyone in our space (Biodiversity/Taxonomy/Informatics/etc.) has a clear (in their own mind) understanding of the meaning of that word, get there are dozens (hundreds?) of subtly (and not-so-subtly) different interpretations of its meanings. The problem is that when people see that word, they immediately interpret it in their own sense by default, even if provided with a specific definition. Keeping the word "concept" as part of the term will perpetuate that barrier to effective confusion indefinitely. While certainly not perfect, I think the word "circumscription" suffers far less baggage and associated heterogeneity in meaning within our assorted relevant communities. It immediately invokes the notion of a set of things, and filters out any meaning associated with classification/hierarchy (which some definitions of "concept" include). Aside from the term, we also wrestle with the "thing" that forms the basis of one of these instances (concept/circumscription). I think we all agree that the "thing" is not a scientific name. The "thing" involves actual physical biological organisms, and the scientific name is just a crude and inconsistently applied text-string label that has historically been used to (roughly) represent the "thing". So I hope we can all agree that the name is not the "thing". But we still have several candidates for the "thing". I think the two most commonly discussed options are:
Option 1 implies that identifiers are minted for TNUs, and we have secondary data structures that track sets of TNU-circumscriptions asserted to be congruent or asserted to have other RCC-5 relationships with other TNUs. The advantages of this approach are:
The disadvantages of this approach are:
Option 2 implies that we have some mechanism for recognizing/defining a particular abstract circumscription of organisms, and we assign a single identifier to each unique circumscription. One or more TNUs would be linked to each identified/defined circumscription, but would reman as separate "things" (perhaps they could be framed as "instances" of a particular identified/defined abstract circumscription). The advantages of this approach are:
The disadvantages of this approach are:
For most of the past couple of decades, I've been a firm supporter of Option 1, on the grounds that it's relatively easy to define a TNU in a way that most people would implement them in the same way, but it's almost impossible to define a "taxon circumscription" independently of any particular TNU in a way that would be used consistently and semi-objectively. However, over the past year or so, Dave Remsen, Nicolas Bailly and I have been meeting every Thursday to brainstorm this stuff, and we think we're on to an approach for Option 2 that could work pretty well. I originally suggested it at a workshop hosted by Bob Peet to establish FGDC metadata standards for taxonomy back in the late 1990s (I don't remember exactly when, but Stan Blum, Walter Berendsohn and others in this space were there). Basically, I pointed out that taxon concepts/circumscriptions could be defined at different levels of granularity: taxonomic, geographic, population, and individual organism. The last (individual organism) is the most granular, but also the most useless (in that the vast, vast, vast majority of organisms on Earth are never seen or documented or recorded by humans, nor ever will be). Defining concept circumscriptions based on geography or specific populations is fraught with peril at many levels. That leaves defining taxon concepts based on taxonomy -- which is the least granular, but definitely the most practical. Using the word "taxonomy" in this sense is misleading, because specifically what this approach does is define taxonomic circumscriptions by included vs. excluded name-bearing type specimens. What has changed in recent months through discussions with Dave and Nicolas is the realization that we can devise specific mechanisms for tracking these kinds of circumscriptions based on, for lack of a better term, "Protonym Count". This post is already WAY too long, and the amount of text and diagrams necessary to adequately communicate our ideas about this would be enormous. But we're chipping away on documentation to explain and illustrate these ideas, and we'll certainly share them with this group as soon as they're ready. But the point is, I see enough promise in this approach that I've shifted my decades-long stance supporting "TNUs as proxies for taxonomic circumscriptions" (Option 1 above) to "sets of implied Protonyms as explicit definitions of taxonomic circumscriptions" (Option 2 above). I strongly doubt that this post has added any clarity to the discussion, but at the very least I can reclaim my throne as provider of overly long comments... |
And I am with @ghwhitbread . Greg just beat me too it, so I am going to repeat a lot of what he said. I do not really see the difference between options 1 and 2, but I think option 1 are supposed to be what TCS, the TDWG Ontology, Franz & Peet 2009, and the OpenBiodiv Ontology – and AviBase for that matter – call Taxon[omic] Concepts. This includes taxonomic treatments, entries in field guides, entries in checklists, entries in databases like Catalogue of Life, and clades in published cladograms. I see some solid differences between these things. [AviBase also calls the things it assigns AviBase IDs to – which I think are supposed to be option 2 – Taxon[omic] Concepts]. But, as Greg also said, "none of this matters". What matters is what we want to do with these things. If we want to classify them, align them and, most importantly, push them around with TCS, they have to be TCS Taxon Concepts. We cannot have two classes doing the same thing. So TCS Taxon Concept has to be a pretty big tent.
Does anybody really think that As Greg already hinted at, 'Taxon Concept' is only a label. I think it is a good label, but, even if I did not, it has so much history and it has been used so often, that it would be crazy to change.
Couldn't have said it better myself. That is exactly what we are supposed to be doing. |
This is exactly why it DOES matter! If we want to classify them, align them, and share them via well-defined terms for classes and properties, then we need to have a shared understanding of what they actually "are". If the "thing" is a TNU, then don't call it a "Taxon Concept". There is a fundamental (and important) difference between how people will use the standard depending on Option 1 vs. Option 1. Option 1: Option 2: Option 1 encourages conflation of the name, rank, higher classification, and other properties of the TNU as though they are properties of the Option 2 allows the I used to favor Option 1 because I saw no pathway to overcome the problem of Option 2. I switched sides because I can now see that pathway. In any case, my stubbornness/arguments here have very little to do with philosophical arm-waving, and very much to do with practical implications of the standard. A big part of the reason that TCS 1.0 failed to become widely adopted is that it wasn't practical for a lot of providers/consumers to implement.
YES!!! It absolutely is an occurrence ("An existence of an Organism at a particular place at a particular time.")! One of the main reasons there is now a |
@deepreef, you still do not seem to understand what we are doing here. Please read the charter of the TCS 2 Task Group at https://www.tdwg.org/community/tnc/tcs2/. We are NOT changing the TCS Taxon Concept class. A Taxon[omic] Concept is a real thing, whether you like it or not, or whether you get it or not, at least to people like @nfranz and myself (and the makers of AviBase from the looks of it) and that is the thing we are after. If you do not agree with that, that is fine, but the entire purpose of the TCS 2 Task Group, not to mention the TCS Maintenance Group, and what we have been chartered to do, is to make the TCS Taxon Concept work, not to reinvent the wheel. This is not some book club with free-flowing discussion that can go anywhere and does not get us anywhere. |
I specifically read the charter before posting, to make sure I was not off base. I just re-read it again now. Maybe you can quote the part of the charter that you believe I am misunderstanding?
You apparently haven't been reading my posts very carefully. The problem is not that I don't recognize a "Taxonomic Concept" as a "real thing". The problem is that I see two very distinct "real things" it could refer to, and I am very explicitly trying to pin down which version of the "real thing" that If you genuinely believe that "Option 2" is off the table because it somehow represents a "significant change to the meaning of terms", then fine -- we'll introduce it in TCS 3.0. In that case, I will (again) suggest a better definition for "An set of organisms, explicitly indicated and/or implied to exist, that are asserted by a particular static reference to be taxonomically homogeneous and collectively represent the entirety of a taxon." |
I also see 2 distinct real things and in my view the TCS 2.0 standard will not be an improvement over TCS 1 unless we have a clearer definition of TaxonConcept (I think we can all agree to that). As I stated at the beginning of this entire TCS 2 process, the lack of a clear definition was the main reason that we didn't adopt TCS 1.0, the definition of TaxonConcept was vague enough that it could be, and often was, interpreted as either of these 2 distinct real things or even less constrained interpretations of a taxon. I recall also from early discussions that we would agree to tackle the 'deep' concept in a latter phase, as @deepreef is also floating as an option, and that is still fine with me, if it helps us make progress (but we should clearly state that we are shelving the 'deep' concept for a latter effort). Since we do have 2 distinct real things which were 'combined' in TCS 1.0, we need to decide which of those two should be labeled tcs:TaxonConcept and we need a very clear definition of tcs:TaxonConcept. I think we need to split the current tcs:TaxonConcept into these 2 real things and I'm OK with either of these being assigned as the tcs:TaxonConcept, but if we try to have tcs:TaxonConcept represent a TN through time, a TNU and a 'deep' Taxon Concept then I don't think we've done what we really needed to do to improve TCS. If we decide that tcs:TaxonConcept is a TNU, then this definition is a great start. "A set of organisms, explicitly indicated and/or implied to exist, that are asserted by a particular static reference to be taxonomically homogeneous and collectively represent the entirety of a taxon." I might remove 'static' as that implies to me that it has been published and we run into cases where we need to exchange observational data (and associated concept data) pertaining to a TaxonConcept (TNU) which has not been published. |
I am awefully sorry that I missed this vivid earlier discussion. We all seem to agree there are 2 things that can be defined, so why don't we just define both instead of choosing one over the other? That would also help to have a better definition. I would also like us to think about how to express Plazi style treatment data in TCS2 with real examples as this is a important source of actual digital taxonomic data. Plazi uses the terminology I worked on the Berlin Model quite a bit. It was a major influence to TCS as was Prometheus from Jessie Kennedys (TCS) team. The potential taxa sensu Berendsohn were clearly like the TaxonConcepts in TCS1: "taxa as circumscribed by a reference". They are TNUs with references varying from journal articles to personal communication or just a persons name in a given year. I always found this wide range of reference types a barrier to working with concepts. This is where the taxon concept explosion starts. Limiting them only to treatments, i.e. statically published TNUs through scholary works, makes it tangible as you can progressively capture immutable data. Usages maintained in databases that can be in a permanent flux and are much harder to work with. For that reason alone I wouldn't mind having an explicit Treatment class in TCS - but it is also very limiting for sharing database work for current taxonomic activities. Darwin Core on the other hand never had focused on taxonomy. In the earlier days it was expected that at some stage the TDWG standards could be joined into a larger model and different standards had a different focus on what is modelled. TCS1 has placeholders for specimens or literature references for this reason. DwC needed some taxonomic terms in order to exchange occurrence data though, but there was no need to structure it properly. Only later we kind of hijacked DwC to also share standalone taxonomic data without occurrences. This evolution of DwC over time and the desire to keep terms stable and not rename them (there is no strict versioning in dwc) lead to inconsistent naming of things. dwc:Taxon existed early on; it makes sense to speak of a taxon in the light of an identification/occurrence as you never actively tie an observation to a synonym. dwc:taxonID therefore was born as the primary key just as there is occurrenceID for Occurrence. Once we wanted to share traditional taxonomic checklists though we needed a way to also share synonyms. And ideally also taxon concepts as per TCS1 or the Berlin Models potential taxa. By that time we liked to refer to name usages instead of taxa, hence parentNameUsageID, acceptedNameUsageID and originalNameUsageID were born - but referring to a taxonID, not a nameUsageID of a NameUsage class. dwc:Taxon was considered to be treated as NameUsage, i.e. a TNU. taxonConceptID and scientificNameID never got much attention and use, but were originally created with the desire to be able to differ between names (scientificNameID), name usages (taxonID) and also taxon concepts (taxonConceptID) in the sense of Richs Option 2 or Avibase IDs. For both no explicit class term was created and it was all pooled under the broad dwc:Taxon class. What exactly dwc:taxonConceptID points to never was really clear. I saw it basically as a shortcut to define RCC5 equals relations between name usages - either by picking an existing taxonID of a name usage as the representative usage for the concept (Option 1) - think of type specimens or reference sequences in OTU clustering or by creating explicit concept ids for just the purpose of aggregating all congruent name usages as Avibase does (Option 2). As we know the taxon terms in Darwin Core are not it's strength, I would not use them as a source for TCS terms. I think it is better to come up with something consistent. So much to the history. Sorry for not having a clear proposal, I probably only added to the confusion. |
Sorry I stopped engaging. This was taking up too much of my thoughts and got in the way of my work (not to mention my sleep). Also, I went on leave. I know there is no hope of cutting this short, but there is also no hope of ever getting TCS ratified as a usable standard if we cannot put this if not behind then beside us, so I am going to try. Nothing is off the table, but not everything is in scope for the TCS 2 Task Group. At the moment, the TCS Maintenance Group cannot action anything, as we haven't got a standard to action it on. Unless of course people want another XML Schema, but I thought that if there is only one thing we all agree on, it is that we want a vocabulary standard that fits in with other TDWG standards like Darwin Core. We all have slightly different visions for TCS, so if we try to make it into the perfect standard right now we never get there, because even if we agree there will be part of the community that does not. I think the only way to get anything ratified is to stick as closely as possible to (at least the intend of) what has already been ratified, so that, if people object to something we propose we can say that it they are objecting to something we already have. So, the task of the TCS 2 Task Group is just to turn the XML Schema Definition (XSD) into a vocabulary. Simple maintenance job, right? I had not realised how big of a problem it is that there are no definitions in TCS (1) and that nothing is explicit. TCS assumes people know what the terms (elements and attributes) mean and frankly so did I. Turns out that people read things differently. So, while when I read in the User Guide that a Taxon Concept is "a name with a description" I think that is not a very good definition of a Taxon Concept, other people read this as meaning that a Taxon Concept is a Treatment. I had not realised that. So, that is one thing we need to get to the bottom of. Since we can apparently not come to an agreement based on what we intuitively think a Taxon Concept is, can we get some help from outside and agree that the TCS Taxon Concept is the same thing as the Taxonomic Concept sensu Franz & Peet, 2009 and sensu Senderov et al., 2018? So, since @deepreef spelt out the different "options", "Option 2", in my opinion, is the TCS Taxon Concept and "Option 1" is the TCS 1 Publication (or part thereof) and is the I think, or I thought, that there was something else going on in this thread as well, namely that people do understand what a Taxon Concept is, but want a different object. That, of course, we cannot do. But this might have been miscommunication. Regarding extra objects, so new terms, when I say that something is not in scope, it means that it is not in scope for the Task Group, or this body of work, not that it cannot be in TCS 2 (which is just a working name and the '2' is not necessarily a major version). It is not about me trying to stifle the discussion or control what is in and what is out, but about finding the path of least resistance and making sure that there will be a TCS 2 some day and the Task Group does not turn into an Interest Group. Also, staying as close as possible to TCS 1 is a means to an end, not a hard rule. Apart from the set of terms that we really need – we cannot decide that the Taxon Concept is too hard, for example – for all other terms the real criterium is whether we can get it ratified easily. @mdoering 's suggestion to have both the Taxon Concept and the Treatment is very tempting, as it will settle the issue about the meaning of Taxon Concept we are having right now, but adding Treatment is far from straightforward, as I can think of many questions that need to be answered. That being said, nobody is stopping anybody from opening an issue and making the case for it (that is the important bit) independently of the Task Group. In fact, that is encouraged. Then, if it is ready to go for review when the rest is, it goes with and otherwise we (the Maintenance Group) will get to it when we get to it. |
It seems that we might have a consensus here. So why don’t we fix it? As none of the TCS2 properties has a domain or range and none of the TCS1 elements has formal definitions, we could add the Taxon Name Usage (TNU) class and sort the definitions of TNU and TaxonConcept. … And maybe tweak some property names. |
I like it. Let's just add the issue for now. That is for Treatment, TaxonNameUsage we already have (#51). |
How about I also create a new issue for TaxonConcept and rename this one? |
My apologies for not engaing myself. I'm dealing with a family health situation, and am way behind on many work-related things. Here is a very brief summary of my view of things:
We have discussed for many years using TNUs as representatives of Taxon Concepts, and perhaps using TNU IDs for Taxon Concepts (this is something I strongly supported for many years). My main problem with specific IDs for Taxon Concepts is that there was nothing "real" to link them to, other than a specific TNU. No one had proposed a clean definition for a Concept (to which an ID would be assigned) that did not involve heavily subjective expert assessments, or in a way that was automatable or scalable, so I believed that it was best to "stack" sets of TNUs representing congruent concepts, and select one of those TNU IDs to serve as a proxy ID for the associated Taxon Concept. My view on this has now changed. We badly need a clean definition of a TNU, and a separate clean definition of a "Taxon Concept".
We need to come up with a clean, stable definition of a TNU, which maps to PLAZI Treatments (all PLAZI treatments correspond to a single TNU, but all heterotypic synonyms included within a Treatment correspond to separate TNUs). This is fundamentally important, because TNUs literally underpin ALL nomenclature and ALL taxonomy. They are, by far, the most important informatic units of taxonomy in general, and therefore it's extremely important to get the definition and properties right.
I recently gave a presentation at TDWG explaining how the term "Taxon Concept" often conflates three different things: Classification, Nomenclature, and Circumscription. Based on the work that Dave Remsen, Nicolas Bailly and I have been doing these past 2+ years, it has become clear that all three of these things are best modelled as sets of Protonyms. Classifications are actually an ordered array of Protonyms, starting with the terminal taxon name and stepping up through the ranks all the way up to Kingdom/Domain. Nomenclature is a subset of classification, representing either a single Protonym (for names at the rank of genus or higher) or an ordered array of Protonyms (for names below the rank of genus), and some algorithms to format text strings based on these Protonym Arrays and relevant Code rules. Circumscriptions are sets of Protonyms that are collectively regarded as heterotypic synonyms at the same taxonomic rank. This requires a LOT more text and graphics to explain, which Dave, Nicolas and I are currently working on (we have 14 "Chapters" already, so this will likely end up as a book). We think we've solved all the major informatic question (yes, @mdoering, in a way that very elegantly handles taxonomic "splits" -- among other edge cases). The question is: what is a "Taxon Concept"? We tend to favor restricting it to "Circumscription" only, keeping "Classification" and "Nomenclature" separate. However, because all three of them are respresented by ordered or unordered arrays/sets of Protonyms, they could all collectively represent the "Taxon Concept". But I think "Circumscription" is the part mostr people want an informatic solution for.
I will not be satisfied with any new taxonomic "standard" unless it will allow us to deprecate both TCS 1.0 and OK, that's enough for now. I'm still dealing with the family health situation (I'm in Florida right now, helping my wife's sisters take care of their ailing father, my father-in-law), so I will not be able to engage again on this topic until late next week at the soonest. I need to reconnect with Dave and Nicolas to flesh out our ideas in writing an images, based on feedback we got at TDWG (including some really important discussions with Donat and Guido of PLAZI, as well as Pensoft and some of the IPNI folks). I'm hoping that by the end of this year or early in 2023, Dave, Nicolas and I will have something more complete to share, to express where we're coming from on the "Protonym Sets" approach to modelling taxonomy and nomenclature. |
Thanks @deepreef.
All of this is wildly out of scope for the TCS 2 Task Group of course. Our job is just to get a set of terms ready for review and we are going to do that before mid-2023. |
Thanks, @nielsklazenga -- the views I expressed were general thoughts on the informatics needs of the community, without any constraint on whatever may or may not be included in TCS 1.0. I disagree with your point 2 (at several levels), but unfortunately I don't have time to elaborate further right now. Perhaps next week, if you're interested in continuing the discussion. No dispute on point 3. My post was not intended as a rebuttal -- it was just an opportunity to make up for my long prior silence by summarizing my current views. I agree that point 4 is out of scope for this particular task group -- it is really something that should be addressed in the context of DwC. |
TaxonConcept (class)
Mapping
The text was updated successfully, but these errors were encountered: