He Kupu Tawhito is a multi-lingual concordance program for digtial texts encoded in TEI P5 ( http://www.tei-c.org/Guidelines/P5/index.xml ) and manipulated using makefiles, XSLT and XQuery.
He Kupu Tawhito was orginally started as an entry for http://www.mixandmash.org.nz/ but was never actually entered.
The Text Encoding Initiative (TEI) is a community of practice in the area now known as textual digital humanities. Since 1994 the primary output of the TEI has been the TEI/XML guidelines, a standard for the interchange of textual data. A main focii of the TEI is the TEI-L mailing list; the TEI is also on github and docker, a communal repository called TAPAS and an academic journal, the jTEI.
TEI/XML can be thought of as a sibling of HTML (they're approximately the same age, depending on how you measure it) which evolved with a focus on defined textual semantics rather than defined display semantics. TEI by example is a good introduction to TEI/XML.
The Text Encoding Initiative Wikipedia article contains some short examples.
The TEI/XML standard is used by content-based projects such as
the British National Corpus,
the Perseus Project,
the Women Writers Project,
the Oxford Text Archive,
the Digital Tripitaka and
SARIT,
and tool-based projects such as
CorrespSearch,
EpiDoc,
Anthologize,
Versioning Machine,
and many more diverse projects.