Knowledge Graph Lite, Part II

The Mighty Thesaurus

Aug 04, 2025

∙ Paid

This article is part II of a three part series. Part I can be found here.

Foundation for Shared Understanding

The Simple Knowledge Organization System ontology (SKOS) supports the thoughtful curation of concepts and structuring of primary relations between concepts. This high-level modeling is done by defining what we call candidate concepts. Candidate concepts account for all concepts to be considered for promotion, be it a preferred label, alternative label or hidden label. The chosen concepts are first modeled as a controlled vocabulary or flat(ter) list.

https://www.hedden-information.com/skos-taxonomies/

The curation stage and the refined, resulting concepts are then structured as a taxonomy or hierarchy. Some practitioners choose to simultaneously model a taxonomy and thesaurus when working with SKOS. Like any other practice, there are benefits and pitfalls to varying orders of operation. My recommendation is to model each stage individually, as SKOS modeling is iterative, anyways. You will continue to revise and refine your SKOS model throughout the lifecycle of the thesaurus. So get used to it.

Most candidate concepts can be acquired through a collection of all concepts within a system, such as what may be derived from a scrape of systems or collected from lists of metadata terms, data catalogs, disparate taxonomies or a conglomeration of several source vocabularies. The idea is to collect as many vocabularies and terms in use to establish a collection of terms that will represent a shared understanding of a system, domain or of specific subjects.

Not all candidate concepts become classes or official members of a SKOS concept collection. The idea is to collapse and refine candidate concepts to arrive at a reconciled, disambiguated SKOS vocabulary, where all concepts are defined and unique.

The work of refining a vocabulary takes rigor and discipline, as it’s easy to turn a blind eye to ambiguous terms, or succumb to stakeholder pressure when colleagues insist on certain concepts such as “Other implementations”. The key here is to resist peer pressure and work against ambiguity, for the sake of knowledge. For a more extended overview of SKOS and ambiguity, I suggest you bookmark and read ANSI/NISO Z39.19-2005 (R2010) Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies.

What’s With a Shared Vocabulary?

The process of modeling system concepts using SKOS is ultimately focused upon establishing a shared vocabulary, bolstered by SKOS’s ontological logic. A SKOS vocabulary reduces ambiguity by guiding the modeler through the thoughtful curation of concepts, establishing natural workflows to support reconciliation of duplicate or near duplicate concepts.

For example, when I was handed a flat list of over 10,000 terms, the result of GitHub and Stack Overflow web scraping, SKOS provided the framework to reconcile synonyms and acronyms, while shaping the long list into a three-level SKOS taxonomy. 10,000 terms swiftly became 2,500 well defined concepts, with parent-child relations alternative labels, hidden labels and related concepts. SKOS modeling is a critical foundation for the development of more complex ontologies, as it is impossible to model messy, undefined, unrefined data. A shared vocabulary establishes baseline definitions, agreed upon vocabularies and therefore, a shared understanding of what is to be further moulded and modeled.

Support Iterative Development

Lightweight ontologies are agile: they can be quickly prototyped, tested, and adapted as understanding evolves. This scaffolding approach helps teams learn together, gradually maturing their knowledge model before locking in heavy constraints that accompany lower, more complex ontologies. SKOS is a fabulous ontology model for getting the ontology feet wet, a welcome primer for ontology modeling. With all of the head scratching around how to build ontologies and even, what is an ontology, SKOS helps to teach ontology logic through the ontological structuring taxonomies and thesauri.

Just as machines appreciate the logical discipline introduced by relationship constraints and definitions, human modelers benefit from the limitations presented by SKOS’s inability to model more complex relationship types. This is the very essence of ontology modeling, as building ontologies requires focus and discipline. SKOS exists as ontology training wheels, a way to model a shared understanding by way of a simplified domain model.

A glossary, data catalog, taxonomy and thesaurus are all excellent candidates for SKOS modeling. Constrain the use cases, start with what exists within any given system, model with SKOS and iterate, until your data asset can stand on its own as a rich, disambiguated SKOS ontology model.

Experiment with your SKOS model, use it for Retrieval Augmented Generation (RAG), train your AI model on SKOS and measure to see how AI output improves. Operationalize SKOS early and often, to provide a feedback loop for iterative modeling, until the shared vocabulary model provides enough context, enough meaning, to deliver meaningful results.

Welcome to Collaboration

One of the biggest issues with ontologies is expertise and know-how. Building ontologies is not for the weak and is normally reserved for expert ontologists and well-trained semantic engineers. SKOS is the ideal entry-level ontology, perfectly positioned to support stakeholder engagement and collaboration.

Domain experts and non-technical stakeholders alike can more easily participate in SKOS-based modeling as the primary tasks associated with SKOS involve defining concepts and modeling defined concepts as parent–child relations. SKOS’s simplified logic lowers the barriers for collaboration and validation, ensuring the resulting ontology reflects real-world knowledge and needs. And the best feature of SKOS is its simplicity (it’s in the name, after all).

Intentional Arrangement