The Question Is the Contract

Competency Questions for System Architectures

Mar 19, 2026

Every information system ever built shares a very important job—to satisfy questions. Questions are the reason the system exists. Build without questions in mind, you are left with a system unable to justify its existence.

Questions date back centuries, as the bedrock of most all Western and Eastern intellectual traditions — and just so happen to be largely ignored in modern system design.

Questions All The Way Down

Aristotle organized knowledge by asking what kinds of questions each domain was equipped to answer. Plato’s dialogues are structured around the pursuit of answerable questions — Socrates famously claimed to know nothing, only to ask. The philosophy of inquiry treats questions as prior to propositions—you cannot begin to know something until you have formulated what it is you are asking.1

book lot on black wooden shelf — Photo by Giammarco Boscaro on Unsplash

R.G. Collingwood took this so far as to argue that propositional logic should be replaced by the logic of question and answer, in which neither the question nor its answer is the more fundamental unit, but both together.2 His provocation taps into the intuitive nature of questions as vehicles for inquiry—a necessary process from which knowledge can be organized, communicated, or retrieved.

The philosophical literature on questions is dense because philosophy as a discipline is about questions. Philosophy distinguishes question types, presuppositions, direct and indirect answers, and the conditions under which a question can be said to be resolved.3 Relevant for system designers, there is a simpler version of the same structure—a question defines a space of possible answers, and a system that cannot identify that space cannot reliably produce anything meaningful, to begin with.

Contemporary epistemologists describe the activity of inquiry as intentional and directed — not a passive acquisition of information but a deployment of capacities aimed at a specific epistemic target.4 Inquiry involves wondering, examining, deliberating, and acting on information once found. The question is not the end of that process, but the beginning and the framework.

Traditional System Design Methodologies

General system design methodology — whether the scalability-oriented framework that decomposes requirements into components and allocates them to hardware and software layers, or the human-centered design approach that begins with user empathy and iterates toward a solution through prototyping — treats the system as the primary object of design.56 The verification question present in both traditions is essentially behavioral—does the system perform its function, handle its load, solve its users’ problem?

Requirements are stated as capabilities — “the system shall” — and validated by testing whether the capability exists.7 Buchanan’s analysis of the relationship between systems thinking and design thinking identifies a key limitation of both. Systems analysis provides no clear identification of the specific problems designers must address; it describes the whole without specifying what the system must produce in response to any particular demand.8 The problem space remains open-ended. What counts as “done” is diffuse.

Information retrieval is different in kind, not degree. When a system exists to answer queries — whether from a human using a search interface, an AI agent traversing a knowledge graph or a retrieval layer deciding what context to surface — the output is an answer to a specific question, and that answer is either correct or it is not.9 The verification question is closed, not open.

And for these reasons, competency questions fit information retrieval system design, in a way that general requirements frameworks do not. A competency question is stated in order to guide the architecture of a system that is expected to answer the questions it was designed to answer. The competency question is a natural-language question with known correct answers, which means it can be evaluated before the system is built and re-evaluated every time the system changes. The design methodology and the acceptance test are the same artifact, which means the requirements of the system persist throughout all parts and pieces.

General system design frameworks do not introduce this type of consistency and rigor in design, approach, architecture and testing as most system frameworks, because they were built for different problems.

black and white game machine — Photo by Ays Be on Unsplash

General system design frameworks — whether waterfall requirements engineering, agile user stories, or human-centered design — evolved to solve the challenge of building systems that do things: process transactions, render interfaces, route messages, allocate resources. The primary design question in those cases is behavioral. What should the system be capable of? The output is a function, a feature, a service. Whether that output is correct is largely a matter of whether it executes as specified.

Information retrieval has been treated, for most of its history, as a subspecialty of system design. In one framing it’s a library problem. In another framing it’s a database problem, then maybe it becomes a search problem — rather than being treated as a foundational, system design concern. The broader system design tradition absorbed it as one feature among many—“the system shall support search.” That framing immediately loses the thing that makes retrieval distinct, which is that the quality of the output depends entirely on the relationship between the question asked and the knowledge represented. A transaction either processes or it doesn’t. A query returns results that are relevant or irrelevant, complete or partial, correctly attributed or hallucinated — and those distinctions are invisible to a capability-based requirement.

There’s also a historical asymmetry in who was doing the theorizing. The people building systems frameworks — software engineers, systems architects — were not the same people studying query negotiation, relevance, and information need. Library science and information retrieval research developed a sophisticated theory of the question as a design object, but that literature stayed largely inside its own domain. It didn’t get absorbed into software engineering methodology the way, say, database normalization did.

The results are that every system built to answer questions — search indexes, knowledge graphs, RAG pipelines, agentic workflows — tends to be designed with tools that weren’t built for the job. They can specify that retrieval exists but they can’t specify what retrieval must return.

Ultimately, failure to persist questions throughout the entire system design and architect process, results in a system that is not designed for information retrieval and fails to answer questions accurately, if at all. General system design frameworks treat information retrieval as a game of slots—any part of the system can be blamed for its failure but it’s a gamble, at best.

Questions, Retrieval and Library Science

Long before anyone built a search engine or a vector database, librarians were studying what it means to satisfy an information need — and discovering that the question a user asks is almost never the question they actually have.

The reference interview, developed as a formal practice across the twentieth century, exists because the expressed question and the real question diverge.10 Ross and Nilsen documented this systematically—users present an initial query that has been compressed by uncertainty, social context, and incomplete knowledge of what the system contains. The reference librarian’s job is to negotiate the question — to translate the proxy into the real need — before any retrieval begins.11 Their workshop for library professionals was privately known as “Why Didn’t You Say So in the First Place?” That title simplifies a very powerful system design principle.

The reference interview is also failure analysis. Studies found that in roughly half of all reference transactions, no interview was conducted at all — the librarian began retrieving against the initial query without clarification.12 The failure rate in those cases was predictably higher. The initial question, taken at face value, produced results that addressed what the user said, not what they needed. The discipline of a reference interview is meant to close that gap between information inquiry and locating what the patron is really looking for.

Information retrieval research has tracked the same problem at the architectural level. Blair and Maron’s 1985 evaluation of a large operational full-text retrieval system — a system users regarded as working well — found that it was returning under twenty percent of the relevant documents for a given search.13 The problem was not the retrieval algorithm but rather the representational gap between how users expressed their information needs and how the documents had been indexed. The question, as asked, did not map onto the collection as described. Building a better search mechanism against the same representational mismatch would not solve it.14

Warner, writing in 1999, argued that the entire precision-and-recall paradigm of information retrieval evaluation had the wrong founding assumption — that delivering relevant records in response to a stated query was the goal of an information system.15 The alternative he proposed: systems should be designed around enhanced discriminatory power, the capacity to let users make informed distinctions among information and navigate toward what they actually need. The point was not to deliver an answer; it was to support the ongoing activity of inquiry.16

Similar to the use of questions in information retrieval, library science and philosophy, the same quandaries exist in modern digital system, where questions are not defined before the architecture is designed and built.

The Discipline of Competency Questions

In ontology engineering, researchers formalized the art of asking questions as the competency question (CQ). A competency question is a natural-language question that a system, combined with real instance data, must be able to answer. Grüninger and Fox introduced the concept in 1995 as a methodology for building and evaluating ontologies. Well before any ontology is designed and formal declarations are made, questions are defined for the future knowledge model answer. The ontology and knowledge graph must then verify completeness by confirming that every question is answerable. Failure to answer a question points to a structural deficiency in the ontology and is not treated as a search problem.17

The mechanism proposed is strict. An ontology must contain a necessary and sufficient set of axioms to represent and solve each question in the set. If the axioms cannot support the answer, the ontology is incomplete for its stated purpose. The competency question set is both the specification for what to build and the acceptance test for built ontology.18

Noy and McGuinness restate this by framing competency questions as litmus test for whether the knowledge base contains enough information to answer the types of questions the domain requires, at the right level of detail.19 Ren et al. extends the concept into test-driven development for ontology authoring—write the question, derive the structural requirements the question presupposes, and build only what is needed to satisfy them.20 Wiśniewski et al. demonstrated that the translation from natural-language competency questions to formal queries follows consistent, learnable patterns — 106 distinct question types across multiple ontologies, each with identifiable structural signatures.21

Keet and Khan, in a 2024 analysis of competency questions across the full ontology engineering lifecycle, found that questions serve at least five distinct purposes depending on when in the development process they are used: scoping, validating, establishing foundational commitments, articulating relationships, and specifying metaproperties.22 Treating all competency questions as a single undifferentiated category, they argue, misses the different work they do — and leads to questions that are poorly formed because their purpose has not been identified.23

A 2023 survey of 63 practicing ontology engineers confirmed that competency questions are used most effectively when they define ontology scope and serve as the basis for evaluating whether the resulting model represents domain knowledge correctly.24 Engineers who formalize competency questions before building, make principled decisions about what to exclude — and can explain those decisions when requirements change. Engineers who skip the question set make assumptions that either prove correct by accident or fail later, both expensive mistakes.25

The discipline of competency questions is rigorous and tested, yet rarely applied outside of the ontology domain, which is a shame. If information and knowledge systems, including AI and are designed to be queried, shouldn’t competency questions extend past ontology engineering? Are competency questions, as a discipline, what is missing from AI and agentic system architectures?

The Missing Element Across Systems

A vector database, a GraphRAG pipeline, an agentic workflow, an enterprise search platform, a context graph, an API layer, a data warehouse — all of these are systems designed to answer questions. The questions are asked by humans, by AI agents, by recommendation engines, by ML models scoring relevance, by orchestration layers deciding what to retrieve next. The questioner may change but the structure of the problems remain.

Every such system has components that exist, in service of questions the system must answer. While the ingestion pipeline decides what to preserve, the indexing strategy determines what to make findable. Pass the job onto the retrieval layer, where decisions are made as to what is surfaced in response to queries. Then there’s the inference configuration, which determines what to derive and how. The schema dictates what relationships can be traversed., which impacts the other layers in a disjointed architecture meant to answer questions.

If the questions that govern those architectural decisions are not stated explicitly before the decisions are made, each component is designed against implicit or tacit assumptions — and those assumptions will most likely not agree with each other.

The result is a system that misfires answers to questions, and questions producing incorrect or incomplete results. Even more troubling, is the noise introduced, when systems are not architected for purpose. This is Blair and Maron’s finding restated in infrastructure terms. The system works, in the sense that it returns results. It fails, in the sense that the results do not address the real question.

Applying the Discipline

The competency question discipline, extracted from ontology engineering and applied to system design in general, enforces a specific order of operations—questions first, architecture second, validation third. Repeat.

Define the Questions Before Defining the Schema

The question "which steps in the deployment pipeline have been waiting for human approval for more than forty-eight hours, and which team owns the upstream stage that triggered them?" is not a generic workflow question. It tells you that the schema must link pipeline steps to approval states, approval states to timestamps, pipeline steps to triggering stages, and stages to owning teams — and that all four links must be traversable in a single query.

A schema built from a topic description, say "deployment pipeline data", might contain all four links or none of them. The competency question above makes the ask and expected answers explicit. It also defines what "waiting for human approval" means so it can be modeled. The competency question forces that a step that has not reached the approval gate is distinguished from from one that has reached the approval gate and stalled — something a topic description cannot express completely, if at all.

Treat Each Question as an Acceptance Test

A system that returns the correct answer to every competency question is complete for its stated scope. When a schema changes — a new field is added, a relationship is renamed, a property range is updated — the competency question set becomes the regression test. The question that stopped returning the correct answer is the question that identifies what broke. This is the practical application of what Bezerra et al. formalized as a structured evaluation protocol for ontologies—each question specifies a test case, and the system’s response constitutes the measured output.26

Use Questions to Define Retrieval Layer Requirements

An AI retrieval layer is the most important components of an AI system, as it is where context is organized, in order to drive system accuracy, reliability and information retrieval. For example, a retrieval-augmented generation pipeline has multiple stages between the question and the answer. The retrieval layer surfaces candidate context, while the model synthesizes a response. Errors accumulate at both stages and are often indistinguishable from each other in the output.

A competency question with a known correct answer exposes both the retrieval layer and the synthesized response. If the retrieval layer returns incomplete context, the answer is partial. If the model hallucinates a relationship that does not exist in the retrieved context, the answer is wrong in a specific, identifiable way. Without the expected answer to the question, neither failure mode is detectable — partial answers look plausible and hallucinated relationships look correct.

The Tetherless World Explanation Ontology demonstrates this at scale, relevant to an AI system—competency questions govern both system design and real-time query response, with each question mapped to a candidate answer and a corresponding query that verifies whether the system’s output matches the expected result.27 The AI system evaluation is bounded, verifiable, and traceable, to a requirement.

Questions Define What System Need Not Contain

This is the less obvious application of competency questions, but it is structurally important. An agentic system that must answer “what regulatory approvals are required before this product can be shipped to Germany?” needs to contain regulatory frameworks, product classifications and jurisdiction mappings. It does not need historical pricing data, supplier contacts, or product photos because no competency question requires them. Their absence is principled and based upon the requirements of the system.

When a stakeholder later asks why the system cannot answer a pricing question, the answer is because no competency question required it be answered. A system built without a question set cannot give an answer. for a query that extends past the competency questions. If the failed query is a missed opportunity or competency question coverage, the system architect can only say it wasn’t thought of, which is a different situation entirely. But what a great feedback loop for system designers, to inspire another competency question to guide the architecture and meet retrieval demands.

The Question and the Question to be Answered

The library science reference interview principle applies directly to agentic AI systems. An agent receiving a query from a user or an orchestration layer will receive a proxy for the real information need — compressed, underspecified, and shaped by what the requester assumed the system could handle.

The system that takes that proxy at face value, retrieves information against the surface phrasing of the input query, rather than the underlying need it imperfectly expresses. The system designed around explicit competency questions — the real questions the system must answer, derived from domain requirements rather than first-contact phrasing — retrieves what the user intended to discover and needs to know. The gap between those two outcomes is the reference interview, reproduced in software.

The Question Before the Graph

Recent work on using large language models to generate competency questions from existing knowledge structures — and to retrofit question sets onto systems that were built without them — has made the discipline of competency questions more accessible.28 Gangemi et al.’s FrODO system demonstrates that ontology drafts can be generated directly from competency questions using frame semantics, making the question set the structural input to knowledge architecture rather than a post-hoc annotation.29 Alharbi et al. shows that generative AI can reconstruct competency questions from existing ontologies, enabling evaluation baselines for systems that predate formal question sets.30

These tools and low-cost implementations make it harder to claim that the discipline of competency questions is too expensive to utilize. The cost amounts to the conversation — with the domain experts, data engineers, product managers and end users— that defines what the system must answer, before a single schema element is declared. While that conversation has always been possible, we rarely build systems from questions.

As an artifact, the discipline of competency questions produces a list of natural-language questions specific enough to evaluate and legible enough for non-technical stakeholders to assess, with enough precision to constrain every design decision that follows in any system architecture.

The reference librarian will ask a user questions to discover “what are you really trying to find?”. The intention is not to slow down the retrieval, but to ensure that retrieval is focused on the right things. The ontology engineer writes competency questions before writing a single class declaration for the same reason. The same discipline, applied before building a context graph, a retrieval pipeline, a knowledge API or an agentic workflow, produces systems that can be evaluated because someone, before the first design decision, wrote down what the system is designed to answer.

Cronon’s guide to historical research makes a point that holds across every domain—a well-articulated question defines not only what to look for, but which methods to use to find it, and when to stop.31 A poorly formed question or no question at all is simply a topic, at best. Topics do not produce systems. Questions do.

Give a gift subscription

Footnotes

Glanzberg, M. “Questions.” Stanford Encyclopedia of Philosophy. First published February 11, 2014; revised March 22, 2022. https://plato.stanford.edu/entries/questions/ Cited in: “Questions All the Way Down”

Ibid.

Réhault, S. “Inquiry, Questions, and Actions.” Dialogue: Canadian Philosophical Review, Cambridge University Press. https://doi.org/10.1017/S0012217324000167 C

Xu, A. System Design Interview: An Insider’s Guide. Independently published, 2020. https://bytes.usc.edu/~saty/courses/docs/data/SystemDesignInterview.pdf

Both, T. “Human-Centered, Systems-Minded Design.” Stanford Social Innovation Review, March 9, 2018. https://ssir.org/articles/entry/human_centered_systems_minded_design

ScienceDirect Topics. “System Requirements.” https://www.sciencedirect.com/topics/computer-science/system-requirement

Buchanan, R. “Systems Thinking and Design Thinking: The Search for Principles in the World We Are Making.” She Ji: The Journal of Design, Economics, and Innovation, 5(2), 2019, pp. 85–104. https://www.sciencedirect.com/science/article/pii/S2405872618301370

Blair, D.C. & Maron, M.E. (1985). “An Evaluation of Retrieval Effectiveness for a Full-Text Document-Retrieval System.” Communications of the ACM, 28(3), 289–299. https://doi.org/10.1145/3166.3197

Ross, C.S., Nilsen, K., & Radford, M.L. Conducting the Reference Interview, 3rd ed. ALA Neal-Schuman, 2019. https://alastore.ala.org/content/conducting-reference-interview-third-edition Cited in:

Blair & Maron, “Retrieval Effectiveness.”

Ross et al., Conducting the Reference Interview.

Blair & Maron, "Retrieval Effectiveness."

Ibid.

Warner, J. (1999). “’In the catalogue ye go for men’: Evaluation criteria for information retrieval systems.” Information Research, 4(4). https://informationr.net/ir/4-4/paper62.html Cited in:

Ibid.

Grüninger, M. & Fox, M.S. (1995). “The Role of Competency Questions in Enterprise Engineering.” In Rolstadås, A. (ed.) Benchmarking — Theory and Practice. IFIP Advances in Information and Communication Technology, pp. 22–31. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-34847-6_3

Noy, N.F. & McGuinness, D.L. (2001). “Ontology Development 101: A Guide to Creating Your First Ontology.” Stanford Knowledge Systems Laboratory Technical Report KSL-01-05. https://protege.stanford.edu/publications/ontology_development/ontology101.pdf

Ibid.

Ren, Y., Parvizi, A., Mellish, C., Pan, J.Z., van Deemter, K., & Stevens, R. (2014). “Towards Competency Question-Driven Ontology Authoring.” In The Semantic Web: Trends and Challenges (ESWC 2014). Lecture Notes in Computer Science, vol. 8465, pp. 752–767. Springer. https://doi.org/10.1007/978-3-319-07443-6_50

Wiśniewski, D., Potoniec, J., Ławrynowicz, A., & Keet, C.M. (2019). “Analysis of Ontology Competency Questions and their Formalizations in SPARQL-OWL.” Journal of Web Semantics, 59, 100534. https://doi.org/10.1016/j.websem.2019.100534

Keet, C.M. & Khan, Z.C. (2025). “On the Roles of Competency Questions in Ontology Engineering.” In Knowledge Engineering and Knowledge Management (EKAW 2024). Lecture Notes in Computer Science, vol. 15370. Springer. https://doi.org/10.1007/978-3-031-77792-9_8

Ibid.

Monfardini, G.K.Q., Salamon, J.S., & Barcellos, M.P. (2023). “Use of Competency Questions in Ontology Engineering: A Survey.” In Conceptual Modeling (ER 2023). Lecture Notes in Computer Science, vol. 14320, pp. 45–64. Springer. https://doi.org/10.1007/978-3-031-47262-6_3

Ibid.

Bezerra, C., Freitas, F., & Santana, F. (2013). “Evaluating Ontologies with Competency Questions.” In Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), Volume 03, pp. 284–285. IEEE Computer Society. https://doi.org/10.1109/WI-IAT.2013.199

Tetherless World Constellation, Rensselaer Polytechnic Institute. “Explanation Ontology: Competency Questions.” https://tetherless-world.github.io/explanation-ontology/competencyquestions/

Rebboud, Y., Tailhardat, L., Lisena, P., & Troncy, R. (2024). “A RAG Approach for Generating Competency Questions in Ontology Engineering.” arXiv:2409.08820. https://arxiv.org/abs/2409.08820

Gangemi, A., Lippolis, A.S., Lodi, G., & Nuzzolese, A.G. (2022). “Automatically Drafting Ontologies from Competency Questions with FrODO.” In Towards a Knowledge-Aware AI (SEMANTiCS 2022). Studies on the Semantic Web, vol. 55, pp. 107–121. IOS Press. https://doi.org/10.3233/SSW220014

Alharbi, R., Tamma, V., Grasso, F., & Payne, T.R. (2025). “The Role of Generative AI in Competency Question Retrofitting.” In The Semantic Web: ESWC 2024 Satellite Events. Lecture Notes in Computer Science, vol. 15344, pp. 3–13. Springer. https://doi.org/10.1007/978-3-031-78952-6_1

Hung, P. & Popp, A. “How to Frame a Researchable Question.” In Cronon, W. (ed.) Learning to Do Historical Research: A Primer. https://www.williamcronon.net/researching/questions.htm

about me. I’m a Semantic Engineer, Information Architect, and knowledge infrastructure strategist dedicated to building information systems. With more than 25 years of experience in enterprise architecture, e-commerce content systems, digital libraries, and knowledge management, I specialize in transforming fragmented information into coherent, machine-readable knowledge systems.

I am the founder of the Ontology Pipeline™, a structured framework for building semantic knowledge infrastructures from first principles. The Ontology Pipeline™ emphasizes progressive context-building: moving from controlled vocabularies to taxonomies, thesauri, ontologies, and ultimately fully realized knowledge graphs.

Professionally, I have led semantic architecture initiatives at organizations including Adobe, where I architected an RDF-based knowledge graph to support Adobe’s Digital Experience ecosystem, and Amazon, where I worked in information architecture and taxonomy. I am also the founder of Contextually LLC, providing consulting and coaching services in ontology modelling, NLP integration, knowledge graphs and knowledge infrastructure design.

I am also a founding curriculum designer, teacher and founder of The Knowledge Graph Academy, a cohort-based educational program designed to train and up skill future semantic engineers and ontologists. The Academy is the the perfect balance of ontology and knowledge graph theory and practice, preparing graduates to confidently work as ontologist and semantic engineers.

An educator and thought leader, I publish regularly on my Substack newsletter, Intentional Arrangement, where my writing frequently explores the relationship between semantic systems and AI.

Connect with me on LinkedIn-!

Intentional Arrangement

More reading…

Discussion about this post

Ready for more?