Journal of Knowledge Management Practice, December 2003

Natural Language And The Problem Of Modeling Knowledge

Steve McKinlay, School of Information Technology, Wellington Institute of Technology, Petone, New Zealand

ABSTRACT:

This paper examines the relationship between language and our attempts to codify and manage business knowledge.  I argue that knowledge is inextricably bound to language and that our conceptualizations, which ultimately constitute knowledge are inseparable from language.  Furthermore if we wish to develop a model of knowledge that can be represented computationally we need to understand the extent to which language can be mapped to its empirical content and or to sentences which correlate with equivalent sentences.  Such a project according to Quine and others is doomed due to inherent indeterminacies in language.  In light of Quines indeterminacy argument I conclude that knowledge cannot be logically and systematically represented under current computational frameworks.


Introduction

“A traveller of good judgement may mistake his way, and be unawares led into a wrong track; and while the road is fair before him, he may go on without suspicion and be followed by others; but when it ends in a coal pit, it requires no great judgement to know he hath gone wrong, nor perhaps to find what misled him.” (Reid, 1764)

One might ponder the worth in being concerned with clarifying the logical nature of value statements.  The kind of statements I refer to are those which we might consider business knowledge.  This concern I suggest ought to be precisely the worry of those wishing to model knowledge for it is an organizations ability to materialize successful value judgments in conjunction with relevant empirical data that characterize its knowledge or lack thereof.  It follows that the success of decisions based on such value judgments ultimately determine the success of any organization and thus I venture its knowledge management system.

What is evident to any executive is that “true” and “false” are held largely inapplicable to such judgments.  The divergent reality nonetheless is that many individuals wishing to model knowledge abysmally fail to recognize the implications of their endeavor.  Knowledge I argue rebuts logical representation. 

A skeptical stance in regard to the logical representation of knowledge is not new.  In fact the failed verificationist project of logical positivism stands as a sombre testament to those who tried.  A perfunctory study of this historical project ought to be compulsory for all would be knowledge modelers.

Efforts in the domain of knowledge modeling I argue focus on the wrong problems, there is a herd-like obsession in regard to what particular kinds of knowledge might be modeled, proponents feel obliged to reiterate Polyani’s tacit and explicit knowledge distinction as the starting point for any and all discussions. 

“A distinction between tacit and explicit knowledge is critical to understanding the working mechanisms of knowledge management.” (Gupta & McDaniel., 2002)

Logical Syntax And Natural Language

I argue that the tacit/explicit distinction is not the salient issue for knowledge modelers.  My concern lies with other issues.  Firstly a dichotomy initially discussed during developments in modern logic in the 30s and 40s, the distinction between logical syntax and natural language and secondly issues arising out of the translation and reduction of natural language.  

To date typical attempts to model data and consequently information within computer systems involve the formalization of structures we characteristically know as entities or objects (the structural aspect), relationships between those objects (the integrity aspect) and a set of operators that enables users to access the information (the manipulative aspect). (Date, 2000; pp. 58). 

Such constructions appropriately formalized and defined represent data axiomatically within computer systems (usually databases).  These structures are necessarily deductive and hence tautological in nature.  That is the execution of queries over such objects are in fact truth valued-expressions evaluating to either true or false. 

With relational databases in particular there is an isomorphic relationship between the predicates that define entities and the meaning of the subsequent rows (or tuples) contained therein.  This is in point of fact a good thing; we want our databases to contain precisely defined information.  The worth of any truth-valued expression would be lost if the meaning of such data were not fixed. 

Furthermore any connection between natural language, the predicates represented in a database or the structures defined within a formal (syntactical) language are at best contingent.  We could easily define database table headers (predicates) as x, y, z or similar and we often do.  This is in keeping with the formality of the logic (although not advisable by database administrators).  This approach differs fundamentally from the implicit appeals to intuition and experience present in the way in which we use and understand natural language.  There is no such one to one correspondence (except in a few rare cases) between natural language terms and things or concepts those terms seek to denote. 

The point of difference ought to be clear.  Natural language is semantically rich, contextual, instructive and often normative, thus logically imperfect and unsystematic.  The practical difficulties in deriving a set of systematic and logical rules of syntax for natural language with respect to certain formalized languages are profound and ultimately futile. 

One way to understand such difficulties amounts to determining the distinctive traits shared by episodes appropriate to an observation that is referenced by a natural language statement, termed by Kant an “observation sentence”.  It might appear prima facie that such problems could be overcome by appeal to synonymy or a natural extension of terms but I think the problem is deeper (albeit outside the scope of this paper) than the mere translation or mapping of natural language and relates to the acquisition of behavioral dispositions and the translation of those dispositions into some kind of formal language. 

My interpretation then with regard to “knowledge management” and associated modeling efforts point toward an acute confusion between modeling statements, propositions or entities which exist within a formal system and thus are logically derivable a priori (Kant’s “theoretical statements”), contrasted with synthetic value statements (comprising knowledge, or observation statements as discussed above) which rely largely upon experience to confirm or detract from their confirmation. 

This discussion is not intended to support the idea that natural language has any kind of “mystical quality” absent from formalized languages.  The point is to clarify the usage of formalized languages in contrast to natural language. 

Knowledge And Natural Language.

At this point the reader may well ponder what strength of connection between knowledge and natural language I have assumed.  Quine deliberates “Each of us learns his language from other people, through observable mouthing of words under conspicuously intersubjective circumstances.” (1960; pp. 1).  This develops as an ordinary language of physical things in relation to non-verbal stimulation.  The development of conceptualizations are thus inseparable from language since they relate to other concepts (expressed linguistically) and or to our ordinary language of physical things. 

If we wish to improve our understanding of our conceptualizations of more complex matters it does no good to attempt to impose some sort of reduction on the language within which they are expressed, such understanding comes from the clarification of casual connections between physical things and other matters. (ibid; pp. 3). 

It is through language that we develop our knowledge of the world.  All manner of complex conceptual and intricately related talk may well bear little relation to non-verbal stimulation however any empirical content of such talk must be sheeted back to our sense-interaction with our world. 

If then we decide to develop some kind of knowledge model we may wish to consider the extent to which language can be mapped to empirical content or the sentences in the language, which correlate with equivalent sentences. 

Quine discusses such problems in detail.  In his 1960 publication Word and Object Quine outlines what is termed the indeterminacy argument, summarized succinctly by William Alston (1986), that no one means anything determinate either by any of his terms or by any of his non-observation sentences.  Given that the constitution of all language is derived from the speech activity of its users it follows that no term or non-observation sentence in a language means anything determinate. 

Although on a first reading seemingly extreme, these claims ought not to be of surprise to anyone, semantic indeterminacy can be recognized (in its most simplistic form) as vagueness of degree, how many inhabitants does it require for a ‘village’ to be called a ‘town’ or a ‘town’ a ‘city’?  Indeterminacy in meaning is well documented by epistemologists and logicians alike, Wittgenstein’s “family resemblance” (1953), Waismann’s “open texture” (1945) and what Michael Dummett (1974) termed the “inextricability thesis”, “the view that there is no sharp line between what belongs to the meaning of a word and what belongs to widely shared and firmly held beliefs about what the word denotes”. (Alston, 1986)

Implications for Knowledge Management

If there is no objectively correct translation of individual terms then the complexities associated with regard to codifying multifaceted business value statements and their relationship with particular users experience, intuition or states of belief are profound.

The point to be made is the significance in distinction between the kind of language and associated behaviors that impart knowledge, the futile attempt at their codification vis-à-vis tautological formal languages independent of appeals to experience or intuition. 

One might question where is the evidence in the first place that such assumptions exist within those concerned with modeling knowledge.  Unfortunately my reply would be that examples abound.

Engle in a recent article demonstrates such a confusion, regarding the strength and aims of knowledge modelers (amongst other like minded individuals) to be the “modeling of the complete subject-knowledge of business users and departments at all corporate levels” (Engle, 2003)

I take it by use of the term “modeling” Engle refers to the formal process of construction of a complete and systematic methodology for representing his “complete subject-knowledge”.  If not, there seems to be no way of meaningfully talking about different kinds of “subject-knowledge” in regard to the model, if so the complications generated by this aspiration are unequivocal. 

Tom Finneran imagines “Knowledge is collected from all existing sources including people, systems, data stores, file cabinets and desktops. All knowledge of value is stored in the organizational knowledge repository. For virtual teams, this knowledge would be immediately conveyed to those people and systems that could use it. The right knowledge will go to the right person or system at the right time. Current knowledge can be retrieved from the system at any time in the future.” (Finneran, 2000)

The conflation between the principles of data management, which can be interpreted as information, captured within the confines of a logically complete formalized structure (namely a relational database) and the nature of knowledge is clearly evident here. 

Barclay and Murray, knowledge management consultants add further fuel to the fire: “Knowledge management often encompasses identifying and mapping intellectual assets within the organization, generating new knowledge for competitive advantage within the organization, making vast amounts of corporate information accessible, sharing of best practices, and technology that enables all of the above — including groupware and intranets.” (Barclay & Murray, 1997)

Conceptions of and attempts to model knowledge within current computational frameworks fraught with their requirements for formal codification and the associated applied formalisms seem doomed.  If language cannot lend itself to determinism and while knowledge is language dependant then knowledge is in the mind of the beholder. 

In summary;

¨      Data and thus the associated inferred appropriately interpreted information can be logically represented within formal systems (such as relational databases). 

¨      Data represented in such a way involves the elimination of appeals to intuition and experience in its interpretation.  Its computational representation is within a logically closed system.

¨      Knowledge inherently involves appeals to intuition and experience in order to confirm or detract in its confirmation.

¨      Knowledge is reliant upon natural language for its expression.

¨      Natural language in light of Quines indeterminacy arguments cannot be systematically and logically represented.

¨      Thus, knowledge cannot be logically and systematically represented.

The upshot of this analysis is we ought to abandon our typical conceptions of knowledge management which rely upon existing data and information oriented architectures, attempts to model knowledge within such systems is futile.  Alternatively we drop the hype and be more honest about what we are presently doing which simply is another form of information management.  The best strategy for knowledge management I can think of is to pay your experts a higher salary than your competitors can.

References

Alston, W., 1986, Quine on Meaning., in The Philosophy of W.V. Quine, exp edition, Hahn and Schlipp (ed.), Open Court, London

Barclay, R.O., Murray, P.C., 1997,  What is Knowledge Management. http://www.media-access.com/whatis.html

Date, C., 2000, An Introduction to Database Systems, Addison Wesley Longman Inc., New York 

Dummett, M., 1974, The Significance of Quine's Indeterminacy Thesis, Synthese 27, 1974, pp. 351-97. Reprinted in Truth and Other Enigmas, Duckworth, London, 1978 pp. 375-419.

Engle, P., 2003, Data Modeling, Left and Right., The Data Administration Newsletter (TDAN.com) http://www.tdan.com/i024hy03.htm

Gupta.A., McDaniel, J., 2002, Creating Competitive Advantage By Effectively Managing Knowledge: A Framework For Knowledge Management, Journal of Knowledge Management Practice, Volume 3, 2002.  http://www.tlainc.com/articl39.htm

Finneran, T., 2000, A Component Based Knowledge Management System., The Data Administration Newsletter (TDAN.com) http://www.tdan.com/i009hy04.htm

Quine, W.V.O.,1960, Word and Object., The MIT Press, Cambridge.

Reid, T., 1764, An Inquiry into the Human Mind on the Principles of Common Sense., Brookes, D.R. (ed.), Edinburgh University Press, 2000.

Waismann F, 1945, Verifiability, Proceedings of the Aristotelian Society, supplementary volume 19. reprinted in G.H.R. Parkinson (ed.), The Theory of Meaning (1968) 

Wittgenstein, L. 1953. Philosophical investigations. (Anscombe, G.E.M., trans.). Basil Blackwell, Oxford.


Contact the Author:

Steve McKinlay., BA(Hons), B.Bus., Lecturer – Computing, School of Information Technology, Wellington Institute of Technology, Petone, New Zealand

Tel: (04) 9202 691; Cell (0274) 405 560; Email: steve.mckinlay@weltec.ac.nz; Web: http://www.weltec.ac.nz/>http://www.weltec.ac.nz/