The distinctions made in my previous article seem to have left some of my readers perplexed because it is not quite clear what they are good for. Well, they are the first step of a line of thought that is difficult to put into a single blog article. Unlike scientific articles or books, blog posts should better be short and more or less self-contained, or most people will not read them. Complex matters have to be divided into several articles.
The problem is then that the motivation behind the parts might not become clear until all the pieces of the puzzle are on the table. So let me give a road-map of the whole thing for better orientation. This road-map-article itself is going to be a rather long, but subsequent articles should generally be much shorter. DDon’t worry if some of the details are not getting clear now. From such a bird’s eye view, a lot of the detail cannot be seen, so that is normal. On the other hand, the significance of future articles providing the details is hard to see without such an overview. I hope that the whole thing will become clearer in hindsight.
I have taken important parts of this line of thought out of some (partially unpublished) articles by Kurt Ammon and I have tried here to map his rather abstract mathematical notions back into concepts describing cognitive processes as we know them. Any errors or misconceptions I might have made in this “mapping” are my own, so partially, this is an interpretation of Kurt’s work.
Each of the points below will have to be expanded into one or several articles. It is possible that during the project, I am going to revise this “road map” as well. I hope that Kurt Ammon is going to publish his article or a revised version of it soon. However, I expect it to be quite dense and abstract, so what I am trying here is to build a bridge into those abstract thoughts to make the basic insights contained in them accessible.
The aim of the project is to gain an understanding of some aspects of cognition. I want to make a preliminary remark here. I have been talking about “data” and “programs” as if human cognition can be described in terms of computer science concepts. The validity of such an approach can indeed be questioned. Data in a computer is discrete, or “digital”. The information flowing through our brains might well be something continuous. Neural information seems to be encoded in the frequency or density of impulses and this frequency does not just have a limited number of discrete values. Likewise, programs running in a computer are deterministic and this determinism is connected to the discrete, non-continuous nature of the data. Biological neural networks might instead operate in a non-deterministic way at times. Now let us keep these considerations at the back of our minds for the time being. I am intending to come back to them towards the end of the whole investigation, in a separate article. Up to that point, we may pretend as if speaking about “data” and “programs” is valid, or we may view the whole argument as valid only for artificial intelligence based on digital computers, leaving its relevance for human and animal cognition open.
Another preliminary remark: this is not a complete description of how cognitive systems work. Important aspects are totally left out here. For example, we have goals, drives, emotions that give our cognition a direction – there is something like a teleology. Moreover, there must be processes of selection by which fitting knowledge is retained while knowledge that does not prove itself is “weeded out”. I am not looking into these aspects here (yet). What I am looking into here is the process of creativity. To compare this with biology, the creativity corresponds to processes of mutations while the “teleological” aspects and the “fitting” of knowledge to reality correspond to processes of selection.
The whole line of arguments may be developed according to the following “roadmap”:
- Knowledge can be viewed as consisting of programs. Declarative knowledge can be viewed as programs as well as soon as it is applied in some cognitive process. I have written about this point already before (see Knowing How and Knowing That).
- Computer programs might go into infinite loops. They “hang” and “do not come back”. This does not happen in cognitive processes. If they do not yield a result after some time, the will be interrupted. So cognitive programs produce an output for every input (even if that output might sometimes be the information that they did not come up with a solution, something akin to an error message). So the programs our knowledge consists of may be viewed as programs computing “total functions”, i.e. functions calculating an output for every input.
- Learning can be thought of as executing a program that produces new knowledge. We put in any data, including other programs, and get out new programs.
- Now it can be shown mathematically that the set of all programs calculating total functions is not “Turing-enumerable”. This means that every program producing such programs as its output is incomplete. There are always programs it cannot produce. If we modify the program to include those, the result is in turn incomplete.
- For our purposes this means that every learning algorithm is limited. If we define general intelligence as the ability of a system to generate arbitrary knowledge, this result means that a generally intelligent system cannot be an algorithm. There can be a process that will always produce something new, but if we make this process part of the learning algorithm, the resulting algorithm is incomplete again. The system might contain some creative part that extends it but this part must not be under the control of the algorithm itself.
- Likewise, it can be shown that programs producing other programs (I have called them “meta-programs” or “learning-programs”) are also not Turing-enumerable. There is no program that can produce all of them (if there where, it would be possible to construct a program that produces all programs computing total functions, and we know already that that is not possible).
- However, it has been shown that the “meta-programs”, i.e. the programs producing other programs (or more technical: “Gödel numbers” of those programs) form what is known as a “productive set”. They cannot be enumerated by any program, but it is possible to have a program that will produce a new one if you give it a known one as an input. This program is known as a “productive function”. So again, there is a process by which we can extend a set of known programs. We could try to integrate this into the algorithm, but the resulting program would be incomplete. Instead, if we look at the cognitive system as an algorithm, the process extending it must not be under the control of the algorithm itself. There must be an independent second input or trigger to the system.
- So what I am planning to do here is the following. I am first mapping cognitive processes onto mathematical concepts. This is an abstraction whose purpose is to make results from theoretical computer science applicable. Cognitive processes are here viewed as the execution of programs. Knowledge is viewed as consisting of programs. Programs are viewed as algorithms that can be modeled by the notions developed in theoretical computer science for this purpose, e.g. “Turing machines” (or any other formalism developed for this purpose, like “Lambda-calculus”, “recursive functions” etc.). Data of any kind will be represented as natural numbers. Programs can be represented by numbers as well (known as the programs “Gödel-numbers”). So what we will be looking at are computable total functions of natural numbers. So these abstractions, we conceptually “translate” cognitive processes into a form that allows us to apply basic results from theoretical computer science, especially the theory of computability. By retranslating these results into something nearer to the cognitive processes we know, it can then be shown that general intelligence cannot be an algorithm although its components might be describable as algorithms. Each algorithm is limited and each learning algorithm that produces other algorithms is limited as well.
- In previous articles (e.g. here) I have argued that a cognitive system can be extended by integrating new information from the environment into its knowledge, information that could not have been derived inside the previous stage of the system. However, I know think that this is an incomplete view. The reason is that in order to integrate new information, you need some process to do so, some process that will apply the new information as knowledge or turn it into a program. However, each set of learning programs is limited, so even if the system integrates new information, there would be areas into which it could not develop as long as it is an algorithm.
- Creative systems, however, still seem possible. They must contain separate processes, like a productive function, by which the system can be extended or modified. The resulting creative system would not be an algorithm since the application of the extension mechanism cannot be controlled by the algorithm but must be separately triggered by physical events external to it. So creative systems as physical systems are possible but they are not algorithms, although their components can be described as algorithms.
- It may be that the non-discreteness and non-determinism of biological neural networks is sufficient to provide this kind of creativity in humans and animals.
- Biological evolution may be viewed as a creative process of this kind as well. While some modification or re-editing of genetic information can happen under the control of the genetic information itself (e.g. processes of crossing-over between chromosomes), uncontrolled mutations triggered by physical events not under the organism’s control are required for evolution to find new ways.
In order to make writing those articles easier, I might add articles to discuss certain mathematical concepts or methods. I plan not to go into formal details or use formal language (at least I will be trying) since I want this to be understandable without much of a background in mathematics, formal logic, programming or other formal disciplines. When explaining proofs, I will also not do so in a formal way. I think that all of these ideas can be understood without formalism (readers interested in the formal details might take any textbook on theoretical computer science, computability theory etc.). So there might be articles on topics like functions, total functions, Gödel numbers, diagonalization and Cantor’s “diagonal method”, algorithms, Turing machines, computability and decidability, productive sets and productive functions, etc. I can then just use these ideas and concepts in other articles without explaining them. So the approach is to work on the language, to provide the conceptual background that then makes explaining and understanding the whole line of thought simple.
Most of the articles belonging to this project will not be re-published on the asifoscope, although I might at times put some summary there. So if you are interested in following this up, you might consider becoming a follower to the Creativistic Philosophy blog.
Since I don’t have much time and there are several other projects I am working on, the whole thing might take a lot of time. I do not have any schedule yet.
(The picture is from https://commons.wikimedia.org/wiki/File:Puzzle-piece.jpg)