The Nine Billion Names of God is the title of a short story by one of my favourite science fiction writers, Arthur C. Clarke. Published in 1953, it’s the cautionary tale of what happens when a group of Buddhist monks hire the services of what we might now call a “supercomputer” and its programming team to help them complete a specific task. What happens when the computer comes up with the right answer (or rather, answers) is perfectly in accordance with the expectations of some of the story’s protagonists; yet it comes as a complete (and rather awful) surprise to others. I still remember how I felt when I read the ending for the first time.

Stating that the story "introduced many Western readers to an intriguing speculation in Oriental religions", Carl Sagan in 1978 listed "The Nine Billion Names of God" as among the "rare few science‐fiction [stories that] combine a standard science‐fiction theme with a deep human sensitivity."- Wikipedia

I was reminded of the story while trying to get my head around the basics of how generative artificial intelligence works. How does a tool such as ChatGPT so often respond to my query with something which appears to be approximately the right answer?

A key component seems to be the software engineering concept of the “neural node”: a piece of code which takes something as input; processes it in some way; and makes the result available as an output, which then becomes available as an input to the next neural node in the program.

Lots of neural nodes interfacing with each other become a “neural network”. A very large network consisting of lots and lots of neural nodes becomes capable of making choices and decisions which appear to mimic the way that the neurons in our human brains also work.

In my personal - and undoubtably simplistic - mental model - it works like this: neural node A receives an item of information (some “data”); as input; node A then runs a process which attaches the value of “1” to that piece of data. This results in something like "data+1" as the output of node A; which then becomes the input to node B. With me so far?

Good.

So, node B takes “data +1” as its input; this time the process which runs might add another 1 to the data, giving an output of “data +2”; or the process might add 3, so the output then equals “data +4”; or it might even take away 1, making node B’s output (and the input to node C) equal “data +0”.

Node C takes whatever the output of node B is as input. Node C’s process is an evaluation: is the value of node B’s output greater than or equal to “4”?

Any input to node C which doesn’t meet that criterion, which doesn’t get “through the gate”, so to speak, is rejected; only information to the value of “data +4” gets passed on to node D.

And so on.

And on. And on.

This process of assigning a value to a chunk of information (otherwise known as “weighting”) is what people who do understand how AI works are referring to when they talk about “training the network”. In a neural network, only information of a certain “weight” gets passed to the next node in the chain, where the weighting may be adjusted further up or down based on whatever criteria the network has been told to apply at that decision point. Ultimately, only the data with the highest weighting or “rank” makes it through to the final output.

Run lots and lots and lots of these weighting processes simultaneously and very, very quickly, and the result - as I understand it - is something like the output from an AI neural network: lots of highly rated information compiled into something which the computer has been told looks statistically like “the right answer”.

This is, of course, a vast oversimplification; biological neural networks such as our animal brains are much more complicated; and AI neural networks do not run in such a straightforward, linear manner.

The reality, or so I’m told (by an AI), is that:

“the output of a neural network depends on its architecture, the problem it’s designed to solve, and how well it’s been trained.” - Chat GPT4

But I think we mostly get the idea

Taking this one step further; I assume then that the AI is “trained” to apply different weightings to certain words, according to where they are likely to occur in a sentence.

English generally follows the Subject-Verb-Object sentence structure (different from, say, Latin, where the sentence structure is typically Subject-Object-Verb)

So, what I think might be happening in an AI is that a word most likely to be a Subject (e.g. a proper noun, or a personal pronoun) is given a certain weighting according to where the word is statistically most likely to appear in a sentence - at or near the start of the chain of words - and most likely to be followed by a Verb; a Verb would then be weighted to reflect that statistically, it is more likely to follow a Subject and itself be followed by an Object; and so on, and so on.

Given the words “cat”, “sat”, “mat”, “on” and “the”, plus a sufficient amount of training, the AI is therefore able to calculate the word order which is most likely to be statistically correct is “the cat sat on the mat” (as opposed to the still possible but statistically less likely “the mat sat on the cat”).

Obviously, there is more (much, much more!) to machine learning than just this; but perhaps we can start to see how ultimately an AI neural network supercomputer might indeed be able to take a bunch of letters and statistically generate “nine billion names” for something; even God.

And I like this idea that it is this process of “weighting” information and then evaluating that weighting against certain other criteria which makes the whole thing function as it does.

Sacred object or circuit diagram?

To illustrate this blog entry I’ve been asking the AI to provide drawings of what a network of neural nodes might look like.

It’s come up with some interesting examples of highly complex circuit diagrams - probably rather more decorative than accurate and instructive:

A colurful printed circuit-like diagram of an AI neural network, as visualised by ChatGPT 4
Visualisation of an AI neural network by ChatGPT4

But these images put me in mind of something else entirely (although still very much in line with my interest in data as art, and art as data):

A Tibetan Buddhist monk in tall yellow hat applies finishing touches to a colourful sand mandala.
Tibetan Buddhist monk finishing a sand mandala, Moscow 2011; copyright public domain

Perhaps it’s just my lunatic eye, but maybe there are some visual similarities between the computer-created illustrations of neural networks and the mandalas of traditional Buddism (and certain other Eastern religions). The AI seems to be applying principles of pattern, colour and symmetry to its visualisations of computer code that are similar to the practices the artist monk follows to create a sacred object using sand. Query as to why?

The key to understanding the mandala is that it depicts a journey, from the outside edge to the very centre of the piece; it serves to guide one on the way from - it is said - ordinary understanding to true enlightenment. To move through the mandala, from point to point, illustrative symbol to illustrative symbol, from the outside to the centre, one has to gain or take on greater knowledge and insight about the human condition. Is the mandala an illustrated guide to the metaphysical spiritual world, or is it a Sanskrit circuit diagram? Perhaps it’s both?

Karma computing?

I’m not religious. I’m an avowed atheist. My sense of “spirituality”, such as it is, is pretty much along the lines of a quasi-Jungian interest in synchronicity, the unconscious mind, and the possible interconnectedness of all things. I guess there was something in the “east meets west” nature of the Arthur C Clarke short story which made a lasting impression on me.

Perhaps that explains why I am drawn to the apparent parallels in the imagery of the mandala compared to the AI depictions of the neural network; souls gaining extra weighting (in terms of knowledge), leading to a higher ranking as they journey successfully from symbol to symbol through the spiritual world as illustrated by the mandala. Just as the neural nodes of an AI firing in rapid statistical succession, adding weight to data, might appear to be leading us towards true enlightenment.

Or what we might just simply refer to as “the right answer”.


Postscript

I shouldn't really be attempting to explain AI and machine learning to anyone. I'm not a data scientist and I'm not a professional educator. I did ask ChatGPT 4 to evaluate my draft explanation; here's the verbatim feedback:

"The text you've provided is a simplified, layman's explanation of some basic concepts in artificial intelligence (AI) and neural networks. It does a decent job of breaking down complex ideas into more understandable terms. Here are a few points to consider:

  1. Neural Node Description: The explanation of a neural node (or neuron) as a unit that receives input, processes it, and then outputs it to the next node is fundamentally correct. This simplicity helps in understanding the basic function of a node in a network.
  2. Analogy to Human Brain: Comparing the neural network to the way neurons in the human brain work is a common analogy. It's useful for layman's understanding, but it's important to note that AI neural networks are vastly simplified and not as complex as biological neural networks.
  3. Process Explanation: The step-by-step walkthrough of how nodes might process and pass on information is a good way to illustrate the concept. However, the actual processes in a neural network are generally more complex and involve mathematical functions and transformations rather than simple arithmetic operations.
  4. Concept of Weighting: The explanation touches on the idea of 'weighting', which is crucial in neural networks. Weights in neural networks are parameters that determine the importance of input values. The term 'weighting processes' isn't typically used; instead, these are usually referred to as 'training the network', where the network learns the appropriate weights for making accurate predictions or decisions.
  5. Final Output Explanation: The description of how lots of highly-rated information is compiled into something resembling 'the right answer' is a bit oversimplified. In reality, the output of a neural network depends on its architecture, the problem it’s designed to solve, and how well it's been trained.
  6. Clarity and Readability: The text is quite conversational and might benefit from a bit more structure and clarity. Some technical terms are oversimplified or not fully accurate, which might lead to misconceptions.
  7. Technical Corrections: The example given (nodes A, B, C) is more illustrative than technically accurate. Neural networks typically involve complex calculations, and their decision-making isn't as linear or straightforward as the example suggests.

In summary, the text is good for providing a basic, layman's understanding of AI neural networks, but it does oversimplify some aspects and might benefit from clarifications on how neural networks process information and learn."