I don’t have a formal CS background and for a long time the whole idea of Turing machines seemed a bit abstract to me, having little to do with actual computers or programming. At some point things ‘clicked’ for me, but not because it was clearly explained somewhere, so I thought I’d give it a go and explain the relation between Turing machines and actual computers and programming.

If this is obvious to you, you are not the target audience of this post. Perhaps there is no target audience and I’m basically explaining to myself what’s obvious to everyone else. Nevertheless, here goes:

### The relation between Turing machines, computers and programming

One of the first things I was confused about after studying Turing machines: are they computers or are they programs? On the one hand, it’s clearly a *machine,* with a pretty ‘physical’ description. That seems to come closest to a computer. On the other hand, it has this *transition function*, which encodes the rules according to which the machine operates. It’s unclear where it’s stored, how it is represented and how it performs its operations, but it definitely seems more like a program than like a computer. So which is it?

The answer is: both. In the end almost every machine consists of both the physical parts and a ‘program’ that guides its operation. Even if the program consists only of “pressing this button turns the machine on”, then logically and physically there is a state transition. I think the easiest way to look at it is from the other side: an actual CPU with its firmware is a Turing machine, where the physical CPU is the machine and the firmware is the transition function.

A major difference between automata and Turing machines is that a Turing machine can generate output. A bunch of proofs around Turing machines rely on the fact that the output of a Turing machine can be (the transition function for) another Turing machine and that’s what made it click for me. That initially seems like just a neat trick, until I realized that that is exactly what’s going on in practice: a computer contains layers upon layers of Turing machines, defining each other, defining programs for each other and rewriting input for each other.

### An example

The Ruby interpreter is a Turing machine that accepts valid Ruby programs as input and outputs bytecode that is input for its internal bytecode interpreter, which is another Turing machine, which outputs machine code for the CPU microcode Turing machine which finally outputs the instructions for the hardware Turing machine, the CPU.

At the same time, the representation of the Ruby interpreter Turing machine on disk is itself a direct input for the CPU microcode Turing machine (I’m hand waving some details here). But it could also have been an input for, e.g. the Java virtual machine bytecode interpreter. Which is of course another Turing machine.