Legal Boundaries in Computers

The following is a conversation about the nature of legal boundaries between pieces of software. Of particular concern is how, or whether, the GPL, which is a particularly 'infectious' software license, can bleed across boundaries to 'infect' other software under a different license.

On Wed, Dec 05, 2001 at 04:53:17AM -0800, George was heard to remark:
> On Wed, Dec 05, 2001 at 12:01:06PM +0000, Alan Cox wrote:
> > > A "derivative work" is a work based upon one or more preexisting works,
> >                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > So, which of these covers merely utilizing another program?
> > 
> > Duh

I asked these same questions of an intellectual property lawyer nearly ten years ago. After some confusion, a fairly clear answer emerged: the boundary lines lie along address spaces (virtual or not). Since processes are in different address spaces, there is no contamination. Since libraries usually are not, there is.

'different address space' is usually "obvious" when its a different CPU, connected by Ethernet. When the different CPU is connected by some other means then its trickier. Lets be complete: connected by bus, e.g. by a memory bus (SMP on same CEC, 'central electronics complex' aka motherboard, or two CPU's on same chip), i/o bus (another CPU on an adapter card) or shared-mem ethernet device (yes, there are weird cards that make it look like you have 'shared memory' with a remote CPU -- i.e. make it SMP-like with a CPU that's a kilometer away). In these cases, the criterion is:

> Well, it depends on your definition of 'based upon'.  

The normal definition is 'incorporates source code derived from'

> For example a
> proprietary xterm replacement might still run bash as a shell.  Does this
> constitute infringement?  

No, as long as the xterm didn't use source code from bash. Since bash runs in a different address space than xterm, its not contaminated.

> Or if a proprietary software does:
> 
> system ("cp foo bar");
> 
> And it is run on a system where cp is a GPL'ed utility?

Same answer as above. system() is a kernel call. The kernel is considered to be in its own address space (since the kernel runs in supervisor state, you can try to argue this, but again, the common legal perception is that the kernel is distinct.) so the kernel can be GPL'ed without contamination the proprietary program. And cp is certainly in a different address space.

> However I'm not claiming knowledge of the Copyright Law.  (Mostly because I'm
> not able to fake it enough).  Also really I don't care what proprietary
> software must or must not do.  However it does pose a question for non-GPL
> compatible free software.

I am not a lawyer. I don't know how much of this 'address space' stuff is codified in court rulings, although I was lead to believe a lot of it was settled in the 1960's/70's, with IBM as a party ...

Similar lines of argument can be used when talking about auto-generated code (compiled, vs. interpreted, m4 macro'ed, config filed, e.g. what glade does with the dynamic libglade XML files).

> Can such a software use an out-of-process GPL component?  

Yes. The boundary is not the process boundary, but the address space boundary. Process boundaries are murky, because one process can cause another process to do something it otherwise wouldn't have done. That is, any IPC, whether shmem, semaphore, pipes, tcpip, or even the 'system()' 'execve()' syscalls, etc. tends to 'feel like' a library call because the cause-effect relationship is the same as a library call. That is why you asked the questions about system ("cp src dest"); because this "feels like" a library call, and thus it "feels like" a violation of the GPL when cp is GPL'ed.

The cause-n-effect model is totally contaminating: your proprietary web browser can make a library-like cause-n-effect action on my GPL'ed web-server. Not only does it feel "library-like", but things like CORBA, RPC and XML-RPC help make it feel even more "library-like". This does not imply that XML-RPC servers need to be LGPL'ed to allow proprietary clients to talk to them.

Murkier 'process' questions arise in more obscure architectures. Think 'lisp machine': is 'eval' like a process? or like a call? Alternately, think of dataflow CPU architectures, where every instruction is scheduled as if it were a separate process, and one would not have/use something like the Linux scheduler to context switch or mediate time slices. This isn't enirely academic: modern out-of-order execution & register-coloring design borrows more or less directly from the earlier dataflow archs.

> If the component is
> not 'required', that is the user does some action to activate it.  Isn't this
> sort of like running it in a shell or something.  (Thus I could run a GPL
> program from a non-GPL compatible shell).  

Answered above.

> Also, does the law know the
> difference between in-process and out-of-process.  To me it doesn't seem like
> such a fundamental difference.  

It isn't, and that's the point.

> Imagine that you would write an emulator for
> Linux x86 ELF binaries.  Then if you load the binary into the emulator you're
> running it in-process.  The binary itself doesn't know.  So can you run GPL'ed
> code in a non-GPL compatible emulator?  

Yes. The emulator provides an address space to the emulated binary. The emulated binary cannot 'break out' of that address space to corrupt the emulator, or corrupt the address spaces of other emulated binaries that might also be running. (bugs don't count).

Note that e.g. the Transmeta chips are 'emulators' in a sense, and they're proprietary. Transmeta chips aren't the first semi-hardware emulators ever built. Rather totally weirdly, a lot of the mainframe 'hardware', what looks-n-feels like hardware to the programmer, is in fact a giant and complex emulator under the covers. They don't advertise this.

> So wouldn't running Linux under
> VMware be license infringement?  

No.

> I'm rambling, it's 4am.  

Let me ramble. There's an interesting case of GPL vs. LGPL when the the hardware supports distinct different address spaces for libraries. If the library is GPL'ed, but lives in its own address space, then I believe that it would not 'infect' the calling program.

If you stand on your head, the Linux kernel is kind-of like this. When you make a system call (e.g. 'write()' ) into the Linux kernel, its not just an ordinary subroutine call into an ordinary library. There is some special funny assembler linkage code that causes an address space switch to occur. Once you are in the kernel, you are still in the same 'process' (the Linux current process pointer has not changed. The FPU context hasn't been changed, other CPU context, such as misc masks and bits haven't changed. The stack pointer is still using the same process stack. So you really are in the 'same process' in a certain definite sense.) So Unix system calls are a kind of 'library call' that changes the address space, but not the process.

The Linux kernel is GPL'ed, but I can run proprietary software on it. How can this be, when a Linux system call is essentially a library call? Answer: its because the address space changed in the call linkage, when the 'svc' instruction caused a very special trap to occur.



Copyright (c) 2000, 2001 Linas Vepstas
pub  1024D/01045933 2001-02-01 Linas Vepstas (Labas!) 
PGP Key fingerprint = 8305 2521 6000 0B5E 8984  3F54 64A9 9A82 0104 5933