SAN FRANCISCO (09/19/2003) - At SCO Forum 2003 last month, The SCO Group Inc. presented its first specific example of the Unix code that has allegedly found its way into Linux. The sample occupied only a couple of slides in a much longer presentation, but a forum attendee who works at the German magazine Heise snapped photos of the slides. The Linux community, hungrily awaiting any presentation of proof from SCO, leaped on the evidence and tore it to shreds. No doubt politically satisfying, the exercise, however, does little to counter SCO's position.
SCO continues to keep the Unix System V source code under wraps, a standard tactic in many intellectual property disputes. After all, you wouldn't prove your secrets have been stolen by revealing them to the public. The Unix code on the slides is limited to comments preceding the case in point -- the kernel function "atealloc." The second slide shows a portion of the Linux code for that function, which SCO claims is substantively identical to its Unix code.
Advocates have challenged the lineage of the Linux code in question, claiming much of it dates back to early editions of Unix that had been placed in the public domain. If the presented example can be traced to a public release of Unix, SCO's point will be invalidated.
The entirety of the System V source code tree is available to hundreds of developers that work for licensees. They can't publish SCO's code, but they can validate or disprove SCO's claim. Eric Raymond, a thoughtful and outspoken member of the open source power structure, compared System V Release 4 with Linux 2.4. He concluded that, yes, the atealloc functions are effectively identical. Waffling aside, it appears that Linux is, in this isolated case, busted.
Does this mean that SCO can expect a blank check from IBM Corp. and can demand license payments from commercial users of Linux? In this case, there is no proof the code was leaked by IBM, the central target of SCO's suit. Raymond posits that the leak may have occurred when a Linux contributor copied code from Unix sources that he or she thought were public. Assertions about the leak's innocent beginnings may not sway a judge. The Linux kernel function SCO presented as evidence was completely rewritten for Version 2.6 of Linux. Raymond speculates that the code was probably reworked because it was ugly -- that is, inefficient or difficult to understand.
Raymond's analysis of the SCO slides was completed within 48 hours of the Heisepublication. The many engineers who have access to both Linux and System V source code are certainly doing diffs (textual difference mapping) between the file trees to find out where matches exist.
Raymond has taken on this work himself, using a rapid text-differencing technique that's used to manage large software projects. If Raymond's so-called "comparorator" works well, it will find all the bits of Unix that have leaked into Linux. He can put the comparorator into reverse and uncover any Linux or BSD code that SCO has lifted without attribution.
SCO claims Linux contains millions of lines of Unix code. IBM, Linux advocates, and every interested System V licensee knows, or can learn, just how much matching code there is. While the cry goes out to SCO to "show us the code," those best positioned to help Linux have already seen everything SCO will use against them. The ball is now in IBM's court. Its AIX source code can prove or disprove SCO's claim that IBM was the major vector for the leaks.