Article > SOAP is stupid
Description :: Universal communication isn't here yet.

I figured I'd get the acronyms out of the way. All of these refer to standards (sometimes vendor-based, sometimes not) used to allow programs to make function calls in other programs, sometimes across a network, and sometimes to code written in an entirely different language, using some basic language. The concept is a useful one, but I have a problem with the advertising behind at least SOAP (if not all of them.)

My problem is that SOAP is described as finally being the truly universal, do-anything protocol that will make everything better. It's a silver bullet. It's a lie. SOAP uses port 80, because it's just XML data in the form of HTTP requests. As many businesses host their own webserver, and because they allow their employees to use the internet from work, port 80 (used to send and receive webpages and webpage requests) is open in both directions. SOAP is "good" because, unlike the other standards, businesses allow for it by default (and unintentionally.) Secondly, SOAP uses XML, a file-format intended to be somewhat human-readable. It can technically send any sort of data, though you'll likely only see plain-text in the files. Like the rest of the standards, SOAP is limited to sending certain basic datatypes: numbers, text, lists of items, dictionaries (maps) of items, structures (a structure being a single item made up of several named and typed fields, fill-in-the-blank style.)

So why isn't it universal? Because, like the other standards, it's just a transport protocol. It isn't a translation box, it isn't an anything-to-anything converter. It's a taxi. You'll hear that "IIOP can't talk to COM" -- but SOAP is the same way. The client and the server both have to know that SOAP is the protocol being used. But it's worse than that.

XML in general has been used recently as a universal file-format. Why? Well, it's text-based, and there are plenty of libraries in most languages for it. It solves the very first problem in handling files: making some sort of sense of all those bits. But it stops there -- although XML can bring along a description of the document (what's allowed, where, and so forth) it doesn't bring along "meaning."

Just because two businesses can input/output XML (and particularly SOAP) messages doesn't mean they can make any use of the data being received. Most businesses don't set up their databases in exactly the same way. When they send chunks of those databases to multiple customers, they either have to know exactly what each customer can understand, or they just send what they'd understand themselves, and let the customers figure it out. If my company expects to receive a document with exactly one name on it, and your company sends me a document with several names on it, then the transport protocol, whether Corba or SOAP or any others, hasn't solved the basic problem of understanding each other.

Why have we gone so long without this? When two businesses decided to send data to each other, they would generally get together, look at the data, and decide what to send. They'd then hand the specifications to their programmers, who would hash out some means of getting the data across. It's a tedious job, admittedly, to design a file format each time something new comes up. But the most important work was already done: they knew what was being sent, regardless of how it was being sent. After that, it really didn't matter if they sent the file in some terribly proprietary manner, or if they wrote it down on paper and sent it, or yelled it at each other, or used SOAP -- the "what" of the matter was decided.

Transport protocols are only a "how." To say that SOAP allows businesses to instantly network together and understand each other is like saying that thanks to TCP/IP (the basic packet routing protocol for the internet,) suddenly an FTP server and a chat program can talk meaningfully. They can send packets to each other, sure. But after the first few, the rest are bound to be mostly error messages, which the receiver likely won't even understand. "Speech" is a wonderful means of communication, but just because it's standard doesn't mean that an American and a Russian can suddenly talk to each other usefully.

So, why is SOAP popular? Because XML is. Why is XML popular? Aside from the hype, it's text-based. Quite a few scripting languages, favored for quick hacks and small jobs, administration work, or other input/output stuff, don't work very well with arbitrary binary data. They don't do pointers, buffers, and all the junk necessary to do it well. But text, they understand. It's a breeze to write code using pre-written XML libraries to handle new file formats. And that's "good," for a lot of people.

But it's not the silver bullet they tell you. Remember that part about "writing code," just above? That still has to be done. Establishing some standard as to the contents of the message? That still has to be done. Transferring data to- and from- a database that isn't part of the standard? That has to be done too.

And did I mention that XML likely uses twice as much bandwidth as a customized solution would, simply because it repeats itself constantly with "tags" to indicate what's what? So that humans, who will likely never see the messages, can read them slightly more easily? Yeah, it does that too.