The Ventriloquist's Dummy

Web Services and Web Browsers
Alex Selkirk
14th November 2019

Contents

1. The Trouble with the Web
2. The Web Browser: Gagging the User
3. What about ActiveX, Plugins, Java Applets and Scripting?
4. Standards, Formats and Protocol Layers
5. Web Browser Services: A Ventriloquist's Act
6. Web Services
7. The Ventriloquist's Dummy: Web Services and Web Browsers
8. What is the Alternative?
9. The Open Web Browser Manifestor
References and Links

Introduction

"Ventriloquism is a practiced skill that is achieved by throwing one's voice. That is, a ventriloquist uses a wooden dummy or doll which he places on his hand in order to operate its movements. The ventriloquist then pretends to carry on a conversation with the dummy by moving its mouth and providing its voice. The dummy's voice actually comes from the ventriloquist, but since there is no sign that he or she is moving his or her mouth or lips, then the voice seems to come from the wooden dummy. Our eyes naturally try to zero-in on the source of the noises or the voice that we hear. But, by controlling the movement of his or her lips, and by not moving the mouth, a ventriloquist successfully tricks us into thinking that the voice we are hearing is coming from another source, such as his or her wooden dummy. "
(from The History of Ventriloquism.)

Web services, Microsoft's .NET project and Microsoft Passport: these are popular subjects for discussion amongst web developers, but when it comes to describing their rationale and purpose, nobody has come up with concise explanations. The defenders of web services, .NET or Passport usually define it with a question: "Wouldn't it be nice to be able to do such-and-such?". But that does not explain the technological reasons for the structure of the architectures being built. This paper is a personal view of the reasons behind web services, beginning with a look at the current state of the web and the ubiquitous tool for viewing it: the web browser.

1. The Trouble with the Web

There is unease amongst Web developers: a feeling that the high hopes engendered in the XML community since the creation of the XML standard in 1998 have not come to fruition. There are a host of interesting standards, created by the W3C, that have very few implementations. A primary reason that XML was 'invented' was to create a format for additional and better display languages, instead of extending HTML with non-standard markup tags. This aim of advancing beyond HTML has not been achieved: HTML is still the web markup language. Its rivals come from non-XML languages such as the portable document format (PDF) and Flash. But nobody has yet pinpointed the cause for this: the current web browser, the supposed 'universal canvas' that has progressed only incrementally since the early versions of Netscape and Internet Explorer.

The first clue that something has gone wrong is on looking at how Microsoft's Passport "web service" works. The protocol is clever, even a work of genius, and yet to me it strikes a highly dischordant note. It brings out the ugliness of the protocols that make up today's World Wide Web.

The Passport Protocol

By using a series of HTTP redirects and information embedded in GET commands, a web user can log onto the Passport web site and be 'authenticated' for all sites using Passport. But examine the protocols and consider this carefully: is it an authentication service? The answer is no - Microsoft does not check the details of users on the Passport web site, so a user could fill in entirely spurious data. A more accurate description of Passport is that it is a store for personal information. So if Passport merely stores personal information, why can a user not store his own information on his own computer? There is no lack of storage space - a modern computer usually has gigabytes of spare hard-drive space. The answer is that the web browser architecture prevents this. Instead a third party has to act on the user's behalf to provide his own details. The ventriloquist act has begun.

The second clue is the advancement of proprietary protocols such as Adobe's Personal Document Format (pdf) and Macromedia's Flash animation. They do not work with the common HTML and scripting protocols, they work instead of them. Finally there is the rise of stand-alone applications that use web protocols such as HTTP and XML but do not use the web browser. For an application that is supposed to be at the center of the modern user experience, the web browser is being sidelined for many web activities.

2. The Web Browser: Gagging the User

Two protocols underpinned the World-Wide Web when it first began: HTTP and HTML. They remain the dominant protocols today. From a cursory look at HTTP it is clear that there were two intentions: retrieving web pages (using the GET command) and providing the web server with information from the client (using the POST command). The GET command is dominant: HTML only uses the POST command for the submission of web forms. HTML forms are basic user interface controls such as a text edit box, radio button or check box which can return text and simple choices to the web server. A number of limitations to web forms as a means of user expression can easily be found :-

What are the other ways for the client to provide information to the server? There are cookies, but they merely send back to the server information that the server has provided - there is no client input or interaction. The GET command can be used to give more information to the server than just the web page - form data for example can be returned using the GET as well as the POST command. If all the other methods by which the GET command is used by a web browser are listed, there is a certain picture to them. The two principal methods to format the GET command is by clicking a hyperlink, or by redirection from another web server. In both these cases, the web server chooses the URL: the hyperlink merely grants the client a choice about whether to use it (which the redirect does not). The final method is to enter a URL manually. This is the only method which is completely controlled by the client.

To sum up, information to a web server consists of the manual input of a URL, web forms (a manual mechanism whose structure is controlled by a web server), hyperlinks (controlled by a web server) and redirects (controlled by a server). In conclusion, the current web browser is a choke point, preventing a normal computer user from speaking words other than those given by a web server. Never mind, if the user is mute, let someone speak words on his behalf! This is the dynamic behind web services: they are the programmed ventriloquists speaking words on behalf of the dummy. If you want a web site to access or change your calendar, use the calendar third party web service. If a web site wishes for you to create a digital signature, it will redirect to the digital signature web service. The user is the dummy because there is no software on his computer, integrated with the web browser that can perform the required functionality. Professional ventriloquists charge a fee for their services - does anyone expect web services to be any different?

3. What about ActiveX, Plug-ins, Java applets and Scripting?

So far, only HTTP and HTML have been considered, but there are a number of additional features that a web designer can use to get interaction from the client. For example there are embedded scripts such as JavaScript (now ECMAScript) or VBScript. Java applets can create new channels for client-server communication. ActiveX controls (for Internet Explorer) or Plug-Ins (for Netscape) can extend the browser beyond simple HTML. Can one or all of these be the way forward for client communication?

Scripts, applets, ActiveX controls and plugins can be divided into two categories. First there is the mobile code category, into which scripts and applets fall. Mobile code is downloaded once to act in combination with a particular web page, and is then discarded. Since mobile code is sent by a potentially dangerous web server, its use needs to be severely restricted, and therefore only limited interaction with a client and his data is possible. Mobile code can be used to carry out calculations on form data, but a client has no way of telling whether these calculations are what he requires, since the mobile code is, like everything else, created at the web server. The inherent security danger in mobile code limits each type's functionality: Java applets to a sandbox allowing only safe actions, scripts to simple calculations or manipulating other web objects.

The second category contains ActiveX and plugins. Some ActiveX controls and plugins are installed along with the web browser but they are usually downloaded from a web site. Once installed they can be used by any web page. Therefore they form an extension to the browser itself. ActiveX controls can either be called from a web page, or use their own special format (such as a PDF file or Flash animation file or SVG). ActiveX or plug-in extensions to the web browser are browser-specific, and there is no process to standardize them so that they are accepted by all companies and users. It is also difficult to distinguish functionally between ActiveX controls and plugins: some controls may be appropriate to use in certain situations, but there is no way of telling which control is being used. There is also no standard method for coordination or communication between multiple controls. The web browser should be providing functionality to enable this coordination, but it does not.

These concerns can best be illustrated by an example. Suppose a control is created that stores user information, such as name, address and credit card details. This could be used by virtual shops when the user wishes to buy goods online, but would be unacceptable at other sites that wish to accumulating user information for their own purposes. So when a web site wishes to use this control, it must be the user's prorogative to know that this particular control is being used, and that it is appropriate for this particular web site to use it. In other words there should be a protocol that allows the user and the web server to agree access to the user's information. What format should the protocol be in? The standard web format is XML, but there is no support for XML in web browsers beyond a display transformation such as XSLT or CSS. For such a control to work, it would have to recreate the parsing and calculations that the web browser should be providing.

This brief survey leads to the conclusion that the current extensions to HTML are either too dangerous, too limited, too implementation-specific or need to recreate so much functionality that there is little chance that they will ever be used as standard ways of carrying out client interaction with the server.

4. Standards, Formats and Protocol Layers

A further criticism of current web browsers is that there are too many languages using different ad-hoc formats. A quick list of standards that come to mind are HTTP, HTML, CSS, SSL, cookies, Javascript, Java applets, ActiveX, Netscape plug-ins, XML and XSLT. In general, these standards can be used together only by separating them into completely separate protocol layers (eg HTML and HTTP), or to split them into separate communications (HTML and CSS). Separate protocol layers have the advantage of a clean interface, but the disadvantage that information is difficult to integrate or send from one to the other. It is difficult for a Java applet on an HTML web page to know that the communication has been encrypted using the SSL protocol and is therefore secure, or that the web server X.509 certificate (which is in another format) indicates who owns the web server.

Surely the time has come to rationalize this situation, so that all the disparate pieces of information are as far as possible incorporated into one layer and one vocabulary framework. This can be achieved best by using XML, which can use the namespace standard to indicate different standards, and schemas to define the data structure, including how the various information items interoperate.

5. Third-Party Web Browser Services

The basic concept of using the web browser to support calculations on client data is to transfer the data to a third-party server and ask it to do the calculations. The tool for achieving this is the HTTP redirect. This mechanism was originally intended to be used when a web site or URL was moved, so that the new location could be automatically found. Now it is used to send a user to a different web site, along with information that allows the return redirect. When a web server requires the results of the calculations, the web browser user is redirected to the third-party site, the cookie retrieves the user's identity, the calculations are made and the site redirects to the original server with the calculated results. The information transfer can be contained in the HTTP information, such as the local part of the URL.

A third party service for a client using a web browser

This system is currently used by Microsoft Passport and by web advert companies such as DoubleClick. For Passport the web user inputs his personal details. For web advert sites, involuntarily-acquired client data such as browsing habits is used to provide a tailored advert, and the main beneficiary is the original web site which is paid for web advertising, rather than the web user. Third party services work on little or no control by the user - the redirects are automatic, the cookie retrieval is automatic. In some situations the user may have to agree to the information use, but in most of the protocol there is no interaction. Nor is the user's computer used for calculations or information storage, although it is perfectly capable of doing so. No, the user is the dummy, the third party service is the ventriloquist, providing the words and moving the lips. With web services, this system will be rationalized (with a common data exchange format) and expanded (new services and specifications), but the ventriloquist act looks like continuing.

6. Web Services

Web Services are a blanket description for a number of initiatives by major software companies. The central theme technology-wise appears to be two-way communication with XML over HTTP via the SOAP protocol. On the commercial front the software companies are talking about selling the use of a service using this technology. For example, Microsoft's web service initiative is called ".NET My Services" (formerly called project Hailstorm). The Microsoft White Paper lists its initial services as:-

These are mostly stand-alone APIs - not mixable, not upgradeable, and difficult to interoperable with other XML standards. In addition they all fit into the current web service with web browser third party mechanism. As far as can be forecast, the user-web service updating will be carried out using a stand-alone web service-enabled application (such as Microsoft Office), while allowing other parties to access the data by using the web browser redirect mechanism previously described. After all, the only other reason for storing documents, or creating lists or other information on a web server is for computer-independence, but such schemes have been tried before and failed. Only a few years back, the technological hype was all about the Net computer, or the JavaServer computer. The model was to store all user information centrally so that it could be downloaded to the user's "dumb terminal". Most users do not want to work that way. It will only work if there is a technological block preventing the standard way of working. That block is the web server.

The current remedy to the web-browser 'gag' is the standalone web-service. The growth in stand-alone applications using web technologies is another sign that the current web browser has failed. Web services can be placed into two categories. The first is the completely integral web service - a complete vocabulary that does not rely on any other XML vocabularies. This type may not even use namespaces to distinguish them from other vocabularies, since they are not used in a way or at a place that will confuse them with other XML data. The second type may reuse a few standard XML vocabularies. Because it intermingles another vocabularies, it will use namespaces and possibly also XML schema. To be implemented, it will use an API created by a large company to interpret the standard XML vocabularies. This has to be linked into the new web service with integration code. The general mechanism is to create a DOM tree of the XML data, then send the sub-tree containing the standard vocabulary to the API to process. The rest of the DOM tree is then processed and used with the API information.

7. The Ventriloquist's Dummy: Web Services with Web Browsers

The future model for new web services, is to use web server-driven redirects to pick up and process client information. The redirect information will be in XML according to the web service specification, but compressed and Base64 encoded so that it is in a text format suitable for transmitting within HTTP. The web services will be chained to keep picking up information from other services - and the user identity information kept by the authentication' service will be the key base service so that each web service will know which user's information to recover. An example of a web browser binding to a web service is the OASIS SAML draft standard, which contains a draft model illustrating the technique. The XML Key Management Specification (XKMS) proposes web services for holding, locating and verifying user's security information. These are but the tip of the forthcoming iceburg of web service standards.

The future use of web services could be shown with an example: suppose that a commercial web site wishes to access the user's calendar to check delivery times for a package. First it redirects to Passport.com to pick up the user's identity. Then it redirects to a "client preferences" web service to find which web service the client is using for his calendar. Then it redirects to the calendar web service to pick up the user's calendar information. The client will probably have to give permission to allow the server to obtain the preferences and calendar information, but his computer does not have to perform any of the calculations. It appears as if the user is giving the calendar information to the web server, but in actual fact he is acting as the dummy, while the web services act as the ventriloquist, making the user's mouth move and providing the words.

8. What is the Alternative?

Now is the time to work against the more blatantly commercial aspects of the web service tide. Web services may be in the interests of the big businesses, allowing them to charge for processing and data storage. It may be in the interests of the highly mobile user with ten different network devices or who does not have a permanent personal computer. However it may not be in the interests of most web users. The choice to control the information on the local computer or via a web service should be up to the individual. Sometimes a web service will be what is wanted, sometimes not, but it should not be required to store personal information using web services. Today's computers are powerful, with lots of storage capacity, so centralized storage on a more powerful computer is not a major benefit, indeed it may be a hindrance. Denial-of-service attacks are a frequent occurance, as are hacking attacks into web servers. Therefore it is not guaranteed that a user's data, stored in a web service, can be accessed, certainly not with the same ease as data on the local computer.

The key point is that the individual should control information according to individual prefernce. This means that web browsers and alternative applications must be upgraded so that they can communicate to web servers using protocols that make use of the user's data. Since there are many different data structures that a client may wish to use, the new web browser has to be easily extensible to use new protocols.

9. The Open Web Browser Manifesto

In my opinion, the current web browser has outlasted its usefulness. We should be considering the shape of a new type of web browser, that will replace it. The vision that I have is of the Open Web Browser, a foundation stone that can be built upon by any programmer to create truly integrated vocabularies. The following is a list of key points that this browser requires:-


References and Links

HTMLX, a similar Open Web Browser vision
http://inet2019.com/post/Features/HTMLX/htmlx.html

SAML, the Security Assertion Markup Language
http://www.oasis-open.org/committees/security/

Building User-Centric Experiences, An Introduction to .NET My Services, a Microsoft White Paper
http://www.microsoft.com/myservices/services/userexperiences.asp

Passport, Microsoft's Initial Web Service
http://www.passport.com/

Web Services, Business Models, and Storage, an essay by Dan Bricklin
http://www.bricklin.com/serviceandstorage.htm