The World Wide Web as an Engineering Paradigm
Matthew Lee Andrew Montgomery
Steven Shapiro Veeral Shah
Qian Wang
December 15, 2000
6.933/ STS. 420 The Structure of Engineering Revolutions Professor David Mindell and Professor George Pratt 1
1 Page 2 3
THE STRUCTURE OF ENGINEERING REVOLUTIONS THE WEB AS AN ENGINEERING PARADIGM
2
Introduction
"You see things and say 'Why? '
but I dream things that never were, and I say 'Why not? '"
George Bernard Shaw (1856 1950)
The World Wide Web is a compelling example of an engineering paradigm. The
adoption of the Hypertext Markup Language (HTML) and the Hypertext Transfer
Protocol (HTTP) as standards represent the acceptance of this new paradigm a
revolution in information sharing. This work details the technical evolution of HTML
and HTTP, from their inception to their acceptance as the technological foundation of the
World Wide Web.
This project history first equates the Web's development an engineering paradigm,
with a discussion of how it departs from the Kuhnian model of scientific paradigms.
Next, a history of the important events preceding the development of the Web sets the
stage for an understanding of the background in which its creation took place. This will
help to illuminate the presence of particular anomalies. We will then examine the period
of crisis that would ensue the question of how to share information across multiple
platforms. In the presence of this crisis, competing articulations and solutions arose
such as Gopher, WAIS, and Guide. Finally, this history delineates the rapid acceptance
of the World Wide Web the adoption of a new paradigm.
By no means did the World Wide Web signal the advent of online
communications. The core technologies that the Web is based upon, such as TCP/ IP and
hypertext, existed for years and in some cases decades before the Web was invented.
Files were transferred with FTP, network news was read through NNTP, and network
packets circled the globe through the Internet. Similar technologies to the Web existed as
well, including Gopher, a system developed at the University of Minnesota.
Despite all of these perfectly functional technologies, the Internet was fraught
with incompatibilities, as data was stored in a myriad of often-incompatible formats and
the usage of protocols was inconsistent across different platforms. For example, the
simple act of retrieving a file could prove troublesome. First, one would have to locate a
remote server that the file would be hosted on, through a primitive search service such as
Archie or by being told about a server from a colleague. Then the file would have to be
retrieved via FTP, which involves a series of directory navigation commands and some
protocol negotiation as well. Finally, the file would have to be viewed locally, and the
viewer might be incompatible or nonexistent for a particular format, especially when the
file was created on another platform. Tim Berners-Lee invented the Web in part to
resolve these incompatibilities, relying on a small set of simple protocols to share data in
a platform-independent manner. This is the story of the Web.
Scientific Revolutions
Thomas Kuhn, in his seminal work "The Structure of Scientific Revolutions," outlined
the development of a paradigm:
1) Paradigm / Pre -Paradigmatic state: At this time, there may or may not be a
conceptual framework by which people assessed the day's ideas. This framework
represents the "agreed upon world view" at the time. Within this state, the
progress that takes place is "normal science:" these developments are incremental,
and do not challenge the prevailing paradigms.
2) Anomaly: An incompatibility with the existing paradigm arises. Evidence or
observations appear to contradict the currently understood framework.
3) Crisis State: This stage is the reaction to the anomaly. The period can be
characterized by confusion, and the existing paradigm is challenged. Competing
articulations emerge, to try to explain the newly found evidence.
4) Resolution: The resolution of the crisis may take one of three forms:
a. Postponement The anomaly is put aside because it is not yet well
understood.
b. Integration The new phenomenon is sometimes found to be compatible
with the previously existing paradigm, and the two are integrated into a
fuller, more complete paradigm.
c. Paradigm Shift This is the most momentous of the three types of
reactions. Out of the competing solutions emerges an undisputed new
consensus. It is this consensus that defines the new conceptual framework
as the prevailing paradigm.
Engineering Revolutions
The characteristics of engineering revolutions differ along some important metrics from
the Kuhnian model, which is more suitable for describe scientific change. Noted here are
several key differences.
Technology Driven: The primary difference between scientific and technological
revolutions is, quite simply, the artifacts at hand. In scientific revolutions, what
previously exists is a framework upon which to base thought processes and research. In
contrast, technological revolutions involve technological artifacts, with a newer 'artifact'
dominating. In these cases, both artifacts work properly, and provide some benefit to
their users, but the mere existence (and growing usage) of the newer technology instead
drives the revolution.
Not Fundamental or Global: Technological revolutions are characterized by a 'better
way of doing things. ' Unlike the acceptance of a broad scientific paradigm, the
acceptance of a new engineering paradigm does not have fundamental implications on the
whole world.
Consensus is Not Necessarily Universal: Often times a new engineering paradigm is not
universally accepted. It is only important that the new framework dominates its field.
The rest of this paper explores the advent of the World Wide Web, providing a chronicle
of its development within the defined technological revolution framework.
The Existing Information Sharing Paradigm
We have just introduced our concept of the structure of engineering revolutions.
Any such revolution must begin with either an existing paradigm or a pre-paradigmatic
state that resembles the crisis state we have discussed. The engineering revolution that is
the focus of this paper is that of the World Wide Web and its revolution of the existing
information-sharing paradigm. Before discussing the Web itself, however, we must first
examine the paradigm immediately preceding the Web, the new technologies which arose
that presented an anomaly in the paradigm, the crisis state that was created by this
anomaly and the competing solutions that arose to solve the crisis. We must understand
what occurred before the Web in order to understand why the Web was developed and
why it is the way to share information electronically today.
Computer systems prior to the 1960s consisted mostly of mainframes at large
research and educational institutions and businesses. No one had conceived of a personal
computer as we understand them today. The systems first used punch-cards as a method
of input. Each system had a designated operator that would feed it the cards dictating
operations that various users wanted performed. Only one set of cards could be inputed at
a time and the system could only do one set of operations. But, the systems were very
large and relatively fast and could handle much more work. Thus, timesharing of the
systems began, in which a terminal with a wire connecting it to the mainframe could be
placed on a users desk and punch cards were eliminated. Each system used its own
operating system and protocols. There was no need for the systems to be compatible with
each other, as there was no network to connect them. Data was thus confined to the local
machines and information could only be shared within the institution which housed the
mainframe. Vannevar Bush, director of the US Office of Scientific Research and
Development under President Roosevelt, wrote an article in 1945 titled As We May Think
published in Atlantic Monthly. These were the some of the first thoughts that were
published thinking about the power of computers in information sharing. His "Memex"
sounds a great deal like today's personal computers and the user's interaction with the
system very much resembles that of hypertext, even though no one would "invent" either
technology for decades.
This leads us to the three technologies that created an anomaly in the information
sharing paradigm. Collaboration and hypertext systems would give people a way to
organize and share their information. Pervasive networking would allow people to
connect their systems together over large distances. Personal computers would allow any
person, at the office or at home, to share their information. Not until all three of these
technologies were developed was there really an anomaly in the information sharing
paradigm. We will now examine each of these technologies in more detail in order to
understand this anomaly.
New Technologies Cause an Anomaly
Collaboration and Hypertext Systems
If it was not for Vannevar Bush's provocative article, As We May Think, Doug
Engelbart may not have invented the mouse or formed the idea of the first hypertext
system. Engelbart first stumbled across the Vannevar Bush article and recalls that it had a
profound effect on him. "I remember being thrilled," he says. "Just the whole concept of
helping people work and think that way just excited me. I never have forgotten that." 1 He
was inspired and began to work on his oN-Line System (NLS) in 1963 and first publicly
demonstrated it in 1968 at the Fall Joint Computer Conference in San Francisco. The
system really was revolutionary for its time; it incorporated a graphical display, video
conferencing, hyperlinks for organizing information, and the computer mouse (which he
invented for the project) for navigation. See Figure 1, a screen shot of Engelbart's
demonstration of NLS in 1968. NLS was a completely successful implementation of what
Engelbart had imagined and people congratulated him for it. However, the same people
just could not see the purpose of it given their view of the role of computers:
large mainframes using punch cards. Engelbart was a man far ahead of his time. Decades
later, when asked about the lack of acceptance for NLS, he says "You can't bring stuff
out against the prevailing paradigms." It would be more than twenty years before many
of Engelbart's ideas would be used to force a paradigm shift.
Other collaboration and hypertext systems were also in development during the
1960s. In fact, Ted Nelson would coin the term "hypertext" during a presentation at the
1965 Association for Computing Machinery conference in Pittsburgh. Ted Nelson would
be one of the first insightful individuals to develop Doug Engelbart and Vannevar Bush's
shared vision. Nelson first developed a system called Xanadu, which he described as a
computerized version of Bush's Memex. Xanadu will be discussed in more detail
shortly. Nelson went to work with Andy Van Damm and some of his students at Brown
in 1967 to create the second hypertext system, the Hypertext Editing System (HES). This
was not an entirely significant hypertext system, but more importantly Andy Van Damm
and his students continued, after Nelson stopped collaborating in 1968, to work on a new
system called the File Retrieval and Editing System (FRESS). Although FRESS had no
networking (the first ARPANET node was not even online yet), it had two things that
later hypertext systems would find valuable. First, it had bi-directional links that allowed
for the user to control the link structure without modifying the original document content.
The second was a form of metadata that allowed keyword searching of links. Other
systems that were to follow, such as Microcosm and Intermedia, borrowed from these
ideas separating the links from the content. FRESS proved to be a relative success in the
classroom for which it was designed. However, FRESS and other similar systems would
not solve the information sharing crisis. The nature of these systems was to separate the
links from the original source documents and maintain them in a central database. This
database would have to be continuously maintained in order to retain the consistency of
the bi-directional links. This, in turn, would prevent FRESS and other similar systems
from scaling to a large (especially global) scale.
Ted Nelson's Xanadu project continued for over two decades after its original
inception. It provided a framework for information sharing, culture and management.
Xanadu's biggest difference from other systems of the time was its micropayment
scheme. Xanadu's creator was not a member of the computer science community [say
what he was here]. This may have been what caused Nelson to heavily focus on giving
credit to the author of every bit of information. It is suspected that Xanadu never really
became a complete system because of its visionary nature. A product must be concrete.
Xanadu presented itself almost as an entirely different way of life. Nelson's writing on
Xanadu, "Literary Machines," made references not only to information management, but
what people would think and how they would dress. Xanadu was just too far reaching to
be seriously considered for a viable solution to the engineering crisis at hand.
Pervasive Networking
On October 4 th 1957, the Soviet Union made history when they launched Sputnik
I into orbit. President Eisenhower declared that the United States would never again be
caught off guard by the Soviet Union and went to Congress on January 7 th , 1958 to
request funding for a new research and development organization. Congress quickly
approved funding for the new Advanced Research Projects Agency (ARPA) and, by
1962, ARPA's long-term goals had been refined. The two major parts of ARPA involved
ballistic missile defense and nuclear test detection. However, there was a small office,
using less than 10 per cent of ARPA's budget, called the Information Processing
Techniques Office (IPTO) that by 1965 was immersed in research for advanced
computing. Computer time-sharing among many different systems across the country was
the solution to the lack of funding the IPTO had and the multitude of requests from
computer science departments who each wanted to build very powerful computers. By
sharing the computer resources, more could be done. Thus, the IPTO began work on a
nationwide network to connect the various systems. By late 1966, plans for the ARPA
Computer Network, or ARPANET, had begun.
It was not immediately clear how the network should be built. Initial ideas from
the project's director, Larry Roberts, focused on just having each computer call up all the
other computers using the existing phone lines to exchange information. This would force
each computer to understand the operating system of all the others. There are obviously
many problems with this approach, as the network could not scale very well or be
flexible. Wesley Clark, an ex-colleague of Roberts, came up with the idea of having
small computers on the network to route data around, or sub-nets. This would mean the
large computers on the network would only have to know one protocol or language to
talk to each other that of the small routing computers. The small computers were named
Interface Message Processors (IMP) and their use was essential to the success of
ARPANET. Another important part of the design or ARPANET was that it used packet-switching.
This new idea, developed independently by both Paul Baran and Donald
Davies, cuts messages up into small chunks of the same size for transmission across the
network. It allowed a network to exist that would be able to survive machine failure on
one part of the network; the packets could just be rerouted through different parts of the
network.
UCLA was the chosen location for the first IMP and the second was put into place
at the Stanford Research Institute (SRI) in October 1969 and the first packet-switching
computer network was established. ARPANET grew quickly, adding about one host per month.
In 1972, the first email was sent across ARPANET and the network was publicly demonstrated
for the first time at the first International Conference on Computer Communication (ICCC)
in Washington, DC. See Figure 2 for a picture of ARPANET in 1977.
The first protocols on the ARPANET were hastily thrown together; they were
telnet and FTP. Telnet allowed someone sitting at a terminal to login to a computer at
another site. FTP, File Transfer Protocol, allows files to be transferred from one computer
to another. These two protocols were a good start, but hardly sufficient for the network.
In June 1973, Vint Cerf, Bob Kahn, Bob Metcalfe and Gιrard Le Lann started to work on
the design of a host-to-host protocol for inter-networking. They called it the Transfer
Control Protocol (TCP). TCP proved to be a robust host-to-host protocol that allows
reliable communication over unreliable networks. In 1978, TCP was combined with
another protocol, Internet Protocol (IP), to form TCP/ IP. IP controlled the routing of
packets around the network. Thus, with the introduction of the Domain Name System
(DNS) by Paul Mockapetris and the adoption of TCP/ IP by ARPANET in 1983, the
Internet as it is known today was born.
This pervasive network, the Internet, allowed any network of computers to be
connected to any other network of computers. People could now share information
electronically, using a universally agreed-upon protocol. But, in the early 1980s,
computers were still primarily at large institutions and there was a relatively small group
of users. Not until the rise of personal computers would there by a true crisis in the
paradigm of information sharing.
Personal Computers
Computers used in the existing information sharing paradigm were massive in
both size and price. Only large institutions could afford and support them. But, the
invention of the transistor in 1947, the integrated circuit in 1958, the microprocessor
in 1969, and magnetic memory in 1972, set the stage for computers to become smaller and
cheaper. They would become a part of most people's daily lives a PC would soon sit on
their desk at work and at home.
The Xerox Palo Alto Research Center (PARC) began work on its Alto computer
in 1973. It was the first computer designed specifically for use by non-computer
scientists. It actually was the first system to use many of Doug Engelbart's ideas, most
notably the mouse. It had a graphical user interface (GUI) with windows, menus and
icons. It looked very much like today's PCs. The machine that evolved from the Alto, the
Xerox Star, was the first system that Xerox tried to sell, but very few were actually sold
because of its $18,000 price tag.
In January 1975, an almost completely different system was released. The Altair
8800, released by Micro Instrumentation Telemetry Systems (MITS), sold for $397.
Users had to assemble the system themselves and the system was essentially a series of
switches and lights. It was completely unclear how it even worked. But, many were sold
to those who were curious about computers and had never had the chance to be exposed
to them. It inspired them to be creative and innovate. Bill Gates and Paul Allen, for
example, saw the Altair and formed Microsoft to sell software that allowed BASIC
programming on the system. Steve Wozniak saw the Altair and was determined to build
something better; the cost of microprocessors and memory had fallen dramatically.
Wozniak joined forces with Steve Jobs to form Apple Computer and they released the
Apple II at the West Coast Computer Faire in May 1977, selling it for a more reasonable
$1300. Many Apple II's sold quickly and many companies started to develop software for
it.
The early 1980s saw the real birth of the personal computer as it is thought of today.
In 1981, IBM launched its PC, selling for $2,800. Businesses loved the PC. IBM's
decision to allow clones ensured rapid growth of the technology. Microsoft wrote the
operating system, MS-DOS, and would later write the Windows operating system. During
this same timeframe, Apple was releasing its next computer. In 1984, Apple released its
first Macintosh computer, selling it for $2,495. It was the first computer that brought
the technology that Xerox had developed based on Engelbart's ideas to the masses. The GUI
stunned people and the Macintosh quickly developed a strong following amongst desktop
publishers, something that it still maintains today. The IBM PC and the Macintosh
changed the face of personal computing forever.
A Crisis Develops
The exponential growth of personal computers into the 1990s signaled the
establishment of the third technology that was needed for an anomaly in the information
sharing paradigm. Recall that an anomaly occurs in an engineering paradigm when an
incompatibility with the existing paradigm arises. In this case, information that was
previously used by few individuals and stored on isolated systems suddenly had the
potential to be used by many people and on any system. The three technologies
collaboration/ hypertext systems, pervasive networking, and personal computers created
this anomaly. But, there was no single solution to solve the problem created by this
anomaly. No system was commonly accepted that allowed people to collaborate across a
network using their PC. Until such a solution was developed, the anomaly brought about
a crisis in the information sharing paradigm. Competing articulations for solving the
anomaly is characteristic of a crisis state and we will now discuss those competing
solutions.
Any solution to this crisis state would have to have three characteristics for it to
be successful. First, it must possess the ability to discover the information that is desired.
The Internet allowed all the networks in the world to be connected, but provided no real
means for finding desired information scattered across the millions of machines that were
connected. Secondly, any solution would have to be able to retrieve the information
regardless of the operating system and software being used. A standard protocol
implemented on all platforms would be essential for success. Finally, the new system
would have to be able to display the information uniformly once it was received, again
regardless of the platform being used. Several solutions were developed that shared all of
three of these characteristics.
Competing Solutions
Some would argue that creativity is at its peak during an engineering crisis state.
In the case of the crisis being described in this paper, we will see that this is the case.
There were many competing articulations that attempted to solve the discovery, display,
and retrieval crisis in information sharing described above. However, these solutions
solved the problem in a variety of creative and differing manners. There were more
competing articulations than can adequately researched for the scope of this paper, but
several, such as WAIS, Archie, Guide, and Gopher, will be discussed in the following
sections. Two solutions are particularly relevant to development of the World Wide Web
project as well as the question of 'what could have been if only... ', Office Workstation
Limited's Guide and Mark McCahill's Gopher.
Wide Area Information Servers
Wide Area Information Servers (WAIS) could be construed as the opposite of
Xanadu. WAIS was concrete, concise and focused with a commercial marketplace. It
was the first successful information management system that was a commercial venture
from the start. Brewster Kahle, then working for the supercomputing company Thinking
Machines, was the head of the project. WAIS was similar to a modern day search engine.
It's key benefit was its ease of use and relatively intuitive interface. Figure 5 below is
one example screenshot of the first WAIS client from a paper written in part by Brewster
Kahle published in Electronic Networking the journal. 3 Users could submit their query in
plain English. Since files were described by keywords, media such as video or sound
files could be searched as well as text. WAIS proved very successful and soon spun off
into WAIS Inc. which was sold in 1995 to America Online for $15 million. WAIS Inc.
was one of the first Internet companies and proved to many skeptics that an Internet
company was possible. WAIS did not become the new paradigm because it was missing
one key ingredient, a humanitarian vision. WAIS Inc. and Brewster Kahle's vision were
not compatible with a free world-wide solution. The company was merely creating a
product for profit.
Archie, Prospero and FTP
Telnet and FTP were the first protocols on ARPANET and Neither protocol's
clients were robust or easy to use. This is where Archie came into play as a possible
solution to the ongoing crisis. In order for users to find programs on the Internet in 1983,
they would be forced to conduct manual searches of known FTP servers. Even then, the
filenames of these programs may not yield useful descriptions. Alan Emtage, a student at
McGill University in Montreal, Canada, sought to alleviate some of the mundane
searching with Archie in 1983. Archie is a few shell scripts that automate this search. As
Archie quickly became popular, Clifford Neuman, a Ph. D. student at the University of
Washington, saw his chance to improve upon Archie by integrating it with his distributed
filesystem, Prospero. Prospero was used to organize Archie's files as well as expedite the
discovery process. This system quickly became very popular because of the growing
urgency for a new paradigm. The Prospero-Archie-FTP system adequately solved
discovery, display and retrieval problems, but it did not become the new paradigm.
"Only hardcore users knew about FTP." 4 This system lacked the necessary easy of use
for the general computer illiterate population. In addition, although it was a breakthrough
for its time, the system left much in the way of improvement especially with respect to its
display of information. Long filename lists with descriptions are hardly comparable to
the media rich interface of the Web as we know it today.
Real Solutions
Two contemporaries of the web deserve a more detailed look in this paper
because of how close they actually became to becoming the new paradigm. Interviews
with the systems parents and corroborative documents shed light onto this rarely
documented turn of the Web. Tim Berners-Lee, the undisputed father of the World Wide
Web, approached both Office Workstations Limited (OWL) and the Gopher Team to try
to convince them to adapt their systems to form to his humanitarian vision of a single
global information space.
Guide
Guide was first developed as an academic project by Peter Brown in 1983 at the
University of Kent in the United Kingdom. This academic version of the hypertext
system Guide worked a little differently than its later commercialized counterpart:
The way it works is that when you hit a button it replaces that button in-line so there is a sort
of metaphor in which you have an expanding document; you hit a button and that button expands, in-line, so it remains in the context where it was in the whole document. If you undo it, it goes
back to the previous [context].
Ian Ritchie, the founder of Office Workstations Limited (OWL), purchased Guide from
the Peter Brown and the university with a sole license to the system.
OWL released their version of Guide for the Macintosh in August 1986, winning
the honor of becoming the first commercially successful hypertext system. Almost
exactly a year later, OWL released an identical version of Guide for the PC. However,
Apple Computer, Inc. forced OWL to concentrate on their newly released PC version of
Guide. In the middle of 1987, Apple released their hypertext system, Hypercard, free
with every Macintosh. Ian Ritchie notes in reference to OWL's shift to the PC market,
"You can compete with other things, but you can't really compete with free." 6 This idea
holds true in many other cases described in this paper.
OWL's implementation of Guide was extremely robust and much more advanced
than Berners-Lee's World Wide Web. Guide used a markup language called the
Hypertext Markup Language (HML) not to be confused with Berners-Lee's HTML. Ian
Ritchie describes HML:
That was a version of SGML with added tags to indicate things like a button or a button source or a point, headers and body text, so forth.. And by marking up your documentation in that way you could automatically generate a hypertext display. And you could go from document to document. And if your documents were on a file server you could go across a network to them.. We did that
I mean the interesting thing now is that everybody is really adopting XML. But XML is.. essentially contains the kind of structures that we and other people had done in the late 80's with our markup languages. I mean our markup language, old HML, was
more than just a hypertext markup language. It actually was a content structure language as well. It described for you, what was a heading.. what was a subheading.. what was an author's name.. what was an abstract.. what was a caption.. Yeah, that's the kind of thing that XML does. In fact there's evidence of that in a variant we were doing for Welcome http://www.larryblakeley.com/larryblakeley20070521.jpg (Contact Info: larry@larryblakeley.com) Important Note: You will need to click this icon to download the free I manage this Web site and the following Web sites: Leslie (Blakeley) Adkins - my oldest daughter Lori Ann Blakeley (June 20, 1985 - May 4, 2005) - my middle daughter Evan Blakeley- my youngest child
needed to view most of the images on this Web site - just a couple of clicks and you're "good to go."
*******************************************************************************************
ance, as almost everyone did at the time. But CERN and the hypertext community were not blind to the potentials of a networked information system. Then rejected Tim Berners-Lee's project, which is quite different from rejecting the premises of the World Wide Web. We will explore some of the reasons for CERN and the hypertext community to reject the early Web. CERN As we have said earlier, CERN is the center of European high-energy physics 36 Email from Berners-Lee to Dan Connolly, included as Item 1 in Appendix A. Figure 6 Particle accelerator at CERN. 27 27 Page 28 29 THE STRUCTURE OF ENGINEERING REVOLUTIONS THE WEB AS AN ENGINEERING PARADIGM 28 research. As such, it had a tremendously diverse culture and a great deal of complex social needs. For example, the people working at CERN were not all working at CERN. Many of the scientists involved in CERN projects were scattered all over Europe, the United States and even Japan. There were also a large number of projects going on at the same time. Such a heterogeneous environment cried out for a networked information system. Tim Berners-Lee, as a software consultant for CERN, was keenly aware of this need. In his numerous proposals to CERN management, Tim Berners-Lee always stressed the usefulness of his World Wide Web project to the people at CERN. CERN was of course also well aware of its own needs, and it seems puzzling that they did not at least explore Tim Berners-Lee's ideas more fully. However, the puzzle is much less puzzling when we consider the fact that CERN's primary focus is physics research. CERN's policy regarding software systems is "buy, not build." 37 However, this policy alone is not the only reason CERN did not take up Tim Berners-Lee's World Wide Web project. According to David Williams, Tim Berners-Lee focused a lot of his persuasive efforts on the hypertext aspects of the Web. However, there were a number of people at CERN, Williams included, who did not feel that hypertext was an appropriate way to represent information. Williams also thought, "Explaining his ideas is not [Tim Berners-Lee's] real strength." 38 As we can see from FIGURE 7, Tim Berners-Lee's diagram of how the Web works, 40 presents quite a complex picture of his ideas. This complex picture presents a direct contrast to the relatively simple implementation of the Web. This disparity between the simplicity of the Web's implementation and the complexity of Berners-Lee's explanations may explain why CERN hesitated to back a full-scale project to develop the Web. As a physics research institution, CERN had to devote most of its resources to that end. However, if Tim Berners-Lee had presented the Web as a less daunting project, it is possible that CERN might have undertaken to support his efforts more. 37 Weaving the Web 38 David Williams email interview. 39 Information Management: A Proposal 40 Information Management: A Proposal 28 28 Page 29 30 THE STRUCTURE OF ENGINEERING REVOLUTIONS THE WEB AS AN ENGINEERING PARADIGM 29 Figure 7 Tim Berners-Lee's flow chart for the Web The Hypertext Community In trying to minimize the development efforts required for the World Wide Web, Tim Berners-Lee had approached several hypertext system builders about possible collaboration. Berners-Lee believed that once people saw his simple implementation of the Web, they would realize the enormous potential of a global information space. However, his early implementation of the Web was not an impressive system to people within the hypertext community. At the Hypertext '91 conference in San Antonio, Tim Berners-Lee demonstrated the WorldWideWeb browser on his NeXT machine to the attendees. However, as a hypertext system, HTML was far too simple to make for a convincing demo. 41 The key to Berners-Lee's idea was of course the marriage of hypertext and the Internet. But to the hypertext community, his implementation was not an interesting hypertext system because it addressed none of the issues facing hypertext systems of the time. For example, link consistency was a big concern for most hypertext systems in the early '90s. When contents move in a system, links that point to the contents can become broken. Many hypertext systems were designed to deal with this problem. The Web on 29 29 Page 30 31 THE STRUCTURE OF ENGINEERING REVOLUTIONS THE WEB AS AN ENGINEERING PARADIGM 30 the other hand, completely ignored the problem of link consistency. Tim Berners-Lee was aware of the link consistency problem, but he correctly recognized that there was no simple, scalable way to ensure global link consistency in a worldwide information system. However, by ignoring the interesting hypertext research problems, Berners-Lee was not able to persuade the hypertext community to help him develop the Web. Why the Web Won Despite the resistance Tim Berners-Lee faced at CERN and in the hypertext community, the Web did beat out its competitors to become the most widely used electronic information system. The triumph of the Web comes from a combination of many factors. Tim Berners-Lee's promising vision, the simplicity and openness of the Web's design and initial implementation, the tremendous grass-roots support from users and developers, and fortuitous timing all contributed to the eventual success of the Web as a new paradigm. Tim Berners-Lee, though he may have erred in presenting too grand a vision to CERN, received an entirely different reaction from the world at large. For people outside CERN, the idea of a global information space was very appealing. However, the simplification of Tim Berners-Lee's vision was the only way for the Web to take roots and grow. The simplicity of early HTML and HTTP not only made them easy to implement but also easy to extend incrementally. Unlike more mature hypertext systems such as Dynatext, which had consistency checks to ensure their hypertext documents were valid, HTML was extremely lax in terms of error checking. While it created something of a problem because ill-formed documents could not be viewed, the lack of error checking was the most important factor in the growth of the Web. Since there is no real way to ensure that an HTML document conformed to the specifications given by Berners-Lee, browser implementers were free to add their own tags to HTML and have them be appropriately interpreted by their own browsers. All a browser implementer had to do was let his users know that a new tag was supported by his browser. For example 41 Peter Brown phone interview. 30 30 Page 31 32 THE STRUCTURE OF ENGINEERING REVOLUTIONS THE WEB AS AN ENGINEERING PARADIGM 31 the developers at NCSA added tags to display graphics to their browser implementation in just such a unilateral fashion. When Tim Berners-Lee released his creation to the people, the Web enjoyed tremendous grass-roots support on the Internet. Berners-Lee had always known the
*******************************************************************************************
ance of gaining a "critical mass" of users and information on the Web. In fact, the first Web server that he set up at info. cern. ch served documentation on the Web itself and instructed people on how they can set up their own Web servers. Here too, the simplicity of HTTP and HTML were of great value. Early HTTP was so simple that an HTTP server could be implemented as a UNIX shell script. Similarly, the HTML defined by Berners-Lee was simple enough that several individuals were able to develop Web browsers from scratch. Pei Wei, then at the University of California at Berkeley, developed Viola, a graphical browser for the X Windows environment (Figure 8). Wei commented on the "simple and elegant" URL scheme used by HTTP that attracted him to the Web. 43 Wei believes that if HTTP and HTML were not "plainly easy for a developer to work with," then the Web would have lost that developer and "that much of a resource for this grass-root supported system". 44 Other developers that created early browsers such as Tom Bruce, whose Cello browser was the earliest Windows based browser, also credit the simplicity of the Web for its quick acceptance. Bruce believes that Berners-Lee was able to "avoid letting the perfect become the enemy of the good" by keeping early HTML and HTTP simple and adding 42 Mark McCahill phone interview. 43 Pei Wei email interview. 44 Pei Wei email interview. Figure 8 ViolaWWW Screenshot. 31 31 Page 32 33 THE STRUCTURE OF ENGINEERING REVOLUTIONS THE WEB AS AN ENGINEERING PARADIGM 32 more functionality as the Web grew. 45 However, the Windows platform was largely ignored by the early Web enthusiasts. Even Berners-Lee's cross platform library, libwww, worked poorly under Windows because of memory allocation issues. Thomas Bruce, in the course of writing Cello, rewrote the entire libwww for Windows. 46 For a lone developer, this was a great deal of work. But because the Web was a simple enough and exciting enough, it spread to all the important platforms through the work of its grass-roots supporters. Once free Web browsers became available on all platforms and independent HTTP servers began to come online, the Web took off quickly. Tim Berners-Lee observed that since 1991, the Web has been growing at an exponential pace. 47 The advent of cheap PCs and a commercialization of the Internet further fueled its incredible growth. In less than 10 years, the Web went from a system no one wanted to a household word thanks in large part to the grass-roots support it enjoyed in its early days. The World Wide Web had some additional advantages over its closest competitor, Gopher. In the Spring 1992 issue of the Electronic Networking journal, Berners-Lee wrote that since Gopher uses the "directory and file model to implement a global information system," it would "map into the Web very naturally, as each directory (menu) is represented by a list of text elements linked to other directories or files (documents)." 49 In other words, the Web was a more general system that is capable of encompassing most of Gopher's capabilities. Another advantage the Web had over Gopher was that the "web gave information away and had ads to support it." 52 Companies saw the opportunity to make money and therefore chose the web over Gopher. The Web was "very pretty" and while librarians loved Gopher for its orderliness, companies loved the Web for its graphics capabilities. While the graphical capabilities of Mosaic did much to popularize the Web, such 45 Thomas Bruce interview. 46 Thomas Bruce interview. 47 An Interview with Tim Berners-Lee (video). 49 Electronic Networking, pg. 56 51 Mark McCahill phone interview. 52 Mark McCahill phone interview 53 Thomas Bruce phone interview. 32 32 Page 33 34 THE STRUCTURE OF ENGINEERING REVOLUTIONS THE WEB AS AN ENGINEERING PARADIGM 33 proprietary extensions were a double-edged sword. On the one hand, they cause incompatibilities among different browser implementations, thus dividing Berners-Lee's vision of a universal information space. An HTML document containing Mosaic specific tags would not display correctly on any browser except NCSA Mosaic. On the other hand, proprietary extensions were quick and direct ways for enterprising developers to add new capabilities to the Web. In the early days of the Web, the benefits of these extensions outweighed the drawbacks. The graphical capabilities of Mosaic were one of the main reasons the Web became so attractive to people outside the circles of academics and research. As the saying goes, "graphics sells," and in the case of the Web, the graphics gave it a tremendous leg up on competing systems such as Gopher. 54 However, as the Web grew and Mosaic became the dominant browser, the danger that a group of aggressive developers could hijack the development of the Web became increasingly present. It was clear that in order for Berners-Lee's vision of the universal Web to survive, some neutral body was needed to build consensus among the various forces that were pulling the Web in different directions. Adoption of the New Paradigm The Web needed to leave CERN for it to truly succeed -it was, after all, a particle research facility and not focused on computer science research. Berners-Lee therefore moved his work and advocacy for the Web to the Massachusetts Institute of Technology's Laboratory for Computer Science and started the World Wide Wed Consortium (W3C). "The W3C was founded in October 1994 to lead the World Wide Web to its full potential by developing common protocols that promote its evolution and ensure its interoperability." 55 Michael Dertouzos, the Director of the Laboratory for Computer Science describes that there was "tremendous synergy" between him and Berners-Lee when they met in Zurich in 1994. Dertouzos was interested in creating an "information marketplace" and saw the world-wide web as the "potential underlying mechanism" for 54 Mark McCahill phone interview. 55 http:// www. w3c. org web page. 33 33 Page 34 35 THE STRUCTURE OF ENGINEERING REVOLUTIONS THE WEB AS AN ENGINEERING PARADIGM 34 his vision. 56 MIT had just successfully spun off the X Consortium and this provided good timing for Berners-Lee to set up the W3C when he was invited to do so by Dertouzos. He saw it as his opportunity to continue to drive the future of the Web as it grew. CERN allowed the Web to leave and gave up its rights to the technology that Berners-Lee had developed there. David Williams, a manager at CERN, credits this happening to Berners-Lee and Robert Cailliau. He says, "Tim was strongly of the opinion that only an open release would allow it [the Web] to take-off. While we are all happy with what happened I tend to feel that CERN should have tried to get a little more public recognition for our/ his work." 57 But, CERN did not try to do that, and the Web took off because of it. The W3C is very active today in promoting standards for different web technologies. Companies pay a fee for membership and, while they are not forced to abide by the recommend standards that are developed by the W3C, they are incompatible with others if they chose not to. The aforementioned "browser wars" in which Mosaic and other browsers battled to outdo each other by adding more and more proprietary features was brought under control by the consensus-building approach of the W3C. This stabilization and standardization fostered the growth of the Web, as developers and companies can depend on there not being major differences and incompatibilities between the different browsers. The birth of the W3C signifies the acceptance of the Web as a new paradigm in information sharing. The W3C is a forum for the building of consensus on the direction of the Web. It also represents the community that lives and works within this engineering paradigm. By working to extend the Web and fulfill its potential, the work of the W3C represents the kind of normal engineering that goes on within an established paradigm. Once people began to accept the W3C as an authority in matters regarding the Web, the engineering revolution has truly come full circle. 56 Michael Dertouzos interview. 57 David Williams email interview. 34 34 Page 35 36 THE STRUCTURE OF ENGINEERING REVOLUTIONS THE WEB AS AN ENGINEERING PARADIGM 35 Conclusion We have tried to tell the story of the Web in terms of an engineering revolution. The progression from the existing paradigm consisting of unconnected computers and isolated islands of information to a connected world of universal information sharing is truly a revolutionary development. We have shown the similarities and differences between Kuhn's model of scientific revolutions and engineering revolutions. People today do not give a second thought about looking for information on the Web. While it may not always be obvious as to how to find the information, it's almost guaranteed that it exists on some Web page. People just 10 years ago looked for their information in a dramatically different way: the world has truly experienced a paradigm shift. 35 35 Page 36 37 THE STRUCTURE OF ENGINEERING REVOLUTIONS THE WEB AS AN ENGINEERING PARADIGM 36 Bibliography Books Berners-Lee, Tim, Weaving the Web: The Original Design and Ultimate Destiny of the World Wide Web by Its Inventor. HarperCollins 1999 Gilles, James, and Cailliau, Robert, How the Web was Born. Oxford University Press, 2000 Naughton, John. A Brief History of the Future: The Origins of the Internet. Weidenfeld: London, 1999. Papers and Journal Articles Berners-Lee, T. J., Cailliau, R. and Groff, J.-F. "The World Wide Web." Computer and Networks and ISDN Systems 25. 1992: pp454-59. B. Kahle et al., "Wide Area Information Servers : An Executive Information System for Unstructured Files", in: Electronic Networking: Research, Applications and Policy, Vol. 2, No. 1 (Meckler, New York, 1992) 59-68. Berners-Lee, T. J. et al., "World-wide web: the information universe", in: Electronic Networking: Research, Applications and Policy, Vol. 2, No. 1 (Meckler, New York, 1992) 52-58. Berners-Lee, T. J. et al., "World-wide web: an information infrastructure for high-energy physics", in: D. Perret-Gallix, ed., Proc. International Workshop on Software Engineering and Artificial Intelligence for High Energy Physics, La Londe, France, January 1992. McCahill, M. et al., "The Internet Gopher: An Information Sheet", in: Electronic Networking: Research, Applications and Policy, Vol. 2, No. 1 (Meckler, New York, 1992) 67-71. Berners-Lee, Tim. "Information Management: A Proposal." CERN, 1989. Berners-Lee, Tim and Cailliau, Robert. "World Wide Web: Proposal for a HyperText Project." CERN, 1990. "The Web Maestro: An Interview with Tim Berners-Lee", Technology Review, July 1996, archived at http:// www. techreview. com/ articles/ july96/ bernerslee. html 36 36 Page 37 38 THE STRUCTURE OF ENGINEERING REVOLUTIONS THE WEB AS AN ENGINEERING PARADIGM 37 Roush, Wade, "MIT Reporter: Spinning a Better Web", Technology Review, April 1995, archived at http:// www. techreview. com/ articles/ apr95/ BernersInterview. html Interviews Email interview with David Williams, Computing and Networks Division CERN, November 11, 2000 Interview with Alan Kotok, Associate Chairman of the World Wide Web Consortium, November 13, 2000 Telephone interview with Mark McCahill, Computer and Information Services, University of Minnesota, November 15, 2000 Email interview with Thomas Bruce, Legal Information Institute at Cornell Law School, November 15, 2000 Interview with Dave Gifford, Programming Systems Research Group, MIT Lab for Computer Science, November 15, 2000 Telephone interview with Peter Brown, University of Kent, November 17, 2000 Telephone interview with Ian Ritchie, Office Workstations Limited (OWL), November 20, 2000 Email interview with Pei Wei, University of California Berkeley/ O'Reilly & Associates, November 21, 2000 Telephone interview with Doug Engelbart, Stanford Research Institute/ Bootstrap Institute, November 27, 2000 Interview with Michael Dertouzos, Directory of MIT Lab for Computer Science, November, 28, 2000 Videos A conversation with the man who invented the World Wide Web [videorecording] : Tim Berner-Lee [sic], 1996, MIT Archives collection Demonstration of NLS, Doug Engelbart, http:// sloan. stanford. edu/ MouseSite/ 1968Demo. html Web Sites 37 37 Page 38 39 THE STRUCTURE OF ENGINEERING REVOLUTIONS THE WEB AS AN ENGINEERING PARADIGM 38 RFC1630: Universal Resource Identifiers in WWW Http:// www. ietf. org/ rfc/ rfc1630. txt? number= 1630 Archives of the www-talk forum Http:// ksi. cpsc. ucalgary. ca/ archives/ WWW-TALK/ archives. html List of Original HTML Tags, 1991. http:// www. w3. org Berners-Lee, Tim. HyperText Transfer Protocol Design Issues, 1991. http:// www. w3. org Grahn, J. Harmon. "The World Wide Web." http:// www. olypen. com/ harmon/ np304. htm 38 38 Page 39 40 THE STRUCTURE OF ENGINEERING REVOLUTIONS THE WEB AS AN ENGINEERING PARADIGM 39 Appendix A: Selected Emails from the WWW-Talk archive All of these emails are hosted at http:// ksi. cpsc. ucalgary. ca/ archives/ WWW-TALK/ archives. html. Item 1 Email from Tim Berners-Lee to Dan Connolly Re: Motif browser status timbl (Tim Berners-Lee) Date: Fri, 8 Nov 91 13: 35: 26 GMT+ 0100 From: timbl (Tim Berners-Lee) To: connolly@ pixel. convex. com Subject: Re: Motif browser status Cc: kharris@ pixel. convex. com, www-talk Dan, Thanks for your message. Obviously you know what you are doing with X11 browsers -we are impressed by what you have done to date. I was interested to hear that you are working on AVS -I have had some contact with AVS people at UNC. You make a good point that the world has been waiting for a good formatted text widget under Motif. One exists under NeXTStep, Robert Cailliau is just adapting one for the Mac for hypertext, but under Motif it has been lacking. Of course, hundreds of people have written them: all the word processors have them in, and products like dynaText, etc. However, there is none in the public domain. CERN like Convex has a copyright on all code, but we are doing our best to release W3 code as widely as possible, and possibly overcome this limitation. Why? The concept of the web is of universal readership. If you publish a document on the web, it is important that anyone who has access to it can read it and link to it. In order to make this possible, we don't need very new technology --what we do need is 1. A common open naming/ addressing format 2. Sufficiently powerful underlying protocols 3. Sufficiently powerful data formats 4. Some free implementations Now we have defined the (1), which did not exist before. We have supplemented the (2), where some protocols do exist. We have added a little to (3) though we will use all existing and new formats. We have written some code. You say your work would be of considerable valuer to convex. Yes, that is true. You must ask yourself whether it would be of more value to convex if kept private or released for general consumption. If you release it, 39 39 Page 40 41 THE STRUCTURE OF ENGINEERING REVOLUTIONS THE WEB AS AN ENGINEERING PARADIGM 40 -Convex gets the credit and a higher profile, (as Thinking Machines has with WAIS indexers for example). -Anyone in the world can read the information you supply with the same tool as they use for other information. -You get a lot of useful feedback from users on the network -A lot of people would be able to profit from what you have done You have to compare this scenario with that if you keep the code private. You will be able to use it internally. Would convex be able to profit from by selling it? If so, how many people would actually buy it? Will the AVS project benefit from a closed private documentation scheme? On these grounds alone, you may conclude that it is in Convex's interest to release the code. Still, you ask what we can "put on the table". If it would make it easier to justify the release of code, we would be happy to make all CERN-developed W3 code officially available to Convex under a more or less formal joint project agreement. Note that we are producing a parallel set of parsers and access mechanisms for HTML, newgroups, WAIS, prospero, etc. We have gateways, and other browsers. The line-mode browser you know, the Mac one is coming along, we may have a full-screen character grid browser too. We are currently unifying the browser architecture so that all access mechanisms can be used by all browsers. I'm not sure that either of our sides would want to be contractually bound to produce or maintain anything -the agreement would be just as-is code sharing of what exists when it exists, no strings. You ask about graphics. That cannot be our next priority, as we need to get the new architecure and general format negociation worked out. In many cases, we find that there are GIF/ TIFF viewers on various platforms, and one can link in to them. We don't want to make a new graphics file format a la Mac/ PICT, but we are intrerested in conversion code. Have you heard of editable Postscript? That might be what you are looking for. (See http:// info. cern. ch/ hypertext/ Standards/ PostScript/ IPF. html) I don't know whether your company has a mechanism for allowing code to be released into the public domain (or General Public License). If it is politically impossible, then that's a pity. (We do have a group of students in Finland working on an X implementation, and if that doesn't work out we could write it ourselves. It may also be that more that one implementation with a different style will be interesting. Obviously it would be rather a duplication of effort, though we are under a lot of pressure from our management and users to put this at the top of the agenda.) I hope I have clarified the W3 team's philosophy, and perhaps convinced you to contribute, to our mutual (and the world's) benefit. Tim PS: Yes, I think you ought to be on www-talk, Dan. I'll put you on. 40 40 Page 41 THE STRUCTURE OF ENGINEERING REVOLUTIONS THE WEB AS AN ENGINEERING PARADIGM 41 The traffic is not too high. __________________________________________________________ Tim Berners-Lee timbl@ info. cern. ch World Wide Web project (NeXTMail is ok) CERN Tel: +41( 22) 767 3755 1211 Geneva 23, Switzerland Fax: +41( 22) 767 7155 41 Page Navigation Panel 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41