Photos of Larryblakeley
(Contact Info: larry at larryblakeley dot com)
Important Note: You will need to click this icon to download the free needed to view most of the images on this Web site - just a couple of clicks and you're "good to go."
The World Wide Web is a compelling example of an engineering paradigm. The adoption of the Hypertext Markup Language (HTML) and the Hypertext Transfer Protocol (HTTP) as standards represent the acceptance of this new paradigm— a revolution in information sharing. This work details the technical evolution of HTML and HTTP, from their inception to their acceptance as the technological foundation of the World Wide Web.
By no means did the World Wide Web signal the advent of online communications. The core technologies that the Web is based upon, such as TCP/ IP and hypertext, existed for years and in some cases decades before the Web was invented. Files were transferred with FTP, network news was read through NNTP, and network packets circled the globe through the Internet. Similar technologies to the Web existed as well, including Gopher, a system developed at the University of Minnesota.
Despite all of these perfectly functional technologies, the Internet was fraught with incompatibilities, as data was stored in a myriad of often-incompatible formats and the usage of protocols was inconsistent across different platforms. Tim Berners-Lee invented the Web in part to resolve these incompatibilities, relying on a small set of simple protocols to share data in a platform-independent manner. This is the story of the Web.
Technological revolutions are characterized by a 'better way of doing things. Unlike the acceptance of a broad scientific paradigm, the acceptance of a new engineering paradigm does not have fundamental implications on the whole world.
The rest of this paper explores the advent of the World Wide Web, providing a chronicle of its development within the defined technological revolution framework.
The Existing Information Sharing Paradigm
The engineering revolution that is the focus of this paper is that of the World Wide Web and its revolution of the existing information-sharing paradigm.
Before discussing the Web itself, however, we must first examine the paradigm immediately preceding the Web, the new technologies which arose that presented an anomaly in the paradigm, the crisis state that was created by this anomaly and the competing solutions that arose to solve the crisis.
We must understand what occurred before the Web in order to understand why the Web was developed and why it is the way to share information electronically today.
Computer systems prior to the 1960s consisted mostly of mainframes at large research and educational institutions and businesses. No one had conceived of a personal computer as we understand them today. The systems first used punch-cards as a method of input. Each system had a designated operator that would feed it the cards dictating operations that various users wanted performed. Only one set of cards could be inputed at a time and the system could only do one set of operations.
But, the systems were very large and relatively fast and could handle much more work. Thus, timesharing of the systems began, in which a terminal with a wire connecting it to the mainframe could be placed on a users desk and punch cards were eliminated. Each system used its own operating system and protocols.
There was no need for the systems to be compatible with each other, as there was no network to connect them. Data was thus confined to the local machines and information could only be shared within the institution which housed the mainframe.
Vannevar Bush, director of the US Office of Scientific Research and Development under President Roosevelt, wrote an article in 1945 titled "As We May Think," published in Atlantic Monthly. His "Memex" sounds a great deal like today's personal computers and the user's interaction with the system very much resembles that of hypertext, even though no one would "invent" either technology for decades.
This leads us to the three technologies that created an anomaly in the information sharing paradigm:
1. Collaboration and hypertext systems would give people a way to organize and share their information.
2. Pervasive networking would allow people to connect their systems together over large distances.
3. Personal computers would allow any person, at the office or at home, to share their information.
Not until all three of these technologies were developed was there really an anomaly in the information sharing paradigm. We will now examine each of these technologies in more detail in order to understand this anomaly.
Collaboration and Hypertext Systems
If it was not for Vannevar Bush's provocative article, "As We May Think," Doug Engelbart may not have invented the mouse or formed the idea of the first hypertext system. Engelbart first stumbled across the Vannevar Bush article and recalls that it had a profound effect on him. "I remember being thrilled," he says. "Just the whole concept of helping people work and think that way just excited me. I never have forgotten that."
He was inspired and began to work on his oN-Line System (NLS) in 1963 and first publicly demonstrated it in 1968 at the Fall Joint Computer Conference in San Francisco. The system really was revolutionary for its time; it incorporated a graphical display, videoconferencing, hyperlinks for organizing information, and the computer mouse (which he invented for the project) for navigation.
NLS was a completely successful implementation of what Engelbart had imagined and people congratulated him for it. However, the same people just could not see the purpose of it given their view of the role of computers: large mainframes using punch cards.
Engelbart was a man far ahead of his time. Decades later, when asked about the lack of acceptance for NLS, he says "You can't bring stuff out against the prevailing paradigms." It would be more than twenty years before many of Engelbart's ideas would be used to force a paradigm shift.
Other collaboration and hypertext systems were also in development during the 1960s.
In fact, Ted Nelson would coin the term "hypertext" during a presentation at the 1965 Association for Computing Machinery conference in Pittsburgh. Ted Nelson would be one of the first insightful individuals to develop Doug Engelbart and Vannevar Bush's shared vision. Nelson first developed a system called Xanadu, which he described as a computerized version of Bush's Memex.
Nelson went to work with Andy Van Damm and some of his students at Brown in 1967 to create the second hypertext system, the Hypertext Editing System (HES). The nature of these systems was to separate the links from the original source documents and maintain them in a central database. This database would have to be continuously maintained in order to retain the consistency of the bi-directional links.
Ted Nelson's Xanadu project continued for over two decades after its original inception. It is suspected that Xanadu never really became a complete system because of its visionary nature. A product must be concrete.
Xanadu presented itself almost as an entirely different way of life. Nelson's writing on Xanadu, "Literary Machines," made references not only to information management, but what people would think and how they would dress. Xanadu was just too far reaching to be seriously considered for a viable solution to the engineering crisis at hand.
On October 4 th 1957, the Soviet Union made history when they launched Sputnik I into orbit. President Eisenhower declared that the United States would never again be caught off guard by the Soviet Union and went to Congress on January 7 th , 1958 to request funding for a new research and development organization.
Congress quickly approved funding for the new Advanced Research Projects Agency (ARPA) and, by 1962, ARPA's long-term goals had been refined. The two major parts of ARPA involved ballistic missile defense and nuclear test detection.
However, there was a small office, using less than 10 per cent of ARPA's budget, called the Information Processing Techniques Office (IPTO) that by 1965 was immersed in research for advanced computing.
Computer time-sharing among many different systems across the country was the solution to the lack of funding the IPTO had and the multitude of requests from computer science departments who each wanted to build very powerful computers. By sharing the computer resources, more could be done. Thus, the IPTO began work on a nationwide network to connect the various systems. By late 1966, plans for the ARPA Computer Network, or ARPANET, had begun.
It was not immediately clear how the network should be built. Initial ideas from the project's director, Larry Roberts, focused on just having each computer call up all the other computers using the existing phone lines to exchange information. This would force each computer to understand the operating system of all the others. There are obviously many problems with this approach, as the network could not scale very well or be flexible.
Wesley Clark, an ex-colleague of Roberts, came up with the idea of having small computers on the network to route data around, or sub-nets. This would mean the large computers on the network would only have to know one protocol or language to talk to each other – that of the small routing computers. The small computers were named Interface Message Processors (IMP) and their use was essential to the success of ARPANET.
Another important part of the design of ARPANET was that it used packet-switching. This new idea, developed independently by both Paul Baran and Donald Davies, cuts messages up into small chunks of the same size for transmission across the network. It allowed a network to exist that would be able to survive machine failure on one part of the network; the packets could just be rerouted through different parts of the network.
UCLA was the chosen location for the first IMP and the second was put into place at the Stanford Research Institute (SRI) in October 1969 and the first packet-switching computer network was established.
ARPANET grew quickly, adding about one host per month. In 1972, the first email was sent across ARPANET and the network was publicly demonstrated for the first time at the first International Conference on Computer Communication (ICCC)in Washington, DC.
The first protocols on the ARPANET were hastily thrown together; they were telnet and FTP.
• Telnet allowed someone sitting at a terminal to login to a computer at another site.
FTP, File Transfer Protocol, allows files to be transferred from one computer to another.
These two protocols were a good start, but hardly sufficient for the network.
In June 1973, Vint Cerf, Bob Kahn, Bob Metcalfe and Gérard Le Lann started to work on the design of a host-to-host protocol for inter-networking. They called it the Transfer Control Protocol (TCP).
TCP proved to be a robust host-to-host protocol that allows reliable communication over unreliable networks.
In 1978, TCP was combined with another protocol, Internet Protocol (IP), to form TCP/ IP.
IP controlled the routing of packets around the network.
Thus, with the introduction of the Domain Name System (DNS) by Paul Mockapetris and the adoption of TCP/ IP by ARPANET in 1983, the Internet as it is known today was born.
This pervasive network, the Internet, allowed any network of computers to be connected to any other network of computers. People could now share information electronically, using a universally agreed-upon protocol.
But, in the early 1980s, computers were still primarily at large institutions and there was a relatively small group of users. Not until the rise of personal computers would there by a true crisis in the paradigm of information sharing.
Computers used in the existing information sharing paradigm were massive in both size and price. Only large institutions could afford and support them. But, the invention of the transistor in 1947, the integrated circuit in 1958, the microprocessor in 1969, and magnetic memory in 1972, set the stage for computers to become smaller and cheaper. They would become a part of most people's daily lives– a PC would soon sit on their desk at work and at home.
The Xerox Palo Alto Research Center (PARC) began work on its Alto computer in 1973. It was the first computer designed specifically for use by non-computer scientists. It actually was the first system to use many of Doug Engelbart's ideas, most notably the mouse. It had a graphical user interface (GUI) with windows, menus and icons. It looked very much like today's PCs.
In January 1975, an almost completely different system was released. The Altair 8800, released by Micro Instrumentation Telemetry Systems (MITS), sold for $397. Users had to assemble the system themselves and the system was essentially a series of switches and lights.
Bill Gates and Paul Allen, for example, saw the Altair and formed Microsoft to sell software that allowed BASIC programming on the system.
Steve Wozniak saw the Altair and was determined to build something better; the cost of microprocessors and memory had fallen dramatically. Wozniak joined forces with Steve Jobs to form Apple Computer and they released the Apple II at the West Coast Computer Faire in May 1977, selling it for a more reasonable $1300. Many Apple II's sold quickly and many companies started to develop software for it.
The early 1980s saw the real birth of the personal computer as it is thought of today. In 1981, IBM launched its PC, selling for $2,800. Businesses loved the PC.
IBM's decision to allow clones ensured rapid growth of the technology.
Microsoft wrote the operating system, MS-DOS, and would later write the Windows operating system.
During this same time frame, Apple was releasing its next computer. In 1984, Apple released its first Macintosh computer, selling it for $2,495. It was the first computer that brought the technology that Xerox had developed based on Engelbart's ideas to the masses. The GUI stunned people and the Macintosh quickly developed a strong following amongst desktop publishers, something that it still maintains today.
The IBM PC and the Macintosh changed the face of personal computing forever.
A Crisis Develops
The exponential growth of personal computers into the 1990s signaled the establishment of the third technology that was needed for an anomaly in the information sharing paradigm.
Recall that an anomaly occurs in an engineering paradigm when an incompatibility with the existing paradigm arises.
In this case, information that was previously used by few individuals and stored on isolated systems suddenly had the potential to be used by many people and on any system.
The three technologies – collaboration/ hypertext systems, pervasive networking, and personal computers created this anomaly. But, there was no single solution to solve the problem created by this anomaly.
No system was commonly accepted that allowed people to collaborate across a network using their PC. Until such a solution was developed, the anomaly brought about a crisis in the information sharing paradigm.
Competing articulations for solving the anomaly is characteristic of a crisis state and we will now discuss those competing solutions.
Any solution to this crisis state would have to have three characteristics for it to be successful.
First, it must possess the ability to discover the information that is desired. The Internet allowed all the networks in the world to be connected, but provided no real means for finding desired information scattered across the millions of machines that were connected.
Secondly, any solution would have to be able to retrieve the information regardless of the operating system and software being used. A standard protocol implemented on all platforms would be essential for success.
Finally, the new system would have to be able to display the information uniformly once it was received, again regardless of the platform being used.
Several solutions were developed that shared all three of these characteristics.
Some would argue that creativity is at its peak during an engineering crisis state. In the case of the crisis being described in this paper, we will see that this is the case.
There were many competing articulations that attempted to solve the discovery, display, and retrieval crisis in information sharing described above. However, these solutions solved the problem in a variety of creative and differing manners.
There were more competing articulations than can adequately researched for the scope of this paper, but several, such as WAIS, Archie, Guide, and Gopher, will be discussed in the following sections.
Two solutions are particularly relevant to development of the World Wide Web project as well as the question of 'what could have been if only... ', Office Workstation Limited's Guide and Mark McCahill's Gopher.
Wide Area Information Servers
Wide Area Information Servers (WAIS) could be construed as the opposite of Xanadu. WAIS was concrete, concise and focused with a commercial marketplace.
It was the first successful information management system that was a commercial venture from the start. Brewster Kahle, then working for the supercomputing company Thinking Machines, was the head of the project.
WAIS was similar to a modern day search engine. It's key benefit was its ease of use and relatively intuitive interface.
WAIS Inc. was one of the first Internet companies and proved to many skeptics that an Internet company was possible.
WAIS did not become the new paradigm because it was missing one key ingredient, a humanitarian vision.
WAIS Inc. and Brewster Kahle's vision were not compatible with a free world-wide solution. The company was merely creating a product for profit.
Archie, Prospero and FTP
Telnet and FTP were the first protocols on ARPANET and neither protocol's clients were robust or easy to use.
This is where Archie came into play as a possible solution to the ongoing crisis. In order for users to find programs on the Internet in 1983, they would be forced to conduct manual searches of known FTP servers. Even then, the filenames of these programs may not yield useful descriptions.
Alan Emtage, a student at McGill University in Montreal, Canada, sought to alleviate some of the mundane searching with Archie in 1983. Archie is a few shell scripts that automate this search.
As Archie quickly became popular, Clifford Neuman, a Ph. D. student at the University of Washington, saw his chance to improve upon Archie by integrating it with his distributed file system, Prospero.
Prospero was used to organize Archie's files as well as expedite the discovery process. This system quickly became very popular because of the growing urgency for a new paradigm.
The Prospero-Archie-FTP system adequately solved discovery, display and retrieval problems, but it did not become the new paradigm. "
Only hardcore users knew about FTP."
Two contemporaries of the web deserve a more detailed look in this paper because of how close they actually became to becoming the new paradigm.
Interviews with the systems parents and corroborative documents shed light onto this rarely documented turn of the Web.
Tim Berners-Lee, the undisputed father of the World Wide Web, approached both Office Workstations Limited (OWL) and the Gopher Team to try to convince them to adapt their systems to form to his humanitarian vision of a single global information space.
Guide was first developed as an academic project by Peter Brown in 1983 at the University of Kent in the United Kingdom. This academic version of the hypertext system Guide worked a little differently than its later commercialized counterpart:
The way it works is that when you hit a button it replaces that button in-line so there is a sortof metaphor in which you have an expanding document; you hit a button and that button expands, in-line, so it remains in the context where it was in the whole document. If you undo it, it goes back to the previous [context].
Ian Ritchie, the founder of Office Workstations Limited (OWL), purchased Guide from Peter Brown and the university with a sole license to the system.
OWL released their version of Guide for the Macintosh in August 1986, winning the honor of becoming the first commercially successful hypertext system.
OWL's implementation of Guide was extremely robust and much more advanced than Berners-Lee's World Wide Web. Guide used a markup language called the Hypertext Markup Language (HML) not to be confused with Berners-Lee's HTML. Ian Ritchie describes HML:
"That was a version of SGML with added tags to indicate things like a button or a button source or a point, headers and body text, so forth.. And by marking up your documentation in that way you could automatically generate a hypertext display. And you could go from document to document. And if your documents were on a file server you could go across a network to them. All that stuff was very similar to what people today call XML."
OWL even produced Guidex (originally IDEX), which was a solution for a particular client which was a networked version of Guide.
Ian Ritchie even described a similar version of Guide for a client that worked over the Internet!
It is mind boggling to look at this little-known hypertext system that was much more advanced than the famous World Wide Web before the Web was really started.
Tim Berners-Lee was looking for a jump start on his Web project in an already existent system. He finally saw one that peaked his interest when Guide was demonstrated at the first European Hypertext Conference at Versaille in Paris in 1990.
However, Guide and Guidex were not public domain, and OWL was a for-profit company. CERN was in no position to purchase external engineering as they were devoting most of their budget to particle accelerator projects and other expensive physics equipment.
If CERN had purchased Guide from OWL it quite possibly could have become what we think of as the World Wide Web today. However, Guide was much more advanced than the Web, so some of the well established problems of the Web today (such as dead links) may have already been solved.
In the late 1980's all major universities had embraced the Internet and begun to digitize various content. These institutions were plagued by the same discovery, display and retrieval crisis as the rest of the networked community. In the early 1990's, Campus Wide Information Systems (CWIS) seemed to be the obvious solution. These systems were very different, but they all had a common goal: To organize the vast microcosm of the university's information space.
The University of Minnesota was no exception. A lot of people had an interest in participating in the university's effort, so a committee was formed to design their ideal system. Mark McCahill, an employee at the university's computing center, was eventually given the task of managing the project. In short, McCahill and his lead developer, Farhad Anklesaria, felt the messy decree delivered by the committee was unacceptable and followed their own design.
Their CWIS was named Gopher, after the university's mascot, and McCahill and Anklesaria went to the committee in 1991. Gopher met with enormous resistance from the University committee because it did not follow their decree. The University of Minnesota rejected Gopher.
However, the system was still released to the public domain in June of 1991. Gopher quickly became the Web's biggest contemporary competitor and was the first solution to the impending crisis that really opened the internet to anyone.. It was simple, intuitive and well-structured. A hierarchy of menus whose items can be submenus or an object (like a document, picture or any other object).
"The information for any given item is enough to identify the server it's on."
This feature was integral to the success of Gopher because it meant the information did not have to be stored on the computer you were using.
One thing had to be added before Gopher's preliminary release, full-text database searching.
Tim Berners-Lee also recognized the value of Gopher as a world-wide system and approached the Gopher team at the 1991 Internet and Engineering Task Force (IETF) meeting..
Mark McCahill was quick to reject Tim Berners-Lee's ideas for one very good reason: the Web project lacked structure. Gopher's biggest strength was its structure and ability to organize information for linear thinkers.
By 1993, Gopher had exhibited exponential growth in its user base. "In April of that year , there were just over 460 registered gopher servers; by December there were over 4800."
Soon, the University of Minnesota's computing center was forced to accept Gopher as it took off around the country's Universities.
This was the system that could have been the new paradigm. It had a jump start on the WWW with a substantial user base.
However, the Gopher team would soon make a big mistake. Companies were going to the University and wanted to use Gopher for their systems. So, the university decided to start charging everyone except for non-profit and academic institutions."It pissed a lot of people off," remembers McCahill.
In 1994, Gopher lost its ground relatively quickly and eventually got consumed by the WWW.
Tim Berners-Lee's Vision
Tim Berners-Lee was the son of two mathematicians who programmed some of the world's first computers. As Berners-Lee notes in his recent book, "They were full of excitement over the idea that, in principle, a person could program a computer to do most anything." He mentioned that he was watching his father write a speech for Basil de Ferranti (of Ferranti Computers, a seller of the early Mark I machine). His father was researching on the brain, thinking about the random associations that the brain makes and exploring the possibility of computers behaving similarly. The idea struck Berners-Lee and he continued to think about that topic throughout his life.
Berners-Lee continued in his education at Oxford, graduating in 1976 with a bachelor's in physics. After a few years at Plessey Telecommunications and DG Nash, Tim took a consulting job in 1980 with CERN, the European particle physics laboratory in Geneva.
Here, Berners-Lee wrote his first attempt at information management, a program he called Enquire. He just wrote it in his spare time for personal use, to remember the connections at CERN among the various computers, projects and people at the lab. Berners-Lee credits this as the time that a larger vision took root in his head.
"Suppose all the information stored on computers everywhere were linked. Suppose I could program my computer to create a space in which anything could be linked to anything. All the bits of information in every computer at CERN, and on the planet, would be available to me and anyone else. There would be a single, global information space.
The grand design for the system was for it to be a universal information space -a system that could store anything and the links between pieces of information.
Berners-Lee decided on a web-like structure using hypertext, in which nodes of information were connected by links, which could connect any two pieces of information together to describe a relationship between those pieces of information.By completely abandoning any notion of constrained structure, and by using simple protocols and formats, Berners-Lee created a system that could be acceptable to all -"a system with as few rules as possible."
To represent the nodes in this web of information, Berners-Lee chose a simple hypertext document format, represented by standard ASCII text documents.
Special spans of text in each document, such as headings and links, were set off from the rest of the text using simple tags like <BODY></ BODY>.
The most important aspect of this representation of hypertext is that it is extremely platform-neutral, as plain text can be read by every computer platform. Berners-Lee realized this, and used the cross-platform nature of ASCII text to truly exploit the diversity of different computer systems and networks that could possibly use the Web.
Following from the establishment of this arbitrary web structure for his universal information space, Berners-Lee made the key insight that since any node could be linked to another node, there existed an equivalency between all nodes from a systems point of view.
A page buried deep inside a proprietary help system would be no different from a transcript typed into a flat text file:
each node would need to have a unique address, allowing a viewer to bridge across information systems seamlessly within a single viewer.
In addition to unifying different data formats, this common addressing scheme sought to unify various hardware platforms as well, since the address would be platform-neutral.
CERN's organizational structure dictated another important aspect of this new system. Since CERN's scientists were constantly coming and going, and doing work from remote locations, it was essential that the new information system be network-aware.
This measure strengthened the platform-independence of the system, as common networking protocols are often used to connect disparate systems together.
TCP/ IP was chosen because it was used already by the Unix world, and VAX machines could be patched to support TCP/ IP as well.
However, perhaps the most important and desirable feature of a network-distributed system is that new systems can come online without disrupting or modifying any existing systems, allowing the system as a whole to scale with the size of the network.
Scalability was a key issue that Berners-Lee was aware of from the beginning. It proved to be the downfall of many more robust systems attempting to solve the information management crisis.
He realized that any one central control point for any aspect of the system would limit its ability to scale up to thousands or millions of users/ servers. It could also limit access to certain groups of people.
A key aspect of decentralization was the realization that a central link database would fatally cripple the system's ability to scale.
The benefits of a central link database is that there would never be a broken link; however, Berners-Lee decided that the ability to scale up was worth the occasional inconvenience of broken links.
Collaboration, and more generally, online editing of hypertext documents was a persistent theme in Berners-Lee's vision for a universal information space. On the most fundamental level, users would be able to add new links from one node to another as they browsed.
Berners-Lee elucidated even further on his vision of collaboration in 1996 in an interview published in the MIT Technology Review. He envisioned web documents as fully collaborative living documents, able to be edited in real-time by several people simultaneously and include live video and/ or audio feeds from each participant.
Clearly, Berners-Lee envisioned a large degree of two-way interaction on the Web, especially for the purpose of facilitating group communication.
Above all, Berners-Lee desired that a simple, straightforward design with lots of room for future expansion. Too much innovation and new concepts all at once would result in lower acceptance for his vision. Berners-Lee's decision to keep his vision simple and pragmatic allowed others to quickly adopt the Web into their own software efforts.
From Vision to Proposal
Berners-Lee wished to implement his vision as a new information management system for CERN. Berners-Lee observed that organizations and relationships at CERN often did not follow rigid hierarchical structures, especially when looked at over time.
Information about a project should be able to grow and evolve along with the organizations that work on it, and therefore "the method of storage must not place its own restraints on the information."
The Engineering Realities
Having created a vision of a universal information space that allows both viewing and cooperative editing of hypertext documents, Tim Berners-Lee faced the challenge of turning his ambitious vision into an engineering reality.
Berners-Lee's first thought was to add networking capabilities to existing hypertext systems. At the European Conference on Hypertext Technology in 1990, Berners-Lee approached Ian Richie of OWL Ltd. about merging his ideas with OWL's Guide hypertext system.
The Guide was "astonishingly like what [Berners-Lee] had envisioned for a Web browser," including the functionality to edit hypertext documents.
However, Berners-Lee was unable to capture Ritchie's interest. Ian Richie, the founder of OWL, had a different view. OWL's Guidex already had most of the features, including retrieval of documents over a network, that Berners-Lee wanted for the Web.
What OWL did not realize was the fact that the networking aspect of an information system would prove to be more important than its hypertext aspects.
Dynatext, created by Electronic Book Technology, was another potential hypertext base for Tim Berners-Lee's Web, but here too, he was unable to make his case.
It was becoming clear that if the Web were ever to see the light of day, Tim Berners-Lee would have to design and build it himself, from scratch. The first implementation of the Web was completed in about two months by Tim Berners-Lee, working alone.
The HTTP protocol is therefore layered on top of existing Internet standards such as TCP/ IP (Original HTTP, 1991).
For example, when a URL such as http:// web. mit. edu/ index. html is entered into a Web browsers, the HTTP connection takes place in three phases:
1. the type of the URL request is determined to be HTTP.
2. the name of the remote server, web. mit. edu, is extracted from the URL.
3. the actual IP address is looked up via the DNS system.
Since no TCP port number on web. mit. edu has been specified, the default port for HTTP connections (port 80) is used.
The Web browser now makes an ordinary TCP connection to port 80 on the computer named web. mit. edu. Once the connection is established, the Web browser sends a single line of ASCII characters, which is the actual HTTP request.
In our example the line will read "GET index. html." The HTTP server running on web. mit. edu receives this request and tries to find the file "index. html." If the file exists and is readable by all users, then it is sent to the Web browser.
But if for some reason the file cannot be located or cannot be read for some other reason, an error message is sent to the Web browser instead.
When the server has finished sending either the file or the error message, it closes the TCP connection with the Web browser, thus ending the HTTP connection.
It is then up to the Web browser to correctly display the document that it has received.
Its most important feature, however, was that it worked well enough to be a usable system. Similarly for HTML, simplicity and functionality were its strengths.
The simplicity of HTML and HTTP also had their drawbacks, however. One of the features Tim Berners-Lee had to sacrifice was the idea of cooperative editing of HTML pages over the Web. In order to implement such a two-way Web, the HTTP protocol would have needed a good authentication system so that users can be given permission to write HTML documents onto a remote computer.
The Struggle For Acceptance
When Tim Berners-Lee implemented the first version of the Web, the world did not beat a path to his door.
Two important sources of early resistance to the Web were CERN and the hypertext community. It is a common misconception, perpetuated to a degree by Tim Berners-Lee's own accounts, that neither CERN management nor the hypertext community saw the usefulness of the World Wide Web. It is true that they underestimated its
ance, as almost everyone did at the time. But CERN and the hypertext community were not blind to the potentials of a networked information system. We will explore some of the reasons for CERN and the hypertext community to reject the early Web.
As we have said earlier, CERN is the center of European high-energy physics research. CERN's policy regarding software systems is "buy, not build." However, this policy alone is not the only reason CERN did not take up Tim Berners-Lee's World Wide Web project. According to David Williams, Tim Berners-Lee focused a lot of his persuasive efforts on the hypertext aspects of the Web. However, there were a number of people at CERN, Williams included, who did not feel that hypertext was an appropriate way to represent information.
The Hypertext Community
The key to Berners-Lee's idea was of course the marriage of hypertext and the Internet.
But to the hypertext community, his implementation was not an interesting hypertext system because it addressed none of the issues facing hypertext systems of the time.
For example, link consistency was a big concern for most hypertext systems in the early '90s. When contents move in a system, links that point to the contents can become broken. Many hypertext systems were designed to deal with this problem. The Web on the other hand, completely ignored the problem of link consistency.
Tim Berners-Lee was aware of the link consistency problem, but he correctly recognized that there was no simple, scalable way to ensure global link consistency in a worldwide information system.
Why the Web Won
Despite the resistance Tim Berners-Lee faced at CERN and in the hypertext community, the Web did beat out its competitors to become the most widely used electronic information system.
The triumph of the Web comes from a combination of many factors:
• Tim Berners-Lee's promising vision,
• the simplicity and openness of the Web's design and initial implementation,
• the tremendous grass-roots support from users and developers, and
• fortuitous timing all contributed to the eventual success of the Web as a new paradigm.
For people outside CERN, the idea of a global information space was very appealing. However, the simplification of Tim Berners-Lee's vision was the only way for the Web to take roots and grow.
While it created something of a problem because ill-formed documents could not be viewed, the lack of error checking was the most important factor in the growth of the Web.
When Tim Berners-Lee released his creation to the people, the Web enjoyed tremendous grass-roots support on the Internet. Berners-Lee had always known the
ance of gaining a "critical mass" of users and information on the Web.
In fact, the first Web server that he set up at info. cern. ch served documentation on the Web itself and instructed people on how they can set up their own Web servers.
Here too, the simplicity of HTTP and HTML were of great value. Similarly, the HTML defined by Berners-Lee was simple enough that several individuals were able to develop Web browsers from scratch.
Pei Wei, then at the University of California at Berkeley, developed Viola, a graphical browser for the X Windows environment. Other developers that created early browsers such as Tom Bruce, whose Cello browser was the earliest Windows based browser, also credit the simplicity of the Web for its quick acceptance. Bruce believes that Berners-Lee was able to "avoid letting the perfect become the enemy of the good" by keeping early HTML and HTTP simple and adding more functionality as the Web grew.
But because the Web was simple enough and exciting enough, it spread to all the important platforms through the work of its grass-roots supporters.
Once free Web browsers became available on all platforms and independent HTTP servers began to come online, the Web took off quickly.
The advent of cheap PCs and a commercialization of the Internet further fueled its incredible growth. In less than 10 years, the Web went from a system no one wanted to a household word thanks in large part to the grass-roots support it enjoyed in its early days.
The World Wide Web had some additional advantages over its closest competitor, Gopher. In the Spring 1992 issue of the Electronic Networking journal, Berners-Lee wrote that since Gopher uses the "directory and file model to implement a global information system," it would "map into the Web very naturally, as each directory (menu) is represented by a list of text elements linked to other directories or files (documents)."
In other words, the Web was a more general system that is capable of encompassing most of Gopher's capabilities.
Another advantage the Web had over Gopher was that the "web gave information away and had ads to support it." Companies saw the opportunity to make money and therefore chose the web over Gopher.
The Web was "very pretty" and while librarians loved Gopher for its orderliness, companies loved the Web for its graphics capabilities. The graphical capabilities of Mosaic were one of the main reasons the Web became so attractive to people outside the circles of academics and research. As the saying goes, "graphics sells," and in the case of the Web, the graphics gave it a tremendous leg up on competing systems such as Gopher.
However, as the Web grew and Mosaic became the dominant browser, the danger that a group of aggressive developers could hijack the development of the Web became increasingly present. It was clear that in order for Berners-Lee's vision of the universal Web to survive, some neutral body was needed to build consensus among the various forces that were pulling the Web in different directions.
Adoption of the New Paradigm
The Web needed to leave CERN for it to truly succeed -it was, after all, a particle research facility and not focused on computer science research. Berners-Lee therefore moved his work and advocacy for the Web to the Massachusetts Institute of Technology's Laboratory for Computer Science and started the World Wide Wed Consortium (W3C). "The W3C was founded in October 1994 to lead the World Wide Web to its full potential by developing common protocols that promote its evolution and ensure its interoperability."
Michael Dertouzos, the Director of the Laboratory for Computer Science describes that there was "tremendous synergy" between him and Berners-Lee when they met in Zurich in 1994. Dertouzos was interested in creating an "information marketplace" and saw the world-wide web as the "potential underlying mechanism" for his vision.
MIT had just successfully spun off the X Consortium and this provided good timing for Berners-Lee to set up the W3C when he was invited to do so by Dertouzos. He saw it as his opportunity to continue to drive the future of the Web as it grew.
CERN allowed the Web to leave and gave up its rights to the technology that Berners-Lee had developed there.
The W3C is very active today in promoting standards for different web technologies.
The aforementioned "browser wars" in which Mosaic and other browsers battled to outdo each other by adding more and more proprietary features was brought under control by the consensus-building approach of the W3C. This stabilization and standardization fostered the growth of the Web, as developers and companies can depend on there not being major differences and incompatibilities between the different browsers.
The birth of the W3C signifies the acceptance of the Web as a new paradigm in information sharing. The W3C is a forum for the building of consensus on the direction of the Web. It also represents the community that lives and works within this engineering paradigm. By working to extend the Web and fulfill its potential, the work of the W3C represents the kind of normal engineering that goes on within an established paradigm. Once people began to accept the W3C as an authority in matters regarding the Web, the engineering revolution has truly come full circle.
- "The World Wide Web as an Engineering Paradigm," Matthew Lee, Andrew Montgomery, Steven Shapiro, Veeral Shah, Qian Wang, December 15, 2000