Web Wandering

Tangled in Your Web

Michael Newcomb

This month, I'll present an overview of setting up your own World Wide Web page and a list of some tools you might find useful for that task. By most accounts, the number of home pages on the Web is growing geometrically, if not exponentially; if everyone else is out there, why shouldn't you be too?

Internet Tips

Owing to space limitations, there is only one Tip this month: check your phone line. Today's modems try to squeeze ever more information into the narrow pipeline provided by your phone company. This pipeline was designed to carry the importunate blandishments of telemarketers, not the vast quantities of data churned up by Web surfing.

Accordingly, it pays to see how fast your connection to the Web really is. Look carefully at the messages displayed when your modem connects to your Internet provider. If you use Windows 95, the Dial Up Networking applet will tell you the connection speed reported by your modem; most other TCP/IP stacks display this speed at least for a moment. If you have a 28.8 kbps modem, does the speed message say "Connected at 28800"?

The answer may be "no" more often than you would think. For most 28.8 kbps modems, it's very common to connect at 26.4 kbps. Sometimes, the connect speed may be even slower. Why? There are three possible sources of trouble.

The Phone Company

One kind of trouble comes from the phone company. From what I have been able to find out, the latest digital switches used by NYNEX and the other regional phone companies use smart compression algorithms to allocate trunk bandwidth among the calls they carry.

There is a lot of compressible data in a voice conversation, including pauses between words and sentences that seem very long to microprocessors. Data calls aren't so lucky: given the compression built into most modems, there isn't much room left for the switch's compression.

If traffic is low, this isn't a problem: the switch "steals" bandwidth not being used by voice calls to carry the data. During high traffic periods, there may be no trunk bandwidth left to steal. In fact, if the number of calls is high enough, the switch may start reducing the net bandwidth allotted to each call. This reduction probably wouldn't be noticeable during a voice call, but your modem can sure feel it.

This allocation problem is compounded by the fact that Internet providers often have dozens or even hundreds of modems carried by the same switch. At peak usage periods, with every call trying to connect at 28.8 kbps, it's easy to see how the phone company can run out of bandwidth. The resulting degradation causes you to connect at 26.4 kbps or even 14.4 kbps.

There's absolutely nothing you can do about this problem. The only real solution is to bypass the voice handling circuits altogether, which is one of the things ISDN is supposed to do, or call only when traffic is low.

Your Internet Provider

The second kind of connection trouble you can't control is speed throttling at the provider's end. Obviously, your Internet provider's host equipment has only a given amount of processing power and only a certain amount of throughput to the Internet itself. If the provider adds more modems than his equipment or his Internet link can comfortably support, you get a lower-speed connection even if you nominally connect at 28.8 kbps.

Some providers will lower the maximum connect speed for their modems when their equipment becomes heavily loaded. Reducing the maximum connect speed effectively gives the host equipment more time to service each modem. This is a handy way to solve the provider's problems, but it can transform your Web surfing into Web wading or even Web trudging.

An ftp transfer can test the real speed of your connection. Most ftp clients tell you the connection throughput as part of their progress display. WS_FTP, for example, gives throughput in effective bits per second, transfer time, and transfer time remaining.

For a fair test, you should retrieve a file from the provider's system; this prevents the vagaries of the Internet from interfering with the results. For example, to test TIAC's throughput, I would connect to their utility ftp site ftp.tiac.net and retrieve any largish file. Since the file is coming directly from TIAC's hardware and not over the Internet, I should get accurate and repeatable results.

Having run this experiment on TIAC a few times, I can report that results vary drastically depending on the time of day. Throughput at peak usage times (business hours on a weekday) may be half the speed obtainable off peak (the middle of the night).

Your Phone Line

The last source of modem asphyxia is under your control: your connection to the phone company's network. Problems here are more common than anyone wants to think about, and real solutions can be tedious and expensive.

The first test is really simple: plug a phone into the same line as your modem uses, lift the receiver, and dial a single digit to clear the dial tone. You should hear some approximation of silence. If there is static or other audible noise, you are almost guaranteed to have poor modem performance. The higher the speed of a data connection, the more sensitive it is to noise.

Try whistling into the phone. If the whistle echoes or "rings," you have a problem. Echoes and distortion are deadly to modem calls.

If you suspect that your line is not performing properly, you can ask the phone company to send someone out to test it. If you ask, the lineperson can give you noise and loop-quality statistics and explain to you what they mean. He can also tell you if your line is relatively good or bad.

Unfortunately, the quality levels used by the phone company are intended for voice calls, not high-speed modem transfers. A line can pass all the phone company tests with flying colors and still be practically useless for data transmission. If the phone company says your line is OK, you're basically out of luck: they guarantee adequate voice calls.

In the post-AT&T-breakup world of telephony, the phone company's responsibility stops at the walls of your home or business. If they bring a working line to you, any problems inside the structure are your problems. Your line can be excellent, but if you have old, faulty, or low-quality wiring inside the building, you'll never get a good connection.

Troubleshooting phone problems is way beyond the scope of this article, but you should know that equipment exists that can test your phone wiring. An electrician that specializes in telephone work may be a useful last resort.

You can also take practical steps like making the connection between your modem and the phone company as simple as possible. Don't "daisy chain" an endless series of telephonic gadgets between the modem and the wall jack. Know that the number of extensions and cordless phones you need may be considerably lower than the number you actually have.

Your modem may also be able to tell you things about your data connection. If you notice a long pause in your data conversation, look at your modem's status lights. Many modems will display a particular light pattern or status message when they are "resynchronizing" with the remote modem. Resynchronization usually happens when the modems decide they can't sustain the current connection speed. Frequent resynchronization is a sign of a poor phone link.

Call Waiting

No discussion of modem connection problems would be complete without a mention of Call Waiting. If you generally get good connections to remote systems, but your modem suddenly hangs up for no apparent reason, you may be a victim of Call Waiting.

Call Waiting is an optional service offered by the phone company. If you are on the phone and a call comes in, you will hear a tone or a clicking noise. The idea is that you can "flash" the phone's switch hook to accept the other call.

Unfortunately, the Call Waiting signal is usually lethal to a data connection. The solution is to block Call Waiting for the duration of your modem call. In most areas, you can do this by adding the string "*70," (star seven zero comma) as a prefix to the number dialed by your communications software. Dialing "*70" disables Call Waiting for the current call only (the comma adds a short pause).

This problem is so common that many terminal programs offer Call Waiting blocking as a built-in option. Win95's Dial Up Networking applet, for example, permits automatic Call Waiting blocking from the "Dial Properties" dialog box.

Publish Yourself

Everybody wants to put up a Web page these days, or so it would seem. As a long-time Web surfer, I can only say that I wish more of the people who create Web pages had something worthwhile to publish for the world, but that's neither here nor there.

There are two main reasons why nearly everyone is exposing themselves on the Web: simplicity and cost. Creating a basic Web page is actually very easy, even if you have relatively little computer experience. At its simplest, a Web page can be nothing more than some text typed into a file.

It's a common misconception that publishing on the Web is free. It isn't. Unless you work for a university or an especially munificent corporation, you will have to pay something for Internet access. Even $20 per month adds up to $240 per year. To put that in perspective, imagine the "reverse" transaction: subscribing to a magazine. A print publication would have to be awfully good to be worth $240 per year.

You will also have to collect (and possibly pay for) some basic software tools and invest some time learning the HTML coding system used by the Web. The biggest expense, though, unless you count your time as worthless, is creating and maintaining the content. To attract repeat visitors to your Web page--if that's what you want--you will have to continuously change your site. This can turn out to be a lot of work.

Nuts and Bolts

Boiled down to its essentials, a Web site is a set of one or more files sitting on a computer that's connected to the Internet. This host computer runs software that parcels out these files to anyone who wants to look at them.

Visitors enter your site by telling their Web clients (e.g., Netscape Navigator) your site's address, called a URL. These Web clients, also called browsers, are actually incredibly simple: all they do is download files from Web hosts and interpret these files.

The Web client that started the big Internet brouhaha, a browser called Mosaic, was only 9,000 lines of code. That gives you an idea how a company like Netscape, with only a handful of employees, could create a product that made it suddenly "worth" several billion dollars. It also tells you why the Netscape phenomenon is so frightening: any company with a handful of employees can create a good browser in a few months. If your company's core product can be duplicated by almost anybody--most notably by companies like Microsoft that can afford to give their knockoff away until you're bankrupt--you'd better think about selling out while the selling's good.

To publish on the Web, you need a host computer and a set of files to put on it. Connecting your own computer to the Internet as a Web host is impractical for the small Web publisher, so you will probably want to sign up with an Internet provider that offers Web hosting services.

Providers and Web Hosting

Since so much of the interest in the Internet is driven by the Web, most providers offer Web hosting in one form or another. The cost of these services depends on several factors:

Each provider has a different fee structure; any one provider may incorporate any or all of the charges listed above. Some providers have doubtless thought up charges that I haven't; it can really pay to shop around and see what's available. You may even want to have an account on one provider for your Web page and another account elsewhere for your routine Web surfing.

To give a real-world example, Bedford's TIAC (The Internet Access Company), where the on-line version of PC Report is currently stored, offers free Web page hosting to all of its customers, commercial or not. Each basic Web account comes with a computer-generated URL and ten megabytes of disk space. The only fee TIAC imposes for Web hosting is a transfer charge: the first 1,000 megabytes transferred are free, and each additional megabyte transferred costs four cents.

Just to give you an idea, my TIAC Web site (which includes PC Report on-line) takes up about four megabytes at present. If every visitor read every document on the site, which is highly unlikely to say the least, I could accept 250 visitors per month without incurring transfer charges. Most Web sites are far smaller than mine.

TIAC offers a number of additional services, such as domain registry, at extra cost. The only major downside to TIAC's Web hosting service is the fact that they provide no hit statistics whatsoever. If you set up on TIAC, you will have no idea whether you had one visitor or a thousand.

Creating Your Web Files

As I said before, Web sites are made up of ordinary ASCII text files. These files contain text tagged with a coding system called HTML (HyperText Markup Language).

HTML's formatting capabilities are extremely limited, but it does allow for the placement of graphics and for the construction of hyperlinks to other documents. These links form the basis of Web surfing: you jump from document to document by clicking on the links.

I'm not going to bother with a detailed description of HTML--there are plenty of books on this subject available at any bookstore. HTML books are the computer equivalent of books by O.J. Simpson jurors: they may not be worthwhile, but you just can't get away from them. If you would like to see an article about HTML, send me some e-mail (pcreport@miken.com) and I'll reconsider if there's enough interest.

You can generate HTML documents with any editor that can create plain text files. Web browsers ignore white space characters like carriage returns, linefeeds, tabs, and multiple spaces, so you can be fairly lackadaisical about the "look" of the file.

HTML documents are formatted by "tags." A tag is a code word enclosed in less-than and greater-than signs, like this:

<H1>Heading Level 1</H1>

These tags tell the browser that the phrase "Heading Level 1" should be formatted as a level one heading. What this means exactly depends on the browser: each browser implements a different set of HTML tags in a different way. All browsers respond to codes like "level one heading," but Mosaic, for example, may use bold type, while Navigator might use italics.

These discrepancies are the enduring curse of Web publishing. You can either develop your site for the lowest common denominator (and end up with a plain-vanilla look) or choose one browser as your standard. If you develop your site for one browser, you can never be exactly sure how other browsers will show the site.

It gets worse when you factor in the extensions supported by some browsers. Netscape has made a profession of unilaterally extending the HTML language; designing to Netscape Navigator may make your site completely illegible to visitors using other browsers. On the other hand, Netscape's extensions add vital capabilities to HTML, so the most interesting sites use Navigator as their standard and ignore the others.

Fortunately, browsers are supposed to ignore tags they don't understand, so you should see something recognizable no matter how different your Web client is from Navigator. That's the main virtue of HTML publishing: everybody should be able to see something, no matter what platform, operating system, or browser they are using. The downside is that you don't have an awful lot of control over exactly what your visitors will see; even fonts and display colors are only partially under your control.

HTML Editors

Since HTML tags are just text, you can type them in by hand if you have to. Needless to say, there are better ways. There are four basic methods to make HTML editing easier: word processor extensions, tag editors, WYSISYG editors, and converters. Table 1 lists some HTML tools.

Table 1. HTML Editors
EditorTypeLocationDescription
HotDog ProTag Editorwww.sausage.comExcellent shareware tag editor.
HotMetalWYSIWYG Editorwww.sq.comProbably will be remembered as the ancestor of a good visual HTML editor. Version 2.0, availablesoon, may rectify serious problems.
Microsoft Internet AssistantWinword Extensionwww.microsoft.com/msoffice
/freestuf/msword/download/ia/default.htm
Powerful set of extensions to Word for Windows. Beta version for Word 95 released at press time.
WebEditTag Editorwww.nesbitt.comPopular shareware tag editor. WYSIWYG preview mode promised soon.
Live MarkupWYSIWYG Editorwww.digimark.net/mediatechCommercialHTML editor.
Web AuthorWinword Extensionwww.qdeck.comCommercial HTML editing toolkit by Quarterdeck of QEMM fame.
Web WizardHTML generatorwww.halcyon.com/
artamedia/webwizard
Very popular free tool creates simple HTML documents after an "interview" process similar to the "wizards"found in most commercial software programs today. Excellent starting place for beginning Web authors.
Internet PublisherWordPerfect Extensionftp://ftp.wordperfect.com/pub/
wpapps/intpub/wpipzip.exe
HTML editing extension set for WordPerfect 6.1 for Windows.

For updated information on the rapidly-changing world of Web editors, check these URLs:

Word Processor Extensions

As their name implies, HTML extensions for word processors usually consist of style sheets and macros designed to help you create Web documents. They use your existing word processor for editing, so you have access to its spelling checker and other document tools.

You probably already know how to use your word processor, so the learning curve for this type of HTML tool is considerably shorter than the others. Of course, if you don't have a word processor, these tools are useless for you.

For the moment, the most powerful and versatile word processor HTML extensions are available only for Microsoft Word for Windows. There are several tool sets for Word, including the free, well-respected Microsoft Internet Assistant package (available at www.microsoft.com).

If you already own Word, the Internet Assistant can be an excellent way to begin creating Web documents. You get to use a tool you already know augmented by a well-documented set of extensions.

Tag Editors

Tag editors add HTML tag management to simple text editing. For the past several months, I have been using a shareware tag editor called HotDog Pro by Sausage Software (www.sausage.com).

HotDog has things like a list of HTML tags, an on-line HTML reference, wizards for the nastier tags, table and form generators, and so on. If you already understand HTML, a tag editor like HotDog can really speed up the creation and testing of Web documents.

HotDog is less suitable for beginners, who may find some of the things it does extremely inscrutable. HotDog's greatest strengths are in its support for highly complex tagging like that used for forms and tables, which are unlikely to interest beginning Web publishers.

Another weakness of tag editors is their develop-and-test usage cycle. Rather like a C compiler, when you alter your Web page files with a tag editor, you must save the changes and switch to your browser to see the effect. Even though this step usually takes only a single keystroke or mouse click, it's still disorienting to constantly switch between the browser and the tag editor.

WYSIWYG Editors

This category has the fewest and most disappointing entries, but it is by far the most vital to the long-time viability of the Web. Just as using WordPerfect's hidden codes became unbearable once there was an alternative, the Web must eventually move away from cryptic manual HTML coding or perish.

The ideal WYSIWYG HTML editor would allow you to format text and position graphics much as a word processor or desktop publishing program does, with mouse clicks and drags. For now, there are no really acceptable entries in this category; HotMetal is the only functional WYSIWYG editor worth considering, and it remains annoyingly limited.

All this will change very soon, perhaps by the time you read this. Netscape has committed to releasing a new product called Navigator Gold that will combine Web browsing and WYSIWYG HTML editing into a single program. If they get Navigator Gold out on time, and if it isn't catastrophically buggy--two big ifs given Netscape's history--it should prolong the company's time on top of the Internet heap and give them time to put more nonstandard HTML extensions between them and the competition.

Whether Navigator Gold appears or not, there are certain to be more and better WYSIWYG HTML editors in the coming months. Since I strongly believe (and hope) this is the future, I will avidly watch these editors and keep you posted.

Converters

This is another emerging field. Converters are filters that take an existing word processing or desktop publishing document and try to represent its formatting using HTML tags. Most of the major word processor and DTP program vendors have promised built-in HTML converters, and several third-party conversion products are available.

So far, at least, converters are really only useful as a starting point. Even the most stripped-down word processor can create formatting too elaborate for HTML to emulate, so it's really a question of converting what you can and making up the difference with hand tagging.

Starting Up Your Site

Once you have created your HTML documents (I have deliberately glossed over a lot of details here), you will need to install them on your site. The exact procedure may differ depending on your provider, but most providers use the same basic system.

Even if they have nothing else, all Web pages have a top-level or "home" page. This is the document that a visitor sees when they first enter your site.

Your Web site, at least at first, might consist of only this one page. If you have other documents or files on the site (such as pictures or sound clips), they will probably be reached by following a link from the home page.

The provider's Web hosting software will usually require that your home page be given a specific file name. For example, TIAC requires that the home page be named INDEX.HTM. You can name your other documents and files anything you want (on TIAC, you can even create directory trees for your Web files), but the top-level page must have this special name.

It's probably obvious, but you should use sensible extensions for your files, because browsers will use a file's extension to determine how to interpret it (shades of Windows!). For example, HTML documents should end in .HTM or .HTML; GIF graphics in .GIF, and so on. Be aware that Web site file names are case sensitive: a link to "target.htm" will not find "TARGET.HTM" or "Target.htm".

You should also be aware that for the moment, the only graphic format accepted by all browsers is GIF. This format has a number of really serious disadvantages over more advanced formats like JPEG, but there is currently no escape.

Once you have your files in order, you will usually need to establish an ftp connection to the provider's Web host system. Previous Web Wandering articles have described the mechanics of this. It's worth mentioning that the ftp address of the Web server may not be the same as other ftp sites hosted by your provider; make sure you find out the exact address before starting.

Note that the ftp connection cannot be anonymous: you will need to provide your user ID and password. The reason for this is obvious: you don't want just anyone altering your Web site!

Once connected, you may need to change to a specific directory on the host. The Internet provider will be able to give you this information. If you use an ftp client like WS_FTP, you can store all these details in a profile so you only have to enter them once.

For my site at TIAC, I connect to ftp.www.tiac.net (note that this is different from TIAC's public ftp archive) with my user ID of "miken" and my password. Once I am logged in, I change to the directory "miken."

The first time you establish the ftp connection, you will probably see a default home page already installed. Most providers create a simple default Web page for each new account so that there is something at your URL even if you don't immediately install any documents of your own.

To install your documents, simply send them to the host system. That's it! Once the files have arrived on the host system, you're on the Web!

You should definitely test your site after you have finished installing the files. Even the most expert HTML hackers make mistakes, so you should test links and check your documents to make sure you see what you intended to.

It can be a real kick to type your URL for the first time and watch what you painstakingly created magically appear, even if it is only information about your cat. And remember: what you can see, anybody else on the Internet can see!

The Catch

There aren't many free lunches in this cynical, end-of-millennium world. The Web can have some downsides too. Here is a quick overview of some of the horrors that can await Web publishers:

The Hit Storm

One of the most unsettling things about being a Web publisher is that you can run up a huge bill without actually doing anything. Even better, you won't usually find out that you've spent the money until your credit card bill comes.

How can this happen? Think back to those fees and charges that I listed above. Two of the fees, the hit charge and the transfer charge, are not directly expended by you. These fees (on your credit card!) are incurred by other people: the people who visit your site.

The more successful your site is, the more hits and transfers you generate. If your provider charges for these things, having a successful site can become an expensive proposition, depending on the provider's fee structure. The Internet, already a place rich in paradoxes and mirror logic, adds one more fable: if you're really successful, you'll lose a whole bunch of money!

Most people's pages are unlikely to draw heavy hit or transfer charges. There are a few isolated cases, though, where you could be in for a very nasty surprise:

I should emphasize again that none of these fates is likely to befall the average innocuous Web page, but you should be aware that the risks exist, just as in any other venture.

Unsung Hero

People caught up in Internet dreams often suggest that the Web will eventually replace paper (where have I heard that before?). Unfortunately, there are few obvious ways except print to tell people your URL so they can find your site.

It's perfectly possible to put up a great site and have nobody find it. To generate traffic, you will need to get the URL out to people who may be interested in what you have to say. For PC Report, we list my site's URL on the Contents page and in this article.

You can get some traffic by adding your site to the big Internet indexes like Infoseek (www.infoseek.com) and Yahoo (www.yahoo.com), but they are rapidly becoming overwhelmed by the sheer number of sites. It can be just as hard to find something on Yahoo as it is to find the same thing by wandering around the Net: exactly how many menus do you have to navigate before the whole exercise becomes pointless?

You can also try your hand at posting your URL in USENET groups, but this has to be done cautiously. If you seem excessively self-promotional, the flamers will come out and make your life miserable. Appropriate posting behavior might be a single message about your Mad Max Web site in the rec.movies group; inappropriate behavior would be a bunch of messages sent to all 10,000 (or so) newsgroups about your commercial immigration-law Web site.

Bandwidth Overload

It's a regrettable fact of life on the Web: the more visually interesting a site is, the longer it takes to download. Remember that the typical Web surfer has a 28.8 kbps connection at best, with many crawling along at 14.4 kbps. If your page takes too long to download, nobody will wait: they'll just jump to some other page.

It's sad to see the way that the most innovative sites are drifting beyond the reach of average surfers. You already need an ISDN connection to enjoy some of the great cutting-edge sites: all those real-time graphics, sound clips, and animations take a lot of bandwidth. How long will it be before you'll need a T1 connection for these same sites? How much more load can the Internet's beleaguered infrastructure take?

You can help solve this problem by using bandwidth intelligently. Text takes almost no time to download, but large graphics, sounds, and video take massive bandwidth. If your site features these expensive things, show them as thumbnails on your home page and let the visitor choose what he wants to see.

When you create graphics for your Web page, use reduced palettes. An uncompressed 256-color bitmapped graphic is twice as large as an uncompressed 16-color bitmap. Don't use more colors than you need. If you have photographic images, use the JPEG compression format, which will drastically reduce the size of an image without greatly compromising its quality.

I said I would avoid HTML minutiae, but there is one major improvement you can make when embedding bitmaps in your Web pages: use the HEIGHT and WIDTH keywords in the IMG tag. If you use these keywords religiously, the text portion of your Web page will seem to download instantaneously, with the graphics filling in as time permits. Without these keywords, the browser must open all the graphics to see how large they are before it can lay out the page. This causes a long "blank screen" period before the Web page starts to appear.

Content Control

Nearly all providers have rules about what you can publish using their Web hosting equipment. It's a good idea to understand what the rules are before you go to the trouble of setting up a site.

The rules aren't likely to affect the average law-abiding Web publisher; they're generally designed to give the provider control over hard-core outlaws who set up pages suggesting (or illustrating) that Mickey Mouse is a sexual deviant.

Another problem area is copyright. If you could magically purge the Net of everything that was sampled from a commercial magazine, book, movie, television show, or music CD, there would be no World Wide Web. All this proves is that the copyright and patent laws designed to protect The House of Seven Gables and the cotton gin are woefully inadequate for the late twentieth century.

Various politicians, bureaucrats, judges, and lawyers have told us that they know the One True Way, but the actual state of copyright law on the Internet is a mystery to everyone, the sages included. Suffice it to say that, depending on who buys the most copies of Newt Gingrich's book, you may wake up one fine day and find yourself a felon.

Visit Web Wandering

Web Wandering (and indeed the whole text of PC Report) is available from my home page, which is currently on TIAC at www.miken.com. Hope to see you there, and be sure to let me know what you think!


[Home] [Previous] [Table of Contents] [Next] [Feedback]