paper for WWW'94, CERN, May 1994 and GopherCON '94, Minneapolis, Minn, April 1994
At Los Alamos National Laboratory (LANL), we wanted to set up a World-Wide-Web (WWW) server. A few months earlier we had already set up a Gopher server. Rather than maintaining two separate systems, with a plain ASCII copy of the information in one system along with an HTML version in the other, we decided to write a WWW server that would make use of the existing Gopher information structure.
Of course, you don't need to do anything special in order to view Gopher information via the Web, you can simply use the URL of the form
gopher://nodename/type/pathto point to your existing Gopher server. However, we wanted the ability to enhance the existing information using HTML, without having to duplicate all of it. The following goals lead to the development of the gopherhttpd server:
This document describes the gopherhttpd server that achieves all of the above goals. The installation and operation of this server will be described, and examples from the LANL server will be shown.
Here is what the LANL Gopher server looks like.
Here is what the LANL WWW Home Page looks like.
The LANL WWW Home Page was generated "on the fly" based upon information from the LANL Gopher server. Note that all of the Gopher menu items appear in the WWW Home Page. Each menu item is "annotated" with a short description. In addition, the entire menu is preceded by a short description (like an "About" or "README" file would normally be used for in Gopher).
Note that the WWW Home Page takes full advantage of HTML, including the ability to present a logo.
This text is used to describe the current page. It is optional. The
contents of this section are taken from the file
README.html
in the current Gopher directory. This file
can contain any HTML code that is desired.
Each of the Menu Items in the template shown above are taken from the
Gopher menu. The gopherhttpd server reads both the
.cap
files and the .Links
files and produces a menu just like the Gopher menu. Each menu
item corresponds to a specific file or directory in Gopherspace. The
description of this file or directory is read from the file
filename.about.html
where filename
is the
name of the Gopher file or directory. The optional icon is taken
from the file .cap/filename.gif
if it exists.
The only thing that was done to the Gopher server to make all of this work was to add the line
ignore: .htmlto the
gopherd
configuration file. Thus, any file ending in
.html
will be ignore by the Gopher server, and handled instead by
gopherhttpd.
The Author name or email address shown at the bottom of the screen is
taken from a file in the current Gopher directory called
AUTHOR.html
.
If this file is not present, and is not found in any parent directories,
default information contained in the
gopherhttpd configuration file is placed here instead. Since this
information can contain any HTML commands, you can provides links to
your phone book entries.
That's it! This is the basic template for every HTML page created by gopherhttpd from your existing Gopher menus.
Normally, gopherhttpd will automatically create the URL for each item
in the menu to point to either another menu, or to the information
file itself. If the file filename.html
exists, then a
link to this file will be generated instead. This allows you to completely
override a file in Gopher with an HTML file by putting both into the
same Gopher directory. Since Gopher ignores HTML files, Gopher users
will only see the original data, and WWW users will only see the HTML
data.
You can prevent Gopher items from appearing in the WWW page by adding
a file to the Gopher directory called filename.ignore
Of
course, you then also need to tell your Gopher server to ignore files
that end in .ignore
. To avoid this, gopherhttpd
also recognizes the file .cap/filename.ignore
.
Now Gopher users will see the
original filename
, but gopherhttpd will omit it from the
WWW menu.
filename.about.html
method described above, you can
achieve this same effect through some extensions to the Gopher
.cap/filename
and .Links
files.
Gopher clients ignore lines in these files that do
not start with recognized keywords. Thus, we have added some new
keywords that gopherhttpd will recognize, but Gopher will ignore.
Here is a list of the new keywords:
Desc
WWW
WWW=url
option in the .cap
file. The url
specified on
this line will be used as the menu item. Also, a line in the
description field will automatically be generated that says
which points to the Gopher system.
If you put an asterisk (*) at the end of the URL, this Gopher message
will be supressed. A dollar-sign ($) in the URL is expanded from the
Host
field. If the protocol string is missing from the
front of the URL, http://
is added by default. Thus, the
string
WWW=$/welcome.html
will generate a menu item with a link of
http://hostname/welcome.html
where hostname
is taken
from the Host=hostname
line in the .cap
file.
Pre
Post
Before
<dt>
that flags the start of the
menu item.
You can think of this text as appearing after the Description text of the
previous item. A common use of this item is with
Before=<hr>
to put a horizontal rule before the menu item.
.cap/filename
file, be sure and put
the new items after any existing Gopher items. Many gopher clients
abort their parsing of the .cap
file when they reach an unknown
keyword. Thus, all standard Gopher keywords should come first,
followed by the extended gopherhttpd keywords.
filename.about.html
, where filename
is the
name of the Gopher file or directory, to specify the description that
appears under the menu item. This is just the default use for the
About file -- you can do much more. If a line in the About file doesn't start with
keyword=
, then it is assumed to be a line in the menu
description. A single line in this file is equivalent to putting a
Desc=description
line in the .cap file.
However, in the About file (the filename.about.html
file), you can specify multiple description lines. In the HTML
output, a <dd>
is inserted at the beginning of each line to force a
line break in the description.
You can also put any valid .cap
information into the About file. Any
information specified in the About file will override information in
the .cap
file. This allows you to further modify and customize your
WWW page since you can change the Gopher information in the
gopherhttpd About file for a given menu entry.
Internet Gopher Information Client 2.0 pl10 Root gopher server: gopher.lanl.gov --> 1. News Flash 7-Mar-1994: What's New in the LANL Gopher.... 2. ---------------------LANL Information---------------------. 3. News and Events/ 4. Phone Book/ 5. Job Openings/ 6. Library Catalogs and Information/ 7. Computing at LANL/ 8. Information Architecture Project/ 9. Software Archive/ 10. Information by Division/ 11. Information by Subject/ 12. -----------------------The Internet-----------------------. 13. About the Internet/ 14. How to get Gopher/Mosaic Software/ 15. The Internet via Gopher/Mosaic/ 16. Finding People, Places, and Information/ 17. Selected Software Archives (FTP)/ 18. Network News (USENET)/ Press ? for Help, q to Quit Page: 1/2
What we want the WWW home page to look like is something like this:
Name
of the
section headings using the About file. Let's concentrate on the menu
item titled "LANL Information". This Gopher menu item points to a
file of information about LANL, with a filename of lanl
.
Here is the contents of the .cap/lanl
Gopher file:
Name=---------------------LANL Information--------------------- Numb=2Pretty simple. Now, here is the contents of the
lanl.about.html
file used by gopherhttpd:
Name=LANL Information Desc=<dl> Before=<p>The
Name=
line overrides the Gopher name that contains
all of the hyphens. The Desc=<dl>
line
tells HTML to start a new description list for the following menu
items.
The Before=
line adds some space between the previous menu
item and the current one.
gopherhttpd will automatically create a link to the existing
lanl
Gopher file. If you want, you could create an HTML
version of this file called lanl.html
, and the link would
automatically point to the HTML file rather than the ASCII file.
.cap/filename.ignore
and gopherhttpd will omit the
item from the menu listing. However, what if you only want a
particular item to appear in the WWW page, and not the Gopher page?
To create a new WWW-only menu item, simply create a
filename.about.html
file. For example, Let's say we want
to add a menu item that points to the master list of WWW servers.
Obviously we don't want our Gopher users to see this, since it
contains a list of WWW servers, not Gopher servers. We create a file
called www-list.about.html
with the following contents:
Name=Master list of WWW servers around the world Numb=10 WWW=http://info.cern.ch/hypertext/DataSources/bySubject/Overview.html Desc=A listing of registered World-Wide-Web servers maintained at CERNThe
Name=
line specifies the highlighted text of the menu
item. The Numb=
line specifies that this item appears in
tenth place in the menu. Without the WWW=
line,
gopherhttpd would create a link to a file called www-list
or www-list.html
. By overridding this link, we can point
to the server list at CERN instead. Finally, the Desc=
line adds a short annotation for this menu item.
AUTHOR.html
is used to sign the bottom of each
WWW page. If the file does not exist in the current Gopher directory,
or in any parent directories, default information from the gopherhttpd
configuration file is used. This file can contain any HTML code.
However, if the file contains a single line with the syntax:
text,nnnnnnor
nnnnnn:{text}where
nnnnnn
is a six-digit number, then gopherhttpd
automatically creates a link to the LANL phone book. The
text
will be highlighted, and linked to the following
URL:
http://www.lanl.gov:52271/?-l+nnnnnnThe LANL phone book runs on port 52271 and takes a query. The
-l
tells the LANL phone book to output the long form of
the record, and the 6-digit number represents the LANL employee
number.
.cap/filename
and
filename.about.html
file syntax.
Status
Admin
nnnnnn:{text of first contact},nnnnnn:{text of second contact}...Each contact will be placed on a separate line.
GopherLink
Host
, Port
, Path
keywords. You can override this using the
GopherLink
keyword. In particular, you can use a value of
none
if the system is not running a Gopher server.
WebLink
Host
,
Port
, and Path
keywords, possibly overriden with the WWW
keyword. You
can override this with the WebLink
keyword. In particular, you can
use a value of none
if the system is not running a WWW
server.
.cap
file for a system running
both a Gopher and WWW server, and how gopherhttpd formats this entry:
Name=DOE High Performance Computing Research Center (ACL) Type=1 Host=gopher.acl.lanl.gov Port=70 Status=production Path= WWW=http://www.acl.lanl.gov/Home.html Admin=102733:{Jerry DeLapp (jgd@acl.lanl.gov), Gopher}\n114212:{Ron Daniel (rdaniel@acl.lanl.gov), WWW} Desc=Information about the Advanced Computing Laboratory and all of the projects that they are involved in. ACL staff and facilities information. Link to central LANL server.
gopherhttpd displays this menu item like this:
Status...
production
WWW......
http://www.acl.lanl.gov/Home.html
Gopher...
gopher://gopher.acl.lanl.gov/
Admin....
Jerry DeLapp (jgd@acl.lanl.gov), Gopher
.........
Ron Daniel (rdaniel@acl.lanl.gov), WWW
If you put this information into the Gopher .Links
file, rather than
using a .cap
file, you will end up with menu items in your Gopher server.
At LANL, we put entries in the .Links
file for all servers running
Gopher, then create individual filename.about.html
files
for servers that do not run Gopher. This way, Gopher users see a list
of all LANL Gopher servers, and WWW users see a nice annotated list of
all Gopher and WWW servers at the Lab.
Installing gopherhttpd
Installation of gopherhttpd is very similar to the installation of a
Gopher server. gopherhttpd is meant to be run from the Unix inetd
daemon. Here are the steps involved in installation:
/etc/services
file for your new
WWW server. This entry should look something like:
httpd 80/tcp # WWW server
/etc/inetd.conf
file. This
entry should look something like:
http stream tcp nowait nobody /etc/gopherhttpd gopherhttpd /gopher /etc/gopherhttpd.confThe meaning of the parameters will be listed in the next section. Note that this daemon is run as user
nobody
. This is recommended as a
security precaution to prevent someone from gaining root access
through unknown holes in gopherhttpd. This example is taken from a Sun
Sparcstation. Some Unix systems do not allow you to specify the user
id that your server runs as.
ps -ax | grep inetdto determine the process id of
inetd
. Then issue a kill
-1 pid
to restart it.
/etc
directory. Feel free to use any
directory you wish, and simply update the entry in inetd.conf
to reflect
the actual location of these files.
Here
is the source for gopherhttpd
gopherhttpd is written in perl. perl is an interpreted language that
requires a run-time interpreter. The first line in gopherhttpd points
to the location of the perl interpreter. The default location is
/usr/bin/perl
. If your perl interpreter is located in a
different place, change the first line in gopherhttpd. If you don't
have PERL,
go get it!
No Unix system should be without it.
The second parameter is the location of the gopherhttpd configuration file. The contents of this configuration file are very similar to the contents of your Gopher configuration file. In particular, it contains information about MIME file types, access control lists, and miscellaneous information such as the node and port of your Gopher and WWW servers. The sample configuration file is full of comments that explain each parameter.
access: ip-address accesswhere
ip-address
is the full or partial IP address of the
system of network you want to control access on. access
is either a + or - to allow or deny access. If the access field is
any string beginning with an exclamation mark (!), access is denied,
any other string not beginning with an ! allows access. The second
form of the syntax makes the configuration file compatible with
existing gopher configuration files.
The ip-address field can actually be any Unix regular expression.
Periods (.) not followed by a * or + are automatically escaped.
Missing fields in the IP address are filled with .* automatically.
Thus, the ip-address 128.165
expands to the regular
expression 128\.165\..*\..*
, matching any node beginning
with the specified numbers.