Linux-Mandrake:
User Guide and
Reference Manual

MandrakeSoft

January 2000
http://www.linux-mandrake.com

Next : Installation in text mode
Previous : Building and installing free software
Up

Chapter 2 : Command Line Utilities

The purpose of this chapter is to introduce a small number of command line tools which may prove useful for everyday use. Of course, you may skip this chapter if you only intend to use a graphical environment, but a quick glance may change your opinion :)

There is not really any organisation in this chapter. Utilities are listed as they come, from the most commonly used ones to the most arcane ones. Each command will be illustrated by an example, but it is left as an exercise to you to find more useful uses of them.

`grep`: General Regular Expression Parser

Okay, the name is not very intuitive, neither is its acronym, but its use is simple: looking for a pattern given as an argument in one or more files. Its syntax is:

grep [options] <pattern> [one or more file(s)]

If several files are mentioned, their name will precede each matching line displayed in the result. Use the -h option to not display these names; use the -l option to get nothing but the matching filenames. It can be useful, especially in long argument lists, to browse the files with a shell loop and use the grep <pattern> <filename> /dev/null trick.

The pattern is a regular expression, even though most of the time it consists of a simple word. The most frequently used options are the following:

-i: Make a case insensitive search.
-v: Invert search: display lines which do not match the pattern.
-n: Display the line number for each line found.
-w: Tells grep that the pattern should match a whole word.

Here's an example of how to use it:

$ cat victim
Hello dad
Hi daddy
So long dad
  # Search for the string "hi", no matter the case
$ grep -i hi victim
Hi daddy
  # Search for "dad" as a whole word, and print the
  #   line number in front of each match
$ grep -nw dad victim
1:Hello dad
3:So long dad
  # We want all lines not beginning with "H" to match
$ grep -v "^H" victim
So long dad
$

In case you want to use grep in a pipe, you don't have to specify the filename as, by default, it takes its input from the standard input. Similarly, by default, it prints the results on the standard output, so you can pipe the output of a grep to yet another program without fear. Example :

$ cat /usr/doc/HOWTO/Parallel-Processing-HOWTO | \
  grep -n thread | less

`find`: find files according to certain criteria

find is a long-standing Unix utility. Its role is to recursively scan one or more directories and find files which match a certain set of criteria in these directories. Even though it is very useful, its syntax is truly arcane, and using it requires a little use. The general syntax is:

find [options] [directories] [criterion] [action]

If you do not specify any directory, find will search the current directory. If you do not specify the criterion, this is equivalent to "true", thus all files will be found. The options, criteria and actions are so numerous that we will only mention a few of each here. Let's start with options:

-xdev: Do not search on directories located on other filesystems.
-mindepth <n>: Descend at least <n> levels below the specified directory before searching for files.
-maxdepth <n>: Search for files which are located at most n levels below the specified directory.
-follow: Follow symbolic links if they link to directories. By default, find does not follow them.
-daystart: When using tests related to time (see below), take the beginning of current day as a timestamp instead of the default (24 hours before current time).

A criterion can be one or more of several atomic tests; some useful tests are:

-type <type>: Search for a given type of file; <type> can be one of: f (regular file), d (directory), l (symbolic link), s (socket), b (block mode file), c (character mode file) or p (named pipe).
-name <pattern>: Find files which names match the given <pattern>. With this option, <pattern> is treated as a shell globbing pattern (see chapter 35.0).
-iname <pattern>: Like -name, but ignore case.
-atime <n>, -amin <n>: Find files which have last been accessed <n> days ago (-atime) or <n> minutes ago (-amin). You can also specify +<n> or -<n>, in which case the search will be done for files accessed respectively at most or at least <n> days/minutes ago.
-anewer <file>: Find files which have been accessed more recently than file <file>
-ctime <n>, -cmin <n>, -cnewer <file>: Same as for -atime, -amin and -anewer, but applies to the last time when the contents of the file have been modified.
-regex <pattern>: As for -name, but pattern is treated as a regular expression.
-iregex <pattern>: As for -regex, but ignore case.

There are many other tests, refer to the man page for more details. To combine tests, you can use one of:

<c1> -a <c2>: True if both <c1> and <c2> are true; -a is implicit, therefore you can type <c1> <c2> <c3> ... if you want all tests <c1>, <c2>, ... to match.
<c1> -o <c2>: True if either <c1> or <c2> are true, or both. Note that -o has a lower precedence than -a, therefore if you want, say, to match files which match criteria <c1> or <c2> and match criterion <c3>, you will have to use parentheses and write ( <c1> -o <c2> ) -a <c3>. You must escape (disactivate) parentheses, as otherwise they will be interpreted by the shell!
-not <c1>: Inverts test <c1>, therefore -not <c1> is true if <c1> is false.

Finally, you can specify an action for each file found. The most frequently used are:

-print: Just prints the name of each file on standard output. This is the default action if you don't specify any.
-ls: Prints the equivalent of ls -ilds on each file found on the standard output.
-exec <command>: Execute command <command> on each file found. The command line <command> must end with a ;, which you must escape so that the shell does not interprete it; the file position is marked with {}. See the examples of usage to figure this out.
-ok <command>: Same as -exec but ask confirmation for each command.

Still here? OK, now let's practice a little, as it's still the best way to figure out this monster. Let's say you want to find all directories in /usr/share. Then you will type:

find /usr/share -type d

Suppose you have an HTTP server, all your HTML files are in /home/httpd/html, which is also your current directory. You want to find all files which contents have not been modified for a month. As you got pages from several writers, some files have the html extension and some have the htm extension. You want to link these files in directory /home/httpd/obsolete. You will then type:

find ( -name "*.htm" -o -name "*.html" ) -a -ctime -30 -exec ln {} /home/httpd/obsolete ;[27]

Okay, this one is a little complex and requires a little explanation. The criterion is this:

( -name "*.htm" -o -name "*.html" ) -a -ctime -30

which does what we want: it finds all files which names end either by .htm or .html (( -name "*.htm" -o -name "*.html" )), and (-a) which have not been modified in the last 30 days, which is roughly a month (-ctime -30). Note the parentheses: they are necessary here, because -a has a higher precedence. If there weren't any, all files ending with .htm would have been found, plus all files ending with .html and which haven't been modified for a month, which is not what we want. Also note that parentheses are escaped from the shell: if we had put ( .. ) instead of ( .. ), the shell would have interpreted them and tried to execute -name "*.htm" -o -name "*.html" in a subshell... Another solution would have been to put parentheses between double quotes or single quotes, but a backslash here is preferable as we only have to isolate one character.

And finally, there is the command to be executed for each file:

-exec ln {} /home/httpd/obsolete ;

Here too, you have to escape the ; from the shell, as otherwise the shell interprets it as a command separator. If you don't do so, find will complain that -exec is missing an argument.

A last example: you have a huge directory /shared/images, with all kind of images in it. Regularly, you use the touch command to update the times of a file named stamp in this directory, so that you have a time reference. You want to find all JPEG images in it which are newer than the stamp file, and as you got images from various sources, these files have extensions jpg, jpeg, JPG or JPEG. You also want to avoid searching in directory old. You want to be mailed the list of these files, and your username is john:

find /shared/images -cnewer     \
     /shared/images/stamp       \
     -a -iregex ".*\.jpe?g"     \
     -a -not -regex ".*/old/.*" \
       | mail john -s "New images"

And here you are! Of course, this command is not very useful if you have to type it each time, and you would like it to be executed regularly... You can do so:

`crontab`: reporting or editing your `crontab` file

crontab is a command which allows you to execute commands at regular time intervals, with the added bonus that you don't have to be logged in and that the output report is mailed to you. You can specify the intervals in minutes, hours, days, and even months. Depending on the options, crontab will act differently:

-l: Print your current crontab file.
-e: Edit your crontab file.
-r: Remove your current crontab file.
-u <user>: Apply one of the above options for user <user>. Only root can do that.

Let's start by editing a crontab. If you type crontab -e, you will be in front of your favorite text editor if you have set the 'EDITOR' or 'VISUAL' environment variable, otherwise VI will be used. A line in a crontab file is made of six fields. The first five fields are time intervals for minutes, hours, days in the month, months and days in the week. The sixth field is the command to be executed. Lines beginning with a # are considered to be comments and will be ignored by crond (the program which is responsible for executing crontab files). Here is an example of crontab:

Note: in order to print this out in a readable font, we had to break up long lines. Therefore, some chunks must be typed on a single line. When the '' character ends a line, this means this line has to be continued. This convention works in Makefile files and in the shell, as well as in other contexts.

# If you don't want to be sent mail, just comment
#   out the following line
#MAILTO=""
#
# Report every 2 days about new images at 2 pm,
#   from the example above - after that, "retouch"
#   the "stamp" file. The "%" is treated as a
#   newline, this allows you to put several
#   commands in a same line.
0 14 */2 * *  find /shared/images              \
  -cnewer /shared/images/stamp                 \
  -a -iregex ".*\.jpe?g"                       \
  -a -not -regex                               \
    ".*/old/.*"%touch /shared/images/stamp
#
# Every Christmas, play a melody :)
0 0 25 12 * mpg123 $HOME/sounds/merryxmas.mp3
#
# Every Tuesday at 5pm, print the shopping list...
0 17 * * 2 lpr $HOME/shopping-list.txt

There are several other ways to specify intervals than the ones shown in this example. For example, you can specify a set of discrete values separated by commas (1,14,23) or a range (1-15), or even combine both of them (1-10,12-20), optionally with a step (1-12,20-27/2). Now it's up to you to find useful commands to put in it :)

`at`: schedule a command, but only once

You may also want to launch a command at a given day, but not regularly. For example, you want to be reminded an appointment, today at 6pm. You run X, and you'd like to be notified at 5:30pm, for example, that you must go. at is what you want here:

$ at 5:30pm
  # You're now in front of the "at" prompt
at> xmessage "Time to go now! Appointment at 6pm"
  # Type C-d to exit
at> <EOT>
$

You can specify the time in different manners:

now +<interval>: Means, well, now, plus an interval (optionally. No interval specified means just now). The syntax for the interval is
<n> (minutes|hours|days|weeks|months).
For example, you can specify now + 1 hour, now + 3 days and so on.
<time> <day>: Fully specify the date. The <time> parameter is mandatory. at is very liberal in what it accepts: you can for example type 0100, 04:20, 2am, 0530pm, 1800, or one of three special values: noon, teatime (4pm) or midnight. The <day> parameter is optional. You can specify it in different manners as well: 12/20/2001 for example, which stands for December 20th, 2001, or, the European way, 20.12.2001. You may omit the year, but then only the European notation is accepted: 20.12. You can also specify the month in full letters: Dec 20 or 20 Dec are both valid.

at also accepts different options:

-l: Prints the list of currently queued jobs; the first field is the job number. This is equivalent to the atq command.
-d <n>: Remove job number <n> from the queue. You can obtain job numbers from atq. This is equivalent to atrm <n>.

As usual, see the at(1) manpage for more options.

`tar`: Tape ARchiver

Although we have already seen a use for tar in chapter 21.0, we haven't explained how it works. This is what this section is here for. As for find, tar is a long standing Unix utility, and as such its sytax is a bit special. The syntax is:

tar [options] [files...]

Now, here is a list of options. Note that all of them have an equivalent long option, but you will have to refer to the manual page for this as they won't be listed here. And of course, not all options will be listed either :)

Note: the initial dash (-) of short options is not now deprecated with tar, except after a long option.

c: This option is used in order to create new archives.
x: This option is used in order to extract files from an existing archive.
t: List files from an existing archive.
v: This will simply list the files are they are added to an archive or extracted from an archive, or, in conjunction with the t option (see above), it outputs a long listing of file instead of a short one.
f <file>: Create archive with name <file>, extract from archive <file> or list files from archive <file>. If this parameter is not given, the default file will be /dev/rmt0, which is generally the special file associated to a streamer. If the file parameter is - (a dash), the input or output (depending on whether you create an archive or extract from one) will be associated to the standard input or standard output.
z: Tells tar that the archive to create should be compressed with gzip, or that the archive to extract from is compressed with gzip.
y: Same as z, but the program used for compression is bzip2.
p: When extracting files from an archive, preserve all file attributes, including ownership, last access time and so on. Very useful for filesystem dumps.
r: Append the list of files given on the command line to an existing archive. Note that the archive to which you want to append files should not be compressed!
A: Append archives given on the command line to the one submitted with the f option. Similarly to r, the archives should not be compressed in order for this to work.

There are many, many, many other options, you may want to refer to the tar(1) for a whole list. See, for example, the d option. Now, on for a little practice. Say you want to create an archive of all images in /shared/images, compressed with bzip2, named images.tar.bz2 and located in your home directory. You will then type:

#
 # Note: you must be in the directory from which
 #   you want to archive files!
 #
$ cd /shared
$ tar cyf ~/images.tar.bz2 images/

As you can see, we have used three options here: c told tar that we wanted to create an archive, y told it that we wanted it compressed with bzip2, and f /images.tar.bz2 told it that the archive was to be created in our home directory, with name images.tar.bz2. We may want to check if the archive is valid now. We can just check this out by listing its files:

#
 # Get back to our home directory
 #
$ cd
$ tar tyvf images.tar.bz2

Here, we told tar to list (t) files from archive images.tar.bz2 (f images.tar.bz2), warned that this archive was compressed with bzip2 (y), and that we wanted a long listing (v). Now, say you have erased the images directory. Fortunately, your archive is intact, and you now want to extract it back to its original place, in /shared. But as you don't want to break your find command for new images, you need to preserve all file attributes:

#
 # cd to the directory where you want to extract
 #
$ cd /shared
$ tar yxpf ~/images.tar.bz2

And here you are!

Now, let's say you want to extract the directory images/cars from the archive, and nothing else. Then you can type this:

$ tar yxf ~/images.tar.bz2 images/cars

In case you would worry about this, don't: no, if you try to back up special files, tar will take them as what they are, special files, and will not dump their contents. So yes, you can safely put /dev/mem in an archive :) Oh, and it also deals correctly with links, so do not worry for this either. For symbolic links, also look at the h option in the manpage.

`bzip2` and `gzip`: data compression programs

You can see that we already have talked of these two programs when dealing with tar. Unlike WinZip under Windows, archiving and compressing are done using two separate utilities -- tar for archiving, and the two programs which we will now introduce for compressing data, bzip2 and gzip.

At first, bzip2 has been written as a replacement of gzip. Its compression ratios are generally better, but on the other hand it is more memory-greedy. The reason why gzip is still here is that it is still more widespread than bzip2. Maybe bzip2 will eventually replace gzip, but maybe not.

Both commands have a similar syntax:

gzip [options] [file(s)]

If no filename is given, both gzip and bzip2 will wait for data from the standard input and send the result to the standard output. Therefore, you can use both programs in pipes. Both programs also have a set of common options:

-1, ..., -9: Set the compression ratio. The higher the number, the better the compression, but better also means slower: "There's no such thing as a free lunch".
-d: Uncompress file(s). This is equivalent to using gunzip or bunzip2.
-c: Dump the result of compression/decompression of files given on the command line to the standard output.

Watch out! By default, both gzip and bzip2 erase the file(s) that they have compressed (or uncompressed) if you don't use the -c option. You can avoid it with bzip2 by using the -k option, but gzip has no such option!

Now some examples. Let's say you want to compress all files ending with .txt in the current directory using bzip2, you will then use:

$ bzip2 -9 *.txt

Let's say you want to share your images archive with someone, but he hasn't got bzip2, only gzip. You don't need to uncompress the archive and recompress it, you can just uncompress to the standard output, use a pipe, compress from standard input and redirect the output to the new archive:

bzip2 -dc images.tar.bz2 | gzip -9 >images.tar.gz

And here you are. You could have typed bzcat instead of bzip2 -dc. There is an equivalent for gzip but its name is zcat, not gzcat. You also have bzless (resp. zless) if you want to view compressed file directly, without having to uncompress them first. As an exercise, try and find the command you would have to type in order to view compressed files without uncompressing them, and without using bzless or zless :)

Many, many more...

There are so many commands that a comprehensive book about them would be the size of an encyclopedia. This chapter hasn't even covered a tenth of the subject, yet you can do much which what you learnt here. If you wish, you may read some manual pages: sort(1), sed(1), zip(1) (yes, that's what you think: you can extract or make ZIP archives with Linux), convert(1), and so on. The best way to get accustomed to these tools is to practice and experiment with them, and you will probably find a lot of uses to them, even quite unexpected ones. Have fun! :)

Next : Installation in text mode
Previous : Building and installing free software
Up