[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
m4
This chapter describes the differences between this implementation of
m4
, and the implementation found under UNIX, notably System V,
Release 3.
There are also differences in BSD flavors of m4
. No attempt
is made to summarize these here.
16.1 Extensions in GNU m4 | Extensions in GNU M4 | |
16.2 Facilities in System V m4 not in GNU m4 | Facilities in System V m4 not in GNU M4 | |
16.3 Other incompatibilities |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
m4
This version of m4
contains a few facilities that do not exist
in System V m4
. These extra facilities are all suppressed by
using the ‘-G’ command line option (see section Invoking m4), unless overridden by other command line options.
$n
notation for macro arguments, n can contain
several digits, while the System V m4
only accepts one digit.
This allows macros in GNU m4
to take any number of
arguments, and not only nine (see section Arguments to macros).
This means that define(`foo', `$11')
is ambiguous between
implementations. To portably choose between grabbing the first
parameter and appending 1 to the expansion, or grabbing the eleventh
parameter, you can do the following:
define(`a1', `A1') ⇒ dnl First argument, concatenated with 1 define(`_1', `$1')define(`first1', `_1($@)1') ⇒ dnl Eleventh argument, portable define(`_9', `$9')define(`eleventh', `_9(shift(shift($@)))') ⇒ dnl Eleventh argument, GNU style define(`Eleventh', `$11') ⇒ first1(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k') ⇒A1 eleventh(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k') ⇒k Eleventh(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k') ⇒k |
Also see the argn
macro (see section Recursion in m4
).
divert
(see section Diverting output) macro can manage more than 9
diversions. GNU m4
treats all positive numbers as valid
diversions, rather than discarding diversions greater than 9.
include
and sinclude
are sought in a
user specified search path, if they are not found in the working
directory. The search path is specified by the ‘-I’ option and the
M4PATH
environment variable (see section Searching for include files).
undivert
can be non-numeric, in which case the named
file will be included uninterpreted in the output (see section Undiverting output).
format
builtin, which
is modeled after the C library function printf
(see section Formatting strings (printf-like)).
regexp
(see section Searching for regular expressions) and patsubst
(see section Substituting text by regular expression) builtins.
m4
with
esyscmd
(see section Reading the output of commands).
builtin
(see section Indirect call of builtins).
indir
(see section Indirect call of macros).
__program__
,
__file__
, and __line__
(see section Printing current location).
dumpdef
and macro tracing can be
controlled with debugmode
(see section Controlling debugging output).
debugfile
(see section Saving debugging output).
maketemp
(see section Making temporary files) macro behaves like mkstemp
,
creating a new file with a unique name on every invocation, rather than
following the insecure behavior of replacing the trailing ‘X’
characters with the m4
process id.
In addition to the above extensions, GNU m4
implements the
following command line options: ‘-F’, ‘-G’, ‘-I’,
‘-L’, ‘-R’, ‘-V’, ‘-W’, ‘-d’, ‘-i’,
‘-l’, ‘--debugfile’ and ‘-t’. See section Invoking m4
, for a
description of these options.
Also, the debugging and tracing facilities in GNU m4
are much
more extensive than in most other versions of m4
.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
m4
not in GNU m4
The version of m4
from System V contains a few facilities that
have not been implemented in GNU m4
yet. Additionally,
POSIX requires some behaviors that GNU m4
has not
implemented yet. Relying on these behaviors is non-portable, as a
future release of GNU m4
may change.
m4
supports multiple arguments to defn
, and
POSIX requires it. This is not yet implemented in GNU
m4
. Unfortunately, this means it is not possible to mix builtins
and other text into a single macro; a helper macro is required.
include
(see section Including named files) when a file is unreadable,
eval
(see section Evaluating integer expressions) when an argument cannot be parsed, or using
m4exit
(see section Exiting from m4
) with a non-numeric argument).
m4
correctly handles multiple instances
of ‘-’ on the command line.
m4wrap
(see section Saving text until end of input) to act in FIFO
(first-in, first-out) order, but GNU m4
currently uses
LIFO order. Furthermore, POSIX states that only the first
argument to m4wrap
is saved for later evaluation, bug
GNU m4
saves and processes all arguments, with output
separated by spaces.
However, it is possible to emulate POSIX behavior by including the file ‘m4-1.4.10/examples/wrapfifo.m4’ from the distribution:
undivert(`wrapfifo.m4')dnl ⇒dnl Redefine m4wrap to have FIFO semantics. ⇒define(`_m4wrap_level', `0')dnl ⇒define(`m4wrap', ⇒`ifdef(`m4wrap'_m4wrap_level, ⇒ `define(`m4wrap'_m4wrap_level, ⇒ defn(`m4wrap'_m4wrap_level)`$1')', ⇒ `builtin(`m4wrap', `define(`_m4wrap_level', ⇒ incr(_m4wrap_level))dnl ⇒m4wrap'_m4wrap_level)dnl ⇒define(`m4wrap'_m4wrap_level, `$1')')')dnl include(`wrapfifo.m4') ⇒ m4wrap(`a`'m4wrap(`c ', `d')')m4wrap(`b') ⇒ ^D ⇒abc |
a`'define`'b
would expand to ab
. But
GNU m4
ignores certain builtins if they have missing
arguments, giving adefineb
for the above example.
define(`f',`1')
(see section Defining a macro)
by undefining the entire stack of previous definitions, and if doing
undefine(`f')
first. GNU m4
replaces just the top
definition on the stack, as if doing popdef(`f')
followed by
pushdef(`f',`1')
. POSIX allows either behavior.
syscmd
(see section Executing simple commands) to evaluate
command output for macro expansion, but this appears to be a mistake
in POSIX since traditional implementations did not do this.
GNU m4
follows traditional behavior in syscmd
, and
provides the extension esyscmd
that provides the POSIX
semantics.
changequote(arg)
(see section Changing the quote characters) to use newline as the close quote, but this was a
bug, and the next version of POSIX is anticipated to state
that using empty strings or just one argument is unspecified.
Meanwhile, the GNU m4
behavior of treating an empty
end-quote delimiter as ‘'’ is not portable, as Solaris treats it as
repeating the start-quote delimiter, and BSD treats it as leaving the
previous end-quote delimiter unchanged. For predictable results, never
call changequote with just one argument, or with empty strings for
arguments.
changecom(arg,)
(see section Changing the comment delimiters) to make it impossible to end a comment, but this is
a bug, and the next version of POSIX is anticipated to state
that using empty strings is unspecified. Meanwhile, the GNU
m4
behavior of treating an empty end-comment delimiter as newline
is not portable, as BSD treats it as leaving the previous end-comment
delimiter unchanged. It is also impossible in BSD implementations to
disable comments, even though that is required by POSIX. For
predictable results, never call changecom with empty strings for
arguments.
m4
give macros a higher precedence than
comments when parsing, meaning that if the start delimiter given to
changecom
(see section Changing the comment delimiters) starts with a macro name, comments
are effectively disabled. POSIX does not specify what the
precedence is, so the GNU m4
parser recognizes
comments, then macros, then quoted strings.
m4
, but
gives an error message that the end of file was encountered inside a
macro with GNU m4
. On the other hand, traditional
implementations do end of file processing for files included with
include
or sinclude
(see section Including named files), while GNU
m4
seamlessly integrates the content of those files. Thus
include(`a.m4')include(`b.m4')
will output ‘3’ instead of
giving an error.
m4
treats traceon
(see section Tracing macro calls) without
arguments as a global variable, independent of named macro tracing.
Also, once a macro is undefined, named tracing of that macro is lost.
On the other hand, when GNU m4
encounters
traceon
without
arguments, it turns tracing on for all existing definitions at the time,
but does not trace future definitions; traceoff
without arguments
turns tracing off for all definitions regardless of whether they were
also traced by name; and tracing by name, such as with ‘-tfoo’ at
the command line or traceon(`foo')
in the input, is an attribute
that is preserved even if the macro is currently undefined.
eval
(see section Evaluating integer expressions) to treat all
operators with the same precedence as C. However, earlier versions of
GNU m4
followed the traditional behavior of other
m4
implementations, where bitwise and logical negation (‘~’
and ‘!’) have lower precedence than equality operators; and where
equality operators (‘==’ and ‘!=’) had the same precedence as
relational operators (such as ‘<’). Use explicit parentheses to
ensure proper precedence. As extensions to POSIX,
GNU m4
gives well-defined semantics to operations that
C leaves undefined, such as when overflow occurs, when shifting negative
numbers, or when performing division by zero. POSIX also
requires ‘=’ to cause an error, but many traditional
implementations allowed it as an alias for ‘==’.
translit
(see section Translating characters) to treat
each character of the second and third arguments literally, but GNU
m4
treats ‘-’ as a range operator.
m4
to honor the locale environment
variables of LANG
, LC_ALL
, LC_CTYPE
,
LC_MESSAGES
, and NLSPATH
, but this has not yet been
implemented in GNU m4
.
m4
follows
tradition and ignores all leading unquoted whitespace.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
There are a few other incompatibilities between this implementation of
m4
, and the System V version.
m4
implements sync lines differently from System V
m4
, when text is being diverted. GNU m4
outputs
the sync lines when the text is being diverted, and System V m4
when the diverted text is being brought back.
The problem is which lines and file names should be attached to text that
is being, or has been, diverted. System V m4
regards all the
diverted text as being generated by the source line containing the
undivert
call, whereas GNU m4
regards the
diverted text as being generated at the time it is diverted.
The sync line option is used mostly when using m4
as
a front end to a compiler. If a diverted line causes a compiler error,
the error messages should most probably refer to the place where the
diversion was made, and not where it was inserted again.
divert(2)2 divert(1)1 divert`'0 ⇒#line 3 "stdin" ⇒0 ^D ⇒#line 2 "stdin" ⇒1 ⇒#line 1 "stdin" ⇒2 |
The current m4
implementation has a limitation that the syncline
output at the start of each diversion occurs no matter what, even if the
previous diversion did not end with a newline. This goes contrary to
the claim that synclines appear on a line by themselves, so this
limitation may be corrected in a future version of m4
. In the
meantime, when using ‘-s’, it is wisest to make sure all
diversions end with newline.
m4
makes no attempt at prohibiting self-referential
definitions like:
define(`x', `x') ⇒ define(`x', `x ') ⇒ |
There is nothing inherently wrong with defining ‘x’ to
return ‘x’. The wrong thing is to expand ‘x’ unquoted,
because that would cause an infinite rescan loop.
In m4
, one might use macros to hold strings, as we do for
variables in other programming languages, further checking them with:
ifelse(defn(`holder'), `value', …) |
In cases like this one, an interdiction for a macro to hold its own name
would be a useless limitation. Of course, this leaves more rope for the
GNU m4
user to hang himself! Rescanning hangs may be
avoided through careful programming, a little like for endless loops in
traditional programming languages.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated by Super-User on November, 6 2008 using texi2html 1.78.