Chapter 2: How does zsh differ from...?

As has already been mentioned, zsh is most similar to ksh, while many of the additions are to please csh users. Here are some more detailed notes. See also the article `UNIX shell differences and how to change your shell' posted frequently to the USENET group

2.1: Differences from sh and ksh

Most features of ksh (and hence also of sh) are implemented in zsh; problems can arise because the implementation is slightly different. Note also that not all ksh's are the same either. I have based this on the 11/16/88f version of ksh; differences from ksh93 will be more substantial.

As a summary of the status:

  1. because of all the options it is not safe to assume a general zsh run by a user will behave as if sh or ksh compatible;
  2. invoking zsh as sh or ksh (or if either is a symbolic link to zsh) sets appropriate options and improves compatibility (from within zsh itself, calling ARGV0=sh zsh will also work);
  3. from version 3.0 onward the degree of compatibility with sh under these circumstances is very high: zsh can now be used with GNU configure or perl's Configure, for example;
  4. the degree of compatibility with ksh is also high, but a few things are missing: for example the more sophisticated pattern-matching expressions are different for versions before 3.1.3 --- see the detailed list below;
  5. also from 3.0, the command `emulate' is available: `emulate ksh' and `emulate sh' set various options as well as changing the effect of single-letter option flags as if the shell had been invoked with the appropriate name. Including the command `emulate sh; setopt localoptions' in a shell function will turn on sh emulation for that function only. In version 4 (and in 3.0.6 through 8), this can be abbreviated as `emulate -L sh'.

The classic difference is word splitting, discussed in question 3.1; this catches out very many beginning zsh users. As explained there, this is actually a bug in every other shell. The answer is to set SH_WORD_SPLIT for backward compatibility. The next most classic difference is that unmatched glob patterns cause the command to abort; set NO_NOMATCH for those.

Here is a list of various options which will increase ksh compatibility, though maybe decrease zsh's abilities: see the manual entries for GLOB_SUBST, IGNORE_BRACES (though brace expansion occurs in some versions of ksh), KSH_ARRAYS, KSH_GLOB, KSH_OPTION_PRINT, LOCAL_OPTIONS, NO_BAD_PATTERN, NO_BANG_HIST, NO_EQUALS, NO_HUP, NO_NOMATCH, NO_RCS, NO_SHORT_LOOPS, PROMPT_SUBST, RM_STAR_SILENT, POSIX_ALIASES, POSIX_BUILTINS, POSIX_IDENTIFIERS, SH_FILE_EXPANSION, SH_GLOB, SH_OPTION_LETTERS, SH_WORD_SPLIT (see question 3.1) and SINGLE_LINE_ZLE. Note that you can also disable any built-in commands which get in your way. If invoked as `ksh', the shell will try to set suitable options.

Here are some differences from ksh which might prove significant for ksh programmers, some of which may be interpreted as bugs; there must be more. Note that this list is deliberately rather full and that most of the items are fairly minor. Those marked `*' perform in a ksh-like manner if the shell is invoked with the name `ksh', or if `emulate ksh' is in effect. Capitalised words with underlines refer to shell options.

2.2: Similarities with csh

Although certain features aim to ease the withdrawal symptoms of csh (ab)users, the syntax is in general rather different and you should certainly not try to run scripts without modification. The c2z script is provided with the source (in Misc/c2z) to help convert .cshrc and .login files; see also the next question concerning aliases, particularly those with arguments.

Csh-compatibility additions include:

2.3: Why do my csh aliases not work? (Plus other alias pitfalls.)

First of all, check you are using the syntax

    alias newcmd='list of commands'
and not

    alias newcmd 'list of commands'
which won't work. (It tells you if `newcmd' and `list of commands' are already defined as aliases.)

Otherwise, your aliases probably contain references to the command line of the form \!*, etc. Zsh does not handle this behaviour as it has shell functions which provide a way of solving this problem more consistent with other forms of argument handling. For example, the csh alias

    alias cd 'cd \!*; echo $cwd'
can be replaced by the zsh function,

    cd() { builtin cd "$@"; echo $PWD; }
(the `builtin' tells zsh to use its own `cd', avoiding an infinite loop) or, perhaps better,

    cd() { builtin cd "$@"; print -D $PWD; }
(which converts your home directory to a ~). In fact, this problem is better solved by defining the special function chpwd() (see the manual). Note also that the ; at the end of the function is optional in zsh, but not in ksh or sh (for sh's where it exists).

Here is Bart Schaefer's guide to converting csh aliases for zsh.

  1. If the csh alias references "parameters" (\!:1, \!* etc.), then in zsh you need a function (referencing $1, $* etc.). Otherwise, you can use a zsh alias.

  2. If you use a zsh function, you need to refer _at_least_ to $* in the body (inside the { }). Parameters don't magically appear inside the { } the way they get appended to an alias.

  3. If the csh alias references its own name (alias rm "rm -i"), then in a zsh function you need the "command" or "builtin" keyword (function rm() { command rm -i "$@" }), but in a zsh alias you don't (alias rm="rm -i").

  4. If you have aliases that refer to each other (alias ls "ls -C"; alias lf "ls -F" ==> lf == ls -C -F) then you must either:

    Those first four are all you really need, but here are four more for heavy csh alias junkies:

  5. Mapping from csh alias "parameter referencing" into zsh function (assuming SH_WORD_SPLIT and KSH_ARRAYS are NOT set in zsh):
          csh             zsh
         =====         ==========
         \!*           $*              (or $argv)
         \!^           $1              (or $argv[1])
         \!:1          $1
         \!:2          $2              (or $argv[2], etc.)
         \!$           $*[$#]          (or $argv[$#], or $*[-1])
         \!:1-4        $*[1,4]
         \!:1-         $*[1,$#-1]      (or $*[1,-2])
         \!^-          $*[1,$#-1]
         \!*:q         "$@"
         \!*:x         $=*             ($*:x doesn't work (yet))

  6. Remember that it is NOT a syntax error in a zsh function to refer to a position ($1, $2, etc.) greater than the number of parameters. (E.g., in a csh alias, a reference to \!:5 will cause an error if 4 or fewer arguments are given; in a zsh function, $5 is the empty string if there are 4 or fewer parameters.)

  7. To begin a zsh alias with a - (dash, hyphen) character, use alias --:
                 csh                            zsh
            ===============             ==================
            alias - "fg %-"             alias -- -="fg %-"

  8. Stay away from alias -g in zsh until you REALLY know what you're doing.

There is one other serious problem with aliases: consider

    alias l='/bin/ls -F'
    l() { /bin/ls -la "$@" | more }
l in the function definition is in command position and is expanded as an alias, defining /bin/ls and -F as functions which call /bin/ls, which gets a bit recursive. This can be avoided if you use function to define a function, which doesn't expand aliases. It is possible to argue for extra warnings somewhere in this mess.

One workaround for this is to use the "function" keyword instead:

    alias l='/bin/ls -F'
    function l { /bin/ls -la "$@" | more }
The l after function is not expanded. Note you don't need the LPAR()RPAR() in this case, although it's harmless.

You need to be careful if you are defining a function with multiple names; most people don't need to do this, so it's an unusual problem, but in case you do you should be aware that in versions of the shell before 5.1 names after the first were expanded:

    function a b c { ... }
Here, b and c, but not a, have aliases expanded. This oddity was fixed in version 5.1.

The rest of this item assumes you use the (more common, but equivalent) LPAR()RPAR() definitions.

Bart Schaefer's rule is: Define first those aliases you expect to use in the body of a function, but define the function first if the alias has the same name as the function.

If you aware of the problem, you can always escape part or all of the name of the function:

     'l'() { /bin/ls -la "$@" | more }
Adding the quotes has no effect on the function definition, but suppresses alias expansion for the function name. Hence this is guaranteed to be safe---unless you are in the habit of defining aliases for expressions such as 'l', which is valid, but probably confusing.

2.4: Similarities with tcsh

(The sections on csh apply too, of course.) Certain features have been borrowed from tcsh, including $watch, run-help, $savehist, periodic commands etc., extended prompts, sched and which built-ins. Programmable completion was inspired by, but is entirely different to, tcsh's complete. (There is a perl script called lete2ctl in the Misc directory of the source distribution to convert complete to compctl statements.) This list is not definitive: some features have gone in the other direction.

If you're missing the editor function run-fg-editor, try something with bindkey -s (which binds a string to a keystroke), e.g.

    bindkey -s '^z' '\eqfg %$EDITOR:t\n'
which pushes the current line onto the stack and tries to bring a job with the basename of your editor into the foreground. bindkey -s allows limitless possibilities along these lines. You can execute any command in the middle of editing a line in the same way, corresponding to tcsh's -c option:

    bindkey -s '^p' '\eqpwd\n'
In both these examples, the \eq saves the current input line to be restored after the command runs; a better effect with multiline buffers is achieved if you also have

    bindkey '\eq' push-input
to save the entire buffer. In version 4 and recent versions of zsh 3.1, you have the following more sophisticated option,

    run-fg-editor() {
      zle push-input
      BUFFER="fg %$EDITOR:t"
      zle accept-line
    zle -N run-fg-editor
and can now bind run-fg-editor just like any other editor function.

2.5: Similarities with bash

The Bourne-Again Shell, bash, is another enhanced Bourne-like shell; the most obvious difference from zsh is that it does not attempt to emulate the Korn shell. Since both shells are under active development it is probably not sensible to be too specific here. Broadly, bash has paid more attention to standards compliancy (i.e. POSIX) for longer, and has so far avoided the more abstruse interactive features (programmable completion, etc.) that zsh has.

In recent years there has been a certain amount of crossover in the extensions, however. Zsh (as of 3.1.6) has bash's `${var/old/new}' feature for replacing the text old with the text new in the parameter $var. Note one difference here: while both shells implement the syntax `${var/#old/new}' and `${var/%old/new}' for anchoring the match of old to the start or end of the parameter text, respectively, in zsh you can't put the `#' or `%' inside a parameter: in other words `{var/$old/new}' where old begins with a `#' treats that as an ordinary character in zsh, unlike bash. To do this sort of thing in zsh you can use (from 3.1.7) the new syntax for anchors in any pattern, `(#s)' to match the start of a string, and `(#e)' to match the end. These require the option EXTENDED_GLOB to be set.

2.6: Shouldn't zsh be more/less like ksh/(t)csh?

People often ask why zsh has all these `unnecessary' csh-like features, or alternatively why zsh doesn't understand more csh syntax. This is far from a definitive answer and the debate will no doubt continue.

Paul's object in writing zsh was to produce a ksh-like shell which would have features familiar to csh users. For a long time, csh was the preferred interactive shell and there is a strong resistance to changing to something unfamiliar, hence the additional syntax and CSH_JUNKIE options. This argument still holds. On the other hand, the arguments for having what is close to a plug-in replacement for ksh are, if anything, even more powerful: the deficiencies of csh as a programming language are well known (look in any Usenet FAQ archive, e.g.\ shell/csh-whynot/faq.html if you are in any doubt) and zsh is able to run many standard scripts such as /etc/rc.

Of course, this makes zsh rather large and feature-ridden so that it seems to appeal mainly to hackers. The only answer, perhaps not entirely satisfactory, is that you have to ignore the bits you don't want. The introduction of loadable in modules in version 3.1 should help.

2.7: What is zsh's support for Unicode/UTF-8?

`Unicode', or UCS for Universal Character Set, is the modern way of specifying character sets. It replaces a large number of ad hoc ways of supporting character sets beyond ASCII. `UTF-8' is an encoding of Unicode that is particularly natural on Unix-like systems.

The production branch of zsh, 4.2, has very limited support: the built-in printf command supports "\u" and "\U" escapes to output arbitrary Unicode characters; ZLE (the Zsh Line Editor) has no concept of character encodings, and is confused by multi-octet encodings.

However, the 4.3 branch has much better support, and furthermore this is now fairly stable. (Only a few minor areas need fixing before this becomes a production release.) This is discussed more fully below, see `Multibyte input and output'.