Chapter 2: How does zsh differ from...?

As has already been mentioned, zsh is most similar to ksh, while many of the additions are to please csh users. Here are some more detailed notes. See also the article `UNIX shell differences and how to change your shell' posted frequently to the USENET group comp.unix.shell.

2.1: Differences from sh and ksh

Most features of ksh (and hence also of sh) are implemented in zsh; problems can arise because the implementation is slightly different. Note also that not all ksh's are the same either. I have based this on the 11/16/88f version of ksh; differences from ksh93 will be more substantial.

As a summary of the status:

  1. because of all the options it is not safe to assume a general zsh run by a user will behave as if sh or ksh compatible;
  2. invoking zsh as sh or ksh (or if either is a symbolic link to zsh) sets appropriate options and improves compatibility (from within zsh itself, calling ARGV0=sh zsh will also work);
  3. from version 3.0 onward the degree of compatibility with sh under these circumstances is very high: zsh can now be used with GNU configure or perl's Configure, for example;
  4. the degree of compatibility with ksh is also high, but a few things are missing: for example the more sophisticated pattern-matching expressions are different for versions before 3.1.3 --- see the detailed list below;
  5. also from 3.0, the command `emulate' is available: `emulate ksh' and `emulate sh' set various options as well as changing the effect of single-letter option flags as if the shell had been invoked with the appropriate name. Including the command `emulate sh; setopt localoptions' in a shell function will turn on sh emulation for that function only. In version 4 (and in 3.0.6 through 8), this can be abbreviated as `emulate -L sh'.

The classic difference is word splitting, discussed in question 3.1; this catches out very many beginning zsh users. As explained there, this is actually a bug in every other shell. The answer is to set SH_WORD_SPLIT for backward compatibility. The next most classic difference is that unmatched glob patterns cause the command to abort; set NO_NOMATCH for those.

Here is a list of various options which will increase ksh compatibility, though maybe decrease zsh's abilities: see the manual entries for GLOB_SUBST, IGNORE_BRACES (though brace expansion occurs in some versions of ksh), KSH_ARRAYS, KSH_GLOB, KSH_OPTION_PRINT, LOCAL_OPTIONS, NO_BAD_PATTERN, NO_BANG_HIST, NO_EQUALS, NO_HUP, NO_NOMATCH, NO_RCS, NO_SHORT_LOOPS, PROMPT_SUBST, RM_STAR_SILENT, POSIX_ALIASES, POSIX_BUILTINS, POSIX_IDENTIFIERS, SH_FILE_EXPANSION, SH_GLOB, SH_OPTION_LETTERS, SH_WORD_SPLIT (see question 3.1) and SINGLE_LINE_ZLE. Note that you can also disable any built-in commands which get in your way. If invoked as `ksh', the shell will try to set suitable options.

Here are some differences from ksh which might prove significant for ksh programmers, some of which may be interpreted as bugs; there must be more. Note that this list is deliberately rather full and that most of the items are fairly minor. Those marked `*' perform in a ksh-like manner if the shell is invoked with the name `ksh', or if `emulate ksh' is in effect. Capitalised words with underlines refer to shell options.

itemize(

  • Syntax: itemize(
  • * Shell word splitting: see question 3.1.
  • * Arrays are (by default) more csh-like than ksh-like: subscripts start at 1, not 0; array[0] refers to array[1]; $array refers to the whole array, not $array[0]; braces are unnecessary: $a[1] == ${a[1]}, etc. Set the KSH_ARRAYS option for compatibility.
  • Furthermore, individual elements of arrays in zsh are always strings, not separate parameters. This means, for example, you can't `unset' an array element in zsh as you can in ksh; you can only set it to the empty string, or shorten the array. (You can unset elements of associative arrays in zsh because those are a completely different type of object.)
  • Coprocesses are established by coproc; |& behaves like csh. Handling of coprocess file descriptors is also different.
  • In cmd1 && cmd2 &, only cmd2 instead of the whole expression is run in the background in zsh. The manual implies this is a bug. Use { cmd1 && cmd2 } & as a workaround. )
  • Command line substitutions, globbing etc.: itemize(
  • * Failure to match a globbing pattern causes an error (use NO_NOMATCH).
  • * The results of parameter substitutions are treated as plain text: foo="*"; print $foo prints all files in ksh but * in zsh (use GLOB_SUBST).
  • * $PSn do not do parameter substitution by default (use PROMPT_SUBST).
  • * Standard globbing does not allow ksh-style `pattern-lists'. Equivalents:
    
    ----------------------------------------------------------------------
          ksh              zsh         Meaning
         ------           ------       ---------
         !(foo)            ^foo        Anything but foo.
                    or   foo1~foo2     Anything matching foo1 but foo2[1].
    @(foo1|foo2|...)  (foo1|foo2|...)  One of foo1 or foo2 or ...
         ?(foo)           (foo|)       Zero or one occurrences of foo.
         *(foo)           (foo)#       Zero or more occurrences of foo.
         +(foo)           (foo)##      One or more occurrences of foo.
    ----------------------------------------------------------------------
          
    
    The ^, ~ and # (but not |)forms require EXTENDED_GLOB. From version 3.1.3, the ksh forms are fully supported when the option KSH_GLOB is in effect; for previous versions you must use the table above.

    [1] See question 3.27 for more on the mysteries of ~ and ^.

  • Unquoted assignments do file expansion after :s (intended for PATHs).
  • * typeset and integer have special behaviour for assignments in ksh, but not in zsh. For example, this doesn't work in zsh:
    
              integer k=$(wc -l ~/.zshrc)
          
    
    because the return value from wc includes leading whitespace which causes wordsplitting. Ksh handles the assignment specially as a single word. )
  • Command execution: itemize(
  • * There is no $ENV variable (use /etc/zshrc, ~/.zshrc; note also $ZDOTDIR).
  • * $PATH is not searched for commands specified at invocation without -c. )
  • Aliases and functions: itemize(
  • The order in which aliases and functions are defined is significant: function definitions with () expand aliases -- see question 2.3.
  • Aliases and functions cannot be exported.
  • There are no tracked aliases: command hashing replaces these.
  • The use of aliases for key bindings is replaced by `bindkey'.
  • * Options are not local to functions (use LOCAL_OPTIONS; note this may always be unset locally to propagate options settings from a function to the calling level).
  • Functions defined with `function funcname { body }' behave the same way as those defined with `funcname () { body }'. In ksh, the former behave as if the body were read from a file with `.', and only the latter behave as true functions. )
  • Traps and signals: itemize(
  • * Traps are not local to functions. The option LOCAL_TRAPS is available from 3.1.6.
  • TRAPERR has become TRAPZERR (this was forced by UNICOS which has SIGERR). )
  • Editing: itemize(
  • The options gmacs, viraw are not supported. Use bindkey to change the editing behaviour: set -o {emacs,vi} becomes `bindkey -{e,v}', although `set -o emacs' and `set -o vi' are supported for compatibility; for gmacs, go to emacs mode and use `bindkey \^t gosmacs-transpose-characters'.
  • The keyword option does not exist and -k is instead interactivecomments. (keyword is not in recent versions of ksh either.)
  • * Management of histories in multiple shells is different: the history list is not saved and restored after each command. The option SHARE_HISTORY appeared in 3.1.6 and is set in ksh compatibility mode to remedy this.
  • \ does not escape editing chars (use ^V).
  • Not all ksh bindings are set (e.g. <ESC>#; try <ESC>q).
  • * # in an interactive shell is not treated as a comment by default.
  • In vi command mode the keys "k" and "j" move the cursor to the end of the line. To move the cursor to the start instead, use
    
              bindkey -M vicmd 'k' vi-up-line-or-history
              bindkey -M vicmd 'j' vi-down-line-or-history
      
    
    )
  • Built-in commands: itemize(
  • Some built-ins (r, autoload, history, integer ...) were aliases in ksh.
  • There is no built-in command newgrp: use e.g. alias newgrp="exec newgrp"
  • jobs has no -n flag. )
  • Other idiosyncrasies: itemize(
  • select always redisplays the list of selections on each loop. ) )

    2.2: Similarities with csh

    Although certain features aim to ease the withdrawal symptoms of csh (ab)users, the syntax is in general rather different and you should certainly not try to run scripts without modification. The c2z script is provided with the source (in Misc/c2z) to help convert .cshrc and .login files; see also the next question concerning aliases, particularly those with arguments.

    Csh-compatibility additions include: itemize(

  • logout, rehash, source, (un)limit built-in commands.
  • *rc file for interactive shells.
  • Directory stacks.
  • cshjunkie*, ignoreeof options.
  • The CSH_NULL_GLOB option.
  • >&, |& etc. redirection. (Note that >file 2>&1 is the standard Bourne shell command for csh's >&file.)
  • foreach ... loops; alternative syntax for other loops.
  • Alternative syntax if ( ... ) ..., though this still doesn't work like csh: it expects a command in the parentheses. Also for, which.
  • $PROMPT as well as $PS1, $status as well as $?, $#argv as well as $#, ....
  • Escape sequences via % for prompts.
  • Special array variables $PATH etc. are colon-separated, $path are arrays.
  • !-type history (which may be turned off via setopt nobanghist).
  • Arrays have csh-like features (see under 2.1). )

    2.3: Why do my csh aliases not work? (Plus other alias pitfalls.)

    First of all, check you are using the syntax

    
        alias newcmd='list of commands'
      
    
    and not
    
        alias newcmd 'list of commands'
      
    
    which won't work. (It tells you if `newcmd' and `list of commands' are already defined as aliases.)

    Otherwise, your aliases probably contain references to the command line of the form \!*, etc. Zsh does not handle this behaviour as it has shell functions which provide a way of solving this problem more consistent with other forms of argument handling. For example, the csh alias

    
        alias cd 'cd \!*; echo $cwd'
      
    
    can be replaced by the zsh function,
    
        cd() { builtin cd "$@"; echo $PWD; }
      
    
    (the `builtin' tells zsh to use its own `cd', avoiding an infinite loop) or, perhaps better,
    
        cd() { builtin cd "$@"; print -D $PWD; }
      
    
    (which converts your home directory to a ~). In fact, this problem is better solved by defining the special function chpwd() (see the manual). Note also that the ; at the end of the function is optional in zsh, but not in ksh or sh (for sh's where it exists).

    Here is Bart Schaefer's guide to converting csh aliases for zsh.

    1. If the csh alias references "parameters" (\!:1, \!* etc.), then in zsh you need a function (referencing $1, $* etc.). Otherwise, you can use a zsh alias.

    2. If you use a zsh function, you need to refer _at_least_ to $* in the body (inside the { }). Parameters don't magically appear inside the { } the way they get appended to an alias.

    3. If the csh alias references its own name (alias rm "rm -i"), then in a zsh function you need the "command" or "builtin" keyword (function rm() { command rm -i "$@" }), but in a zsh alias you don't (alias rm="rm -i").

    4. If you have aliases that refer to each other (alias ls "ls -C"; alias lf "ls -F" ==> lf == ls -C -F) then you must either: itemize(
    5. convert all of them to zsh functions; or
    6. after converting, be sure your .zshrc defines all of your aliases before it defines any of your functions. )

      Those first four are all you really need, but here are four more for heavy csh alias junkies:

    7. Mapping from csh alias "parameter referencing" into zsh function (assuming SH_WORD_SPLIT and KSH_ARRAYS are NOT set in zsh):
      
            csh             zsh
           =====         ==========
           \!*           $*              (or $argv)
           \!^           $1              (or $argv[1])
           \!:1          $1
           \!:2          $2              (or $argv[2], etc.)
           \!$           $*[$#]          (or $argv[$#], or $*[-1])
           \!:1-4        $*[1,4]
           \!:1-         $*[1,$#-1]      (or $*[1,-2])
           \!^-          $*[1,$#-1]
           \!*:q         "$@"
           \!*:x         $=*             ($*:x doesn't work (yet))
              
      

    8. Remember that it is NOT a syntax error in a zsh function to refer to a position ($1, $2, etc.) greater than the number of parameters. (E.g., in a csh alias, a reference to \!:5 will cause an error if 4 or fewer arguments are given; in a zsh function, $5 is the empty string if there are 4 or fewer parameters.)

    9. To begin a zsh alias with a - (dash, hyphen) character, use alias --:
      
                   csh                            zsh
              ===============             ==================
              alias - "fg %-"             alias -- -="fg %-"
            
      

    10. Stay away from alias -g in zsh until you REALLY know what you're doing.

    There is one other serious problem with aliases: consider

    
        alias l='/bin/ls -F'
        l() { /bin/ls -la "$@" | more }
      
    
    l in the function definition is in command position and is expanded as an alias, defining /bin/ls and -F as functions which call /bin/ls, which gets a bit recursive. This can be avoided if you use function to define a function, which doesn't expand aliases. It is possible to argue for extra warnings somewhere in this mess.

    One workaround for this is to use the "function" keyword instead:

    
        alias l='/bin/ls -F'
        function l { /bin/ls -la "$@" | more }
      
    
    The l after function is not expanded. Note you don't need the LPAR()RPAR() in this case, although it's harmless.

    You need to be careful if you are defining a function with multiple names; most people don't need to do this, so it's an unusual problem, but in case you do you should be aware that in versions of the shell before 5.1 names after the first were expanded:

    
        function a b c { ... }
      
    
    Here, b and c, but not a, have aliases expanded. This oddity was fixed in version 5.1.

    The rest of this item assumes you use the (more common, but equivalent) LPAR()RPAR() definitions.

    Bart Schaefer's rule is: Define first those aliases you expect to use in the body of a function, but define the function first if the alias has the same name as the function.

    If you aware of the problem, you can always escape part or all of the name of the function:

    
         'l'() { /bin/ls -la "$@" | more }
      
    
    Adding the quotes has no effect on the function definition, but suppresses alias expansion for the function name. Hence this is guaranteed to be safe---unless you are in the habit of defining aliases for expressions such as 'l', which is valid, but probably confusing.

    2.4: Similarities with tcsh

    (The sections on csh apply too, of course.) Certain features have been borrowed from tcsh, including $watch, run-help, $savehist, periodic commands etc., extended prompts, sched and which built-ins. Programmable completion was inspired by, but is entirely different to, tcsh's complete. (There is a perl script called lete2ctl in the Misc directory of the source distribution to convert complete to compctl statements.) This list is not definitive: some features have gone in the other direction.

    If you're missing the editor function run-fg-editor, try something with bindkey -s (which binds a string to a keystroke), e.g.

    
        bindkey -s '^z' '\eqfg %$EDITOR:t\n'
      
    
    which pushes the current line onto the stack and tries to bring a job with the basename of your editor into the foreground. bindkey -s allows limitless possibilities along these lines. You can execute any command in the middle of editing a line in the same way, corresponding to tcsh's -c option:
    
        bindkey -s '^p' '\eqpwd\n'
      
    
    In both these examples, the \eq saves the current input line to be restored after the command runs; a better effect with multiline buffers is achieved if you also have
    
        bindkey '\eq' push-input
      
    
    to save the entire buffer. In version 4 and recent versions of zsh 3.1, you have the following more sophisticated option,
    
        run-fg-editor() {
          zle push-input
          BUFFER="fg %$EDITOR:t"
          zle accept-line
        }
        zle -N run-fg-editor
      
    
    and can now bind run-fg-editor just like any other editor function.

    2.5: Similarities with bash

    The Bourne-Again Shell, bash, is another enhanced Bourne-like shell; the most obvious difference from zsh is that it does not attempt to emulate the Korn shell. Since both shells are under active development it is probably not sensible to be too specific here. Broadly, bash has paid more attention to standards compliancy (i.e. POSIX) for longer, and has so far avoided the more abstruse interactive features (programmable completion, etc.) that zsh has.

    In recent years there has been a certain amount of crossover in the extensions, however. Zsh (as of 3.1.6) has bash's `${var/old/new}' feature for replacing the text old with the text new in the parameter $var. Note one difference here: while both shells implement the syntax `${var/#old/new}' and `${var/%old/new}' for anchoring the match of old to the start or end of the parameter text, respectively, in zsh you can't put the `#' or `%' inside a parameter: in other words `{var/$old/new}' where old begins with a `#' treats that as an ordinary character in zsh, unlike bash. To do this sort of thing in zsh you can use (from 3.1.7) the new syntax for anchors in any pattern, `(#s)' to match the start of a string, and `(#e)' to match the end. These require the option EXTENDED_GLOB to be set.

    2.6: Shouldn't zsh be more/less like ksh/(t)csh?

    People often ask why zsh has all these `unnecessary' csh-like features, or alternatively why zsh doesn't understand more csh syntax. This is far from a definitive answer and the debate will no doubt continue.

    Paul's object in writing zsh was to produce a ksh-like shell which would have features familiar to csh users. For a long time, csh was the preferred interactive shell and there is a strong resistance to changing to something unfamiliar, hence the additional syntax and CSH_JUNKIE options. This argument still holds. On the other hand, the arguments for having what is close to a plug-in replacement for ksh are, if anything, even more powerful: the deficiencies of csh as a programming language are well known (look in any Usenet FAQ archive, e.g. http://www.cis.ohio-state.edu/hypertext/faq/usenet/unix-faq/\ shell/csh-whynot/faq.html if you are in any doubt) and zsh is able to run many standard scripts such as /etc/rc.

    Of course, this makes zsh rather large and feature-ridden so that it seems to appeal mainly to hackers. The only answer, perhaps not entirely satisfactory, is that you have to ignore the bits you don't want. The introduction of loadable in modules in version 3.1 should help.

    2.7: What is zsh's support for Unicode/UTF-8?

    `Unicode', or UCS for Universal Character Set, is the modern way of specifying character sets. It replaces a large number of ad hoc ways of supporting character sets beyond ASCII. `UTF-8' is an encoding of Unicode that is particularly natural on Unix-like systems.

    The production branch of zsh, 4.2, has very limited support: the built-in printf command supports "\u" and "\U" escapes to output arbitrary Unicode characters; ZLE (the Zsh Line Editor) has no concept of character encodings, and is confused by multi-octet encodings.

    However, the 4.3 branch has much better support, and furthermore this is now fairly stable. (Only a few minor areas need fixing before this becomes a production release.) This is discussed more fully below, see `Multibyte input and output'.