2839 lines
104 KiB
Plaintext
2839 lines
104 KiB
Plaintext
|
|
|
|
|
|
Bell Laboratories
|
|
|
|
|
|
|
|
subject: Introduction to ksh-93 date: January 14, 1993
|
|
Charge Case 311531-0101
|
|
File Case 49059-6 from: David G. Korn
|
|
MH 11267
|
|
3C-526B x7975
|
|
(ulysses!dgk)
|
|
|
|
TM 11267-930???-93
|
|
|
|
|
|
|
|
ABSTRACT
|
|
|
|
|
|
|
|
ksh-93 is a major rewrite of ksh, a program that serves as a
|
|
command language (shell) for the UNIX* operating system.
|
|
As with ksh, ksh-93 is essentially compatible with the
|
|
System V version of the Bourne shell[1] , and compatible
|
|
with previous versions of ksh. ksh-93 is intended to comply
|
|
with the IEEE POSIX 1003.2 shell standard and the ISO 9945-
|
|
2[2] shell standard. In addition to changes in the language
|
|
required by these standards, the primary focus of ksh-93 is
|
|
related to shell programming. ksh-93 provides the
|
|
programming power of several other interpretive languages
|
|
such as awk[3], FIT[4], perl[5], and tcl[6].
|
|
|
|
This memo assumes that the reader is already familiar with
|
|
the Bourne shell. It introduces most of the features of
|
|
ksh-93 relative to the Bourne shell; both as a command
|
|
language and as a programming language. The Appendix
|
|
contains a sample script written in ksh-93.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
__________
|
|
|
|
* UNIX is a registered trademark of USL
|
|
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Bell Laboratories
|
|
|
|
|
|
|
|
subject: Introduction to ksh-93 date: January 14, 1993
|
|
Charge Case 311531-0101
|
|
File Case 49059-6 from: David G. Korn
|
|
MH 11267
|
|
3C-526B x7975
|
|
(ulysses!dgk)
|
|
|
|
TM 11267-930???-93
|
|
|
|
|
|
|
|
MEMORANDUM_FOR_FILE
|
|
|
|
|
|
|
|
1. INTRODUCTION
|
|
|
|
The term "shell" is used to describe a program that provides
|
|
a command language interface. Because the UNIX* system
|
|
shell is a user level program, and not part of the operating
|
|
system itself, anyone can write a new shell or modify an
|
|
existing one. This has caused an evolutionary progress in
|
|
the design and implementation of shells, with the better
|
|
ones surviving. The most widely available UNIX system
|
|
shells are the Bourne shell[7], written by Steve Bourne at
|
|
AT&T Bell Laboratories, the C shell[8], written by Bill Joy
|
|
at the University of California, Berkeley, and the KornShell
|
|
language [9], written by David Korn at AT&T Bell
|
|
Laboratories. The Bourne shell is available on almost all
|
|
versions of the UNIX system. The C Shell is available with
|
|
all Berkeley Software Distribution, BSD, UNIX systems and on
|
|
many other systems. The KornShell, is available on System V
|
|
Release 4 systems. In addition, it is available on many
|
|
other systems. The source for the KornShell language is
|
|
available from the AT&T Toolchest, an electronic software
|
|
distribution system. It runs on all known versions of the
|
|
UNIX system and on many UNIX system look-alikes.
|
|
|
|
There have been several articles comparing the UNIX system
|
|
shells. Jason Levitt[10] highlights some of the new
|
|
|
|
|
|
__________
|
|
|
|
* UNIX is a registered trademark of USL
|
|
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 2 -
|
|
|
|
|
|
|
|
features introduced by the KornShell language. Rich
|
|
Bilancia[11] explains some of the advantages of using the
|
|
KornShell language. John Sebes[12] provides a more detailed
|
|
comparison of the three shells, both as a command language
|
|
and as a programming language.
|
|
|
|
The KornShell language is a superset of the Bourne shell.
|
|
The KornShell language has many of the popular C shell
|
|
features, plus additional features of its own. Its initial
|
|
popularity stems primarily from its improvements as a
|
|
command language. The primary interactive benefit of the
|
|
KornShell command language is a visual command line editor
|
|
that allows you to make corrections to your current command
|
|
line or to earlier command lines, without having to retype
|
|
them.
|
|
|
|
However, in the long run, the power of the KornShell
|
|
language as a high-level programming language, as described
|
|
by Dolotta and Mashey[13], may prove to be of greater
|
|
significance. ksh-93 provides the programming power of
|
|
several other interpretive languages such as awk, FIT, perl,
|
|
and tcl. An application that was originally written in the
|
|
C programming language was rewritten in the KornShell
|
|
language. More than 20,000 lines of C code were replaced
|
|
with KornShell scripts totaling fewer than 700 lines. In
|
|
most instances there was no perceptible difference in
|
|
performance between the two versions of the code.
|
|
|
|
The KornShell language has been embedded into windowing
|
|
systems allowing graphical user interfaces to be developed
|
|
in shell rather than having to build applications that need
|
|
to be compiled. The wksh program[14], provides a method of
|
|
developing OpenLook or Motif applications as ksh scripts.
|
|
|
|
This memo is an introduction to ksh-93 the program that
|
|
implements an enhanced version of the KornShell language.
|
|
It is referred to as ksh in the rest of this memo. The memo
|
|
describes the KornShell language based on the features of
|
|
the 02/25/93 release of ksh. This memo is not a tutorial,
|
|
only an introduction. The second edition of reference [9]
|
|
gives a more complete treatment of the KornShell language.
|
|
|
|
A concerted effort has been made to achieve both System V
|
|
Bourne shell compatibility and IEEE POSIX compatibility so
|
|
that scripts written for either of these shells can run
|
|
without modification with ksh. In addition, ksh-93 attempts
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 3 -
|
|
|
|
|
|
|
|
to be compatible with older versions of ksh. When conflicts
|
|
between these versions of the shell, ksh-93 selects the
|
|
behavior dictated by the IEEE POSIX standard. The
|
|
description of features in this memo assumes that the reader
|
|
is already familiar with the Bourne shell.
|
|
|
|
|
|
2. COMMAND LANGUAGE
|
|
|
|
There is no separate command language. All features of the
|
|
language, except job control, can be used both within a
|
|
script and interactively from a terminal. However, features
|
|
that are more likely to be used while running commands
|
|
interactively from a terminal are presented here.
|
|
|
|
2.1 Setting Options
|
|
|
|
By convention, UNIX commands consist of a command name
|
|
followed by options and other arguments. Options are either
|
|
of the form -letter, or -letter value. In the former case,
|
|
several options may be grouped after a single -. The
|
|
argument -- signifies an end to the option list and is only
|
|
required when the first non-option argument begins with a -.
|
|
Most commands print an error message which shows which
|
|
options are permitted when given incorrect arguments.
|
|
Ordinarily, ksh executes a command by using the command name
|
|
to locate a program to run and by running the program as a
|
|
separate process. Some commands, referred to as built-ins,
|
|
are carried out by ksh itself, without creating a separate
|
|
process. The reasons that some commands are built-in are
|
|
presented later. In nearly all cases the distinction
|
|
between a command that is built-in and one that is not is
|
|
invisible to the user. However, nearly all commands that
|
|
are built-in follow command line conventions. In addition,
|
|
the option sequence -? causes the command to print a usage
|
|
message which lists the valid options.
|
|
|
|
ksh has several options that can be set by the user as
|
|
command line arguments and as option arguments to the set
|
|
command. Most options can be set with a single letter
|
|
option or as a name that follows the -o option. Use set -o
|
|
to display the current option settings. Some of these
|
|
options, such as interactive and monitor (See Job Control
|
|
below) are enabled automatically by ksh when the shell is
|
|
connected to a terminal device. Other options, such as
|
|
noclobber and ignoreeof are normally placed in a startup
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 4 -
|
|
|
|
|
|
|
|
file. The noclobber option causes ksh to print an error
|
|
message when you use > to redirect to a file that already
|
|
exists, If you want to redirect to an existing file, then
|
|
you have to use >| to override the noclobber option. The
|
|
ignoreeof option is used to prevent the end-of-file
|
|
character, normally ^D(Control-d), from exiting the shell
|
|
and possibly logging you out. You must type exit to log
|
|
out. Most of the options are described in this memo as
|
|
appropriate.
|
|
|
|
2.2 Command Aliases
|
|
|
|
Command aliases provide a mechanism of associating a command
|
|
name and options with a shorter name. Aliases are defined
|
|
with the alias built-in. The form of an alias command
|
|
definition is:
|
|
alias name=value
|
|
As with other shell assignments, no space is allowed before
|
|
or after the =. The characters of an alias name cannot be
|
|
characters that are special to the shell. The replacement
|
|
string, value, can contain any valid shell script, including
|
|
meta-characters such as pipe symbols and i/o-redirection
|
|
provided that they are quoted. Unlike csh, aliases in ksh
|
|
cannot take arguments. The equivalent functionality of
|
|
aliases with arguments can only be achieved with shell
|
|
fucntions described later.
|
|
|
|
As a command is being read, the command name is checked
|
|
against a list of alias names. If it is found, the name is
|
|
replaced by the alias value associated with the alias and
|
|
then rescanned. When rescanning the value for an alias,
|
|
alias substitutions are performed except for an alias that
|
|
is currently being processed. This prevents infinite loops
|
|
in alias substitutions. For example with the aliases,
|
|
alias l=ls 'ls=ls -C', the command name l becomes ls, which
|
|
becomes ls -C. Ordinarily, only the command name word is
|
|
processed for alias substitution. However, if the value of
|
|
an alias ends in a space, then the word following the alias
|
|
is also checked for alias substitution. This makes it
|
|
possible to define an alias whose first argument is the name
|
|
of a command and have alias substitution performed on this
|
|
argument, for example nohup='nohup '.
|
|
|
|
Aliases can be used to redefine built-in commands so that
|
|
the alias,
|
|
alias test=./test
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 5 -
|
|
|
|
|
|
|
|
can be used to look for test in your current working
|
|
directory rather than using the built-in test command.
|
|
Reserved words such as for and while cannot be changed by
|
|
aliasing. The command alias, without arguments, generates a
|
|
list of aliases and corresponding alias values. The unalias
|
|
command removes the name and text of an alias.
|
|
|
|
Aliases are used to save typing and to improve readability
|
|
of scripts. Several aliases are predefined by ksh. For
|
|
example, the predefined alias
|
|
alias integer='typeset -i'
|
|
allows the integer variables i and j to be declared and
|
|
initialized with the command
|
|
integer i=0 j=1
|
|
|
|
While aliases can be defined in scripts, it is not
|
|
recommended. The location of an alias command can be
|
|
important since aliases are only processed when a command is
|
|
read. A . procedure (the shell equivalent to an include
|
|
file) is read all at once (unlike start up files which are
|
|
read a command at a time) so that any aliases defined there
|
|
will not effect any commands within this script. Predefined
|
|
aliases do not have this problem.
|
|
|
|
2.3 Command Re-entry
|
|
|
|
When run interactively, ksh saves the commands you type at a
|
|
terminal in a file. If the variable HISTFILE is set to the
|
|
name of a file to which the user has write access, then the
|
|
commands are stored in this history file. Otherwise the
|
|
file $HOME/.sh_history is checked for write access and if
|
|
this fails an unnamed file is used to hold the history
|
|
lines. Commands are always appended to this file.
|
|
Instances of ksh that run concurrently and use the same
|
|
history file name, share access to the history file so that
|
|
a command entered in one shell will be available for editing
|
|
in another shell. The file may be truncated when ksh
|
|
determines that no other shell is using the history file.
|
|
The number of commands accessible to the user is determined
|
|
by the value of the HISTSIZE variable at the time the shell
|
|
is invoked. The default value is 128. Each command may
|
|
consist of one or more lines since a compound command is
|
|
considered one command. If the character ! is placed
|
|
within the primary prompt string, PS1, then it is replaced
|
|
by the command number each time the prompt is given.
|
|
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 6 -
|
|
|
|
|
|
|
|
A built-in command named hist is used to list and/or edit
|
|
any of these saved commands. The option -l is used to
|
|
specify listing of previous commands. The command can
|
|
always be specified with a range of one or more commands.
|
|
The range can be specified by giving the command number,
|
|
relative or absolute, or by giving the first character or
|
|
characters of the command. When given without specifying
|
|
the range, the last 16 commands are listed, each preceded by
|
|
the command number.
|
|
|
|
If the listing option is not selected, then the range of
|
|
commands specified, or the last command if no range is
|
|
given, is passed to an editor program before being re-
|
|
executed by ksh. The editor to be used may be specified
|
|
with the option -e and following it with the editor name.
|
|
If this option is not specified, the value of the shell
|
|
variable HISTEDIT is used as the name of the editor,
|
|
providing that this variable has non-null value. If this
|
|
variable is not set, or is null, and the -e option has not
|
|
been selected, then /bin/ed is used. When editing has been
|
|
complete, the edited text automatically becomes the input
|
|
for ksh. As this text is read by ksh, it is echoed onto the
|
|
terminal.
|
|
|
|
The -s option causes the editing to be bypassed and just
|
|
re-executes the command. In this case only a single command
|
|
can be specified as the range and an optional argument of
|
|
the form old=new may be added which requests a simple string
|
|
substitution prior to evaluation. A convenient alias,
|
|
alias r='hist -s'
|
|
has been pre-defined so that the single key-stroke r can be
|
|
used to re-execute the previous command and the key-stroke
|
|
sequence, r abc=def c can be used to re-execute the last
|
|
command that starts with the letter c with the first
|
|
occurrence of the string abc replaced with the string def.
|
|
Typing r c > file re-executes the most recent command
|
|
starting with the letter c, with standard output redirected
|
|
to file.
|
|
|
|
2.4 In-line editing
|
|
|
|
Lines typed from a terminal frequently need changes made
|
|
before entering them. With the Bourne shell the only method
|
|
to fix up commands is by backspacing or killing the whole
|
|
line. ksh offers options that allow the user to edit parts
|
|
of the current command line before submitting the command.
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 7 -
|
|
|
|
|
|
|
|
The in-line edit options make the command line into a single
|
|
line screen edit window. When the command is longer than
|
|
the width of the terminal, only a portion of the command is
|
|
visible. Moving within the line automatically makes that
|
|
portion visible. Editing can be performed on this window
|
|
until the return key is pressed. The editing modes have
|
|
editing directives that access the history file in which
|
|
previous commands are saved. A user can copy any of the
|
|
most recent HISTSIZE commands from this file into the input
|
|
edit window. You can locate commands by searching or by
|
|
position.
|
|
|
|
The in-line editing options do not use the termcap or
|
|
terminfo databases. They work on most standard terminals.
|
|
They only require that the backspace character moves the
|
|
cursor left and the space character overwrites the current
|
|
character on the screen and moves the cursor to the right.
|
|
Very few terminals or terminal emulators do not have this
|
|
behavior.
|
|
|
|
There is a choice of editor options. The emacs, gmacs, or
|
|
vi option is selected by turning on the corresponding option
|
|
of the set command. If the value of the EDITOR or VISUAL
|
|
variables ends with any of these suffixes the corresponding
|
|
option is turned on. A large subset of each of these
|
|
editors' features are available within the shell.
|
|
Additional functions, such as file name completion, have
|
|
also been added.
|
|
|
|
In the emacs or gmacs mode the user positions the cursor to
|
|
the point needing correction and inserts, deletes, or
|
|
replaces characters as needed. The only difference between
|
|
these two modes is the meaning of the directive ^T. Control
|
|
keys and escape sequences are used for cursor positioning
|
|
and control functions. The available editing functions are
|
|
listed in the manual page.
|
|
|
|
The vi editing mode starts in insert mode and enters control
|
|
mode when the user types ESC ( 033 ). The return key, which
|
|
submits the current command for processing, can be entered
|
|
from either mode. The cursor can be anywhere on the line.
|
|
A subset of commonly used vi editing directives are
|
|
available. The k and j directives that normally move up and
|
|
down by one line, move up and down one command in the
|
|
history file, copying the command into the input edit
|
|
window. For reasons of efficiency, the terminal is kept in
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 8 -
|
|
|
|
|
|
|
|
canonical mode until an ESC is typed. On some terminals,
|
|
and on earlier versions of the UNIX operating system, this
|
|
doesn't work correctly. The viraw option, which always uses
|
|
raw or cbreak mode, must be used in this case.
|
|
|
|
Most of the code for the editing options does not rely on
|
|
the ksh code and can be used in a stand-alone mode with most
|
|
any command to add in-line edit capability. However, all
|
|
versions of the in-line editors have some features that use
|
|
some shell specific code. For example, with all edit modes,
|
|
the ESC-= directive applied to command words (the first word
|
|
on the line, or the first word after a ;, |, (, or &) lists
|
|
all aliases, functions, or commands that match the portion
|
|
of the given current word. When applied to other words,
|
|
this directive prints the names of files that match the
|
|
current word. The ESC-* directive adds the expanded list of
|
|
matching files to the command line. A trailing * is added
|
|
to the word if it doesn't contain any file pattern matching
|
|
characters before the expansion. In emacs and gmacs mode,
|
|
ESC-ESC indicates command completion when applied to command
|
|
names, otherwise it indicates pathname completion. With
|
|
command or pathname completion, the list generated by the
|
|
ESC-= directive is examined to find the longest common
|
|
prefix. With command completion, only the last component of
|
|
the pathname is used to compute the longest command prefix.
|
|
If the longest common prefix is a complete match, then word
|
|
is replaced by the pathname, and a / is appended if pathname
|
|
is a directory, otherwise a space is added. In vi mode, \
|
|
from control mode gives the same behavior.
|
|
|
|
2.5 Key Binding
|
|
|
|
It is possible to intercept keys as they are entered and
|
|
apply new meanings or bindings. A trap named KEYBD is
|
|
evaluated each time the user enters a key from the keyboard,
|
|
except while entering a search string or an argument to an
|
|
edit directive such as r in vi-mode. The action associated
|
|
with this trap can change the value of the entered key to
|
|
cause the key to perform a different operation.
|
|
|
|
When the KEYBD trap is entered, the .sh.edtext variable
|
|
contains the contents of the current input line and the
|
|
.sh.edcol variable gives the current cursor position within
|
|
this line. The .sh.edmode variable contains the ESC
|
|
character when the trap is entered from insert mode of vi
|
|
mode. Otherwise, this value is null. The .sh.edchar
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 9 -
|
|
|
|
|
|
|
|
variable contains the character or escape sequence that
|
|
caused the trap. The value of .sh.edchar at the end of the
|
|
trap will be used as the input sequence.
|
|
|
|
Using the associative array facility of ksh described later,
|
|
and the function facility of ksh, it is easy to write a
|
|
single trap so that keys can be bound dynamically. For
|
|
example,
|
|
|
|
typeset -A Keybind
|
|
trap 'eval "${Keybind[${.sh.edchar}]}"' KEYBD
|
|
function keybind # key seq
|
|
{
|
|
Keybind[$1]=".sh.edchar=${.sh.edmode}$2"
|
|
}
|
|
|
|
|
|
2.6 Job Control
|
|
|
|
The job control mechanism is almost identical to the version
|
|
found in csh of the Berkeley UNIX operating system, version
|
|
4.1 and later. The job control feature allows the user to
|
|
stop and restart programs, and to move programs to and from
|
|
the foreground and the background. It will only work on
|
|
systems that provide support for these features. However,
|
|
even systems without job control have a monitor option which
|
|
when enabled, will report the progress of background jobs
|
|
and enable the user to kill jobs by job number or job name.
|
|
|
|
An interactive shell associates a job with each pipeline
|
|
typed in from the terminal and assigns them a small integer
|
|
number called the job number. If the job is run
|
|
asynchronously, the job number is printed at the terminal.
|
|
At any given time, only one job owns the terminal, i.e.,
|
|
keyboard signals are only sent to the processes in one job.
|
|
When ksh creates a foreground job, it gives it ownership of
|
|
the terminal. If you are running a job and wish to stop it
|
|
you hit the key ^Z (control-Z) which sends a STOP signal to
|
|
all processes in the current job. The shell receives
|
|
notification that the processes have stopped and takes back
|
|
control of the terminal.
|
|
|
|
There are commands to continue programs in the foreground
|
|
and background. There are several ways to refer to jobs.
|
|
The character % introduces a job name. You can refer to
|
|
jobs by name or number as described in the manual page. The
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 10 -
|
|
|
|
|
|
|
|
built-in command bg allows you to continue a job in the
|
|
background, while the built-in command fg allows you to
|
|
continue a job in the foreground even though you may have
|
|
started it in the background.
|
|
|
|
A job being run in the background will stop if it tries to
|
|
read from the terminal. It is also possible to stop
|
|
background jobs that try to write on the terminal by setting
|
|
the terminal options appropriately.
|
|
|
|
There is a built-in command jobs that lists the status of
|
|
all running and stopped jobs. In addition, you are informed
|
|
of the change of state (running or stopped) of any
|
|
background jobs just before each prompt. If you want to be
|
|
notified about background job completions as soon as they
|
|
occur without waiting for a prompt, then use the notify
|
|
option. When you try to exit the shell while jobs are
|
|
stopped or running, you will receive a message from ksh. If
|
|
you ignore this message and try to exit again, all stopped
|
|
processes will be terminated. In addition, for login
|
|
shells, the HUP signal will be sent to all background jobs
|
|
unless the job has been disowned with the disown command.
|
|
|
|
A built-in version of kill makes it possible to use job
|
|
numbers as targets for signals. Signals can be selected by
|
|
number or name. The name of the signal is the name found in
|
|
the include file /usr/include/sys/signal.h with the prefix
|
|
SIG removed. The -l option of kill provides a means to map
|
|
individual signal names to and from signal number. In
|
|
addition, if no signal name or number is given, kill -l
|
|
generates a list of valid signal names.
|
|
|
|
2.7 Changing Directories
|
|
|
|
By default, ksh maintains a logical view of the file system
|
|
hierarchy which makes symbolic links transparent. For
|
|
systems that have symbolic links, this means that if /bin is
|
|
a symbolic link to /usr/bin and you change directory to
|
|
/bin, pwd will indicate that you are in /bin, not /usr/bin.
|
|
pwd -P generates the physical pathname of the present
|
|
working directory by resolving all the symbolic links. By
|
|
default, the cd command will take you where you expect to go
|
|
even if you cross symbolic links. A subsequent cd .. in the
|
|
example above will place you in /, not /usr. On systems
|
|
with symbolic links, cd -P causes .. to be treated
|
|
physically.
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 11 -
|
|
|
|
|
|
|
|
ksh remembers your last directory in the variable OLDPWD.
|
|
The cd built-in can be given with argument - to return to
|
|
the previous directory and prints the name of the directory.
|
|
Note that cd - done twice returns you to the starting
|
|
directory, not the second previous directory. A directory
|
|
stack manager has been written as shell functions to push
|
|
and pop directories from the stack.
|
|
|
|
2.8 Prompts
|
|
|
|
When ksh reads commands from a terminal, it issues a prompt
|
|
whenever it is ready to accept more input and then waits for
|
|
the user to respond. The TMOUT variable can be set to be
|
|
the number of seconds that the shell will wait for input
|
|
before terminating. A 60 second warning message is printed
|
|
before terminating.
|
|
|
|
The shell uses two prompts. The primary prompt, defined by
|
|
the value of the PS1 variable, is issued at the start of
|
|
each command. The secondary prompt defined by the value of
|
|
the PS2 variable, is issued when more input is needed to
|
|
complete a command.
|
|
|
|
ksh allows that user to specify a list of files or
|
|
directories to check before issuing the PS1 prompt. The
|
|
variable MAILPATH is a colon ( : ) separated list of file
|
|
names to be checked for changes periodically. The user is
|
|
notified before the next prompt. Each of the names in this
|
|
list can be followed by a ? and a prompt to be given when a
|
|
change has been detected in the file. The prompt will be
|
|
evaluated for parameter substitution. The parameter $_
|
|
within a mail message will evaluate to the name of the file
|
|
that has changed. The parameter MAILCHECK is used to
|
|
specify the minimal interval in seconds before new mail is
|
|
checked for.
|
|
|
|
In addition to replacing each ! in the prompt with the
|
|
command version, ksh expands the value of the PS1 variable
|
|
for parameters expansions, arithmetic expansions, and
|
|
command substitutions as described below to generate the
|
|
prompt. The expansion characters that are to be applied
|
|
when the prompt is issued must be quoted to prevent the
|
|
expansions from occurring when assigning the value to PS1.
|
|
For example, PS1="$PWD" causes PS1 to be set to the value of
|
|
PWD at the time of the assignment whereas PS1='$PWD' causes
|
|
PWD to be expanded at the time the prompt is issued.
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 12 -
|
|
|
|
|
|
|
|
Command substitution may require a separate process to
|
|
execute and cause the prompt display to be somewhat slow,
|
|
especially when the return key is pressed several times in a
|
|
row. Therefore, its use within PS1 is discouraged. Some
|
|
variables are maintained by ksh so that their values can be
|
|
used with PS1. The PWD variable stores the pathname of the
|
|
current working directory. The value of SECONDS variable is
|
|
the value of the most recent assignment plus the elapsed
|
|
time. By default, the time is measured in milli-seconds,
|
|
but since SECONDS is a floating point variable, the number
|
|
of places after the decimal point in the expanded value can
|
|
be specified with typeset -Fplaces SECONDS. In a roundabout
|
|
way, this variable can be used to generate a time stamp into
|
|
the PS1 prompt without creating a process at each prompt.
|
|
The following code explains how you can do this on System V.
|
|
On BSD, you need another command to initialize the SECONDS
|
|
variable.
|
|
|
|
|
|
# . this script and use $TIME as part of your PS1 string to
|
|
# get the time of day in your prompt
|
|
typeset -RZ2 _x1 _x2 _x3
|
|
(( SECONDS=$(date '+3600*%H+60*%M+%S') ))
|
|
_s='_x1=(SECONDS/3600)%24,_x2=(SECONDS/60)%60,_x3=SECONDS%60,0'
|
|
TIME='"${_d[_s]}$_x1:$_x2:$_x3"'
|
|
# PS1=${TIME}whatever
|
|
|
|
|
|
|
|
2.9 Tilde substitution
|
|
|
|
The character ~ at the beginning of a word has special
|
|
meaning to ksh. If the characters after the ~ up to a /
|
|
match a user login name in the password database, then the ~
|
|
and the name are replaced by that user's login directory.
|
|
If no match is found, the original word is unchanged. A ~
|
|
by itself, or in front of a /, is replaced by the value of
|
|
the HOME parameter. A ~ followed by a + or - is replaced by
|
|
the value of $PWD and $OLDPWD respectively.
|
|
|
|
2.10 Output formats
|
|
|
|
The output of built-in commands and traces have values
|
|
quoted so that they can be re-input to the shell. This
|
|
makes it easy to use cut and paste shell output on systems
|
|
which use a pointing device such as a mouse. In addition,
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 13 -
|
|
|
|
|
|
|
|
output can be saved in a file for reuse.
|
|
|
|
2.11 The ENV file
|
|
|
|
When an interactive ksh starts, it evaluates the $ENV
|
|
variable to arrive at a file name. If this value is not
|
|
null, ksh attempts to read and process commands in a file by
|
|
this name. Earlier versions of ksh read the ENV file for
|
|
all invocations of the shell primarily to allow function
|
|
definitions to be available for all shell invocations. The
|
|
function search path, FPATH, described later, eliminated the
|
|
primary need for this capability and it was removed because
|
|
the high performance cost was no longer deemed acceptable.
|
|
|
|
|
|
3. PROGRAMMING LANGUAGE
|
|
|
|
The KornShell vastly extends the set of applications that
|
|
can be implemented efficiently at the shell level. It does
|
|
this by providing simple yet powerful mechanisms to perform
|
|
arithmetic, pattern matching, substring generation, and
|
|
arrays. Users can write applications as separate functions
|
|
that can be defined in the same file or in a library of
|
|
functions stored in a directory and loaded on demand.
|
|
|
|
3.1 String Processing
|
|
|
|
The shell is primarily a string processing language. By
|
|
default, variables hold variable length strings. There are
|
|
no limits to the length of strings. Storage management is
|
|
handled by the shell automatically. Declarations are not
|
|
required. With most programming languages, string constants
|
|
are designated by enclosing characters in single quotes or
|
|
double quotes. Since most of the words in the language are
|
|
strings, the shell requires quotes only when a string
|
|
contains characters that are normally processed specially by
|
|
the shell, but their literal meaning is intended. However,
|
|
since the shell is a string processing language, and some
|
|
characters can occur as literals and as language
|
|
metacharacters, quoting is an important part of the
|
|
language.
|
|
|
|
There are four quoting mechanisms in ksh. The simplest is
|
|
to enclose a sequence of characters inside single quotes.
|
|
All characters between a pair of single quotes have their
|
|
literal meaning; the single quote itself cannot appear. A $
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 14 -
|
|
|
|
|
|
|
|
immediately preceding a single quoted string causes all the
|
|
characters until the matching single quote to be interpreted
|
|
as an ANSI-C language string. Thus, '\n' represents
|
|
characters \ and n, whereas, $'\n' represents the new-line
|
|
character. Double quoted strings remove the special meaning
|
|
of all characters except $, and `, so that parameter
|
|
expansion and command substitution (defined below) are
|
|
performed. The final mechanism for quoting a character is
|
|
by preceding it with the escape character \. This mechanism
|
|
works outside of quoted strings and for the characters $, `,
|
|
", and \ in double quoted strings.
|
|
|
|
Variables are designated by one or more strings of
|
|
alphanumeric characters beginning with an alphabetic
|
|
character separated by a .. Upper and lowercase characters
|
|
are distinct, so that the variable A and a are names of
|
|
different variables. There is no limit to the length of the
|
|
name of a variable. You do not have to declare variables.
|
|
You can assign a value to a variable by writing the name of
|
|
the variable, followed by an equal sign, followed by a
|
|
character string that represents its value. To create a
|
|
variable whose name contains a ., the variable whose name
|
|
consists of the characters before the last . must already
|
|
exist. You reference a variable by putting the name inside
|
|
curly braces and preceding the braces with a dollar sign.
|
|
The braces may be omitted when the name is alphanumeric. If
|
|
x and y are two shell variables, then to define a new
|
|
variable, z, whose value is the concatenation of the values
|
|
of x and y, you just say z=$x$y. It is that easy.
|
|
|
|
The $ can be thought of as meaning "value of." You can also
|
|
capture the output of any command with the notation
|
|
$(command). This is referred to as command substitution.
|
|
For example, x=$(date) assigns the output from the date
|
|
command to the variable x. Command substitution in the
|
|
Bourne shell is denoted by enclosing the command between
|
|
backquotes, (``). This notation suffers from some
|
|
complicated quoting rules. Thus, it is hard to write sed
|
|
patterns which contains back slashes within command
|
|
substitution. Putting the pattern in single quotes is of
|
|
little help. ksh accepts the Bourne shell command
|
|
substitution syntax for backward compatibility. The
|
|
$(command) notation allows the command itself to contain
|
|
quoted strings even if the substitution occurs within double
|
|
quotes. Nesting is legal.
|
|
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 15 -
|
|
|
|
|
|
|
|
The special command substitution of the form $(cat file) can
|
|
be replaced by $(< file), which is faster because the cat
|
|
command doesn't have to run.
|
|
|
|
3.2 Shell Parameters and Variables
|
|
|
|
There are three types of parameters used by ksh, special
|
|
parameters, positional parameters, and named parameters
|
|
which are called variables. ksh defines the same special
|
|
characters, 0, *, @, #, ?, $, !, and -, as they are defined
|
|
in the Bourne shell.
|
|
|
|
Positional parameters are set when the shell is invoked, as
|
|
arguments to the set built-in, and by calls to functions
|
|
(see below) and . procedures. They are named by a number
|
|
starting at 1.
|
|
|
|
The third type of parameter is a variable. As mentioned
|
|
earlier, ksh uses variables whose names consist of one or
|
|
more alpha-numeric strings separated by a .. There is no
|
|
need to specify the type of a variable in the shell because,
|
|
by default, variables store strings of arbitrary length and
|
|
values will automatically be converted to numbers when used
|
|
in an arithmetic context. However, ksh variables can have
|
|
one or more attributes that control the internal
|
|
representation of the variable, the way the variable is
|
|
printed, and its access or scope. In addition, ksh allows
|
|
variables to represent arrays of values and references to
|
|
other variables. The typeset built-in command of ksh
|
|
assigns attributes to variables. Two of the attributes,
|
|
readonly and export, are available in the Bourne shell.
|
|
Most of the remaining attributes are discussed here. The
|
|
complete list of attributes appears in the manual. The
|
|
unset built-in of ksh removes values and attributes of
|
|
variables. When a variable is exported, certain of its
|
|
attributes are also exported.
|
|
|
|
Whenever a value is assigned to a variable, the value is
|
|
transformed according to the attributes of the variable.
|
|
Changing the attribute of a variable can change its value.
|
|
The attributes -L and -R are for left and right field
|
|
justification respectively. They are useful for aligning
|
|
columns in a report. For each of these attributes, a width
|
|
can be defined explicitly or else it is defined the first
|
|
time an assignment is made to the variable. Each assignment
|
|
causes justification of the field, truncating if necessary.
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 16 -
|
|
|
|
|
|
|
|
Assignment to fixed sized variables provides one way to
|
|
generate a substring consisting of a fixed number of
|
|
characters from the beginning or end of a string. Other
|
|
methods are discussed later.
|
|
|
|
The attributes -u and -l, are used for upper case and lower
|
|
case formatting respectively. Since it makes no sense to
|
|
have both attributes on simultaneously, turning on either of
|
|
these attributes turns the other off. The following script,
|
|
using read and print which are described later, provides an
|
|
example of the use of shell variables with attributes. This
|
|
script reads a file of lines each consisting of five fields
|
|
separated by : and prints fields 4 and 2 in upper case in
|
|
columns 1-15, left justified, and columns 20-25 right-
|
|
justified respectively.
|
|
|
|
typeset -uL15 f4 # 15 character left justified
|
|
typeset -uR6 f2 # 6 character right justified
|
|
IFS=: # set field separator to :
|
|
while read -r f1 f2 f3 f4 f5 # read line, split into fields
|
|
do print -r -- "$f4 $f2" # print fields 4 and 2
|
|
done
|
|
|
|
|
|
The -i, -E, and -F, attributes are used to represent
|
|
numbers. Each can be followed by a decimal number. The -i
|
|
attribute causes the value to be represented as an integer
|
|
and it can be followed by a number representing the numeric
|
|
base when expanding its value. Whenever a value is assigned
|
|
to an integer variable, it is evaluated as an arithmetic
|
|
expression and then truncated to an integer.
|
|
|
|
The -E attribute causes the value to be represented in
|
|
scientific notation whenever its value is expanded. The
|
|
number following the -E determines the number of significant
|
|
figures, and defaults to 6. The -F attribute causes the
|
|
value to be represented with a fixed number of places after
|
|
the decimal point. Assignments to variables of the -E or -F
|
|
cause the evaluation of the right hand side of the
|
|
assignment.
|
|
|
|
ksh allows one-dimensional arrays in addition to simple
|
|
variables. There are two types of arrays; associative
|
|
arrays and indexed arrays. The subscript for an associative
|
|
array is an arbitrary string, whereas the subscript for an
|
|
indexed array is an arithmetic expression that is evaluated
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 17 -
|
|
|
|
|
|
|
|
to yield an integer index. Any variable can become an
|
|
indexed array by referring to it with a subscript. All
|
|
elements of an array need not exist. Subscripts for arrays
|
|
must evaluate to an integer between 0 and some maximum
|
|
value, otherwise an error results. The maximum value may
|
|
vary from one machine to another but is at least 4095.
|
|
Evaluation of subscripts is described in the next section.
|
|
Attributes apply to the whole array.
|
|
|
|
Assignments to array variables can be made to individual
|
|
elements via parameter assignment commands or the typeset
|
|
built-in. Additionally, values can be assigned sequentially
|
|
using the -A name option of the set command. Referencing of
|
|
subscripted variables requires the character $, but also
|
|
requires braces around the array element name. The braces
|
|
are needed to avoid conflicts with the file name generation
|
|
mechanism. The form of any array element reference is:
|
|
${name[subscript]}.
|
|
A subscript value of * or @ can be used to generate all
|
|
elements of an array, as they are used for expansion of
|
|
positional parameters. The list of currently defined
|
|
subscripts for a given variable can be generated with
|
|
${!name[@]}, or ${!name[*]}.
|
|
|
|
The nameref attribute causes the variable to be treated as a
|
|
reference to the variable defined by its value. Once this
|
|
attribute is set, all references to this variable become
|
|
references to the variable named by the value of this
|
|
variable. For example, if foo=bar, then setting the
|
|
reference attribute on foo will call all subsequent
|
|
references to foo to behave as references to bar. Unsetting
|
|
this attribute breaks the association. Reference variables
|
|
are usually used inside functions whose arguments are the
|
|
name of a shell variable. The names for reference variables
|
|
cannot contain a .. Whenever a shell variable is
|
|
referenced, the portion of the variable up to the first .
|
|
is checked to see whether it matches the name of a reference
|
|
variable. If it does, then the name of the variable
|
|
actually used consists of the concatenation of the name of
|
|
the variable defined by the reference plus the remaining
|
|
portion of the original variable name. For example, using
|
|
the predefined alias, alias nameref='typeset -n',
|
|
|
|
.bar.home.bam="hello world"
|
|
nameref foo=.bar.home
|
|
print ${foo.bam}
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 18 -
|
|
|
|
|
|
|
|
hello world
|
|
|
|
|
|
3.3 Substring Generation
|
|
|
|
The expansion of a variable or parameter can be modified so
|
|
that only a portion of the value results. It is often
|
|
necessary to extract a portion of a shell variable or a
|
|
portion of an array. There are several parameter expansion
|
|
operators that can do this. One method to generate a
|
|
substring is with an expansion of the form
|
|
${name:offset:length} where offset is an arithmetic
|
|
expression that defines the offset of the first character
|
|
starting from 0, and length is an arithmetic expression that
|
|
defines the length of the substring. If :length is omitted,
|
|
the length of the value of name starting at offset is used.
|
|
The :offset:length operators can also be applied to array
|
|
expansions and to parameters * and @ to generate portions of
|
|
an array. For example, the expansion,
|
|
${name[@]:offset:length}, yields up to length elements of
|
|
the array name starting at the element offset.
|
|
|
|
The other parameter expansion modifiers use shell patterns
|
|
to describe portions of the string to modify and delete. A
|
|
description of shell patterns is contained below. When
|
|
these modifiers are applied to special parameters @ and * or
|
|
to array parameters given as name[@] or name[*], the
|
|
operation is performed on each element. There are four
|
|
parameter substitution modifiers that strip off leading and
|
|
trailing substrings during parameter substitution by
|
|
removing the characters matching a given pattern. An
|
|
expansion of the form ${name#pattern} causes the smallest
|
|
matching prefix of the value of name to be removed. The
|
|
largest prefix matching pattern is removed by using ##
|
|
instead of #. Similarly, an expansion of the form
|
|
${name%pattern} causes the smallest matching substring at
|
|
the end of name to be removed. Again, using %% instead of
|
|
%, causes the largest matching trailing substring to be
|
|
deleted. For example, if the shell variable file has value
|
|
foo.c, then the expression ${file%.c}.o has value foo.o.
|
|
|
|
The value of an expansion can be changed by specifying a
|
|
pattern that matches the part that needs to be changed after
|
|
the the parameter expansion modifier /. An expansion of the
|
|
form ${name/pattern/string} replaces the first match of
|
|
pattern with the value of variable name to string. The
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 19 -
|
|
|
|
|
|
|
|
second / is not necessary when string is null. The
|
|
expansion ${name//pattern/string} changes all occurrences of
|
|
the pattern into string. The parameter expansion modifiers
|
|
/# and /% cause the matching pattern to be anchored to the
|
|
beginning and end respectively.
|
|
|
|
Finally, there are parameter expansion modifiers that yield,
|
|
the name of the variable, the string length of the value or
|
|
the number of elements of an array. ${!name} yields the
|
|
name of the variable which will be name itself except when
|
|
name is a reference variable. In this case it will yield
|
|
the name of the variable it refers to. ${#name} will be the
|
|
length in bytes of $name. For an array variable ${#name[*]}
|
|
gives the number of elements in the array.
|
|
|
|
3.4 Arithmetic Evaluation
|
|
|
|
For the most part, the shell is a string processing
|
|
language. However, the need for arithmetic has long been
|
|
obvious. Many of the characters that are special to the
|
|
Bourne shell, are needed as arithmetic operators. To make
|
|
arithmetic easy to use, and to maintain compatibility with
|
|
the Bourne shell, ksh uses matching (( and )) to delineate
|
|
arithmetic expressions. While single parentheses might have
|
|
been more desirable, these already mean subshell so that
|
|
another notation was required. The arithmetic expression
|
|
inside the double parentheses follows that same syntax,
|
|
associativity and precedence as the ANSI-C[15] programming
|
|
language. The characters between the matching double
|
|
parentheses are processed with the same rules used for
|
|
double quotes so that spaces can be used to aid readability
|
|
without additional quoting.
|
|
|
|
All arithmetic evaluations are performed using double
|
|
precision floating point arithmetic. Floating point
|
|
constants follow the same rules as the ANSI-C programming
|
|
language. Integer arithmetic constants are written as
|
|
base#number,
|
|
where base is a decimal integer between two and sixty-four
|
|
and number is any non-negative number. Base ten is used
|
|
when no base is specified. The digits are represented by
|
|
the characters 0-9a-zA-Z_@. For bases less than or equal to
|
|
36, upper and lower case characters can be used
|
|
interchangibly to represent the digits from 10 thru 35.
|
|
|
|
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 20 -
|
|
|
|
|
|
|
|
Arithmetic expressions are made from constants, variables,
|
|
operators. Parentheses may be used for grouping. The
|
|
contents inside the double parentheses are processed with
|
|
the same expansions as occurs in a double quoted string, so
|
|
that all $ expansions are performed before the expression
|
|
are performed. However, there is no need to use the $ to
|
|
get the value of a variable because the arithmetic evaluator
|
|
replaces the name of the variable by its value within an
|
|
arithmetic expression. The $ cannot be used when the
|
|
variable is the subject of assignment or an increment
|
|
operation. As a rule it is better not to use $ in front of
|
|
variables in an arithmetic expression.
|
|
|
|
An arithmetic command of the form (( ... )) is a command
|
|
that evaluates the enclosed arithmetic expression. For
|
|
example, the command
|
|
(( x++ ))
|
|
can be used to increment the variable x, assuming that x
|
|
contains some numerical value. The arithmetic command is
|
|
true (return value 0), when the resulting expression is
|
|
non-zero, and false (return value 1) when the expression
|
|
evaluates to zero. This makes the command easy to use with
|
|
the if and while compound command.
|
|
|
|
The for compound command has been extended for use in
|
|
arithmetic contexts. The syntax,
|
|
for (( expr1; expr2 ; expr3 ))
|
|
can be used as the first line of a for loop with the same
|
|
semantics as the for statement in the ANSI-C programming
|
|
language.
|
|
|
|
Arithmetic evaluations can also be performed as part of the
|
|
evaluation of a command line. The syntax $(( ... )) expands
|
|
to the value of the enclosed arithmetic expression. This
|
|
expansion can occur wherever parameter expansion is
|
|
performed. For example using the ksh command print
|
|
(described later)
|
|
print $((2+2))
|
|
prints the number 4.
|
|
|
|
The following script prints the first n lines of its
|
|
standard input onto its standard output, where n can be
|
|
supplied as an optional argument whose default value is 20.
|
|
|
|
integer n=${1-20} # set n
|
|
while (( n-- >=0 )) && read -r line # at most n lines
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 21 -
|
|
|
|
|
|
|
|
do print -r -- "$line"
|
|
done
|
|
|
|
|
|
3.5 Shell Expansions
|
|
|
|
The commands you enter from the terminal or from a script
|
|
are divided into words and each word undergoes several
|
|
expansions to generate the command name and its arguments.
|
|
This is done in two phases. The first phase recognizes
|
|
reserved words, spaces and operators to decide where command
|
|
boundaries lie. Alias substitutions take place during this
|
|
phase. The second phase performs expansions in the
|
|
following order:
|
|
|
|
+ Tilde substitution, parameter expansion, arithmetic
|
|
expansion, and command substitution are performed from
|
|
left to right.
|
|
|
|
+ The characters that result from parameter expansion and
|
|
command substitution above are checked with the
|
|
characters in the IFS variable for possible field
|
|
splitting. (See a description of read below to see how
|
|
IFS is used.) Setting IFS to a null value causes field
|
|
splitting to be skipped.
|
|
|
|
+ Pathname generation (as described below) is performed
|
|
on each of the fields. Any field that doesn't match a
|
|
pathname is left alone. The option, -f or noglob, is
|
|
used to disable pathname generation.
|
|
|
|
3.6 Pattern Matching
|
|
|
|
The shell is primarily a string processing language and uses
|
|
patterns for matching file names as well as for matching
|
|
strings. The characters ?, *, and [ are processed specially
|
|
by the shell when not quoted. These characters are used to
|
|
form patterns that match strings. Patterns are used by the
|
|
shell to match pathnames, to specify substrings, and for
|
|
case commands. The character ? matches any one character.
|
|
The character * matches zero or more characters. The
|
|
character sequence [...] defines a character class that
|
|
matches any character contained within []. A range of
|
|
characters can be specified by putting a - between the first
|
|
and last character of the range. An exclamation mark, !,
|
|
immediately after the [, means match all characters except
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 22 -
|
|
|
|
|
|
|
|
the characters specified. For example, the pattern
|
|
a?c*.[!a-z] matches any string beginning with an a, whose
|
|
third character is a c, and that ends in . (dot) followed
|
|
by any character except the lowercase letters, a-z. The
|
|
sequence [:alpha:] inside a character class, matches any set
|
|
of characters in the ANSI-C alpha class. Similarly,
|
|
[:class:] matches each of the characters in the given class
|
|
for all the ANSI-C character classes.
|
|
|
|
ksh treats strings of the form (pattern-list) , where
|
|
pattern-list is a list of one or more patterns separated by
|
|
a |, specially when preceded by *, ?, +, @, or !. A ?
|
|
preceding (pattern-list) means that the pattern list
|
|
enclosed in () is optional. An @(pattern-list) matches any
|
|
pattern in the list of patterns enclosed in (). A
|
|
*(pattern-list) matches any string that contains zero or
|
|
more of each of the enclosed patterns, whereas +(pattern-
|
|
list) requires a match of one or more of any of the given
|
|
patterns. For instance, the pattern +([0-9])?(.) matches
|
|
one or more digits optionally followed by a .(dot). A
|
|
!(pattern-list) matches anything except any of the given
|
|
patterns. For example, print !(*.o) will display any file
|
|
name that does not end in .o.
|
|
|
|
When patterns are used to generate pathnames when expanding
|
|
commands several other rules apply. A separate match is
|
|
made for each matching on each file name component of the
|
|
pathname. Read permission is required for any portion of the
|
|
pathname that contains any special pattern character. Search
|
|
permission is required for every component except possibly
|
|
the last.
|
|
|
|
By default, file names in each directory that begin with .
|
|
are skipped when performing a match. If the pattern to be
|
|
matched starts with a leading ., then only files beginning
|
|
with a ., are considered when finding matching files. If
|
|
the FIGNORE variable is set, then only files that do not
|
|
match this pattern are considered. This overrides the
|
|
special meaning of . in a pattern and in a file name.
|
|
|
|
If the markdirs option is set, each matching pathname that
|
|
is the name of a directory has a trailing / appended to the
|
|
name.
|
|
|
|
|
|
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 23 -
|
|
|
|
|
|
|
|
3.7 Conditional Expressions
|
|
|
|
The Bourne shell uses the test command, or the equivalent [
|
|
command, to test the attributes for files and to compare
|
|
strings or numbers. The problem with test is that the shell
|
|
has expanded the words of the test command and split them
|
|
into arguments before test begins execution. test cannot
|
|
distinguish between operators and operands. In most cases
|
|
test "$1" will test whether argument 1 is non-null.
|
|
However, if argument 1 is -f, then test will treat -f as an
|
|
operator and yield a syntax error. One of the most frequent
|
|
errors with test occurs when its operands are not within
|
|
double quotes. In this case, the argument may expand to
|
|
more than a single argument or to no argument at all. In
|
|
either case this will likely cause a syntax error. What
|
|
makes this most insidious is that these errors are
|
|
frequently data dependent. A script that appears to run
|
|
correctly may abort if given unexpected data.
|
|
|
|
To get around these problems, ksh has a compound command for
|
|
condition expression testing as part of the language. The
|
|
reserved words [[ and ]] delimit the range of the command.
|
|
Because they are reserved words, not operator characters,
|
|
they require spaces to separate them from arguments. The
|
|
words between [[ and ]] are not processed for field
|
|
splitting or for pathname generation. In addition, since
|
|
ksh determines the operators before parameter expansion,
|
|
expansions that yield no argument cause no problem. The
|
|
operators within [[...]] are almost the same as those for
|
|
the test command. All unary operators are of the form
|
|
-letter and are followed by a single operand. Instead of -a
|
|
and -o, [[...]] uses && and || to indicate "and" and "or".
|
|
Parentheses are used without quoting for grouping.
|
|
|
|
The right hand side of the string comparison operators ==
|
|
and != take a pattern and tests whether the left hand
|
|
operand matches this pattern. Quoting the pattern results
|
|
is a string comparison rather than the pattern match. The
|
|
operators < and > within [[...]] designate lexicographical
|
|
comparison.
|
|
|
|
In addition there are several other new comparison
|
|
primitives. The binary operators -ot and -nt compare the
|
|
modification times of two files to see which file is older
|
|
than or newer than the other. The binary operator -ef tests
|
|
whether two files have the same device and i-node number,
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 24 -
|
|
|
|
|
|
|
|
i. e., a link to the same file.
|
|
|
|
The unary operator -L returns true if its operand is a
|
|
symbolic link. The unary operator -O ( -G ) returns true if
|
|
the owner (or group) of the file operand matches that of the
|
|
caller. The unary operator -o returns true when its operand
|
|
is the name of an option that is currently on.
|
|
|
|
The following script illustrates some of the uses of
|
|
[[...]]. The reference manual contains the complete list of
|
|
operators.
|
|
|
|
for i in "${@}"
|
|
do # execute foo for numeric directory
|
|
if [[ -d $i && $i == +([0-9]) ]]
|
|
then foo
|
|
# otherwise if writable or executable file and not mine
|
|
elif [[ (-w $i||-x $i) && ! -O $i ]]
|
|
then bar
|
|
fi
|
|
done
|
|
|
|
|
|
3.8 Input and Output
|
|
|
|
ksh has extended I/O capabilities to enhance the use of the
|
|
shell as a programming language. As with the Bourne shell,
|
|
you use the I/O redirection operator, <, to control where
|
|
input comes from, and the I/O redirection operator, >, to
|
|
control where output goes to. Each of these operators can be
|
|
preceded with a single digit that specifies a file unit
|
|
number to associate with the file stream. Ordinarily you
|
|
specify these I/O redirection operators with a specific
|
|
command to which it applies. However, if you specify I/O
|
|
redirections with the exec command, and don't specify
|
|
arguments to exec, then the I/O redirection applies to the
|
|
current program. For example, the command exec < foobar
|
|
opens file foobar for reading. The exec command is also
|
|
used to close files. A file descriptor unit can be opened
|
|
as a copy of an existing file descriptor unit by using
|
|
either of the <& or >& operators and putting the file
|
|
descriptor unit of the original file after the &. Thus,
|
|
2>&1 means open standard error (file descriptor 2) as a copy
|
|
of standard output (file descriptor 1). A file descriptor
|
|
value of - after the & indicates that the file should be
|
|
closed. To close file unit 5, specify exec 5<&-. There are
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 25 -
|
|
|
|
|
|
|
|
two additional redirection operators with ksh and the POSIX
|
|
shell that are not part of the Bourne shell. The >|
|
|
operator overrides the effect of the noclobber option
|
|
described earlier. The <> operator causes a file to be
|
|
opened for both reading and writing.
|
|
|
|
ksh recognizes certain pathnames and treats them specially.
|
|
Pathnames of the form /dev/fd/n are treated as equivalent to
|
|
the file defined by file descriptor n. These name can be
|
|
used as the script argument to ksh and in conditional
|
|
testing as described above. On underlying systems that
|
|
support /dev/fd in the file system, these names can be
|
|
passed to other commands. Pathnames of the form
|
|
/dev/tcp/hostid/port and /dev/udp/hostid/port can be used to
|
|
create tcp and udp connections to services given by the
|
|
hostid number and port number. The hostid cannot use
|
|
symbolic values. In practice these are typically generated
|
|
by command substitution. For example,
|
|
exec 5> /dev/tcp/$(service name) would open file descriptor
|
|
5 for sending messages to hostid and port number defined by
|
|
the output of service name.
|
|
|
|
The Bourne shell has a built-in command read for reading
|
|
lines from standard input (file descriptor 0) and splitting
|
|
it into fields based on the value of the IFS variable, and a
|
|
command echo to write strings to standard output. ( On some
|
|
systems, echo is not a built-in command and incurs
|
|
considerable overhead to use.) Unfortunately, neither of
|
|
these commands is able to perform some very basic tasks.
|
|
For example. with the Bourne shell, the read built-in
|
|
cannot read a single line that end in \. With ksh the read
|
|
built-in has a -r option to remove the special meaning for \
|
|
which allows it to be treated as a regular character rather
|
|
than the line continuation character. With the Bourne
|
|
shell, there is no simple way to have more than one file
|
|
open at any time for reading. ksh has options on the read
|
|
command to specify the file descriptor for the input. The
|
|
fields that are read from a line can be stored into an
|
|
indexed array with the -A option to read. This allows a
|
|
line to be split into an arbitrary number of fields.
|
|
|
|
The way the Bourne shell uses the IFS variable to split
|
|
lines into fields greatly limits its utility. Often data
|
|
files consist of lines that use a character such as : to
|
|
delimit fields with two adjacent delimiters that denote a
|
|
null field. The Bourne shell treats adjacent delimiters as
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 26 -
|
|
|
|
|
|
|
|
a single field delimiter. With ksh, delimiters that are
|
|
considered white space characters have the behavior of the
|
|
Bourne shell, but other adjacent delimiters separate null
|
|
fields.
|
|
|
|
The read command is often used in scripts that interact with
|
|
the user by prompting the user and then requesting some
|
|
input. With the Bourne shell two commands are needed; one
|
|
to prompt the user, the other to read the reply. ksh allows
|
|
these two commands to be combined. The first argument of
|
|
the read command can be followed by a ? and a prompt string
|
|
which is used whenever the input device is a terminal.
|
|
Because the prompt is associated with the read built-in, the
|
|
built-in command line editors will be able to re-output the
|
|
prompt whenever the line needs to be refreshed when reading
|
|
from a terminal device.
|
|
|
|
With the Bourne shell, there is no way to set a time limit
|
|
for waiting for the user response to read. The -t option to
|
|
read takes a floating point argument that gives the time in
|
|
seconds, or fractions of seconds that the shell should wait
|
|
for a reply.
|
|
|
|
The version of the echo command in System V treats certain
|
|
sequences beginning with \ as control sequences. This makes
|
|
it hard to output strings without interpretation. Most BSD
|
|
derived systems do not interpret \ control sequences.
|
|
Unfortunately, the BSD versions of echo accepts a -n option
|
|
to prevent a trailing new-line, but has no way to cause the
|
|
string -n to be printed. Neither of these versions is
|
|
adequate. Also, because they are incompatible, it is very
|
|
hard to write portable shell scripts using echo. The ksh
|
|
built-in, print, outputs characters to the terminal or to a
|
|
file and subsumes the functions of all versions of echo.
|
|
Ordinarily, escape sequences in arguments beginning with \
|
|
are processed the same as for the System V echo command.
|
|
However print follows the standard conventions for options
|
|
and has options that make print very versatile. The -r
|
|
option can be used to output the arguments without any
|
|
special meaning. The -n option can be used here to suppress
|
|
the trailing new-line that is ordinarily appended. As with
|
|
read, it is possible to specify the file descriptor number
|
|
as an option to the command to avoid having to use
|
|
redirection operators with each occurrence of the command.
|
|
|
|
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 27 -
|
|
|
|
|
|
|
|
The IEEE POSIX shell and utilities standard committee was
|
|
unable to reconcile the differences between the System V and
|
|
BSD versions of echo. They introduced a new command named
|
|
printf which takes an ANSI-C format string and a list of
|
|
options and outputs the strings using the ANSI-C formatting
|
|
rules. Since ksh is POSIX conforming, it accepts printf.
|
|
However, there is a -f options to print that can be used to
|
|
specify a format string which processes the arguments the
|
|
same way that printf does.
|
|
|
|
The format processing for print and printf has been extended
|
|
slightly. There are three additional formatting directives.
|
|
The %b format causes the \ escape sequences to be expanded
|
|
as they are with the System V echo command. The %q format
|
|
causes quotes to be placed on the output as required so that
|
|
it can be used as shell input. Special characters in the
|
|
output of most ksh built-in commands and in the output from
|
|
an execution trace are quoted in an equivalent fashion. The
|
|
%P format causes an extended regular expression string to be
|
|
converted into a shell pattern. This is useful for writing
|
|
shell applications that have to accept regular expression as
|
|
input. Finally, the escape sequence \E which expands to the
|
|
terminal escape character (octal 033) has been added.
|
|
|
|
The shell is frequently used as a programming language for
|
|
interactive dialogues. The select statement has been added
|
|
to the language to make it easier to present menu selection
|
|
alternatives to the user and evaluate the reply. The list
|
|
of alternatives is numbered and put in columns. A user
|
|
settable prompt, PS3, is issued and if the answer is a
|
|
number corresponding to one of the alternatives, the select
|
|
loop variable is set to this value. In any case, the REPLY
|
|
variable is used to store the user entered reply. The shell
|
|
variables LINES and COLUMNS are used to control the layout
|
|
of select lists.
|
|
|
|
3.9 Co-process
|
|
|
|
ksh can spawn a co-process by adding a |& after a command.
|
|
This process will be run with its standard input and its
|
|
standard output connected to the shell. The built-in
|
|
command print with the -p option will write into the
|
|
standard input of this process and the built-in command read
|
|
with the -p option will read from the output of this
|
|
process.
|
|
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 28 -
|
|
|
|
|
|
|
|
In addition, the I/O redirection operators <& and >& can be
|
|
used to move the input or output pipe of the co-process to a
|
|
numbered file descriptor. Use exec 3>& p to move the input
|
|
of the co-process to file descriptor 3. After you have
|
|
connected to file descriptor 3, you can direct the output of
|
|
any command to the co-process by running command >&3. Also,
|
|
by moving the input of the co-process to a numbered
|
|
descriptor, it is possible to run a second co-process. The
|
|
output of both co-processes will be the file descriptor
|
|
associated with read -p. You can use exec 4< p to cause the
|
|
output of these co-processes to go to file descriptor 4 of
|
|
the shell. Once you have moved the pipe to descriptor 4, it
|
|
is possible to connect a server to the co-process by running
|
|
command 4<& p or to close the co-process pipe with
|
|
exec 4<& -.
|
|
|
|
3.10 Functions
|
|
|
|
Function definitions are of the form
|
|
|
|
function name
|
|
{
|
|
any shell script
|
|
}
|
|
|
|
A function whose name contains a . is called a discipline
|
|
function. The portion of the name before the last . must
|
|
refer to the name of an existing variable. Thus, if p is a
|
|
reference to PATH, then the function name p.get and PATH.get
|
|
refer to the same function.
|
|
|
|
The function is invoked either by specifying name as the
|
|
command name and optionally following it with arguments or
|
|
by using it as an option to the . built-in command.
|
|
Positional parameters are saved before each function call
|
|
and restored when completed. The arguments that follow the
|
|
function name on the calling line become positional
|
|
parameters inside the function. The return built-in can be
|
|
used to cause the function to return to the statement
|
|
following the point of invocation.
|
|
|
|
Functions can also be defined with the System V notation,
|
|
|
|
name ()
|
|
{
|
|
any shell script
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 29 -
|
|
|
|
|
|
|
|
}
|
|
|
|
Functions defined with this syntax cannot be used as the
|
|
first argument to a . procedure. ksh accepts this notation
|
|
for compatibility only. There is no need to use this
|
|
notation when writing ksh scripts.
|
|
|
|
Functions defined with the function name syntax and invoked
|
|
by name are executed in the current shell environment and
|
|
can share named variables with the calling program.
|
|
Options, other than execution trace -x, set by the calling
|
|
program are passed down to a function. The options are not
|
|
shared with the function so that any options set within a
|
|
function are restored when the function exits. Traps
|
|
ignored by the caller are ignored within the function and
|
|
cannot be enabled. Traps caught by the calling program are
|
|
reset to their default action with the function. In most
|
|
instances, the default action is to cause the function to
|
|
terminate. A trap on EXIT, defined within a function
|
|
executes after the function completes but before the caller
|
|
resumes. Therefore, any variable assignments and any
|
|
options set as part of a trap action will be effective after
|
|
the caller resumes.
|
|
|
|
By default, variables are inherited by the function and
|
|
shared by the calling program. However, for functions
|
|
defined with the function name syntax that are invoked by
|
|
name, environment substitutions preceding the function call
|
|
apply only to the scope of the function call. Also,
|
|
variables whose names do not contain a . that are defined
|
|
with the typeset built-in command are local to the function
|
|
that they are declared in. Thus, for the function defined
|
|
|
|
function name
|
|
{
|
|
typeset -i x=10
|
|
let z=x+y
|
|
print $z
|
|
}
|
|
|
|
invoked as y=13 name, x and y are local variables with
|
|
respect to the function name while z is global.
|
|
|
|
Functions defined with the name() syntax, and functions
|
|
invoked as an argument to the . command, share everything
|
|
other than positional parameters with the caller.
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 30 -
|
|
|
|
|
|
|
|
Assignments that precede the call remain in effect after the
|
|
function completes.
|
|
|
|
Alias and function names are not passed down to shell
|
|
scripts or carried across separate invocations of ksh. The
|
|
$FPATH variable gives a colon separated list of directories
|
|
that is searched for function definitions when trying to
|
|
resolve the command name. Whenever a file name contained in
|
|
$FPATH is found, the complete file is read and all functions
|
|
contained within become defined.
|
|
|
|
Calls that reference functions can be recursive. Except for
|
|
special built-ins, function names take precedence over
|
|
built-in names and names of programs when used as command
|
|
names. To write a function to replace a built-in command or
|
|
to replace a program, you must use the command built-in
|
|
command. The arguments to command are the name and
|
|
arguments of the program you want to execute. For example
|
|
to write a cd function which changes the directory and
|
|
prints out the directory name, you can write,
|
|
|
|
function cd
|
|
{
|
|
if command cd "$@"
|
|
then print -r -- $PWD
|
|
fi
|
|
}
|
|
|
|
|
|
The FPATH variable is a colon separated list that ksh uses
|
|
to search for function definitions. When ksh encounters an
|
|
autoload function, it runs the . command on the script
|
|
containing the function, and then executes the function.
|
|
|
|
Function definitions may also be placed in the ENV file.
|
|
However, this causes the shell to take longer to begin
|
|
executing.
|
|
|
|
3.11 Process Substitution
|
|
|
|
This feature is only available on versions of the UNIX
|
|
operating system which support the /dev/fd directory for
|
|
naming open files. Each command argument of the form
|
|
<(list) or >(list) will run process list asynchronously
|
|
connected to some file in the /dev/fd directory. The name
|
|
of this file will become the argument to the command. If
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 31 -
|
|
|
|
|
|
|
|
the form with > is selected then writing on this file will
|
|
provide input for list. If < is used, then the file passed
|
|
as an argument will contain the output of the list process.
|
|
For example,
|
|
|
|
paste <(cut -f1 file1) <(cut -fB file2) | tee >(process1) >(process2)
|
|
|
|
extracts fields 1 and 3 from the files file1 and file2
|
|
respectively, places the results side by side, and sends it
|
|
to the processes process1 and process2, as well as putting
|
|
it onto the standard output. Note that the file which is
|
|
passed as an argument to the command is a UNIX system
|
|
pipe(2) so that the programs that expect to lseek(2) on the
|
|
file will not work.
|
|
|
|
3.12 Finding Commands
|
|
|
|
The addition of aliases, functions, and more built-ins has
|
|
made it substantially more difficult to know what a given
|
|
command word really means.
|
|
|
|
There are several reasons that commands are built into the
|
|
shell rather than being separate programs. Commands that
|
|
begin with reserved words are an integral part of the shell
|
|
language itself and typically define the control flow of the
|
|
language. Some control flow commands are not reserved words
|
|
in the language but are special built-ins. Special built-
|
|
ins are built-ins that are considered a part of the language
|
|
rather than user definable commands. The best examples of
|
|
commands that fit this description are break and continue.
|
|
Because they are not reserved words, they can be the result
|
|
of shell expansions and are not effected by quoting. These
|
|
commands have the following special properties:
|
|
|
|
+ Assignments that precede them apply to the current
|
|
shell process, not just to the given command.
|
|
|
|
+ An error in the format of these commands cause a shell
|
|
script or function that contains them to abort.
|
|
|
|
+ They cannot be overridden by shell functions.
|
|
|
|
Other commands are built-in because they perform side
|
|
effects on the current environment that would be nearly
|
|
impossible to implement otherwise. Built-ins such as cd and
|
|
read are examples of such built-ins. These built-ins are
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 32 -
|
|
|
|
|
|
|
|
semantically equivalent to commands that are not built-in
|
|
except that they don't take a path search to locate.
|
|
|
|
A third reason to have a command built-in is so that it will
|
|
be unaffected by the setting of the PATH variable. The
|
|
print command fits this category. Scripts that use print
|
|
will be portable to all sites that run ksh.
|
|
|
|
The final reason for having a command be a built-in is for
|
|
performance. On most systems it is more than an order of
|
|
magnitude faster to initiate a command that is built-in than
|
|
to create a separate process to run the command. Example
|
|
that fit this category are test and basename.
|
|
|
|
Given a command name ksh decides what it means using the
|
|
following order:
|
|
|
|
+ Reserved words define commands that form part of the
|
|
shell grammar. They cannot be quoted.
|
|
|
|
+ Alias substitutions occur first as part of the reading
|
|
of commands. Using quotes in the command name will
|
|
prevent alias substitutions.
|
|
|
|
+ Special built-ins come next.
|
|
|
|
+ Functions.
|
|
|
|
+ Commands that are built-in that are not associated with
|
|
a pathname.
|
|
|
|
+ If the command name contains a /, the program or script
|
|
corresponding to the given name is executed.
|
|
|
|
+ A path search locates the pathname corresponding to the
|
|
command. If the pathname where it is found matches the
|
|
pathname associated with a built-in command, the
|
|
built-in command is executed. If the directory where
|
|
the command is found is listed in the FPATH variable,
|
|
the file is read into the shell like a dot script, and
|
|
a function by that name is invoked. Once a pathname is
|
|
found, ksh remembers its location and only checks
|
|
relative directories in PATH the next time the command
|
|
name is used. Assigning a value to PATH causes ksh to
|
|
forget the location of all command names.
|
|
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 33 -
|
|
|
|
|
|
|
|
+ The FPATH variable is searched and files found are
|
|
treated as described above.
|
|
|
|
The first argument of the command built-in, described
|
|
earlier, skips the checks for reserved words and for
|
|
function definitions. In all other ways, command behaves
|
|
like a built-in that is not associated with a pathname. As
|
|
a result, if the first argument of command is a special
|
|
built-in, the special properties of this built-in do not
|
|
apply. For example, whereas, exec 3< foo will cause a script
|
|
containing it to abort if the open fails,
|
|
command exec 3< foo results in a non-zero exit status but
|
|
does not abort the script.
|
|
|
|
You can get a complete list of the special built-in commands
|
|
with builtin -s. In addition builtin without arguments gives
|
|
a list of the current built-ins and the pathname that they
|
|
are associated with. A built-in can be bound to another
|
|
pathname by giving the pathname for the built-in. The
|
|
basename of this path must be the name of an existing
|
|
built-in for this to succeed. Specifying the name of the
|
|
built-in without a pathname causes this built-in to be found
|
|
before a path search. On systems with run time loading of
|
|
libraries, built-in commands can be added with the builtin
|
|
command. Each command that is to be built-in must be
|
|
written as a C function whose name is of the form b_name,
|
|
where name is the name of the built-in that is to be added.
|
|
The function has the same argument calling convention as
|
|
main. The lower eight bits of the return value become the
|
|
exit status for this built-in. Builtins are added by
|
|
specifying the pathname of the library as an argument to the
|
|
-f option of builtin.
|
|
|
|
A built-in command, whence, when used with the -v option has
|
|
been provided to answer this question. A line is printed
|
|
for each argument to whence telling what would happen if
|
|
this argument were used as a command name. It reports on
|
|
reserved words, aliases, built-ins, and functions. If the
|
|
command is none of the above, it follows the path search
|
|
rules and prints the full path-name, if any, otherwise it
|
|
prints an error message.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 34 -
|
|
|
|
|
|
|
|
3.13 Symbolic Names
|
|
|
|
To avoid implementation dependencies, ksh accepts and
|
|
generates symbolic names for built-ins that use numerical
|
|
values in the Bourne shell. The -S option of the umask
|
|
built-in command accepts and displays default file creation
|
|
permissions symbolically. It uses the same symbolic
|
|
notation as the chmod command.
|
|
|
|
The trap and kill built-in commands allows the signal names
|
|
to be given symbolically. The names of signals and traps
|
|
corresponding to signals are the same as the signal name
|
|
with the SIG prefix removed. The trap 0 is named EXIT.
|
|
|
|
3.14 Added Traps
|
|
|
|
A new trap named ERR has been added. This trap is invoked
|
|
whenever the shell would exit if the -e option were set.
|
|
This trap is used by Fourth Generation Make[16] which runs
|
|
ksh as a co-process.
|
|
|
|
A trap named DEBUG gets executed after each command. This
|
|
trap can be used for debugging purposes. The KEYBD trap was
|
|
described earlier.
|
|
|
|
3.15 Debugging
|
|
|
|
The primary method for debugging Bourne shell scripts is to
|
|
use the -x option to enable the execution trace. After all
|
|
the expansions have been performed, but before each command
|
|
is executed, the trace writes to standard error the name and
|
|
arguments of each command preceded by a +. With ksh the PS4
|
|
variable is evaluated for parameter substitution and is
|
|
displayed before each command, instead of the +.
|
|
|
|
The LINENO variable is set to the current line number
|
|
relative to the beginning of the current script or function.
|
|
It is most useful as part of the PS4 prompt.
|
|
|
|
The variable RANDOM produces a random number in the range 0
|
|
to 32767 each time it is referenced. Assignment to this
|
|
variable sets the seed for the random number generator.
|
|
|
|
The parameter PPID is used to generate the process id of
|
|
the process which invoked this shell.
|
|
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 35 -
|
|
|
|
|
|
|
|
3.16 Timing Commands
|
|
|
|
A reserved word time has been added to replace the time
|
|
command. Any function, command or pipeline can be preceded
|
|
by this reserved word to obtain information about the
|
|
elapsed, user, and system times. Since I/O redirections
|
|
bind to the command, not to time, parentheses should be used
|
|
to redirect the timing information which is normally printed
|
|
on file descriptor 2.
|
|
|
|
|
|
4. SECURITY
|
|
|
|
There are several documented problems associated with the
|
|
security of shell procedures[17]. These security holes
|
|
occur primarily because a user can manipulate the
|
|
environment to subvert the intent of a setuid shell
|
|
procedure. Frequently, shell procedures are initiated from
|
|
binary programs, without the author's awareness, by library
|
|
routines which invoke shells to carry out their tasks. When
|
|
the binary program is run setuid then the shell procedure
|
|
runs with the permissions afforded to the owner of the
|
|
binary file.
|
|
|
|
In the Bourne shell, the IFS parameter is used to split each
|
|
word into separate command arguments. If a user knows that
|
|
some setuid program will run sh -c /bin/pwd (or any other
|
|
command in /bin) then the user sets and exports IFS=/.
|
|
Instead of running /bin/pwd the shell will run bin with pwd
|
|
as an argument. The user puts his or her own bin program
|
|
into the current directory. This program can create a copy
|
|
of the shell, make this shell setuid, and then run the
|
|
/bin/pwd program so that the original program continues to
|
|
run successfully. This kind of penetration is not possible
|
|
with ksh since the IFS parameter only splits arguments that
|
|
result from command or parameter substitution.
|
|
|
|
Some setuid programs run programs using system() without
|
|
giving the full pathname. If the user sets the PATH
|
|
variable so that the desired command will be found in his or
|
|
her local bin, then the same technique described above can
|
|
be employed to compromise the security of the system. To
|
|
close up this and other security holes, ksh resets the
|
|
effective user id to the real user id and the effective
|
|
group id to the real group id unless the privileged option
|
|
(-p) is specified at invocation. In this mode, the
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 36 -
|
|
|
|
|
|
|
|
privileged mode, the .profile and ENV files are not
|
|
processed. Instead, the file /etc/suid_profile is read and
|
|
executed. This gives an administrator control over the
|
|
environment to set the PATH variable or to log setuid shell
|
|
invocations. Clearly security of the system is compromised
|
|
if /etc or this file is publicly writable.
|
|
|
|
In the Berkeley UNIX version the operating system looks for
|
|
the characters #! as the first two characters of an
|
|
executable file. If these characters are found, then the
|
|
next word on this line is taken as the interpreter to invoke
|
|
for this command and the interpreter is execed with the name
|
|
of the script as argument zero and argument one. If the
|
|
setuid or setgid bits are on for this file, then the
|
|
interpreter is run with the effective uid and/or gid set
|
|
accordingly. This scheme has two major drawbacks. First of
|
|
all, using the #! notation forces an exec of the
|
|
interpreter even when the call is invoked from the
|
|
interpreter which it must exec. This is inefficient since
|
|
the interpreter can handle a failed exec much faster than
|
|
starting up again. More importantly, setuid and setgid
|
|
procedures provide an easy target for intrusion. By linking
|
|
a setuid or setgid procedure to a name beginning with a -
|
|
the interpreter is fooled into thinking that it is being
|
|
invoked with a command line option rather than the name of a
|
|
file. When the interpreter is the shell, the user gets a
|
|
privileged interactive shell. There is code in ksh to guard
|
|
against this simple form of intrusion.
|
|
|
|
A more reliable way to handle setuid and setgid procedures
|
|
is provided with ksh. The technique does not require any
|
|
changes to the operating system and provides better
|
|
security. Another advantage to this method is that it also
|
|
allows scripts which have execute permission but no read
|
|
permission to run. Taking away read permission makes
|
|
scripts more secure.
|
|
|
|
The method relies on a setuid root program to authenticate
|
|
the request and exec the shell with the correct mode bits to
|
|
carry out the task. This shell is invoked with the
|
|
requested file already open for reading. A script which
|
|
cannot be opened for reading or which has its setuid and/or
|
|
setgid bits turned on causes this setuid root program to get
|
|
execed. For security reasons, this program is given the
|
|
full pathname /etc/suid_exec. A description of the
|
|
implementation of the /etc/suid_exec program can be found in
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 37 -
|
|
|
|
|
|
|
|
a separate paper[18].
|
|
|
|
|
|
5. CODE CHANGES
|
|
|
|
ksh is written in ANSI-C as a reusable library. The code
|
|
can be compiled with C++ and older K&R C as well. The code
|
|
uses the IEEE POSIX 1003.1 and ISO 9945-1 standard[19]
|
|
wherever possible so that ksh should be able to run on any
|
|
POSIX compliant system. In addition, it is possible to
|
|
compile ksh for older systems.
|
|
|
|
Unlike earlier version of the Bourne shell, ksh treats eight
|
|
bit characters transparently without stripping off the
|
|
leading bit. There is also a compile time switch to enable
|
|
handling multi-byte and multi-width characters sets.
|
|
|
|
On systems with dynamic libraries, it is possible to add
|
|
built-in commands at run time with the built-in command
|
|
builtin described earlier. It is also possible to embed ksh
|
|
in applications in a manner analogous to tcl.
|
|
|
|
|
|
6. EXAMPLE
|
|
|
|
An example of a ksh script is included in the Appendix.
|
|
This one page program is a variant of the UNIX system
|
|
grep(1) program. Pattern matching for this version of grep
|
|
means shell patterns.
|
|
|
|
The first half uses the getopts command to find the option
|
|
flags. Nearly all options have been implemented. The
|
|
second half goes through each line of each file to look for
|
|
a pattern match.
|
|
|
|
This program is not intended to serve as a replacement for
|
|
grep which has been highly tuned for performance. It does
|
|
illustrate the programming power of ksh. Note that no
|
|
auxiliary processes are spawned by this script. It was
|
|
written and debugged in under two hours. While performance
|
|
is acceptable for small files, this program runs at only one
|
|
tenth the speed of grep for large files.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 38 -
|
|
|
|
|
|
|
|
7. PERFORMANCE
|
|
|
|
ksh executes many scripts faster than the System V Bourne
|
|
shell; in some cases more than 10 times as fast. The
|
|
primary reason for this is that ksh creates fewer processes.
|
|
The time to execute a built-in command or a function is one
|
|
or two orders of magnitude faster than performing a fork()
|
|
and exec() to create a separate process. Command
|
|
substitution and commands inside parentheses are performed
|
|
without creating another process, unless necessary to
|
|
preserve correct behavior.
|
|
|
|
Another reason for improved performance is the use of the
|
|
sfio[20], library for I/O. The sfio library buffers all I/O
|
|
and output and buffers are flushed only when required. The
|
|
algorithms used in sfio perform better than traditional
|
|
versions of standard I/O so that programs that spend most of
|
|
their time formatting output may actually perform better
|
|
than versions written in C.
|
|
|
|
Several of the internal algorithms have been changed so that
|
|
the number of subroutine calls has been substantially
|
|
reduced. ksh uses variable sized hash tables for variables.
|
|
Scripts that rely heavily on referencing variables execute
|
|
faster. More processing is performed while reading the
|
|
script so that execution time is saved while running loops.
|
|
These changes are not noticeable for scripts that fork() and
|
|
run processes, but they reduce the time that it takes to
|
|
interpret commands by more than a factor of two.
|
|
|
|
Most importantly, ksh provide mechanisms to write
|
|
applications that do not require as many processes. The
|
|
arithmetic provided by the shell eliminates the need for the
|
|
expr command. The pattern matching and substring
|
|
capabilities eliminate the need to use sed or awk to process
|
|
strings.
|
|
|
|
The architecture of ksh makes it easy to make commands
|
|
built-ins without changing the semantics at all. Systems
|
|
that have run-time binding of libraries allow applications
|
|
to be sped up by supplying the critical programs as shell
|
|
built-in commands. Implementations on other systems can add
|
|
built-in commands at compile time.
|
|
|
|
|
|
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 39 -
|
|
|
|
|
|
|
|
8. CONCLUSION
|
|
|
|
The 1988 version of ksh has tens of thousands of regular
|
|
users and is a suitable replacement for the Bourne shell.
|
|
The 1993 version of ksh is essentially upward compatible
|
|
with both the 1988 version of ksh and with the recent IEEE
|
|
POSIX and ISO shell standard. The 1993 version offers many
|
|
advantages for programming applications, and it has been
|
|
rewritten so that it can be used in embedded applications.
|
|
It also offers improved performance.
|
|
|
|
|
|
|
|
MH-11267-DGK-dgk David G. Korn
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 40 -
|
|
|
|
|
|
|
|
APPENDIX
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 41 -
|
|
|
|
|
|
|
|
|
|
References
|
|
|
|
|
|
|
|
1. S. R. Bourne, An Introduction to the UNIX Shell," BSTJ -
|
|
Vol. 57, No. 6 part 2, pages 1947-1972.
|
|
|
|
2. POSIX - Part 2: Shell and Utilities, IEEE Std 1003.2-
|
|
1992, ISO/IEC 9945-2:1992.
|
|
|
|
3. Al Aho, Brian Kernighan, and Peter Weinberger, The AWK
|
|
Programming Language, Addison Wesley, 1988.
|
|
|
|
4. LLoyd H. Nakatani and Laurence W. Ruedisueli, The FIT
|
|
Programming Language Primer, TN 1126-920301-03, 1992.
|
|
|
|
5. Larry Wall, The PERL Programming Language,
|
|
|
|
6. John K. Ousterhout, Tcl: An Embeddable Command Language,
|
|
Proceedings of the USENIX meeting, pp. ?-?, 1990.
|
|
|
|
7. S. R. Bourne, An Introduction to the UNIX Shell, Bell
|
|
System Technical Journal, Vol. 57, No. 6, Part 2, pp.
|
|
1947-1972, July 1978.
|
|
|
|
8. W. Joy, An Introduction to the C Shell, Unix
|
|
Programmer's Manual, Berkeley Software Distribution,
|
|
University of California, Berkeley, 1980.
|
|
|
|
9. Morris Bolsky and David Korn, The KornShell Command and
|
|
Programming Language, Prentice Hall, 1989.
|
|
|
|
10. Jason Levitt, The Korn Shell: An Emerging Standard,
|
|
UNIX/World, pp. 74-81, September 1986.
|
|
|
|
11. Rich Bilancia, Proficiency and Power are Yours With the
|
|
Korn Shell, UNIX/World, pp. 103-107, September 1987.
|
|
|
|
12. John Sebes, Comparing UNIX Shells, UNIX Papers, Edited
|
|
by the Waite Group, Howard W. Sams & Co., 1987.
|
|
|
|
13. T. A. Dolotta and J. R. Mashey, Using the shell as a
|
|
Primary Programming Tool, Proc. 2nd. Int. Conf. on
|
|
Software Engineering, 1976, pages 169-176.
|
|
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- 42 -
|
|
|
|
|
|
|
|
14. J. S. Pendergrast, WKSH - Korn Shell with X-Windows
|
|
Support, USL.
|
|
|
|
15. American National Standard for Information Systems -
|
|
Programming Language - C, ANSI X3.159-1989.
|
|
|
|
16. G. S. Fowler, "The Fourth Generation Make," Proceedings
|
|
of the Portland USENIX meeting, pp. 159-174, 1985.
|
|
|
|
17. F. T. Grampp and R. H. Morris, UNIX Operating System
|
|
Security, AT&T Bell Labs Tech. Journal, Vol. 63, No. 8,
|
|
Part 2, pp.1649-1671, 1984.
|
|
|
|
18. D. G Korn Parlez-vous Kanji? TM-59554-860602-03, 1986.
|
|
|
|
19. POSIX - Part 1: System Application Program Interface,
|
|
IEEE Std 1003.1-1990, ISO/IEC 9945-1:1990.
|
|
|
|
20. David Korn and Kiem-Phong Vo, SFIO - A Safe/Fast
|
|
Input/Output library, Proceedings of the Summer Usenix,
|
|
pp. , 1991.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
BELL LABORATORIES PROPRIETARY
|
|
Not for use or disclosure outside Bell Laboratories except by
|
|
written approval of the director of the distributing organization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|