From b14e79e9d039a46f3489a24dc5903f57c8bdd0c5 Mon Sep 17 00:00:00 2001 From: Martijn Dekker Date: Thu, 9 Jun 2022 15:51:44 +0100 Subject: [PATCH] posix: use real pipe(2) instead of socketpair(2) The POSIX standard requires real UNIX pipes as in pipe(2). But on systems supporting it (all modern ones), ksh uses socketpair(2) instead to make it possible for certain commands to peek ahead without consuming input from the pipe, which is not possible with real pipes. See features/poll and sh/io.c. But this can create undesired side effects: applications connected to a pipe may test if they are connected to a pipe, which will fail if they are connected to a socket. Also, on Linux: $ cat /etc/passwd | head -1 /dev/stdin head: cannot open '/dev/stdin' for reading: No such device or address ...which happens because, unlike most systems, Linux cannot open(2) or openat(2) a socket (a limitation that is allowed by POSIX). Unfortunately at least two things depend on the peekahead capability of the _pipe_socketpair feature. One is the non-blocking behaviour of the -n option of the 'read' built-in: -n Causes at most n bytes to be read instead of a full line, but will return when reading from a slow device as soon as any characters have been read. The other thing that breaks is the <#pattern and <##pattern redirection operators that basically grep standard input, which inherently requires peekahead. Standard UNIX pipes always block on read and it is not possible to peek ahead, so these features inevitably break. Which means we cannot simply use standard pipes without breaking compatibility. But we can at least fix it in the POSIX mode so that cross-platform scripts work more correctly. src/cmd/ksh93/sh/io.c: sh_pipe(): - If _pipe_socketpair is detected at compile time, then use a real pipe via sh_rpipe() if the POSIX mode is active. (If _pipe_socketpair is not detected, a real pipe is always used.) src/cmd/ksh93/data/builtins.c: - sh.1 documents the slow-device behaviour of -n but 'read --man' did not. Add that, making it conditional on _pipe_socketpair. Resolves: https://github.com/ksh93/ksh/issues/327 --- NEWS | 8 ++++++++ src/cmd/ksh93/Mamfile | 1 + src/cmd/ksh93/data/builtins.c | 11 +++++++++-- src/cmd/ksh93/include/version.h | 2 +- src/cmd/ksh93/sh.1 | 21 +++++++++++++++++++++ src/cmd/ksh93/sh/io.c | 4 ++++ 6 files changed, 44 insertions(+), 3 deletions(-) diff --git a/NEWS b/NEWS index 6f54dff3f..b6d7f7a8c 100644 --- a/NEWS +++ b/NEWS @@ -3,6 +3,14 @@ For full details, see the git log at: https://github.com/ksh93/ksh/tree/1.0 Any uppercase BUG_* names are modernish shell bug IDs. +2022-06-09: + +- The POSIX mode has been amended to use a UNIX pipe(2) instead of a + socketpair(2) to connect commands in a pipeline, as the standard requires. + (When reading directly from a pipeline in posix mode, the <#pattern and + <##pattern redirection operators will not work and the -n option to the + read built-in will not return early when reading from a slow device.) + 2022-06-08: - If -B/--braceexpand is turned on in --posix mode, it now only allows brace diff --git a/src/cmd/ksh93/Mamfile b/src/cmd/ksh93/Mamfile index 614fa75a4..bbff8a017 100644 --- a/src/cmd/ksh93/Mamfile +++ b/src/cmd/ksh93/Mamfile @@ -1281,6 +1281,7 @@ make install done aliases.o generated make builtins.o make data/builtins.c + prev FEATURE/poll implicit prev FEATURE/cmds implicit prev include/jobs.h implicit prev include/builtins.h implicit diff --git a/src/cmd/ksh93/data/builtins.c b/src/cmd/ksh93/data/builtins.c index 2fc839c4b..8f0883de2 100644 --- a/src/cmd/ksh93/data/builtins.c +++ b/src/cmd/ksh93/data/builtins.c @@ -29,6 +29,7 @@ #include "builtins.h" #include "jobs.h" #include "FEATURE/cmds" +#include "FEATURE/poll" #define bltin(x) (b_##x) /* The following is for builtins that do not accept -- options */ #define Bltin(x) (B_##x) @@ -1407,7 +1408,7 @@ const char sh_optpwd[] = ; const char sh_optread[] = -"[-1c?\n@(#)$Id: read (ksh 93u+m) 2022-02-16 $\n]" +"[-1c?\n@(#)$Id: read (ksh 93u+m) 2022-06-09 $\n]" "[--catalog?" SH_DICT "]" "[+NAME?read - read a line from standard input]" "[+DESCRIPTION?\bread\b reads a line from standard input and breaks it " @@ -1433,7 +1434,13 @@ const char sh_optread[] = "the line starting at index 0.]" "[C?Unset \avar\a and read \avar\a as a compound variable.]" "[d]:[delim?Read until delimiter \adelim\a instead of to the end of line.]" -"[n]#[count?Read at most \acount\a characters or (for binary fields) bytes.]" +"[n]#[count?Read at most \acount\a characters or (for binary fields) bytes." +#if _pipe_socketpair + " When reading from a slow device, " + "will return as soon as any characters have been read, " + "unless the \bposix\b shell option is on." +#endif + "]" "[N]#[count?Read exactly \acount\a characters or (for binary fields) bytes.]" "[p?Read from the current co-process instead of standard input. " "An end-of-file causes \bread\b to disconnect the co-process " diff --git a/src/cmd/ksh93/include/version.h b/src/cmd/ksh93/include/version.h index dcf1e9c1f..dca29cf74 100644 --- a/src/cmd/ksh93/include/version.h +++ b/src/cmd/ksh93/include/version.h @@ -21,7 +21,7 @@ #define SH_RELEASE_FORK "93u+m" /* only change if you develop a new ksh93 fork */ #define SH_RELEASE_SVER "1.0.0-beta.2" /* semantic version number: https://semver.org */ -#define SH_RELEASE_DATE "2022-06-08" /* must be in this format for $((.sh.version)) */ +#define SH_RELEASE_DATE "2022-06-09" /* must be in this format for $((.sh.version)) */ #define SH_RELEASE_CPYR "(c) 2020-2022 Contributors to ksh " SH_RELEASE_FORK /* Scripts sometimes field-split ${.sh.version}, so don't change amount of whitespace. */ diff --git a/src/cmd/ksh93/sh.1 b/src/cmd/ksh93/sh.1 index 391dd5032..7557a9481 100644 --- a/src/cmd/ksh93/sh.1 +++ b/src/cmd/ksh93/sh.1 @@ -184,6 +184,12 @@ separated by .BR | . The standard output of each command but the last is connected by a +.IR socketpair (2) +or +(if the +.B posix +shell option is on) +by a .IR pipe (2) to the standard input of the next command. Each command, @@ -7723,6 +7729,21 @@ disables fast filescan loops of type makes the \fB<>\fR redirection operator default to redirecting standard input if no file descriptor number precedes it; .IP \[bu] +causes the shell to use a standard UNIX +.IR pipe (2) +instead of a +.IR socketpair (2) +to connect commands in a pipeline +(when reading directly from a pipeline, the +.BI <# pattern +and +.BI <## pattern +redirection operators will not work and the +.B -n +option to the +.B read +built-in will not return early when reading from a slow device); +.IP \[bu] disables the special floating point constants \fBInf\fR and \fBNaN\fR in arithmetic evaluation so that, e.g., \fB$((inf))\fR and \fB$((nan))\fR refer to the variables by those names; diff --git a/src/cmd/ksh93/sh/io.c b/src/cmd/ksh93/sh/io.c index 4d26420b5..05ea03635 100644 --- a/src/cmd/ksh93/sh/io.c +++ b/src/cmd/ksh93/sh/io.c @@ -913,6 +913,10 @@ int sh_iomovefd(register int fdold) int sh_pipe(register int pv[]) { int fd[2]; +#ifdef pipe + if(sh_isoption(SH_POSIX)) + return(sh_rpipe(pv)); +#endif if(pipe(fd)<0 || (pv[0]=fd[0])<0 || (pv[1]=fd[1])<0) { errormsg(SH_DICT,ERROR_system(1),e_pipe);