Top shell Questions

List of Tags
460
Grundlefleck

What command can be used to check if a directory does or does not exist, within a shell script?

Answered By: Grundlefleck ( 779)

To check if a directory exists in a shell script you can use the following:

if [ -d "$DIRECTORY" ]; then
    # Control will enter here if $DIRECTORY exists.
fi

Or to check if a directory doesn't exist:

if [ ! -d "$DIRECTORY" ]; then
    # Control will enter here if $DIRECTORY doesn't exist.
fi

However, as Jon Ericson points out (thanks Jon), subsequent commands may not work as intended if you do not take into account that a symbolic link to a directory will also pass this check. E.g. running this:

ln -s "$ACTUAL_DIR" "$SYMLINK"
if [ -d "$SYMLINK" ]; then 
    rmdir "$SYMLINK" 
fi

Will produce the error message:

rmdir: failed to remove `symlink': Not a directory

So symbolic links may have to be treated differently, if subsequent commands expect directories:

if [ -d "$LINK_OR_DIR" ]; then 
    if [ -L "$LINK_OR_DIR" ]; then
        # It is a symlink!
        # Symbolic link specific commands go here.
        rm "$LINK_OR_DIR"
    else
        # It's a directory!
        # Directory command goes here.
        rmdir "$LINK_OR_DIR"
    fi
fi

Take particular note of the double-quotes used to wrap the variables, the reason for this is explained by 8jean in another answer.

If the variables contain spaces or other unusual characters it will probably cause the script to fail.

257
Tristan Havelick

In a unix shell, if I want to combine stderr and stdout into the stdout stream for further manipulation, I can append the following on the end of my command:

2>&1

So, if I want to use "head" on the output from g++, I can do something like this:

g++ lots_of_errors 2>&1 | head

so I can see only the first few errors.

I always have trouble remembering this, and I constantly have to go look it up, and it is mainly because I don't fully understand the syntax of this particular trick. Can someone break this up and explain character by character what "2>&1" means?

Answered By: Ayman Hourieh ( 289)

1 is stdout. 2 is stderr.

Here is one way to remember this construct (altough it is not entirely accurate): at first, 2>1 may look like a good way to redirect stderr to stdout. However, it will actually be interpreted as "redirect stderr to a file named 1". & indicates that what follows is a file descriptor and not a filename. So the construct becomes: 2>&1.

247
stefanB

How do I split a string based on a delimiter in Bash?

I have this string stored in a variable:

IN="bla@some.com;john@home.com"

Now I would like to split the strings by ';' delimiter so that I have

ADDR1="bla@some.com"
ADDR2="john@home.com"

I don't necessarily need the ADDR1 and ADDR2 variables. If they are elements of an array that's even better.

Edit: After suggestions from the answers below, I ended up with the following which is what I was after:

#!/usr/bin/env bash

IN="bla@some.com;john@home.com"

arr=$(echo $IN | tr ";" "\n")

for x in $arr
do
    echo "> [$x]"
done

output:

> [bla@some.com]
> [john@home.com]

Edit2: There was a solution involving setting Internal_field_separator (IFS) to ';'. I am not sure what happened with that answer, how do you reset IFS back to default?

Edit3: RE: IFS solution, I tried this and it works, I keep the old IFS and then restore it:

IN="bla@some.com;john@home.com"

OIFS=$IFS
IFS=';'
arr2=$IN
for x in $arr2
do
    echo "> [$x]"
done

IFS=$OIFS

BTW, when I tried

arr2=($IN) 

I only got the first string when printing it in loop, without brackets around $IN it works.

Answered By: Johannes Schaub - litb ( 128)

You can set the internal field separator (IFS) variable, and then let it parse into an array. When this happens in a command, then the assignment to IFS only takes place to that single command's environment (to read ). It then parses the input according to the IFS variable value into an array, which we can then iterate over.

IFS=';' read -ra ADDR <<< "$IN"
for i in "${ADDR[@]}"; do
    # process "$i"
done

It will parse one line of items separated by ;, pushing it into an array. Stuff for processing whole of $IN, each time one line of input separated by ;:

 while IFS=';' read -ra ADDR; do
      for i in "${ADDR[@]}"; do
          # process "$i"
      done
 done <<< "$IN"
215
Muhammad Alkarouri

What are the differences between shell languages like bash, zsh, fish and the scripting languages above that makes them more suitable for the shell?

When using the command line the shell languages seem to be much easier. It feels for me much smoother to use bash for example than to use the shell profile in ipython, despite reports to the contrary. I think most wil agree with me that a large portion of medium to large scale programming is easier in Python than in bash. I use Python as the language I am most familiar with, the same goes for Perl and Ruby.

I have tried to articulate the reason but am unable to, aside from assuming that the treatment of strings differently in both has something to do with it.

The reason of this question is that I am hoping to develop a language usable in both. If you know of such a language, please post it as well.

As S.Lott explains, the question needs some clarification. I am asking about the features of the shell language versus that of scripting languages. So the comparison is not about the characteristics of various interactive (REPL) environments such as history and command line substitution. An alternative expression for the question would be:

Can a programming language that is suitable for design of complex systems be at the same time able to express useful one-liners that can access the file system or control jobs? Can a programming language usefully scale up as well as scale down?

Answered By: J&#246;rg W Mittag ( 334)

There are a couple of differences that I can think of; just thoughtstreaming here, in no particular order:

1. Python & Co. are designed to be good at scripting. Bash & Co. are designed to be only good at scripting, with absolutely no compromise. IOW: Python is designed to be good both at scripting and non-scripting, Bash cares only about scripting.

2. Bash & Co. are untyped, Python & Co. are strongly typed, which means that the number 123, the string 123 and the file 123 are quite different. They are, however, not statically typed, which means they need to have different literals for those, in order to keep them apart. Example:

  • Ruby: 123 (number), Bash: 123
  • Ruby: '123' (string), Bash: 123
  • Ruby: /123/ (regexp), Bash: 123
  • Ruby: File.open('123') (file), Bash: 123
  • Ruby: IO.open('123') (file descriptor), Bash: 123
  • Ruby: URI.parse('123') (URI), Bash: 123
  • Ruby: `123` (command), Bash: 123

3. Python & Co. are designed to scale up to 10000, 100000, maybe even 1000000 line programs, Bash & Co. are designed to scale down to 10 character programs.

4. In Bash & Co., files, directories, file descriptors, processes are all first-class objects, in Python, only Python objects are first-class, if you want to manipulate files, directories etc., you have to wrap them in a Python object first.

5. Shell programming is basically dataflow programming. Nobody realizes that, not even the people who write shells, but it turns out that shells are quite good at that, and general-purpose languages not so much. In the general-purpose programming world, dataflow seems to be mostly viewed as a concurrency model, not so much as a programming paradigm.

I have the feeling that trying to address these points by bolting features or DSLs onto a general-purpose programming language doesn't work. At least, I have yet to see a convincing implementation of it. There is RuSH (Ruby shell), which tries to implement a shell in Ruby, there is rush, which is an internal DSL for shell programming in Ruby, there is Hotwire, which is a Python shell, but IMO none of those come even close to competing with Bash, Zsh, fish and friends.

Actually, IMHO, the best current shell is Microsoft PowerShell, which is very surprising considering that for several decades now, Microsoft has continually had the worst shells evar. I mean, COMMAND.COM? Really? (Unfortunately, they still have a crappy terminal. It's still the "command prompt" that has been around since, what? Windows 3.0?)

PowerShell was basically created by ignoring everything Microsoft has ever done (COMMAND.COM, CMD.EXE, VBScript, JScript) and instead starting from the Unix shell, then removing all backwards-compatibility cruft (like backticks for command substitution) and massaging it a bit to make it more Windows-friendly (like using the now unused backtick as an escape character instead of the backslash which is the path component separator character in Windows). After that, is when the magic happens.

They address problem 1 and 3 from above, by basically making the opposite choice compared to Python. Python cares about large programs first, scripting second. Bash cares only about scripting. PowerShell cares about scripting first, large programs second. A defining moment for me was watching a video of an interview with Jeffrey Snover (PowerShell's lead designer), when the interviewer asked him how big of a program one could write with PowerShell and Snover answered without missing a beat: "80 characters." At that moment I realized that this is finally a guy at Microsoft who "gets" shell programming (probably related to the fact that PowerShell was neither developed by Microsoft's programming language group (i.e. lambda-calculus math nerds) nor the OS group (kernel nerds) but rather the server group (i.e. sysadmins who actually use shells)), and that I should probably take a serious look at PowerShell.

Number 2 is solved by having arguments be statically typed. So, you can write just 123 and PowerShell knows whether it is a string or a number or a file, because the cmdlet (which is what shell commands are called in PowerShell) declares the types of its arguments to the shell. This has pretty deep ramifications: unlike Unix, where each command is responsible for parsing its own arguments (the shell basically passes the arguments as an array of strings), argument parsing in PowerShell is done by the shell. The cmdlets specify all their options and flags and arguments, as well as their types and names and documentation(!) to the shell, which then can perform argument parsing, tab completion, IntelliSense, inline documentation popups etc. in one centralized place. (This is not revolutionary, and the PowerShell designers acknowledge shells like the DIGITAL Command Language (DCL) and the IBM OS/400 Command Language (CL) as prior art. For anyone who has ever used an AS/400, this should sound familiar. In OS/400, you can write a shell command and if you don't know the syntax of certain arguments, you can simply leave them out and hit F4, which will bring a menu (similar to an HTML form) with labelled fields, dropdown, help texts etc. This is only possible because the OS knows about all the possible arguments and their types.) In the Unix shell, this information is often duplicated three times: in the argument parsing code in the command itself, in the bash-completion script for tab-completion and in the manpage.

Number 4 is solved by the fact that PowerShell operates on strongly typed objects, which includes stuff like files, processes, folders and so on.

Number 5 is particularly interesting, because PowerShell is the only shell I know of, where the people who wrote it were actually aware of the fact that shells are essentially dataflow engines and deliberately implemented it as a dataflow engine.

Another nice thing about PowerShell are the naming conventions: all cmdlets are named Action-Object and moreover, there are also standardized names for specific actions and specific objects. (Again, this should sound familar to OS/400 users.) For example, everything which is related to receiving some information is called Get-Foo. And everything operating on (sub-)objects is called Bar-ChildItem. So, the equivalent to ls is Get-ChildItem (although PowerShell also provides builtin aliases ls and dir – in fact, whenever it makes sense, they provide both Unix and CMD.EXE aliases as well as abbreviations (gci in this case)).

But the killer feature IMO is the strongly typed object pipelines. While PowerShell is derived from the Unix shell, there is one very important distinction: in Unix, all communication (both via pipes and redirections as well as via command arguments) is done with untyped, unstructured, ASCII strings. In PowerShell, it's all strongly typed, structured objects. This is so incredibly powerful that I seriously wonder why noone else has thought of it. (Well, they have, but they never became popular.) In my shell scripts, I estimate that up to one third of the commands is only there to act as an adapter between two other commands that don't agree on a common textual format. Many of those adapters go away in PowerShell, because the cmdlets exchange structured objects instead of unstructured text. And if you look inside the commands, then they pretty much consist of three stages: parse the textual input into an internal object representation, manipulate the objects, convert them back into text. Again, the first and third stage basically go away, because the data already comes in as objects.

However, the designers have taken great care to preserve the dynamicity and flexibility of shell scripting through what they call an Adaptive Type System.

Anyway, I don't want to turn this into a PowerShell commercial. There are plenty of things that are not so great about PowerShell, although most of those have to do either with Windows or with the specific implementation, and not so much with the concepts. (E.g. the fact that it is implemented in .NET means that the very first time you start up the shell can take up to several seconds if the .NET framework is not already in the filesystem cache due to some other application that needs it. Considering that you often use the shell for well under a second, that is completely unacceptable.)

The most important point I want to make is that if you want to look at existing work in scripting languages and shells, you shouldn't stop at Unix and the Ruby/Python/Perl/PHP family. For example, Tcl was already mentioned. Rexx would be another scripting language. Emacs Lisp would be yet another. And in the shell realm there are some of the already mentioned mainframe/midrange shells such as the OS/400 command line and DCL. Also, Plan9's rc.

205
Masi

I unsuccesfully tried:

sed 's#/\n# #g' file
sed 's#^$# #g' file

How to fix it?

Answered By: Zsolt Botykai ( 258)

Or use this solution with sed:

sed ':a;N;$!ba;s/\n/ /g'

This will read the whole file in a loop, then replaces the newline(s) with a space.

Update: explanation.

  1. create a label via :a
  2. append the current and next line to the pattern space via N
  3. if we are before the last line, branch to the created label $!ba ($! means not to do it on the last line (as there should be one final newline)).
  4. finally the substitution replaces every newline with a space on the pattern space (which is the whole file).
188
Piskvor

I'm looking for the string "foo=" (without quotes) in text files in a directory tree. It's on a common Linux machine, I have bash shell:

grep -ircl "foo=" *

In the directories are also many binary files which match "foo=". As these results are not relevant and slow down the search, I want grep to skip searching these files (mostly JPEG and PNG images): how would I do that?

I know there are the --exclude=PATTERN and --include=PATTERN options, but what is the pattern format? manpage of grep says:

--include=PATTERN     Recurse in directories only searching file matching PATTERN.
--exclude=PATTERN     Recurse in directories skip file matching PATTERN.

Searching on grep include, grep include exclude, grep exclude and variants did not find anything relevant

If there's a better way of grepping only in certain files, I'm all for it; moving the offending files is not an option, I can't search only certain directories (the directory structure is a big mess, with everything everywhere). Also, I can't install anything, so I have to do with common tools (like grep or the suggested find).

UPDATES: @Adam Rosenfield's answer is just what I was looking for:

grep -ircl --exclude=*.{png,jpg} "foo=" *

@rmeador's answer is also a good solution:

grep -Ir --exclude="*\.svn*" "pattern" *

It searches recursively, ignores binary files, and doesn't look inside Subversion hidden folders.(...)

Answered By: Adam Rosenfield ( 154)

Use the shell globbing syntax:

grep pattern -r --include=\*.{cpp,h} rootdir

The syntax for --exclude is identical.

Note that the star is escaped with a backslash to prevent it from being expanded by the shell (quoting it, such as --include="*.{cpp,h}", would work just as well). Otherwise, if you had any files in the current working directory that matched the pattern, the command line would expand to something like grep pattern -r --include=foo.cpp --include=bar.h rootdir, which would only search files named foo.cpp and bar.h, which is quite likely not what you wanted.

178
Andy White

I'm debating whether I should learn PowerShell, or just stick with Cygwin/Perl Scripts/Unix Shell scripts, etc.

The benefit of PowerShell would be that the scripts could be more easily used by teammates that don't have cygwin; however, I don't know if I'd really be writing that many general purpose scripts, or if people would even use them.

Unix scripting is so powerful, does PowerShell come close enough to warrant switching over?

EDIT: Here are some of the specific things (or equivalents) I would be looking for in PowerShell:

  • grep
  • sort
  • uniq
  • Perl (how close does PowerShell come to Perl capabilities?)
  • awk
  • sed
  • file (the command that gives file info)
  • etc.
Answered By: Jeffrey Snover - MSFT ( 382)

Tools are just tools.
They help or they don't.
You need help or you don't.

If you know Unix and those tools do what you need them to do on Windows - then you are a happy guy and there is no need to learn PowerShell (unless you want to explore).

My original intent was to include a set of Unix tools in Windows and be done with it (a number of us on the team have deep Unix backgrounds and a healthy dose of respect for that community.) What I found was that this didn't really help much. The reason for that is that awk/grep/sed don't work against COM, WMI, ADSI, the Registry, the cert store, etc, etc. In other words, UNIX is an entire ecosystem self-tuned around text files. As such, text processing tools are effectively management tools. Windows is a completely different ecosystem self-tuned around APIs and Objects. That's why we invented PowerShell.

What I think you'll find is that there will be lots of occasions when text-processing won't get you what you want on Windows. At that point, you'll want to pick up PowerShell. NOTE - it is not an all or nothing deal. Within PowerShell, you can call out to your Unix tools (and use their text process or PowerShell's text processing). Also you can call PowerShell from your Unix tools and get text.

Again - there is no religion here - our focus is on giving you the tools you need to succeed. That is why we are so passionate about feedback. Let us know where we are falling down on the job or where you don't have a tool you need and we'll put it on the list and get to it. In all honesty, we are digging ourselves out of a 30 year hole so it is going to take a while. That said, if you pick up the beta of Windows Server 2008 /R2 and/or the betas of our server products, I think you'll be shocked at how quickly that hole is getting filled.

With regard to usage - we've had > 3.5 million downloads to date. That does not include the people using it in Windows Server 2008 because it is included as an optional component and does not need a download. V2 will ship in all versions of Windows. It will be on-by-default for all editions except Server core where it is an optional component. Shortly after Windows 7/Windows Server 2008 R2 ships, we'll make V2 available on all platforms XP and above. In other words - your investment in learning will be applicable to a very large number of machines/environments.

One last comment. If/when you start to learn PowerShell, I think you'll be pretty happy. Much of the design is heavily influenced by our Unix backgrounds so while we are quite different, you'll pick it up very quickly (after you get over cussing that it isn't Unix :-) ). We know that people have a very limited budget for learning - that is why we are super hard-core about consistency. You are going to learn something and then you'll use it over and over and over again.

Experiment! Enjoy! Engage!

Jeffrey Snover [MSFT] Windows Management Partner Architect

172
gregh

How would I validate that a program exists? which would then either return an error and exit or continue with the script.

It seems like it should be easy, but it's been stumping me.

Answered By: lhunath ( 267)

Yes; avoid which. Not only is it an external process you're launching for doing very little (meaning builtins like hash, type or command are way cheaper), you can also rely on the builtins to actually do what you want, while the effects of external commands can easily vary from system to system.

Why care?

  • Many operating systems have a which that doesn't even set an exit status, meaning the if which foo won't even work there and will always report that foo exists, even if it doesn't (note that some POSIX shells appear to do this for hash too).
  • Many operating systems make which do custom and evil stuff like change the output or even hook into the package manager.

So, don't use which. Instead use one of these:

$ command -v foo >/dev/null 2>&1 || { echo >&2 "I require foo but it's not installed.  Aborting."; exit 1; }
$ type foo >/dev/null 2>&1 || { echo >&2 "I require foo but it's not installed.  Aborting."; exit 1; }
$ hash foo 2>/dev/null || { echo >&2 "I require foo but it's not installed.  Aborting."; exit 1; }

If your hash bang is /bin/sh then you should care about what POSIX says. type and hash's exit codes aren't terribly well defined by POSIX, and hash is seen to exit successfully when the command doesn't exist (haven't seen this with type yet). command's exit status is well defined by POSIX, so that one is probably the safest to use.

If your script uses bash though, POSIX' rules don't really matter anymore and both type and hash become perfectly safe to use. type now has a -P to search just the PATH and hash has the side-effect that the command's location will be hashed (for faster lookup next time you use it), which is usually a good thing since you probably check for its existence in order to actually use it.

As a simple example, here's a function that runs gdate if it exists, otherwise date:

gnudate() {
    if hash gdate 2>/dev/null; then
        gdate "$@"
    else
        date "$@"
    fi
}

Could you please suggest me how to run a shell script on remote machine?

I have ssh configured on both machine A and B. My script is on machine A which will perform a task on machine B.

Answered By: Jason R. Coombs ( 192)

If Machine A is a Windows box, you can use Plink (part of PuTTY) with the -m parameter, and it will execute the local script on the remote server.

plink root@MachineB -m local_script.sh

If Machine A is a Unix-based system, you can use:

ssh root@MachineB 'bash -s' < local_script.sh

You shouldn't have to copy the script to the remote server to run it.

155
user77413

We've got a PHP application and want to count all the lines of code under a specific directory and its subdirectories. We don't need to ignore comments, as we're just trying to get a rough idea.

wc -l *.php

That command works great within a given directory, but ignores subdirectories. I was thinking this might work, but it is returning 74, which is definitely not the case...

find . -name '*.php' | wc -l

What's the correct syntax to feed in all the files?

Answered By: Peter Elespuru ( 244)

Try

find . -name '*.php' | xargs wc -l

This may help as well

http://www.dwheeler.com/sloccount/

It'll give an accurate source lines of code count for whatever hierarchy you point it at, as well as some additional stats