Top unix Questions

List of Tags
Gabriel Isenberg

I have a C++ application I'm in the process of optimizing. What tool can I use to pinpoint my slow code?

Answered By: Mike Dunlavey ( 522)

If your goal is to use a profiler, use one of the suggested ones.

However, if you're in a hurry and you can manually interrupt your program under the debugger while it's being subjectively slow, there's a simple way to find performance problems.

Just halt it several times, and each time look at the call stack. If there is some code that is wasting some percentage of the time, 20% or 50% or whatever, that is the probability that you will catch it in the act on each sample. So that is roughly the percentage of samples on which you will see it. There is no educated guesswork required. If you do have a guess as to what the problem is, this will prove or disprove it.

You may have multiple performance problems of different sizes. If you clean out any one of them, the remaining ones will take a larger percentage, and be easier to spot, on subsequent passes.

Caveat: programmers tend to be skeptical of this technique unless they've used it themselves. They will say that profilers give you this information, but that is only true if they sample the entire call stack. Call graphs don't give you the same information, because 1) they don't summarize at the instruction level, and 2) they give confusing summaries in the presence of recursion. They will also say it only works on toy programs, when actually it works on any program, and it seems to work better on bigger programs, because they tend to have more problems to find.

P.S. This can also be done on multi-thread programs if there is a way to collect call-stack samples of the thread pool at a point in time, as there is in Java.

P.P.S As a rough generality, the more layers of abstraction you have in your software, the more likely you are to find that that is the cause of performance problems (and the opportunity to get speedup).

Added: It might not be obvious, but the stack sampling technique works equally well in the presence of recursion. The reason is that the time that would be saved by removal of an instruction is approximated by the fraction of samples containing it, regardless of the number of times it may occur within a sample.

Another objection I often hear is: "It will stop someplace random, and it will miss the real problem". This comes from having a prior concept of what the real problem is. A key property of performance problems is that they defy expectations. Sampling tells you something is a problem, and your first reaction is disbelief. That is natural, but you can be sure if it finds a problem it is real, and vice-versa.

ADDED: Let me make a Bayesian explanation of how it works. Suppose there is some instruction I (call or otherwise) which is on the call stack some fraction f of the time (and thus costs that much). For simplicity, suppose we don't know what f is, but assume it is either 0.1, 0.2, 0.3, ... 0.9, 1.0, and the prior probability of each of these possibilities is 0.1, so all of these costs are equally likely a-priori.

Then suppose we take just 2 stack samples, and we see instruction I on both samples, designated observation o=2/2. This gives us new estimates of the frequency f of I, according to this:

P(f=x) x  P(o=2/2|f=x) P(o=2/2&&f=x)  P(o=2/2&&f >= x)  P(f >= x)

0.1    1     1             0.1          0.1            0.25974026
0.1    0.9   0.81          0.081        0.181          0.47012987
0.1    0.8   0.64          0.064        0.245          0.636363636
0.1    0.7   0.49          0.049        0.294          0.763636364
0.1    0.6   0.36          0.036        0.33           0.857142857
0.1    0.5   0.25          0.025        0.355          0.922077922
0.1    0.4   0.16          0.016        0.371          0.963636364
0.1    0.3   0.09          0.009        0.38           0.987012987
0.1    0.2   0.04          0.004        0.384          0.997402597
0.1    0.1   0.01          0.001        0.385          1

                  P(o=2/2) 0.385                

The last column says that, for example, the probability that f >= 0.5 is 92%, up from the prior assumption of 60%.

Suppose the prior assumptions are different. Suppose we assume P(f=0.1) is .991 (nearly certain), and all the other possibilities are almost impossible (0.001). In other words, our prior certainty is that I is cheap. Then we get:

P(f=x) x  P(o=2/2|f=x) P(o=2/2&& f=x)  P(o=2/2&&f >= x)  P(f >= x)

0.001  1    1              0.001        0.001          0.072727273
0.001  0.9  0.81           0.00081      0.00181        0.131636364
0.001  0.8  0.64           0.00064      0.00245        0.178181818
0.001  0.7  0.49           0.00049      0.00294        0.213818182
0.001  0.6  0.36           0.00036      0.0033         0.24
0.001  0.5  0.25           0.00025      0.00355        0.258181818
0.001  0.4  0.16           0.00016      0.00371        0.269818182
0.001  0.3  0.09           0.00009      0.0038         0.276363636
0.001  0.2  0.04           0.00004      0.00384        0.279272727
0.991  0.1  0.01           0.00991      0.01375        1

                  P(o=2/2) 0.01375                

Now it says P(f >= 0.5) is 26%, up from the prior assumption of 0.6%. So Bayes allows us to update our estimate of the probable cost of I. If the amount of data is small, it doesn't tell us accurately what the cost is, only that it is big enough to be worth fixing.

Yet another way to look at it is called the Rule Of Succession. If you flip a coin 2 times, and it comes up heads both times, what does that tell you about the probable weighting of the coin? The respected way to answer is to say that it's a Beta distribution, with average value (number of hits + 1) / (number of tries + 2) = (2+1)/(2+2) = 75%.

(The key is that we see I more than once. If we only see it once, that doesn't tell us much except that f > 0.)

So, even a very small number of samples can tell us a lot about the cost of instructions that it sees. (And it will see them with a frequency, on average, proportional to their cost. If n samples are taken, and f is the cost, then I will appear on nf+/-sqrt(nf(1-f)) samples. Example, n=10, f=0.3, that is 3+/-1.4 samples.)


What command can be used to check if a directory does or does not exist, within a shell script?

Answered By: Grundlefleck ( 779)

To check if a directory exists in a shell script you can use the following:

if [ -d "$DIRECTORY" ]; then
    # Control will enter here if $DIRECTORY exists.

Or to check if a directory doesn't exist:

if [ ! -d "$DIRECTORY" ]; then
    # Control will enter here if $DIRECTORY doesn't exist.

However, as Jon Ericson points out (thanks Jon), subsequent commands may not work as intended if you do not take into account that a symbolic link to a directory will also pass this check. E.g. running this:

if [ -d "$SYMLINK" ]; then 
    rmdir "$SYMLINK" 

Will produce the error message:

rmdir: failed to remove `symlink': Not a directory

So symbolic links may have to be treated differently, if subsequent commands expect directories:

if [ -d "$LINK_OR_DIR" ]; then 
    if [ -L "$LINK_OR_DIR" ]; then
        # It is a symlink!
        # Symbolic link specific commands go here.
        rm "$LINK_OR_DIR"
        # It's a directory!
        # Directory command goes here.
        rmdir "$LINK_OR_DIR"

Take particular note of the double-quotes used to wrap the variables, the reason for this is explained by 8jean in another answer.

If the variables contain spaces or other unusual characters it will probably cause the script to fail.

Is there a (unix) shell script to format JSON in human-readable form?

Basically, I want it to transform the following:

{ foo: "lorem", bar: "ipsum" }

... into something like this:

    foo: "lorem",
    bar: "ipsum"


Answered By: B Bycroft ( 704)

With python you can just do

echo '{"foo": "lorem", "bar": "ipsum"}' | python -mjson.tool

There is a plethora of questions where people talk about common tricks, notably "Vim+ctags tips and tricks".

However, I don't refer to commonly used shortcuts that someone new to Vim would find cool. I am talking about a seasoned Unix user (be she/he a developer, administrator, both, etc.), who thinks (s)he knows something 99% of us never heard or dreamed about. Something that not only makes his/her work easier, but also is COOL and hackish. After all, Vim resides in the most dark-corner-rich OS in the world, thus it should have intricacies that only a few privileged know about and want to share with us.

Answered By: skinp ( 339)

Might not be one that 99% of Vim users don't know about, but it's something I use daily and that any Linux+Vim poweruser must know.

Basic command, yet extremely useful.

:w !sudo tee %

I often forget to sudo before editing a file I don't have write permissions on. When I come to save that file and get a permission error, I just issue that vim command in order to save the file without the need to save it to a temp file and then copy it back again.

You obviously have to be on a system with sudo installed and have sudo rights.

Tristan Havelick

In a unix shell, if I want to combine stderr and stdout into the stdout stream for further manipulation, I can append the following on the end of my command:


So, if I want to use "head" on the output from g++, I can do something like this:

g++ lots_of_errors 2>&1 | head

so I can see only the first few errors.

I always have trouble remembering this, and I constantly have to go look it up, and it is mainly because I don't fully understand the syntax of this particular trick. Can someone break this up and explain character by character what "2>&1" means?

Answered By: Ayman Hourieh ( 289)

1 is stdout. 2 is stderr.

Here is one way to remember this construct (altough it is not entirely accurate): at first, 2>1 may look like a good way to redirect stderr to stdout. However, it will actually be interpreted as "redirect stderr to a file named 1". & indicates that what follows is a file descriptor and not a filename. So the construct becomes: 2>&1.

$ time foo
real        0m0.003s
user        0m0.000s
sys         0m0.004s

What do 'real', 'user' and 'sys' mean in the output of time?

Which one is meaningful when benchmarking my app?

Answered By: ConcernedOfTunbridgeWells ( 344)

Real, User and Sys process time statistics

One of these things is not like the other. Real refers to actual elapsed time; User and Sys refer to CPU time used only by the process.

  • Real is wall clock time - time from start to finish of the call. This is all elapsed time including time slices used by other processes and time the process spends blocked (for example if it is waiting for I/O to complete).

  • User is the amount of CPU time spent in user-mode code (outside the kernel) within the process. This is only actual CPU time used in executing the process. Other processes and time the process spends blocked do not count towards this figure.

  • Sys is the amount of CPU time spent in the kernel within the process. This means executing CPU time spent in system calls within the kernel, as opposed to library code, which is still running in user-space. Like 'user', this is only CPU time used by the process. See below for a brief description of kernel mode (also known as 'supervisor' mode) and the system call mechanism.

User+Sys will tell you how much actual CPU time your process used. Note that this is across all CPUs, so if the process has multiple threads it could potentially exceed the wall clock time reported by Real. Note that in the output these figures include the User and Sys time of all child processes as well, although the underlying system calls return the statistics for the process and its children separately.

Origins of the statistics reported by time (1)

The statistics reported by time are gathered from various system calls. 'User' and 'Sys' come from wait (2) or times (2), depending on the particular system. 'Real' is calculated from a start and end time gathered from the gettimeofday (2) call. Depending on the version of the system, various other statistics such as the number of context switches may also be gathered by time.

On a multi-processor machine a multi-threaded process or a process forking children could have an elapsed time smaller than the total CPU time - as different threads or processes may run in parallel. Also, the time statistics reported come from different origins, so times recorded for very short running tasks may be subject to rounding errors, as the example given by the original poster shows.

A brief primer on Kernel vs. User mode

On unix, or any protected-memory operating system, 'Kernel' or 'Supervisor' mode refers to a privileged mode that the CPU can operate in. Certain privileged actions that could affect security or stability can only be done when the CPU is operating in this mode; these actions are not available to application code. An example of such an action might be to manipulate the MMU to gain access to the address space of another process. Normally, user-mode code cannot do this (with good reason), although it can request shared memory from the kernel, which could be read or written by more than one process. In this case, the shared memory is explicitly requested from the kernel through a secure mechansm and both processes have to explicitly attach to it in order to use it.

The privileged mode is usually referred to as 'kernel' mode because the kernel is executed by the CPU running in this mode. In order to switch to kernel mode you have to issue a specific instruction (often called a trap) that switches the CPU to running in kernel mode and runs code from a specific location held in a jump table. For security reasons, you cannot switch to kernel mode and execute arbitrary code - the traps are managed through a table of addresses that cannot be written to unless the CPU is running in supervisor mode. You trap with an explicit trap number and the address is looked up in the jump table; the kernel has a finite number of controlled entry points.

The 'system' calls in the C libary (particularly those described in Section 2 of the man pages) have a user-mode component, which is what you actually call from your C program. Behind the scenes they may issue one or more system calls to the kernel to do specific services such as I/O, but they still also have code running in user-mode. It is also quite possible to directly issue a trap to kernel mode from any user space code if desired, although you may need to write a snippet of assembly language to set up the registers correctly for the call. A page describing the system calls provided by the Linux kernel and the conventions for setting up registers can be found here.

More about 'sys'

There are things that your code cannot do from user mode - things like allocating memory or accessing hardware (HDD, network, etc.) These are under the supervision of The Kernel, and he alone can do them. Some operations that you do (like malloc or fread/fwrite) will invoke these Kernel functions and that then will count as 'sys' time. Unfortunately it's not as simple as "every call to malloc will be counted in 'sys' time". The call to malloc will do some processing of its own (still counted in 'user' time) and then somewhere along the way call the function in kernel (counted in 'sys' time). After returning from the kernel call there will be some more time in 'user' and then malloc will return to your code. When the switch happens and how much of it is spent in kernel mode - you cannot say. It depends on the implementation of the library. Also, other seemingly innocent functions might also use malloc and the like in the background, which will again have some time in 'sys' then.

Chris Serra

Is there a way to include all the jar files within a directory in the classpath?

I'm trying java -classpath lib/*.jar:. my.package.Program and it is not able to find class files that are certainly in those jars. Do I need to add each jar file to the classpath separately?

Answered By: basszero ( 192)

If using Java 6 or later, classpath wildcards are a part of the JVM.

java -cp "Test.jar;lib/*" my.package.MainClass

Key gotchas:

  1. Use quotes
  2. Use * only, not *.jar

The above example and gotchas are from other answers on this page. (Thanks davorp et al & Wim Deblauwe)

From the Classpath document section entitled, Understanding class path wildcards:

Class path entries can contain the basename wildcard character *, which is considered equivalent to specifying a list of all the files in the directory with the extension .jar or .JAR. For example, the class path entry foo/* specifies all JAR files in the directory named foo. A classpath entry consisting simply of * expands to a list of all the jar files in the current directory.

A class path entry that contains * will not match class files. To match both classes and JAR files in a single directory foo, use either foo;foo/* or foo/*;foo. The order chosen determines whether the classes and resources in foo are loaded before JAR files in foo, or vice versa.

Subdirectories are not searched recursively. For example, foo/* looks for JAR files only in foo, not in foo/bar, foo/baz, etc.

The order in which the JAR files in a directory are enumerated in the expanded class path is not specified and may vary from platform to platform and even from moment to moment on the same machine. A well-constructed application should not depend upon any particular order. If a specific order is required then the JAR files can be enumerated explicitly in the class path.

Expansion of wildcards is done early, prior to the invocation of a program's main method, rather than late, during the class-loading process itself. Each element of the input class path containing a wildcard is replaced by the (possibly empty) sequence of elements generated by enumerating the JAR files in the named directory. For example, if the directory foo contains a.jar, b.jar, and c.jar, then the class path foo/* is expanded into foo/a.jar;foo/b.jar;foo/c.jar, and that string would be the value of the system property java.class.path.

The CLASSPATH environment variable is not treated any differently from the -classpath (or -cp) command-line option. That is, wildcards are honored in all these cases. However, class path wildcards are not honored in the Class-Path jar-manifest header.

If you cannot use wildcards, bash allows the following syntax (where lib is the directory containing all the Java archive files):

java -cp $(echo lib/*.jar | tr ' ' ':')

(Note that using a classpath is incompatible with the -jar option. See also: Execute jar file with multiple classpath libraries from command prompt)


I unsuccesfully tried:

sed 's#/\n# #g' file
sed 's#^$# #g' file

How to fix it?

Answered By: Zsolt Botykai ( 258)

Or use this solution with sed:

sed ':a;N;$!ba;s/\n/ /g'

This will read the whole file in a loop, then replaces the newline(s) with a space.

Update: explanation.

  1. create a label via :a
  2. append the current and next line to the pattern space via N
  3. if we are before the last line, branch to the created label $!ba ($! means not to do it on the last line (as there should be one final newline)).
  4. finally the substitution replaces every newline with a space on the pattern space (which is the whole file).

I'm looking for the string "foo=" (without quotes) in text files in a directory tree. It's on a common Linux machine, I have bash shell:

grep -ircl "foo=" *

In the directories are also many binary files which match "foo=". As these results are not relevant and slow down the search, I want grep to skip searching these files (mostly JPEG and PNG images): how would I do that?

I know there are the --exclude=PATTERN and --include=PATTERN options, but what is the pattern format? manpage of grep says:

--include=PATTERN     Recurse in directories only searching file matching PATTERN.
--exclude=PATTERN     Recurse in directories skip file matching PATTERN.

Searching on grep include, grep include exclude, grep exclude and variants did not find anything relevant

If there's a better way of grepping only in certain files, I'm all for it; moving the offending files is not an option, I can't search only certain directories (the directory structure is a big mess, with everything everywhere). Also, I can't install anything, so I have to do with common tools (like grep or the suggested find).

UPDATES: @Adam Rosenfield's answer is just what I was looking for:

grep -ircl --exclude=*.{png,jpg} "foo=" *

@rmeador's answer is also a good solution:

grep -Ir --exclude="*\.svn*" "pattern" *

It searches recursively, ignores binary files, and doesn't look inside Subversion hidden folders.(...)

Answered By: Adam Rosenfield ( 154)

Use the shell globbing syntax:

grep pattern -r --include=\*.{cpp,h} rootdir

The syntax for --exclude is identical.

Note that the star is escaped with a backslash to prevent it from being expanded by the shell (quoting it, such as --include="*.{cpp,h}", would work just as well). Otherwise, if you had any files in the current working directory that matched the pattern, the command line would expand to something like grep pattern -r --include=foo.cpp --include=bar.h rootdir, which would only search files named foo.cpp and bar.h, which is quite likely not what you wanted.

Andy White

I'm debating whether I should learn PowerShell, or just stick with Cygwin/Perl Scripts/Unix Shell scripts, etc.

The benefit of PowerShell would be that the scripts could be more easily used by teammates that don't have cygwin; however, I don't know if I'd really be writing that many general purpose scripts, or if people would even use them.

Unix scripting is so powerful, does PowerShell come close enough to warrant switching over?

EDIT: Here are some of the specific things (or equivalents) I would be looking for in PowerShell:

  • grep
  • sort
  • uniq
  • Perl (how close does PowerShell come to Perl capabilities?)
  • awk
  • sed
  • file (the command that gives file info)
  • etc.
Answered By: Jeffrey Snover - MSFT ( 382)

Tools are just tools.
They help or they don't.
You need help or you don't.

If you know Unix and those tools do what you need them to do on Windows - then you are a happy guy and there is no need to learn PowerShell (unless you want to explore).

My original intent was to include a set of Unix tools in Windows and be done with it (a number of us on the team have deep Unix backgrounds and a healthy dose of respect for that community.) What I found was that this didn't really help much. The reason for that is that awk/grep/sed don't work against COM, WMI, ADSI, the Registry, the cert store, etc, etc. In other words, UNIX is an entire ecosystem self-tuned around text files. As such, text processing tools are effectively management tools. Windows is a completely different ecosystem self-tuned around APIs and Objects. That's why we invented PowerShell.

What I think you'll find is that there will be lots of occasions when text-processing won't get you what you want on Windows. At that point, you'll want to pick up PowerShell. NOTE - it is not an all or nothing deal. Within PowerShell, you can call out to your Unix tools (and use their text process or PowerShell's text processing). Also you can call PowerShell from your Unix tools and get text.

Again - there is no religion here - our focus is on giving you the tools you need to succeed. That is why we are so passionate about feedback. Let us know where we are falling down on the job or where you don't have a tool you need and we'll put it on the list and get to it. In all honesty, we are digging ourselves out of a 30 year hole so it is going to take a while. That said, if you pick up the beta of Windows Server 2008 /R2 and/or the betas of our server products, I think you'll be shocked at how quickly that hole is getting filled.

With regard to usage - we've had > 3.5 million downloads to date. That does not include the people using it in Windows Server 2008 because it is included as an optional component and does not need a download. V2 will ship in all versions of Windows. It will be on-by-default for all editions except Server core where it is an optional component. Shortly after Windows 7/Windows Server 2008 R2 ships, we'll make V2 available on all platforms XP and above. In other words - your investment in learning will be applicable to a very large number of machines/environments.

One last comment. If/when you start to learn PowerShell, I think you'll be pretty happy. Much of the design is heavily influenced by our Unix backgrounds so while we are quite different, you'll pick it up very quickly (after you get over cussing that it isn't Unix :-) ). We know that people have a very limited budget for learning - that is why we are super hard-core about consistency. You are going to learn something and then you'll use it over and over and over again.

Experiment! Enjoy! Engage!

Jeffrey Snover [MSFT] Windows Management Partner Architect