Top grep Questions

List of Tags
347
Mark Harrison

I would like to grep for a string, but show the preceding 5 lines and following 5 lines as well as the matched line. I'm scanning for errors in a logfile, and want to see the context.

Any clues for the clueless?

Answered By: Pat Notz ( 499)

For BSD or GNU grep you can use -B num to set how many lines before the match and -A num for the number of lines after the match.

grep -B 3 -A 2 foo README.txt

If you want the same amount of lines before and after you can use -C num.

grep -C 3 foo README.txt

This will show 3 lines before and 3 lines after.

281
tylerl

I'm trying to build a list of functions that can be used for arbitrary code execution. The purpose isn't to list functions that should be blacklisted or otherwise disallowed. Rather, I'd like to have a grep-able list of red-flag keywords handy when searching a compromised server for back-doors.

The idea is that if you want to build a multi-purpose malicious PHP script -- such as a "web shell" script like c99 or r57 -- you're going to have to use one or more of a relatively small set of functions somewhere in the file in order to allow the user to execute arbitrary code. Searching for those those functions helps you more quickly narrow down a haystack of tens-of-thousands of PHP files to a relatively small set of scripts that require closer examination.

Clearly, for example, any of the following would be considered malicious (or terrible coding):

<? eval($_GET['cmd']); ?>

<? system($_GET['cmd']); ?>

<? preg_replace('/.*/e',$_POST['code']); ?>

and so forth.

Searching through a compromised website the other day, I didn't notice a piece of malicious code because I didn't realize preg_replace could be made dangerous by the use of the /e flag (which, seriously? Why is that even there?). Are there any others that I missed?

Here's my list so far:

Shell Execute

  • system
  • exec
  • popen
  • backtick operator
  • pcntl_exec

PHP Execute

  • eval
  • preg_replace (with /e modifier)
  • create_function
  • include[_once] / require[_once] (see mario's answer for exploit details)

It might also be useful to have a list of functions that are capable of modifying files, but I imagine 99% of the time exploit code will contain at least one of the functions above. But if you have a list of all the functions capable of editing or outputting files, post it and I'll include it here. (And I'm not counting mysql_execute, since that's part of another class of exploit.)

Answered By: Rook ( 207)

To build this list I used 2 sources. A Study In Scarlet and RATS. I have also added some of my own to the mix and people on this thread have helped out.

Edit: After posting this list I contacted the founder of RIPS and as of now this tools searches PHP code for the use of every function in this list.

Most of these function calls are classified as Sinks. When a tainted variable (like $_REQUEST) is passed to a sink function, then you have a vulnerability. Programs like RATS and RIPS use grep like functionality to identify all sinks in an application. This means that programmers should take extra care when using these functions, but if they where all banned then you wouldn't be able to get much done.

"With great power comes great responsibility."

--Stan Lee

Command Execution

exec           - Returns last line of commands output
passthru       - Passes commands output directly to the browser
system         - Passes commands output directly to the browser and returns last line
shell_exec     - Returns commands output
`` (backticks) - Same as shell_exec()
popen          - Opens read or write pipe to process of a command
proc_open      - Similar to popen() but greater degree of control
pcntl_exec     - Executes a program

PHP Code Execution

Apart from eval there are other ways to execute PHP code: include/require can be used for remote code execution in the form of Local File Include and Remote File Include vulnerabilities.

eval()
assert()  - identical to eval()
preg_replace('/.*/e',...) - /e does an eval() on the match
create_function()
include()
include_once()
require()
require_once()
$_GET['func_name']($_GET['argument']);
$func = new ReflectionFunction($_GET['func_name']); $func->invoke(); or $func->invokeArgs(array());

List of functions which accept callbacks

These functions accept a string parameter which could be used to call a function of the attacker's choice. Depending on the function the attacker may or may not have the ability to pass a parameter. In that case an Information Disclosure function like phpinfo() could be used.

Function                     => Position of callback arguments
'ob_start'                   =>  0,
'array_diff_uassoc'          => -1,
'array_diff_ukey'            => -1,
'array_filter'               =>  1,
'array_intersect_uassoc'     => -1,
'array_intersect_ukey'       => -1,
'array_map'                  =>  0,
'array_reduce'               =>  1,
'array_udiff_assoc'          => -1,
'array_udiff_uassoc'         => array(-1, -2),
'array_udiff'                => -1,
'array_uintersect_assoc'     => -1,
'array_uintersect_uassoc'    => array(-1, -2),
'array_uintersect'           => -1,
'array_walk_recursive'       =>  1,
'array_walk'                 =>  1,
'assert_options'             =>  1,
'uasort'                     =>  1,
'uksort'                     =>  1,
'usort'                      =>  1,
'preg_replace_callback'      =>  1,
'spl_autoload_register'      =>  0,
'iterator_apply'             =>  1,
'call_user_func'             =>  0,
'call_user_func_array'       =>  0,
'register_shutdown_function' =>  0,
'register_tick_function'     =>  0,
'set_error_handler'          =>  0,
'set_exception_handler'      =>  0,
'session_set_save_handler'   => array(0, 1, 2, 3, 4, 5),
'sqlite_create_aggregate'    => array(2, 3),
'sqlite_create_function'     =>  2,

Information Disclosure

Most of these function calls are not sinks. But rather it maybe a vulnerability if any of the data returned is viewable to an attacker. If an attacker can see phpinfo() it is definitely a vulnerability.

phpinfo
posix_mkfifo
posix_getlogin
posix_ttyname
getenv
get_current_user
proc_get_status
get_cfg_var
disk_free_space
disk_total_space
diskfreespace
getcwd
getlastmo
getmygid
getmyinode
getmypid
getmyuid

Other

extract - Opens the door for register_globals attacks (see study in scarlet).
parse_str -  works like extract if only one argument is given.  
putenv
ini_set
mail - has CRLF injection in the 3rd parameter, opens the door for spam. 
header - on old systems CRLF injection could be used for xss or other purposes, now it is still a problem if they do a header("location: ..."); and they do not die();. The script keeps executing after a call to header(), and will still print output normally. This is nasty if you are trying to protect an administrative area. 
proc_nice
proc_terminate
proc_close
pfsockopen
fsockopen
apache_child_terminate
posix_kill
posix_mkfifo
posix_setpgid
posix_setsid
posix_setuid

Filesystem Functions

According to RATS all filesystem functions in php are nasty. Some of these don't seem very useful to the attacker. Others are more useful than you might think. For instance if allow_url_fopen=On then a url can be used as a file path, so a call to copy($_GET['s'], $_GET['d']); can be used to upload a PHP script anywhere on the system. Also if a site is vulnerable to a request send via GET everyone of those file system functions can be abused to channel and attack to another host through your server.

// open filesystem handler
fopen
tmpfile
bzopen
gzopen
SplFileObject->__construct
// write to filesystem (partially in combination with reading)
chgrp
chmod
chown
copy
file_put_contents
lchgrp
lchown
link
mkdir
move_uploaded_file
rename
rmdir
symlink
tempnam
touch
unlink
imagepng   - 2nd parameter is a path.
imagewbmp  - 2nd parameter is a path. 
image2wbmp - 2nd parameter is a path. 
imagejpeg  - 2nd parameter is a path.
imagexbm   - 2nd parameter is a path.
imagegif   - 2nd parameter is a path.
imagegd    - 2nd parameter is a path.
imagegd2   - 2nd parameter is a path.
iptcembed
ftp_get
ftp_nb_get
// read from filesystem
file_exists
file_get_contents
file
fileatime
filectime
filegroup
fileinode
filemtime
fileowner
fileperms
filesize
filetype
glob
is_dir
is_executable
is_file
is_link
is_readable
is_uploaded_file
is_writable
is_writeable
linkinfo
lstat
parse_ini_file
pathinfo
readfile
readlink
realpath
stat
gzfile
readgzfile
getimagesize
imagecreatefromgif
imagecreatefromjpeg
imagecreatefrompng
imagecreatefromwbmp
imagecreatefromxbm
imagecreatefromxpm
ftp_put
ftp_nb_put
exif_read_data
read_exif_data
exif_thumbnail
exif_imagetype
hash_file
hash_hmac_file
hash_update_file
md5_file
sha1_file
highlight_file
show_source
php_strip_whitespace
get_meta_tags
221
Ortwin Gentz

I have deleted a file or some code in a file sometime in the past. Can I grep in the content (not in the commit messages)?

A very poor solution is to grep the log:

git log -p | grep <pattern>

However this doesn't return the commit hash straight away. I played around with git grep to no avail.

Answered By: Jeet ( 310)

To search for commit content (i.e., actual lines of source, as opposed to commit messages and the like), what you need to do is:

git grep <regexp> $(git rev-list --all)

This will grep through all your commit text for regexp.

Here are some other useful ways of searching your source:

Search working tree for text matching regular expression regexp:

git grep <regexp>

Search working tree for lines of text matching regular expression regexp1 or regexp2:

git grep -e <regexp1> [--or] -e <regexp2>

Search working tree for lines of text matching regular expression regexp1 and regexp2, reporting file paths only:

git grep -e <regexp1> --and -e <regexp2>

Search working tree for files that have lines of text matching regular expression regexp1 and lines of text matching regular expression regexp2:

git grep -l --all-match -e <regexp1> -e <regexp2>

Search all revisions for text matching regular expression regexp:

git grep <regexp> $(git rev-list --all)

Search all revisions between rev1 and rev2 for text matching regular expression regexp:

git grep <regexp> $(git rev-list <rev1>..<rev2>)

If you run OS X, I have written an OS X Dashboard Widget summarizing this (and other Git commands) here.

188
Piskvor

I'm looking for the string "foo=" (without quotes) in text files in a directory tree. It's on a common Linux machine, I have bash shell:

grep -ircl "foo=" *

In the directories are also many binary files which match "foo=". As these results are not relevant and slow down the search, I want grep to skip searching these files (mostly JPEG and PNG images): how would I do that?

I know there are the --exclude=PATTERN and --include=PATTERN options, but what is the pattern format? manpage of grep says:

--include=PATTERN     Recurse in directories only searching file matching PATTERN.
--exclude=PATTERN     Recurse in directories skip file matching PATTERN.

Searching on grep include, grep include exclude, grep exclude and variants did not find anything relevant

If there's a better way of grepping only in certain files, I'm all for it; moving the offending files is not an option, I can't search only certain directories (the directory structure is a big mess, with everything everywhere). Also, I can't install anything, so I have to do with common tools (like grep or the suggested find).

UPDATES: @Adam Rosenfield's answer is just what I was looking for:

grep -ircl --exclude=*.{png,jpg} "foo=" *

@rmeador's answer is also a good solution:

grep -Ir --exclude="*\.svn*" "pattern" *

It searches recursively, ignores binary files, and doesn't look inside Subversion hidden folders.(...)

Answered By: Adam Rosenfield ( 154)

Use the shell globbing syntax:

grep pattern -r --include=\*.{cpp,h} rootdir

The syntax for --exclude is identical.

Note that the star is escaped with a backslash to prevent it from being expanded by the shell (quoting it, such as --include="*.{cpp,h}", would work just as well). Otherwise, if you had any files in the current working directory that matched the pattern, the command line would expand to something like grep pattern -r --include=foo.cpp --include=bar.h rootdir, which would only search files named foo.cpp and bar.h, which is quite likely not what you wanted.

151
Portman

Any recommendations on grep tools for Windows? Ideally ones that could leverage 64-bit OS.

I'm aware of cygwin, of course, and have also found PowerGREP, but I'm wondering if there are any hidden gems out there.

Answered By: Mark Biek ( 48)

Based on recommendations in the comments, I've started using grepWin and it's fantastic and free.


(I'm still a fan of PowerGREP but I don't use it anymore.)

I know you already mentioned it but PowerGREP is awesome.

Some of my favorite features are:

  • Right-click on a folder to run PowerGREP on it
  • Use Regular Expressions or literal text
  • Specify wildcards for files to include & exclude
  • Search & replace
    • Preview mode is nice because you can make sure you're replacing what you intend to.

Now I realize that the other grep tools can do all of the above. It's just that PowerGREP packages all of the functionality into a very easy-to-use gui.

From the same wonderful folks who brought you RegexBuddy and who I have no affiliation with beyond loving their stuff.