UNIX grep is a command tool for searching text strings inside files. (One should not confuse it with find which matches filenames and properties). In this blog post there are some hints how to use grep to search from files fast and efficiently.
Example how to search Plone source tree for “content-core” examples in page template files
Some notes about grep
- Grep can search multiple files and directory trees
- Grep can be tuned to be faster
- Grep output can be friendly and colorized
As with many UNIX tools, due to legacy and backwards compatibility, grep doesn’t do these things out of the box and simply provides you an plain barebone interface.
1. Install GNU grep
GNU grep supports plenty of options, like better coloring, over BSD grep which is shipped with BSD based operating systems like OSX. You can install GNU grep from grep package of Macports. See ztanesh README for example sudo port install command.
2. Searching multiple files
Below is an example how to search case-insensitive (-i) match, recursively (-R) from a folder, only including (–include) .py files. I.e. It searches all Python files in the source tree for “foobar” word:
grep -Ri --include="*.py" foobar ~/code/mixnap/krusovice-src
3. Using colors
You can colorize things in grep output like filename, linenumber, highlighted match and lines around the result.
Below is my example for setting GREP_COLORS environment variable
GREP_OPTIONS="--color=always" GREP_COLORS="ms=01;37:mc=01;37:sl=:cx=01;30:fn=35:ln=32:bn=32:se=36"
Note: Use GREP_COLORS, not deprecated GREP_COLOR environment variable, as the former provides much more options.
4. Search as ASCII
By default, grep will decode incoming text files in encoding set in environment variables. This will take CPU cycles. If you are searching plain ASCII match, like with programming language source code files, you can gain much speed by disabling the decoding. Override LC_CTYPE environment variable when running grep:
LC_CTYPE=POSIX grep....
This is a GNU grep bug and fixed in 2.7.
5. Show lines around the match
You can specify –before-context and –after-context options which show the text snippet around the matching line. Also –line-number is very useful switch when dealing with source code files.
6. ZSH shell search alias
This wraps it all together. We define a ZSH function search which will give us a shortcut for searching multiple files in a folder tree:
# Search ASCII-string from multiple files in the currect working directory # E.g. # search "foobar" "*.html" # search "foobar" "*.html" myfolder # By default we excluse dotted files and directoves (.git, .svn) function search() { if [[ ! -n "$1" ]] ; then echo "Usage: search \"pattern\" \"*.filemask\" \"path\"" return fi # Did we get path arg if [[ ! -n "$3" ]] ; then search_path="." else search_path="$3" fi # LC_CTYPE="posix" 20x increases performance for ASCII search # https://twitter.com/jlaurila/status/86750682094374912 # We use specially tuned GREP colors - make sure you have GNU grep on OSX # https://github.com/miohtama/ztanesh/blob/master/README.rst GREP_COLORS="ms=01;37:mc=01;37:sl=:cx=01;30:fn=35:ln=32:bn=32:se=36" LC_CTYPE=POSIX \ grep -Ri "$1" --line-number --before-context=3 --after-context=3 --color=always --include="$2" --exclude=".*" "$search_path"/* }
This, and other ZSH goodies, are available in ztanesh package on Github.
7. Turn off OS native file indexing
If you use grep as your primary search tool I suggest you turn off your operating system search indexing operations like OSX Spotlight. These just take space and CPU cycles.
Subscribe to RSS feed Follow me on Twitter Follow me on Facebook Follow me Google+