# Searching from the Terminal
Finding content is probably the most useful CLI skill to acquire. It comes in handy all the time as a developer: perhaps you're chasing a bug, or just trying to look at a foreign codebase to figure out how a new feature works.
Searching through text files generally falls into two camps: Content-based, in which you're searching for specific content; and Attribute-based, in which you're searching for files that meet some criteria.
# Universal Approach vs Modern Tooling
The universal approaches for content-based and attribute-based searches is to
find, respectively. They'll do in a pinch since they're
already installed in most Linux environments.
However, there are two new tools vying to usurp those classics, they are
fd. Compared to the classic tools, they are both:
- Written in Rust and target musl instead of glibc, so that makes them fully static binaries. That means you can copy those same binaries into ubuntu, centos, or arch and they should just work out of the box, with no installation step or dependencies. Good when you lack root access.
- Highly concurrent, using all your CPU cores by default rather than just one. This means more speed.
- Aware of git repos and by default don't scan your .git/ folder, and anything
else in your .gitignore file.
- No more
find . -type f | egrep -v "\.git" | xargs grep -l "search_pattern"
- No more
regexby default. (As opposed to fixed strings, or globs.)
- Use color by default.
fd does not allow for the complex querying that
find allows, it
can do all the most common use-cases more quickly (speed) and with less typing
(ergonomic). And besides, if you really need the complexity of AND, OR, NOT
statements then you're probably writing a script, and not just using tools adhoc.
If that's the case, then the expressivity of native python or bash scripting
will probably supersede the need for
find expressions anyway.
The tool for attribute-based searchs is
fd. The syntax is:
fd pattern [path] [-flags and options]
Does the filename contain this string?
The search pattern is the only required argument when calling
fd. By default,
it uses regular expressions so you can do these types of searches:
'ends_with$' or use
. as wildcards.
What if I'm only interested in python files.
You can filter by files having a specific extension with the
-e option. So
if you only want python files use
-e py. Although you could add the extension
to the search pattern with
'(...)\.py$', but it's cleaner and more intuitive to keep
What files were modified this month?
Use the rather verbose
--changed-within 1w option. You can specify time in
min), hours (
h), days (
d), weeks (
w), months (
or years (
y). You can also specify a specific timestamp in the verbose format
Find the old files
--changed-before 1d option.
How do I avoid binaries? related: How do I find files smaller/larger than X?
Search by size, using the
--size option. Text files are small so a
common trick to avoid binaries is to search by files smaller than 20Kb
Notice the negative 20? That means "smaller than". To search for files larger
than use a
+ as in
-S +20k. Units are
How do I only search for files? Directories? Pipes?
To search for files use the
-t f where f is files. To search for directories
-t d, for executable
-t x, for empty files
-t e, for symlinks
-t s, and named pipes
How do I exclude files that match a glob pattern?
The tool for content-based searches is
ripgrep which is shortened to
the CLI invocation. The syntax is:
rg [-flags and options] pattern [path]
How do I find files that contain a function name?
Just use write the function name as the search pattern. By default, the search
pattern uses regular expressions. If you need literal strings, to for example
not have to escape code segments, then use
How do I include/exclude files from the search?
You can use multiple
-g options, specifying the globs to include or exclude
!). By default files in your
.git, or hidden
files/directories are also not scanned.
To search among the "hidden files", use
-H or to ignore git "ignored" files use
--no-ignore. If you want to specify additional files to ignore, you can use
--ignore-file PATH option, passing in a gitignore compatible file.
How can I view more of the surrounding code?
Use the context option
-C n, the default is 3 lines.
When searching through a very large number of files, it is sometimes helpful to
filter both on content and on attributes. To do this successfully you must first
start with content-based search, using
rg. Once you have perfected your search
pattern you'll want to get rid of the context and only show the filepath to the
matching files. This is accomplished by passing the lowercase L flag
Now your results of matching files can be piped into
fd which can narrow
the results even further based on file-attributes. For example:
rg function_name -l | fd --changed-within 2d
# I have a filelist, Now what?
Once you have a list of files, you'll probably want to run some command using these files. Maybe you want to read the files with vim? Maybe you're trying to delete these files from your directory? Or copy them?
Well executables can either take a specific number of parameters or they can take
an unlimited amount. Executables like
cp expect a set number of arguments.
The first argument is the original filepath and the second is new filepath. For
commands like those, you'll want to use the lowercase option
-x COMMAND. If you
have 1000 results, then the command will run 1000 times, each time being fed
a different file from the list.
vim can be fed the entire filelist at once. Rather than
launching 1000x, they'll launch once, but iterate through all 1000 files. This
is more efficient, when available. To feed the entire list at once, use the
fd . -x rm # same as: rm A rm B ... rm Z
While uppercase X is equivalent to:
fd . -X rm # same as: rm A B ... Z
# fd Cheatsheet
--changed-before [duration or timestamp]- files older than
--changed-within [duration or timestamp]- files newer than
-e extension- filter by file extension
-E glob_pattern- exclude those files
-S [-/+][n][unit]- files less than/greater than n Units
-t [f|d|x|e]- find only files, directories, executables, empty files,..
-x COMMAND- execute
-X COMMAND- batch execute
# rg Cheatsheet
-l(lowercase L) - only show matching files, no context
-v- invert match
-C nhow many lines of context to give
-F LITERAL_STRING- match on literal string
-g GLOB_PATTERN- include/exclude(using
!) files in the glob pattern
--no-ignore- don't use VCS gitignore