
Training-modules is maintained by hbctraining.
Grep regex i license#
These are open access materials distributed under the terms of the Creative Commons Attribution license (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. This lesson has been developed by members of the teaching team at the Harvard Chan Bioinformatics Core (HBC). grep is used for simple patterns and basic regular expressions (BREs) egrep can handle extended regular expressions (EREs). What we mean by this is that if you were to have a variable named at that holds AT:ġ) Use grep to find all matches in catch.txt that start with “B” and have a “T” anywhere in the string after the “B”.Ģ) Use grep to find all matches in catch.txt that don’t start with “C” and don’t end with “H”ģ) Use grep to find all matches in catch.txt that have atleast two “A”s in them Additional Resources Notably, bash variables within single-quotations are NOT expanded. However, within bash, single-quotation marks ( ') are intepreted literally, meaning that the expression within the quotation marks will be interpreted by bash EXACTLY the way it is written. The big advantage of using quotation marks, single or double, when using grep is that it allows you to use search expressions with whitespace in them. So grep doesn’t ever “see” quotation marks, but rather quotation marks are interpreted by bash first and then the result is passed to grep. As a result, if your search term doesn’t have whitespace it doesn’t matter if you put quotations, but if it does, then it won’t behave the way you’d like it to behave. If you are using grep to search and have whitespace (space or tabs) in your search, grep will treat the expression before the whitespace as the search term and the expression after the whitespace(s) as a file(s). Let’s briefly discuss the differences: No quotation However, if you would like to use grep to do certain types of searches, it is better or safer to wrap your search term in quotations, and likely double quotations. When using grep it is usually not required to put your search term in quotes. If you want to make it a habit to always use the -E option when using regular expressions in grep it is a bit more safe. We won’t use too many of these types of regular expressions and we will point them out when we need them. There is a -E option when using grep that allows the user to use what is considered “extended regular expressons”. There are two principles that we should discuss more, the -E option and the use of quotation marks.

These differences are not exhaustive, but they will be helpful in exploring how regular expressions are implemented in grep. To match zero or more occurrences of any character in list, type the following command.In here, you can see that we have a variety of case differences and misspellings. If you want to find all words containing the pattern “nn,” type the following command. However, if you want to find all words containing the letter “n,” type the following command. If you want to find all words with the letters “qu” in them, type the following command.

When an asterisk ( *) follows a character, grep interprets the asterisk as “zero or more instances of that character.” When the asterisk follows a regular expression, grep interprets the asterisk as “zero or more instances of characters matching the pattern.”īecause it includes zero occurrences, the asterisk can create a confusing command output. The following command matches any three-character string with “an” as the first two characters, including “any,” “and,” “management,” and “plan” (because spaces count, too). The following command displays any line in the file list where b is the only character on the line. The following command displays any line in which b is the last character on the line. The following command finds any line in the file list that starts with the letter b.Ī dollar-sign ( $) metacharacter indicates the end of the line. See Searching for Metacharacters for more information on escaping metacharacters.Ī caret ( ^) metacharacter indicates the beginning of the line.

When you use a grep regular expression at the command prompt, surround the regular expression with quotes. When you use regular expressions with the grep command, you need to tell your system to ignore the special meaning of these metacharacters by escaping them.

These special characters, called metacharacters, also have special meaning to the system. Regular expressions consist of letters and numbers, in addition to characters with special meaning to grep. You can also use the grep command to search for targets that are defined as patterns by using regular expressions.
