Lecture 21: Regular expressions. Why we need ways of describing patterns of strings, and not just specific strings. The syntax and semantics of regular expressions: constants, concatenation, alternation, repetition. Back-references and capture groups. Splitting on regular expressions. grep and grepl for finding matches. regexpr and gregexpr for finding matches, regmatches for extracting the matching substrings. regexec for capture groups. Examples of multi-stage processing with regular expressions. Examples of substitutions with regmatches, sub and gsub. Things you cannot do with regular expressions.
Examples: Lincoln's 2nd inaugural address; baking brownies with Ms. Alice B. Tolkas; extracting earthquake information from data files.
Readings: Matloff, chapter 11; R Cookbook, chapter 7; Spector, chapter 7. Optional readings: Bradnam and Korf, sections 4.26--4.28, 5.3, 6.1
Posted at November 06, 2013 10:30 | permanent link