1、Perl 语言,1,第九章 用正则表达式处理文本,用s/替换,1,相关函数,2,列表上下文中的m/,3,2,更强大的正则表达式,4,9.1 用s/进行替换,m/ 等同于“查询(search)”功能,s /类似于“查询并替换”,3,$_ =“Hes out bowling with Barney tonight.”;s/Barney/Fred/; #Barney 被Fred 替换掉print “$_n”;s/with (w+)/agaist $1s team/;print “$_n”; #为“Hes out bowling against Freds team tonight”,9.1 用s/进
2、行替换,4,$_ =“green scaly dinosaur”;s/(w+) (w+)/$2, $1/; #现在为“scaly, green dinosaur”;s/huge, /; #现在为“huge, scaly, green dinosaur”s/,.*een/; #空替换,现在为“huge dinosaur”s/green/red/; #匹配失败,仍然为“huge dinosaur”s/w+$/($!)$ #现在为“gigantic (huge!) dinosaur”if (s/huge/gigantic/) print “It matched!n” #s/返回布尔值,用/g进行全局
3、替换,s/值进行一次替换,无论是否还有地方还能匹配上。修饰/g 要求s/将所有匹配上的部分都进行替换:,5,$_ = “home, sweet home!”;s/home/cave/g;print “$_n”; # “cave, sweet cave!”,应用:s/g缩减字符串中的空白。,$_ =“Input datat may have extra whitespace.”;s/s+/ /g; #现在是“Input data may have extra whitespace.”s/s+/; #将开头的空白去掉s/s+$/; #将结尾的空白去掉s/s+|s+$/g; #将开头,结尾的空白去掉
4、,不同的定界符,m/和qw/,可以改变使用其他定界符s/的分隔符也可以。但使用个分隔符,有些不同。,6,s#https:/#http:/#;sfredbarney;sfred(barney);s#barney#;,可用替换修饰符,除了/g 修饰符外,替换操作中还可使用/i, /x, 和/s,修饰前部分的模式这些在普通的模式匹配中已经出现过的修饰符。无顺序。,7,s#wilma#Wilma#gi; #所有WilmA(不分大小写),替换Wilmas_ _END_ _.* s; #将_END_ 标记及其后面内容去掉,无损替换,如果需要同时保留原始字符串和替换后的字符串,传统的办法是先复制一份,再替换
5、默认状态下,s/返回的成功替换的次数,8,my $original = Fred ate 1 rib;my $copy = $original;$copy = s/d+ ribs?/10 ribs/;(my $copy = $original) = s/d+ ribs?/10 ribs/; #先赋值,后替换my $copy = $original = s/d+ ribs?/10 ribs/r; #先替换,后赋值,大小写转换,修饰符U 要求接下来的字符均是大写:修饰符L 要求接下来的字符均是小写:,9,$_ =“I saw Barney with Fred.”;s/(fred|barney)/U
6、$1/gi; #$_现在是“I saw BARNEY with FRED.”s/(fred)|barney/L$1/gi; #$_现在是“I saw barney with fred.”,s/(w+) with (w+)/U$2E with $1/I; #$_ 现在是“I saw FRED with barney.”,大小写转换,修饰符u 要求接下来的一个字符大写:修饰符l 要求接下来的一个字符小写:,10,s/ (fred|barney)/u$1/ig; #$_现在是“I saw FRED with Barney.”s/(fred|barney)/uL$1/ig; #$_现在为“I saw
7、Fred with Barney.”,print “Hello, Lu$nameE, would you like to play a game?n” #也可以在双引号中使用此修饰符,9.2 相关函数,split操作符:根据模式拆分字符串my fields = split /separator/, $string;根据模式扫描字符串,按照模式匹配分隔字符串,如果匹配成功,该处就是当前字段的结尾,下一个字段的开头。任何匹配的内容都不会出现在返回值中。,11,fields = split /:/, “abc:def:g:h”; #返回(“abc”, “def”, “g”, “h”)fields =
8、 split /:/, “abc:def:g:h”; #得到(“abc”, “def”, “”, “g”, “h”),9.2 相关函数,如果有两个分隔符是连在一起的,则可能得到空的元素开头的空元素会被返回,但结尾的空元素被丢弃,12,fields = split /:/, “:a:b:c:”; #得到(“”, “”, “”, “a”, “b”, “c”);my $some_input = “This is a t test.n”;my args = split /s+/, $some_input; #(“This”, “is”, “a”, “test.”),9.2 相关函数,默认时,split
9、 对$_操作,模式为空白:,13,my fields = split; #同split /s+/, $_;my fields = split , abcdef; #将字符串分解为单个字符;,9.2 相关函数,join函数不使用模式,与split 相反的操作:将这些分割的部分组合成一个整体。 join 函数的第一个参数是胶水,它可以是任意字符串。剩下的参数是要被粘合的部分。join 将粘合元素添加在这些部分之间,并返回其结果:,14,my $x = join “:”, 4, 6, 8, 10, 12; #$x 为“4:6:8:10:12”,9.2 相关函数,被粘合的参数至少要有两个元素,否则胶水
10、无法添加,15,my $y = join “foo”, “bar”; #得到“bar”my empty; #空数组my $empty = join “baz”, empty; #没有元素,因此为空串my $x = join “:”, 4, 6, 8, 10, 12;my values = split /:/, $x; #values 为(4, 6, 8, 10, 12)my $z =join “-”, values; #$z 为“4-6-8-10-12”,9.3 列表上下文中的m/,列表上下文中,模式返回的捕获组得到的列表,16,$_ =“Hello there, neighbor!”;my(
11、$first, $second, $third) =/(S+) (S+), (S+)/;print “$second is my $thirdn”;,9.3 列表上下文中的m/,/g修饰符,可以使用到s/, 也可以修饰m/,意为匹配到字符串中的对个地方。,17,my $data = “Barney Rubble Fred FlintstoneWilma Flintstone”;my %last_name = ($data = / (w+)S+(w+)/g);,9.4更强大的正则表达式,量词:*, + ,? 3,5-贪婪量词Perl默认在保证整体匹配的前提下,会尽量匹配长字符串,18,fred
12、and barney went bowling last night用/fred.+barney/进行匹配-大量的回溯动作,直到匹配成功,非贪婪量词-*?, +? ,? 3,5?Perl默认会尽量匹配短字符串,fred and barney went bowling last night用/fred.+barney/进行匹配-大量的回溯动作,直到匹配成功,9.4更强大的正则表达式,正则表达式的速度依赖于具体的数据。贪婪量词与非贪婪量词的差别,19,Iam talking about the cartoon with Fred and Wilma!s#(.*)#$1#g;I thought you said Fred and Velma, not Wilma$#(.*?)#$1#g;,跨行的模式匹配-替换,在多行字符串中,也可以分别针对每一行进行替换。/m,20,$filename = ex9.txt;if (!open FILE, $filename) die Cant open $filename: $!;my $lines = join , ;$lines = s/$filename: /gm;print $lines;,本章小结,掌握:替换;split; join熟悉:列表上下文中的m/,非贪婪量词了解:跨行替换,21,Thank You !,,22,