>>> import string >>> s='hello rollen , how are you ' >>> string.capwords(s) 'Hello Rollen , How Are You' #每个单词的首字母大写 >>> string.split(s) ['hello', 'rollen', ',', 'how', 'are', 'you'] #划分为列表 默认是以空格划分 >>> s='1+2+3' >>> string.split(s,'+') #以‘+’号进行划分 ['1', '2', '3']
maketrans()方法会创建一个能够被translate()使用的翻译表,可以用来改变一些列的字符,这个方法比调用replace()更加的高效。
例如下面的例子将字符串s中的‘a,改为1,‘b’改为2,‘c’改为3’:
>>> leet=string.maketrans('abc','123') >>> s='abcdef' >>> s.translate(leet) '123def' >>> leet=string.maketrans('abc','123') >>> s='aAaBcC' >>> s.translate(leet) '1A1B3C'
string中的template的小例子:
import string values = { 'var':'foo' } t = string.Template(""" Variable : $var Escape : $$ Variable in text: ${var}iable """) print 'TEMPLATE:', t.substitute(values) s = """ Variable : %(var)s Escape : %% Variable in text: %(var)siable """ print 'INTERPOLATION:', s % values
上面的例子的输出为:
TEMPLATE:
Variable : foo
Escape : $
Variable in text: fooiable
INTERPOLATION:
Variable : foo
Escape : %
Variable in text: fooiable
但是上面的substitute如果提供的参数不足的时候,会出现异常,我们可以使用更加安全的办法,如下:
import string values = { 'var':'foo' } t = string.Template("$var is here but $missing is not provided") try: print 'substitute() :', t.substitute(values) except KeyError, err: print 'ERROR:', str(err) print 'safe_substitute():', t.safe_substitute(values)
上面例子的输出为:
substitute() : ERROR: 'missing'
safe_substitute(): foo is here but $missing is not provided
下面来看一些template的高级用法:
import string template_text = ''' Delimiter : %% Replaced : %with_underscore Ignored : %notunderscored ''' d = { 'with_underscore':'replaced', 'notunderscored':'not replaced', } class MyTemplate(string.Template): delimiter = '%' idpattern = '[a-z]+_[a-z]+' t = MyTemplate(template_text) print 'Modified ID pattern:' print t.safe_substitute(d)
输出为:
Modified ID pattern:
Delimiter : %
Replaced : replaced
Ignored : %notunderscored
在这个例子中,我们通过自定义属性delimiter 和 idpattern自定了规则,我们使用%替代了美元符号$,而且我们定义的替换规则是被替换的变量名要包含下环线,所以在上面的例子中,只替换了一个。
import textwrap sample_text = ''' The textwrap module can be used to format text for output in situations where pretty-printing is desired. It offers programmatic functionality similar to the paragraph wrapping or filling features found in many text editors. ''' print 'No dedent:\n' print textwrap.fill(sample_text, width=50)
输出为:
No dedent:
The textwrap module can be used to format text
for output in situations where pretty-printing is
desired. It offers programmatic functionality
similar to the paragraph wrapping or filling
features found in many text editors.
上面的例子设置宽度为50,下面的例子我们来移除缩进
import textwrap sample_text = ''' The textwrap module can be used to format text for output in situations where pretty-printing is desired. It offers programmatic functionality similar to the paragraph wrapping or filling features found in many text editors. ''' dedented_text = textwrap.dedent(sample_text) print 'Dedented:' print dedented_text
Dedented:
The textwrap module can be used to format text for output in
situations where pretty-printing is desired. It offers
programmatic functionality similar to the paragraph wrapping
or filling features found in many text editors.
Hit any key to close this window...
下面来一个对比:
import textwrap sample_text = ''' The textwrap module can be used to format text for output in situations where pretty-printing is desired. It offers programmatic functionality similar to the paragraph wrapping or filling features found in many text editors. ''' dedented_text = textwrap.dedent(sample_text).strip() for width in [ 45, 70 ]: print '%d Columns:\n' % width print textwrap.fill(dedented_text, width=width) print
上面的例子的输出如下:
45 Columns: The textwrap module can be used to format text for output in situations where pretty- printing is desired. It offers programmatic functionality similar to the paragraph wrapping or filling features found in many text editors. 70 Columns: The textwrap module can be used to format text for output in situations where pretty-printing is desired. It offers programmatic functionality similar to the paragraph wrapping or filling features found in many text editors. Hit any key to close this window...
我们也可以设置首行和剩余的行:
import textwrap sample_text = ''' The textwrap module can be used to format text for output in situations where pretty-printing is desired. It offers programmatic functionality similar to the paragraph wrapping or filling features found in many text editors. ''' dedented_text = textwrap.dedent(sample_text).strip() print textwrap.fill(dedented_text, initial_indent=' ', subsequent_indent=' ' * 4, width=50, )
输出为:
The textwrap module can be used to format text
for output in situations where pretty-printing
is desired. It offers programmatic
functionality similar to the paragraph
wrapping or filling features found in many
text editors.
上面的例子设置首行缩进1个空格,其余行缩进4个空格
在文本中查找:
import re pattern = 'this' text = 'Does this text match the pattern?' match = re.search(pattern, text) s = match.start() e = match.end() print 'Found "%s"\nin "%s"\nfrom %d to %d ("%s")' % \ (match.re.pattern, match.string, s, e, text[s:e])
start和end返回匹配的位置
输出如下:
Found "this"
in "Does this text match the pattern?"
from 5 to 9 ("this")
re includes module-level functions for working with regular expressions as text strings,
but it is more efficient to compile the expressions a program uses frequently. The compile() function converts an expression string into a RegexObject.
import re # Precompile the patterns regexes = [ re.compile(p) for p in [ 'this', 'that' ]] text = 'Does this text match the pattern?' print 'Text: %r\n' % text for regex in regexes: print 'Seeking "%s" ->' % regex.pattern, if regex.search(text): print 'match!' else: print 'no match'
Text: 'Does this text match the pattern?'
Seeking "this" -> match!
Seeking "that" -> no match
The module-level functions maintain a cache of compiled expressions. However,
the size of the cache is limited, and using compiled expressions directly avoids the
cache lookup overhead. Another advantage of using compiled expressions is that by
precompiling all expressions when the module is loaded, the compilation work is shifted
to application start time, instead of to a point when the program may be responding to
a user action.
So far, the example patterns have all used search() to look for single instances of
literal text strings. The findall() function returns all substrings of the input that
match the pattern without overlapping.
import re text = 'abbaaabbbbaaaaa' pattern = 'ab' for match in re.findall(pattern, text): print 'Found "%s"' % match
Found "ab"
Found "ab"
finditer() returns an iterator that produces Match instances instead of the
strings returned by findall().
import re text = 'abbaaabbbbaaaaa' pattern = 'ab' for match in re.finditer(pattern, text): s = match.start() e = match.end() print 'Found "%s" at %d:%d' % (text[s:e], s, e)
Found "ab" at 0:2
Found "ab" at 5:7