在微博上看到了一遍文章,然后就试着翻译了一下,有些地方看不懂,就直接贴原文了,还望能看懂的人指点一下。
# 快速的Python性能优化 #
1.%timeit (per line) and %prun (cProfile) in ipython interactive shell.
Profile your code while working on it, and try to find of where is the bottleneck. This is not contrary to the fact that premature optimization is the root of all evil. This is mean for the first level optimization and not a heavy optimization sequence.
For more on profiling python code, you should read this: http://www.huyng.com/posts/python-performance-analysis/
Another interesting library, line_profiler is for line by line profiling https://bitbucket.org/robertkern/line_profiler
2.减少函数的调用次数
如果需要对一个list进行操作,向函数传入一个list要比每个元素调用一次函数快。
3.使用xrange代替range ##
xrange是range的C实现,更高效的使用内存
4.对于大数据,使用numpy要比标准数据结构快
5."".join(string) 比 + 或者 += 好
6.while 1 比 while True 快
7.列表推导式 > for 循环 > while 循环
列表推导式比for循环快,while循环是最慢的,因为while使用外部计数器
8.使用 cProfile, cStringIO 和 cPickle
始终使用可用的C版本的模块。
9.使用局部变量 局部变量比全局变量,宏和属性查找快
10.ist and iterators versions exist - iterators are memory efficient and scalable. Use itertools
Create generators and use yeild as much as posible. They are faster compared to the normal list way of doing it.
11.在所有可能的地方,使用 Map, Reduce and Filter 代替循环. 12.对于检查'a in b'的地方, dict or set 比 list/tuple好.
13.对于大数据,尽可能的使用不可变类型,这样更快 tuples>list
14.insertion into a list in O(n) complexity.
15.如果要从首尾操作列表,使用双端队列
16.使用del删除使用后的对象
Python does it by itself. But make sure of that with the gc module or
by writing an __del__ magic function or
the simplest way, del after use.
1.time.clock()
18.GIL(http://wiki.python.org/moin/GlobalInterpreterLock) - GIL is a demon.
GIL allows only one python native thread to be run per process, preventing CPU level parallelism. Try using ctypes and native C libararies to overcome this. When even you reach the end of optimizing with python, always there exist an option of rewriting terribly slow functions in native C, and using it through python C bindings. Other libraries like gevent is also attacking the problem, and is successful to some extend.
TL,DR: While you write code, just give one round of thought on the data structures, the iteration constructs, builtins and create C extensions for tricking the GIL if need.