Psyco can compile code that uses arbitrary object types and extension modules. Operations that it does not know about will be compiled into direct calls to the C code that implements them. However, some specific operations can be optimized, and sometimes massively so -- this is the core idea around which Psyco is built, and the reason for the sometimes impressive results.
The other reason for the performance improvement is that the machine code does not have to decode the pseudo-code (``bytecode'') over and over again while interpreting it. Removing this overhead is what compilers classically do. They also simplify the frame objects, making function calls more efficients. So does Psyco. But doing only this would be ineffective with Python, because each bytecode instruction still has a lot of run-time decoding to do (typically, looking up the type of the arguments in tables, invoking the corresponding operation and building a resulting Python object).
The type-based look-ups and the successive construction and destruction of objects for all intermediate values is what Psyco can most successfully cancel, but it needs to be taught about a type and its operations before it can do so.
We list below the specifically optimized types and operations. Possible performance gains are just wild guesses; specialization is known to give often-good-but-hard-to-predict gains. Remember, all operations not listed below work well -- they just cannot be much accelerated.
A performance killer is the usage of the built-in functions map and filter. Never use them with Psyco. Replace them with list comprehensions (see 2.4). The reason is that entering code compiled by Psyco from non-Psyco-accelerated (Python or C) code is quite slow, slower than a normal Python function call. The map and filter functions will typically result in a very large number of calls from C code to a lambda
expression foolishly compiled by Psyco. An exception to this rule is when using map or filter with a built-in function, when they are typically slightly faster than list comprehension because the only difference is then that the loop is performed by C code instead of by Psyco-generated code. Still, I generally recommend that you forget about map and filter and use the "Pythonic" way.
Virtual-time objects are objects that, when used as intermediate values, are simply not be built at run-time at all. The noted performance gains only apply if the object can actually remain virtualized. Any unsupported operation will force the involved objects to be normally built.
Type | Operations | Notes |
---|---|---|
Any built-in type | reading members and methods | (1) |
Built-in function and method | call | (1) |
Integer | truth testing, unary + - ~ abs() , binary + - * | & << >> ^ , comparison |
(2) |
Dictionary | len() |
(4) |
Float | truth testing, unary + - abs() , binary + - * / , comparison |
(5) |
Function | call | (6) |
Sequence iterators | for |
(7) |
List | len() , item get and set, concatenation |
(8) |
Long | all arithmetic operations | (9) |
Instance method | call | (1) |
String | len() , item get, slicing, concatenation |
(10) |
Tuple | len() , item get, concatenation |
(11) |
Type | call | |
array.array | item get, item set | (15) |
Built-in function | Notes |
---|---|
range | (8) |
xrange | (13) |
chr, ord | (10) |
id | |
type | |
len, abs, divmod | |
apply | (14) |
the whole math module | (16) |
map, filter | not supported(17) |
Notes:
for
loops over sequences as efficient as what you would write in C.
for
looping over a range is as efficient as the common C for
loop. For the other cases of lists see (4).
'I'
and 'L'
are not supported. Type code 'f'
does not support item assignment. The speed of a complex algorithm using an array as buffer (like manipulating an image pixel-by-pixel) should be very high; closer to C than plain Python.