3.2.1 Charge profilers

The ``profilers'' invoked by the above functions profile or background work on the following principles. During the regular execution of your application, Psyco measures the time spent by your functions. Individual functions are selected for compilation based on these data.

Python 2.2.2 (and up) maintains a counter of executed bytecode instructions; this number is the most accurate (well, the least inaccurate) method I found out to guess how much a function could benefit from being run by Psyco. This number counts the number of bytecode instructions (or elementary steps) taken during the interpretation, which gives a ``time'' estimate. This ``time'' does not include things like time spent waiting for data to be read or written to a file, for example, which is a good thing because Psyco cannot optimize this anyway.

The ``time'' thus charged to various functions is accumulated, until some limit is reached. To favour functions that have recently be seen running over old-timers, the charge of each function ``decays'' with time, following an exponential law as if the amount of time was stored as a dissipating electric charge. The limit at which a function is compiled is given as a fraction of the total charge; in other words, when a function's charge reaches at least xx percents of the total current charge, it is compiled. As all the charges decay with time, reaching such a limit is easier than it may seem; for example, if the limit is set at 10%, and if the execution stays for several seconds within the same 10 functions, then at least one of them will eventually reach the limit. In practice, it seems to be a very good way to measure the charge; a few CPU-intensive functions will very quickly reach the limit, even if the program has already been running for a long time.

The functions profile and background take the following optional keyword arguments:

The limit, as a fraction of the total charge, between 0.0 and 1.0. The default value is 0.09, or 9%.

The time (in seconds) it takes for the charge to drop to half its initial value by decay. After two half-lifes, the charge is one-fourth of the initial value; after three, one-eighth; and so on. The default value is 0.5 seconds.

How many times per second statistics must be collected. This parameter is central to background's operation. The default value is 100. This is a maximum for a number of operating systems, whose sleep function is quite limited in resolution.

pollfreq also applies to profile, whose main task is to do active profiling, but which collects statistics in the background too so that it can discover long-running functions that never call other functions. The default value in this case is 20.

When a function is charged, its parent (the function that called it) is also charged a bit. The purpose is to discover functions that call several other functions, each of which does a part of the job, but none of which does enough to be individually identified and compiled. When a function is charged some amount of time, its the parent is additionally charged a fraction of it. The parent of the parent is charged too, and so on recursively, althought the effect is more and more imperceptible. This parameter controls the fraction. The default value of 0.25 means that the parent is charged one-fourth of what a child is charged. Note: Do not use values larger than 0.5.

Note: All default values need tuning. Any comments about values that seem to give better results in common cases are welcome.

Profiling data is not stored on disk (currently). Profiling starts anew each time you restart your application. This is consistent with the way the ``electric charges'' approach works: any charge older than a few half-lifes gets very close to zero anyway. Moreover, in the current implementation, a full reset of the charges occurs every 120 half-lifes, but the effect should go unnoticed.