Posted Thursday, 03 September 2009 at 12:11 by Andrew Liu
Tagged: python | web applications | web development
Read more blogs...
Memory management with Pylons on mod_wsgi and apache seems to be one of the major issues, especially when running multiple instances. I happen to fall in this category, and I have found that each Pylons instance takes up a relatively large chunk of memory. When working with limited memory constraints (such as a VPS), then this can become a major problem. Trying to mimimise memory consumption on these thus becomes a critical issue.
Searching forums has found the issue to be related to the following:
Thread Stack Size
High virtual memory usage is almost solely caused by the huge stack allocated on Linux by default (default Linux stack size is 8MB). It is still not quite clear whether this is really troublesome (it looks like in typical configurations this stack remains purely virtual number), but in some configurations may theoretically cause trouble (like reaching VPS limits).
This can be controlled in several ways.
If the startup scripts running the process can be modified, the following can reduce the stack size allocation. This example reduces the stack size to a 512kB stack.
ulimit -s 512
This can be done in a python startup script as well with the following:
thread.stack_size(512 * 1024)
Tuning Pylons can help in reducing memory consumption of the associated processes. Editing your development.ini or production.ini file with the following can be helpful, although I didn't personally do this myself:
threadpool_workers = 10 # 10 is default
sqlalchemy.default.pool_size = 3 # 5 is default
sqlalchemy.default.max_overflow = 7 # pool_size+max_overflow = max simultaneous database sessions
Because of the use of threads with mod_wsgi, ensure that apache has the mpm_prefork and mpm_worker modules installed.
More information on these modules can be found at the apache httpd site:
When configured, try adjusting the options, as the defaults are (in my opinion) very generous. In particular, the StartServers and MinSpareServers should be reduced to just 2 (unless you are a very heavily hit site).
30 # prefork MPM
31 # StartServers: number of server processes to start
32 # MinSpareServers: minimum number of server processes which are kept spare
33 # MaxSpareServers: maximum number of server processes which are kept spare
34 # MaxClients: maximum number of server processes allowed to start
35 # MaxRequestsPerChild: maximum number of requests a server process serves
36 <IfModule mpm_prefork_module>
37 StartServers 2
38 MinSpareServers 2
39 MaxSpareServers 10
40 MaxClients 150
41 MaxRequestsPerChild 0
44 # worker MPM
45 # StartServers: initial number of server processes to start
46 # MaxClients: maximum number of simultaneous client connections
47 # MinSpareThreads: minimum number of worker threads which are kept spare
48 # MaxSpareThreads: maximum number of worker threads which are kept spare
49 # ThreadsPerChild: constant number of worker threads in each server process
50 # MaxRequestsPerChild: maximum number of requests a server process serves
51 <IfModule mpm_worker_module>
52 StartServers 2
53 MaxClients 150
54 MinSpareThreads 5
55 MaxSpareThreads 75
56 ThreadsPerChild 25
57 MaxRequestsPerChild 0
Lastly, and this is the one configuration that I believe did the most for me, is to change the WSGI configuration options within apache.
Whereas the mod_wsgi installation and configuration posts give us what is required to make it work, it is the options that I think help reduce the memory. My setup is:
1 <IfModule mod_wsgi.c>
2 WSGIDaemonProcess myapplication processes=1 threads=4 maximum-requests=100 inactivity-timeout=300 display-name=myapplication
3 WSGIScriptAlias /site.wsgi/ /apps/myapplication/myapplication.wsgi/
4 WSGIProcessGroup myapplication
The main piece of advice is to use a single process (processes=1). The number of threads could be higher, but I kept it at 4 for now. The maximum requests seemed to make a big difference, as originally I had 5000, then reduced to 1000, then even further to 100. I think this will have to increase depending on the number of hits you expect to receive, but keeping this low means that the life of a thread will be, at most, 100 requests before it resets. This shouldn't affect a user's experience (possibly just the time taken to destroy and create a new thread).
Doing some or all of the above should help you in managing the memory consumption of multiple Pylons processes on mod_wsgi and apache.