首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >是什么导致了我的Django文件上传代码中的内存峰值?

是什么导致了我的Django文件上传代码中的内存峰值?
EN

Stack Overflow用户
提问于 2015-04-07 23:03:02
回答 2查看 1.1K关注 0票数 2

当我上传一个4.8mb文件时,我的系统内存使用量从30 My跳到了300mb+。

基本上,用户上传一个(例如) 4.8mb jpeg,然后将其上传到S3,然后存储在Amazon桶中。

我认为有一件事让我很难过,那就是我正在使用easy_thumbnails生成3个文件,这些文件也存储在S3桶中。

更新2:在这一点上,我认为我的主要问题是内存激增,但从未被释放。在完成gc.collect()之后,我将开始研究如何运行Photo.objects.create()。看起来sorl缩略图可能是一个更好的选择,听起来它更适合使用远程存储。

更新1:我正在使用django-调试-工具栏,并让它拦截重定向。我现在得到了一些有用的数据。它告诉我,我已经执行了20 db查询(eek),但更糟糕的是(我相信) boto (用于我的文件存储)正在记录124条消息。看起来好像是把每个文件都分开发送了。也许这是正常的,也许不是?不管是哪种情况,似乎都很高。

一旦文件被上传,除非我重置apache,否则内存就不会恢复。

这是我的模型:

代码语言:javascript
复制
from django.db import models
from easy_thumbnails.fields import ThumbnailerImageField
from django.conf import settings
import datetime
import os
import string
from django.core.urlresolvers import reverse

...

class Photo(models.Model):
    """
    A photo belongs to a user. A photo has a preview size and the original which
    is referenced when printing or generating the PDF for print.

    """
    def original_resolution(instance, filename):
        """
        Returns a path to upload the image to. The path is created with the
        website's slug and the current year. Using Amazon S3 for CDN storage

        """
        today = datetime.datetime.now()

        return 'uploads/{0}/{1}/{2}/{8}-{3}-{4}-{5}-{6}-{7}'.format(
            instance.owner.pk,
            today.year,
            today.month,
            today.day,
            today.hour,
            today.minute,
            today.second,
            clean_filename(filename),
            'original')

    def thumbnail_resolution(instance, filename):
        """
        Returns a path to upload the image to. The path is created with the
        website's slug and the current year. Using Amazon S3 for CDN storage

        """
        today = datetime.datetime.now()

        return 'uploads/{0}/{1}/{2}/{8}-{3}-{4}-{5}-{6}-{7}'.format(
            instance.owner.pk,
            today.year,
            today.month,
            today.day,
            today.hour,
            today.minute,
            today.second,
            clean_filename(filename),
            'thumbnail')

    def editor_resolution(instance, filename):
        """
        Returns a path to upload the image to. The path is created with the
        website's slug and the current year. Using Amazon S3 for CDN storage

        """
        today = datetime.datetime.now()
        return 'uploads/{0}/{1}/{2}/{8}-{3}-{4}-{5}-{6}-{7}'.format(
            instance.owner.pk,
            today.year,
            today.month,
            today.day,
            today.hour,
            today.minute,
            today.second,
            clean_filename(filename),
            'editor')

    height = models.PositiveIntegerField(blank=True)

    width = models.PositiveIntegerField(blank=True)

    owner = models.ForeignKey(settings.AUTH_USER_MODEL)

    original = ThumbnailerImageField(
        upload_to=original_resolution,
        resize_source=dict(size=(0, 3100), crop="scale", quality=99),
        height_field='height',
        width_field='width',
        verbose_name=u'Choose Photo')

    thumbnail = ThumbnailerImageField(
        upload_to=thumbnail_resolution,
        resize_source=dict(size=(0, 100), crop="scale"),
        blank=True,
        null=True)

    editor = ThumbnailerImageField(
        upload_to=editor_resolution,
        resize_source=dict(size=(0, 1000), crop="scale"),
        blank=True,
        null=True)

    def get_absolute_url(self):
        return reverse('photos:proxy_editor_image', kwargs={
            'pk': self.pk})

    def __unicode__(self):
        return "Photo #{}".format(self.pk)

...

下面是我处理上传的视图:

代码语言:javascript
复制
@login_required
def upload_photo(request):
    """
    Creates and saves a new photo.
    """

    if request.is_ajax():
        response = {}

        form = UploadPhotoForm(data=request.POST, files=request.FILES)
        if form.is_valid():
            new_photo = Photo.objects.create(
                original=form.cleaned_data['original'],
                thumbnail=form.cleaned_data['original'],
                editor=form.cleaned_data['original'],
                owner=form.cleaned_data['owner']
            )
            response['result'] = 'success'
            response['message'] = 'Photo successfully uploaded!'
            response['new_photo_pk'] = new_photo.pk
            response['thumbnail_path'] = new_photo.thumbnail.url
            response['editor_path'] = new_photo.editor.url
            response['original_path'] = new_photo.original.url
            response['editor_path_proxy'] = new_photo.get_absolute_url()

        else:
            response['result'] = 'fail'
            response['message'] = 'The photo failed to upload.'
            response['new_photo_pk'] = False

        return HttpResponse(
            json.dumps(response),
            content_type='application/json'
        )

    if request.method == 'POST':
        form = UploadPhotoForm(request.POST, request.FILES)

        if form.is_valid():
            Photo.objects.create(
                original=form.cleaned_data['original'],
                thumbnail=form.cleaned_data['original'],
                editor=form.cleaned_data['original'],
                owner=form.cleaned_data['owner']
            )
            messages.success(request, "Photo successfully uploaded!")
        else:
            messages.error(request, "The photo failed to upload.")

    return HttpResponseRedirect(reverse('photos:list'))

我很困惑,我可能只是不明白什么。帮助?

几个月来,我一直在努力解决这个内存使用问题,并最终将其缩小到了这个范围。

FWIW,我正在使用django1.6python2.7,webfaction,amazon存储桶。

EN

回答 2

Stack Overflow用户

发布于 2015-04-08 06:14:36

这很可能与您的Apache安装程序有关(工人与预叉)。这个过程,不管是什么,都会撕碎一堆记忆,然后永远不会释放它。我们做了同样的事情,最后把应用程序的部分放在单独的wsgi进程组中,这样我们就可以阻止内存繁重的请求导致所有wsgi进程消耗大量内存。

由于以下问题,我与mod_wsgi的创建者进行了深入的交谈:wsgi and Apache worker

我使用htop来诊断一些内存繁重的进程。然后,根据我的发现,在apache中设置多个WSGIDaemonProcess指令,如下所示.

代码语言:javascript
复制
WSGIDaemonProcess article-app processes=10 threads=5 display-name=articles user=myuser group=mygroup python-path=/home/admin/.virtualenvs/django/lib/python2.7/site-packages
WSGIDaemonProcess account-app-memory-heavy processes=2 threads=5 display-name=account user=myuser group=mygroup python-path=/home/admin/.virtualenvs/django/lib/python2.7/site-packages 

您还可以使用这些选项来调试并可能提高性能:

代码语言:javascript
复制
inactivity-timeout=300 maximum-requests=100

然后在我的VirtualHost指令中..。

代码语言:javascript
复制
<Location /article/>
    WSGIProcessGroup articles
</Location>
<Location /account/>
    WSGIProcessGroup account
</Location>

现在,您可以将内存重上传限制为1或2个进程,并有N个其他进程为应用程序的其余部分提供服务,而不会消耗太多内存。在我们的例子中,我们的文章应用程序每个wsgi进程消耗1.43亿,我们的帐户应用程序消耗3.37亿。我们将帐户应用程序限制为2个进程,文章限制为10个进程,这会造成非常可预测的内存占用。

票数 0
EN

Stack Overflow用户

发布于 2015-04-08 14:39:00

这可能是因为据我所知,WebFaction仍然使用过时的mod_wsgi,甚至可能使用Apache2.2。大文件上传存在一个问题,如果它们以小块的速度缓慢地被传输进来。使用最新的mod_wsgi版本,最好是Apache2.4,因为其中还有与内存使用相关的其他修复,您可能会看到更好的结果。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/29502907

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档