文章/答案/技术大牛

发布

社区首页 >问答首页 >是什么导致了我的Django文件上传代码中的内存峰值？

问是什么导致了我的Django文件上传代码中的内存峰值？
EN

Stack Overflow用户

提问于 2015-04-07 23:03:02

回答 2查看 1.1K关注 0票数 2

当我上传一个4.8mb文件时，我的系统内存使用量从30 My跳到了300mb+。

基本上，用户上传一个(例如) 4.8mb jpeg，然后将其上传到S3，然后存储在Amazon桶中。

我认为有一件事让我很难过，那就是我正在使用easy_thumbnails生成3个文件，这些文件也存储在S3桶中。

更新2:在这一点上，我认为我的主要问题是内存激增，但从未被释放。在完成gc.collect()之后，我将开始研究如何运行Photo.objects.create()。看起来sorl缩略图可能是一个更好的选择，听起来它更适合使用远程存储。

更新1:我正在使用django-调试-工具栏，并让它拦截重定向。我现在得到了一些有用的数据。它告诉我，我已经执行了20 db查询(eek)，但更糟糕的是(我相信) boto (用于我的文件存储)正在记录124条消息。看起来好像是把每个文件都分开发送了。也许这是正常的，也许不是？不管是哪种情况，似乎都很高。

一旦文件被上传，除非我重置apache，否则内存就不会恢复。

这是我的模型：

from django.db import models
from easy_thumbnails.fields import ThumbnailerImageField
from django.conf import settings
import datetime
import os
import string
from django.core.urlresolvers import reverse

...

class Photo(models.Model):
    """
    A photo belongs to a user. A photo has a preview size and the original which
    is referenced when printing or generating the PDF for print.

    """
    def original_resolution(instance, filename):
        """
        Returns a path to upload the image to. The path is created with the
        website's slug and the current year. Using Amazon S3 for CDN storage

        """
        today = datetime.datetime.now()

        return 'uploads/{0}/{1}/{2}/{8}-{3}-{4}-{5}-{6}-{7}'.format(
            instance.owner.pk,
            today.year,
            today.month,
            today.day,
            today.hour,
            today.minute,
            today.second,
            clean_filename(filename),
            'original')

    def thumbnail_resolution(instance, filename):
        """
        Returns a path to upload the image to. The path is created with the
        website's slug and the current year. Using Amazon S3 for CDN storage

        """
        today = datetime.datetime.now()

        return 'uploads/{0}/{1}/{2}/{8}-{3}-{4}-{5}-{6}-{7}'.format(
            instance.owner.pk,
            today.year,
            today.month,
            today.day,
            today.hour,
            today.minute,
            today.second,
            clean_filename(filename),
            'thumbnail')

    def editor_resolution(instance, filename):
        """
        Returns a path to upload the image to. The path is created with the
        website's slug and the current year. Using Amazon S3 for CDN storage

        """
        today = datetime.datetime.now()
        return 'uploads/{0}/{1}/{2}/{8}-{3}-{4}-{5}-{6}-{7}'.format(
            instance.owner.pk,
            today.year,
            today.month,
            today.day,
            today.hour,
            today.minute,
            today.second,
            clean_filename(filename),
            'editor')

    height = models.PositiveIntegerField(blank=True)

    width = models.PositiveIntegerField(blank=True)

    owner = models.ForeignKey(settings.AUTH_USER_MODEL)

    original = ThumbnailerImageField(
        upload_to=original_resolution,
        resize_source=dict(size=(0, 3100), crop="scale", quality=99),
        height_field='height',
        width_field='width',
        verbose_name=u'Choose Photo')

    thumbnail = ThumbnailerImageField(
        upload_to=thumbnail_resolution,
        resize_source=dict(size=(0, 100), crop="scale"),
        blank=True,
        null=True)

    editor = ThumbnailerImageField(
        upload_to=editor_resolution,
        resize_source=dict(size=(0, 1000), crop="scale"),
        blank=True,
        null=True)

    def get_absolute_url(self):
        return reverse('photos:proxy_editor_image', kwargs={
            'pk': self.pk})

    def __unicode__(self):
        return "Photo #{}".format(self.pk)

...

下面是我处理上传的视图：

@login_required
def upload_photo(request):
    """
    Creates and saves a new photo.
    """

    if request.is_ajax():
        response = {}

        form = UploadPhotoForm(data=request.POST, files=request.FILES)
        if form.is_valid():
            new_photo = Photo.objects.create(
                original=form.cleaned_data['original'],
                thumbnail=form.cleaned_data['original'],
                editor=form.cleaned_data['original'],
                owner=form.cleaned_data['owner']
            )
            response['result'] = 'success'
            response['message'] = 'Photo successfully uploaded!'
            response['new_photo_pk'] = new_photo.pk
            response['thumbnail_path'] = new_photo.thumbnail.url
            response['editor_path'] = new_photo.editor.url
            response['original_path'] = new_photo.original.url
            response['editor_path_proxy'] = new_photo.get_absolute_url()

        else:
            response['result'] = 'fail'
            response['message'] = 'The photo failed to upload.'
            response['new_photo_pk'] = False

        return HttpResponse(
            json.dumps(response),
            content_type='application/json'
        )

    if request.method == 'POST':
        form = UploadPhotoForm(request.POST, request.FILES)

        if form.is_valid():
            Photo.objects.create(
                original=form.cleaned_data['original'],
                thumbnail=form.cleaned_data['original'],
                editor=form.cleaned_data['original'],
                owner=form.cleaned_data['owner']
            )
            messages.success(request, "Photo successfully uploaded!")
        else:
            messages.error(request, "The photo failed to upload.")

    return HttpResponseRedirect(reverse('photos:list'))

我很困惑，我可能只是不明白什么。帮助?

几个月来，我一直在努力解决这个内存使用问题，并最终将其缩小到了这个范围。

FWIW，我正在使用django1.6python2.7，webfaction，amazon存储桶。

file-upload

amazon-s3

python

django

apache

回答 2

Stack Overflow用户

发布于 2015-04-08 06:14:36

这很可能与您的Apache安装程序有关(工人与预叉)。这个过程，不管是什么，都会撕碎一堆记忆，然后永远不会释放它。我们做了同样的事情，最后把应用程序的部分放在单独的wsgi进程组中，这样我们就可以阻止内存繁重的请求导致所有wsgi进程消耗大量内存。

由于以下问题，我与mod_wsgi的创建者进行了深入的交谈：wsgi and Apache worker。

我使用htop来诊断一些内存繁重的进程。然后，根据我的发现，在apache中设置多个WSGIDaemonProcess指令，如下所示.

WSGIDaemonProcess article-app processes=10 threads=5 display-name=articles user=myuser group=mygroup python-path=/home/admin/.virtualenvs/django/lib/python2.7/site-packages
WSGIDaemonProcess account-app-memory-heavy processes=2 threads=5 display-name=account user=myuser group=mygroup python-path=/home/admin/.virtualenvs/django/lib/python2.7/site-packages

您还可以使用这些选项来调试并可能提高性能：

inactivity-timeout=300 maximum-requests=100

然后在我的VirtualHost指令中..。

<Location /article/>
    WSGIProcessGroup articles
</Location>
<Location /account/>
    WSGIProcessGroup account
</Location>

现在，您可以将内存重上传限制为1或2个进程，并有N个其他进程为应用程序的其余部分提供服务，而不会消耗太多内存。在我们的例子中，我们的文章应用程序每个wsgi进程消耗1.43亿，我们的帐户应用程序消耗3.37亿。我们将帐户应用程序限制为2个进程，文章限制为10个进程，这会造成非常可预测的内存占用。

票数 0

Stack Overflow用户

发布于 2015-04-08 14:39:00

这可能是因为据我所知，WebFaction仍然使用过时的mod_wsgi，甚至可能使用Apache2.2。大文件上传存在一个问题，如果它们以小块的速度缓慢地被传输进来。使用最新的mod_wsgi版本，最好是Apache2.4，因为其中还有与内存使用相关的其他修复，您可能会看到更好的结果。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/29502907

复制

相似问题

问是什么导致了我的Django文件上传代码中的内存峰值？
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问是什么导致了我的Django文件上传代码中的内存峰值？EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问是什么导致了我的Django文件上传代码中的内存峰值？
EN