首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >软件渲染比硬件渲染快

软件渲染比硬件渲染快
EN

Stack Overflow用户
提问于 2013-11-13 21:44:32
回答 1查看 2.2K关注 0票数 3

我刚发现SDL的一种奇怪的行为。

我编写了一个简单的粒子渲染器,由于某种原因,它的运行速度比硬件渲染器快6倍。

下面是源代码:

main.cpp

代码语言:javascript
复制
#define _USE_MATH_DEFINES
#include <iostream>
#include <cstdlib>
#include <Windows.h>
#include <vector>
#include <math.h>
#include <time.h>
#include <SDL.h>

#include "Particle.h"

const int SCREEN_WIDTH = 1024;
const int SCREEN_HEIGHT = 600;
const int PARTICLE_NUMBER = 50000;
const int MAX_SPEED = 200;
const int MIN_SPEED = 5;

long long getMs (void) {
    SYSTEMTIME stime;
    GetLocalTime(&stime);
    long long ms = stime.wMilliseconds +
        stime.wSecond * 1000 +
        stime.wMinute * 60000 +
        stime.wHour * 3600000 +
        stime.wDay * 86400000 +
        stime.wMonth * 2592000000 +
        (stime.wYear - 1970) * 31104000000;
    return ms;
}

int main(int argc, char *argv[])
{
    bool hardwareAccelerated = true;

    if (argc == 2)
    {
        if (strncmp(argv[1], "-software", 9) == 0)
        {
            hardwareAccelerated = false;
        }
    }

    char title [100];
    sprintf(title, "Particles: %d - (%s)", PARTICLE_NUMBER, (hardwareAccelerated ? "HARDWARE ACCELERATED" : "SOFTWARE RENDERING"));

    Particle<double> *particles = (Particle<double>*) malloc(sizeof(Particle<double>) * PARTICLE_NUMBER);

    for (int i = 0; i < PARTICLE_NUMBER; i++)
    {
        double x = rand() % SCREEN_WIDTH;
        double y = rand() % SCREEN_HEIGHT;
        double direction = (((double) rand() / (double) RAND_MAX) - 0.5f) * 2 * M_PI;
        double speed = rand() % (MAX_SPEED - MIN_SPEED) + MIN_SPEED;
        (particles+i)->setPos(x, y);
        (particles+i)->setDirection(direction);
        (particles+i)->setSpeed(speed);
        // std::cout << (particles+i) << std::endl;
    }



    if (SDL_Init(SDL_INIT_EVERYTHING) != 0) {
        return 1;
    }

    SDL_Window *window = SDL_CreateWindow(title,
        SDL_WINDOWPOS_CENTERED, SDL_WINDOWPOS_CENTERED,
        SCREEN_WIDTH, SCREEN_HEIGHT, SDL_WINDOW_SHOWN);
    if (window == nullptr) {
        return 2;
    }

    SDL_RendererFlags flags = (hardwareAccelerated ? SDL_RENDERER_ACCELERATED : SDL_RENDERER_SOFTWARE);
    SDL_Renderer *renderer = SDL_CreateRenderer(window, -1,
        flags);
    if (renderer == nullptr) {
        return 3;
    }

    bool quit = false;
    SDL_Event evt;

    long long lastFrame = getMs();
    double delta = 0.f;
    while (!quit)
    {
        long long currentTime = getMs();
        delta = currentTime - lastFrame;
        lastFrame = currentTime;

        std::cout << "delta: " << delta << std::endl;

        while(SDL_PollEvent(&evt) != 0)
        {
            if (evt.type == SDL_QUIT)
            {
                quit = true;
            }
        }
        SDL_SetRenderDrawColor(renderer, 0,0,0,1);
        SDL_RenderClear(renderer);
        SDL_SetRenderDrawColor(renderer, 255,0,0,255);
        for (int i = 0; i < PARTICLE_NUMBER; i++)
        {
            (particles+i)->tick(delta);
            double *pos = (particles+i)->getPos();
            SDL_RenderDrawPoint(renderer, pos[0], pos[1]);
        }
        SDL_RenderPresent(renderer);
    }

    SDL_DestroyRenderer(renderer);
    SDL_DestroyWindow(window);
    SDL_Quit();

    return 0;
}

particle.h

代码语言:javascript
复制
#ifndef _H_PARTICLE
#define _H_PARTICLE
#include <math.h>

template <class T>
class Particle
{
public:
    Particle(void);

    void tick(double);

    void setPos(T, T);
    T* getPos(void);
    void setDirection(double);
    double getDirection(void);
    void setSpeed(T);
    T getSpeed(void);
    ~Particle(void);
private:
    T x;
    T y;
    T speed;
    double direction;
};

template <class T>
Particle<T>::Particle(void)
{
}

template <class T>
void Particle<T>::tick(double delta)
{
    double dt = delta / 1000;
    T d_speed = this->speed * dt;
    // std::cout << d_speed << std::endl;

    this->x += cos(this->direction) * d_speed;
    this->y += sin(this->direction) * d_speed;

    if (this->x > SCREEN_WIDTH) this->x = 0;
    if (this->y > SCREEN_HEIGHT) this->y = 0;
    if (this->x < 0) this->x = SCREEN_WIDTH;
    if (this->y < 0) this->y = SCREEN_HEIGHT;
}

template <class T>
void Particle<T>::setPos(T x, T y)
{
    this->x = x;
    this->y = y;
}

template <class T>
T* Particle<T>::getPos(void)
{
    T pos[2];
    pos[0] = this->x;
    pos[1] = this->y;
    return pos;
}

template <class T>
void Particle<T>::setDirection(double direction)
{
    this->direction = direction;
}

template <class T>
double Particle<T>::getDirection(void)
{
    return this->direction;
}

template <class T>
void Particle<T>::setSpeed(T speed)
{
    this->speed = speed;
}

template <class T>
T Particle<T>::getSpeed(void)
{
    return this->speed;
}

template <class T>
Particle<T>::~Particle(void)
{
}

#endif

为什么会发生这种情况?硬件渲染器不应该比软件更快吗?

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2013-11-13 23:00:40

SDL_RenderDrawPoint()调用SDL_RenderDrawPoints(),但计数为1。SDL_RenderDrawPoints()在呈现所需点数之前调用SDL_stack_alloc(),在完成时调用SDL_stack_free()。那可能是你的问题。你对系统中的每一个粒子,每一个帧都做了一个mallocfree

我认为退役的忍者有正确的想法-使用SDL_RenderDrawPoints()代替,只做mallocfree每帧一次。

或者--使用另一种模式。创建一个SDL_Surface一次。每一帧,您将需要的所有像素(通过在特定像素上执行SDL_Surface的像素内存直接操作)混合,然后当涉及到呈现时,将SDL_Surface转换为SDL_Texture并将其呈现给渲染器。

一些示例代码--如果一个Particle是一个类并包含指向一个SDL_Surface的指针,那么您可以有一个如下所示的绘制函数:

代码语言:javascript
复制
void Particle::draw()
{
  Uint32 x = m_position.getX();
  Uint32 y = m_position.getY();
  Uint32 * pixel = (Uint32*)m_screen->pixels+(y*(m_pitch/4))+x;

  Uint8 r1 = 0;
  Uint8 g1 = 0;
  Uint8 b1 = 0;
  Uint8 a1 = 0;
  GFX_RGBA_FROM_PIXEL(*pixel, m_screen->format, &r1, &g1, &b1, &a1);

  Uint32 * p = (Uint32*)m_screen->pixels+(y*(m_pitch/4))+x;
  *p = SDL_MapRGB(m_screen->format, m_r, m_g, m_b);
}

其中,GFX_RGBA_FROM_PIXEL (从Andreas的SDL2_gfx库窃取)定义为:

代码语言:javascript
复制
///////////////////////////////////////////////////////////////////
void GFX_RGBA_FROM_PIXEL(Uint32 pixel, SDL_PixelFormat * fmt, Uint8* r, Uint8* g, Uint8* b, Uint8* a)
{
  *r = ((pixel&fmt->Rmask)>>fmt->Rshift)<<fmt->Rloss;
  *g = ((pixel&fmt->Gmask)>>fmt->Gshift)<<fmt->Gloss;
  *b = ((pixel&fmt->Bmask)>>fmt->Bshift)<<fmt->Bloss;
  *a = ((pixel&fmt->Amask)>>fmt->Ashift)<<fmt->Aloss;
}

可能会更快。我没有做任何时间的测试,但这可能是值得的,因为你直接操作像素内存的颜色,然后简单地对每一个帧进行闪现。您没有做任何mallocs或frees。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/19965005

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档