Homebrew Homebrew Development

TheGag96

Well-Known Member
Newcomer
Joined
Feb 1, 2016
Messages
45
Trophies
0
Age
27
XP
111
Country
United States

catlover007

Developer
Developer
Joined
Oct 23, 2015
Messages
722
Trophies
1
XP
3,968
Country
Germany
At around 350 sprites (I've edited the example to draw to both screens, so it would be double this limit if using only one screen), the FPS drops while the percentage readouts are:

CPU: ~8.5%
GPU: ~85%
CmdBuf: 0.26%
this is weird. I played with the example myself, looked into citro2d source code and measured a few things, but everything seems to work correctly. I looked back into my old homebrew game and it easily pushes a lot more vertices at 60 FPS with citro3d.

This is relevant:
https://github.com/devkitPro/citro2d/issues/14

The issue might be that it's just a fillrate limitation. Usually games use depth buffers, which citro2d doesn't use, they would save a lot of unecessary drawing.
 

TheGag96

Well-Known Member
Newcomer
Joined
Feb 1, 2016
Messages
45
Trophies
0
Age
27
XP
111
Country
United States
this is weird. I played with the example myself, looked into citro2d source code and measured a few things, but everything seems to work correctly. I looked back into my old homebrew game and it easily pushes a lot more vertices at 60 FPS with citro3d.

This is relevant:
https://github.com/devkitPro/citro2d/issues/14

The issue might be that it's just a fillrate limitation. Usually games use depth buffers, which citro2d doesn't use, they would save a lot of unecessary drawing.

I'm the one who made that issue haha. I posted here because no one responded. Are you saying that you too see the slowdown at around 700 sprites on the unedited example? Also, I'm doubting it's a fill rate limitation (at least given the little I understand about that). Check out the PICA200 specs on Wikipedia. The 3DS runs the GPU at 133 MHz, so doing a little math tells that theoretically, this thing should be able to draw about ~17k 16x16 tiles to both screens per frame @ 60 FPS assuming no overlap.
 

catlover007

Developer
Developer
Joined
Oct 23, 2015
Messages
722
Trophies
1
XP
3,968
Country
Germany
I'm the one who made that issue haha. I posted here because no one responded. Are you saying that you too see the slowdown at around 700 sprites on the unedited example? Also, I'm doubting it's a fill rate limitation (at least given the little I understand about that). Check out the PICA200 specs on Wikipedia. The 3DS runs the GPU at 133 MHz, so doing a little math tells that theoretically, this thing should be able to draw about ~17k 16x16 tiles to both screens per frame @ 60 FPS assuming no overlap.
hm, I modified a citro3d example to test this: https://gist.github.com/RSDuck/707ba294d0d049f89a768fd4e4b20b57

it runs quite slowly (the GPU runs for more than 100ms) and instead of flickering the triangle changes color slowly.

so it seems not to be related to citro2d
 

TheGag96

Well-Known Member
Newcomer
Joined
Feb 1, 2016
Messages
45
Trophies
0
Age
27
XP
111
Country
United States
hm, I modified a citro3d example to test this: https://gist.github.com/RSDuck/707ba294d0d049f89a768fd4e4b20b57

it runs quite slowly (the GPU runs for more than 100ms) and instead of flickering the triangle changes color slowly.

so it seems not to be related to citro2d

Well, strangely, if I compile that example with the vshader.v.pica from the both_screens example (I assume that's what you used)? on Citra, my entire computer grinds to a halt (my CPU is no slouch). Very strange lol.

Maybe the sprite stuff (and possibly your example) is worth bringing up on the devkitPro forums. I'll make a post there too.
 

catlover007

Developer
Developer
Joined
Oct 23, 2015
Messages
722
Trophies
1
XP
3,968
Country
Germany
Well, strangely, if I compile that example with the vshader.v.pica from the both_screens example (I assume that's what you used)? on Citra, my entire computer grinds to a halt (my CPU is no slouch). Very strange lol.

Maybe the sprite stuff (and possibly your example) is worth bringing up on the devkitPro forums. I'll make a post there too.
it's based on the simple triangle example
 

atrt7

New Member
Newbie
Joined
May 1, 2020
Messages
1
Trophies
0
Age
20
XP
46
Country
United States
Hey guys I'm having this problem with this template project where I can't compile to 3dsx but I can compile to cia. I can't post links so it is the codeblocks template project from TricksterGuy. btw I'm on linux

Output from compiler:

main.cpp
arm-none-eabi-g++ -MMD -MP -MF /home/atrt7/Documents/3dsdev/3ds-template/build/main.d -Wall -O2 -mword-relocations -fomit-frame-pointer -ffunction-sections -march=armv6k -mtune=mp
core -mfloat-abi=hard -mtp=soft -I/home/atrt7/Documents/3dsdev/3ds-template/include -I/opt/devkitpro/portlibs/3ds/include -I/opt/devkitpro/libctru/include -I/home/atrt7/Documents/
3dsdev/3ds-template/build -DARM11 -D_3DS -fno-rtti -fno-exceptions -std=gnu++11 -c /home/atrt7/Documents/3dsdev/3ds-template/source/main.cpp -o main.o
linking 3ds-template.elf
built ... 3ds-template.smdh
Cannot open dir

make[1]: *** [/opt/devkitpro/devkitARM/3ds_rules:36: /home/atrt7/Documents/3dsdev/3ds-template/output/3ds-template.3dsx] Error 1
make: *** [Makefile:111: all] Error 2
 

catlover007

Developer
Developer
Joined
Oct 23, 2015
Messages
722
Trophies
1
XP
3,968
Country
Germany
Hey guys I'm having this problem with this template project where I can't compile to 3dsx but I can compile to cia. I can't post links so it is the codeblocks template project from TricksterGuy. btw I'm on linux

Output from compiler:

main.cpp
arm-none-eabi-g++ -MMD -MP -MF /home/atrt7/Documents/3dsdev/3ds-template/build/main.d -Wall -O2 -mword-relocations -fomit-frame-pointer -ffunction-sections -march=armv6k -mtune=mp
core -mfloat-abi=hard -mtp=soft -I/home/atrt7/Documents/3dsdev/3ds-template/include -I/opt/devkitpro/portlibs/3ds/include -I/opt/devkitpro/libctru/include -I/home/atrt7/Documents/
3dsdev/3ds-template/build -DARM11 -D_3DS -fno-rtti -fno-exceptions -std=gnu++11 -c /home/atrt7/Documents/3dsdev/3ds-template/source/main.cpp -o main.o
linking 3ds-template.elf
built ... 3ds-template.smdh
Cannot open dir

make[1]: *** [/opt/devkitpro/devkitARM/3ds_rules:36: /home/atrt7/Documents/3dsdev/3ds-template/output/3ds-template.3dsx] Error 1
make: *** [Makefile:111: all] Error 2
by example project do you mean the template from the 3ds examples? https://github.com/devkitPro/3ds-examples/tree/master/templates/application
 

ShroomKing

Somebody
Member
Joined
Mar 3, 2017
Messages
470
Trophies
0
Age
29
Location
in bed
XP
1,969
Country
United States
Hey guys I'm having this problem with this template project where I can't compile to 3dsx but I can compile to cia. I can't post links so it is the codeblocks template project from TricksterGuy. btw I'm on linux

Output from compiler:

main.cpp
arm-none-eabi-g++ -MMD -MP -MF /home/atrt7/Documents/3dsdev/3ds-template/build/main.d -Wall -O2 -mword-relocations -fomit-frame-pointer -ffunction-sections -march=armv6k -mtune=mp
core -mfloat-abi=hard -mtp=soft -I/home/atrt7/Documents/3dsdev/3ds-template/include -I/opt/devkitpro/portlibs/3ds/include -I/opt/devkitpro/libctru/include -I/home/atrt7/Documents/
3dsdev/3ds-template/build -DARM11 -D_3DS -fno-rtti -fno-exceptions -std=gnu++11 -c /home/atrt7/Documents/3dsdev/3ds-template/source/main.cpp -o main.o
linking 3ds-template.elf
built ... 3ds-template.smdh
Cannot open dir

make[1]: *** [/opt/devkitpro/devkitARM/3ds_rules:36: /home/atrt7/Documents/3dsdev/3ds-template/output/3ds-template.3dsx] Error 1
make: *** [Makefile:111: all] Error 2
Go to the 3ds-template folder and create a new folder called romfs inside it, then try again.
 

JeffRuLz

Well-Known Member
Member
Joined
Sep 14, 2018
Messages
164
Trophies
0
XP
2,561
Country
United States
Does anyone have an example of using Citro2Ds C2D_SceneSize?
For the life of me I cannot get this to work.
I ran into this problem today. I decided to use Citro3D functions instead. I know this doesn't exactly answer your question but I thought it might be helpful.

C2D_SceneBegin(fb);
// Scissor top screen to 320x240 and center it.
C3D_SetViewport(0, -40, 240.0, 400.0);
C3D_SetScissor(GPU_SCISSOR_NORMAL, 0, 40, 0, 360);
 
  • Like
Reactions: TarableCode

TarableCode

Well-Known Member
Member
Joined
Mar 2, 2016
Messages
184
Trophies
0
Age
37
XP
319
Country
Canada
I ran into this problem today. I decided to use Citro3D functions instead. I know this doesn't exactly answer your question but I thought it might be helpful.

C2D_SceneBegin(fb);
// Scissor top screen to 320x240 and center it.
C3D_SetViewport(0, -40, 240.0, 400.0);
C3D_SetScissor(GPU_SCISSOR_NORMAL, 0, 40, 0, 360);

Maybe I'm missing something?
It's still sideways lol.

Code:
#define __STDC_FORMAT_MACROS
#include <inttypes.h>

#include <stdio.h>
#include <stdbool.h>
#include <stdlib.h>
#include <string.h>
#include <citro2d.h>
#include <3ds.h>

#include "fb_conv.h"

#include "CLUT_blues.h"
#include "CLUT_reds.h"
#include "CLUT_greens.h"
#include "fb8bpp.h"

extern const uint8_t FBImage[ ];

uint16_t ConversionTable[ 256 ];

#define MainTextureWidth 512
#define MainTextureHeight 512
#define MainTextureSize ( MainTextureWidth * MainTextureHeight * sizeof( uint16_t ) )

#define EmuScreenWidth 512
#define EmuScreenHeight 384
#define EmuScreenSize ( ( EmuScreenWidth * EmuScreenHeight ) / 2 )

#define MainScreenWidth 400
#define MainScreenHeight 240

// Cobbled together main framebuffer sprite
C2D_Sprite ScreenSprite = {
    .image = ( C2D_Image ) {
        .tex = &( C3D_Tex ) {
            // Gets initialized later
        },
        .subtex = &( const Tex3DS_SubTexture ) {
            .width = MainTextureWidth,
            .height = MainTextureHeight,
            .left = 0.0f,
            .right = 1.0f,
            .top = 0.0f,
            .bottom = 1.0f
        }
    },
    .params = ( C2D_DrawParams ) {
        .pos.x = 0.0f,
        .pos.y = 0.0f,
        .pos.w = MainTextureWidth,
        .pos.h = MainTextureHeight,
        .center.x = 0.5f,
        .center.y = 0.5f,
        .depth = 1.0f,
        .angle = 0.0f //C3D_AngleFromDegrees( 270.0f )
    }
};

float ScaleX = 1.0f; //0.78125f;
float ScaleY = 1.0f; //0.625f;

int ScrollX = 0;
int ScrollY = 0;

void SetUnscaled( void ) {
    C3D_TexSetFilter( ScreenSprite.image.tex, GPU_NEAREST, GPU_NEAREST );
    C2D_SpriteSetScale( &ScreenSprite, 1.0f, 1.0f );
}

void SetScaled( void ) {
    C3D_TexSetFilter( ScreenSprite.image.tex, GPU_LINEAR, GPU_LINEAR );
    C2D_SpriteSetScale( &ScreenSprite, ScaleX, ScaleY );
}

void Scroll( void ) {
    C2D_SpriteSetPos( &ScreenSprite, ( float ) ScrollX, ( float ) ScrollY );
}

C3D_RenderTarget* MainRenderTarget = NULL;
uint16_t* MainScreenBuffer = NULL;

bool CreateMainScreen( void ) {
    MainRenderTarget = C2D_CreateScreenTarget( GFX_TOP, GFX_LEFT );

    if ( MainRenderTarget ) {
        MainScreenBuffer = ( uint16_t* ) linearAlloc( MainTextureSize );

        if ( MainScreenBuffer ) {
            memset( MainScreenBuffer, 0, MainTextureSize );

            return C3D_TexInit( ScreenSprite.image.tex, 512, 512, GPU_RGB565 );
        }
    }

    return false;
}

void DestroyMainScreen( void ) {
    if ( MainScreenBuffer ) {
        linearFree( MainScreenBuffer );
    }

    C3D_TexDelete( ScreenSprite.image.tex );
}

void Init_8BPP( void ) {
    int r = 0;
    int g = 0;
    int b = 0;
    int i = 0;

    for ( i = 0; i < 256; i++ ) {
        r = ( ( uint16_t* ) CLUT_reds )[ i ] >> 11;
        g = ( ( uint16_t* ) CLUT_greens )[ i ] >> 10;
        b = ( ( uint16_t* ) CLUT_blues )[ i ] >> 11;

        ConversionTable[ i ] = RGB565( r, g, b );
    }
}

void Unpack8BPP( const uint8_t* PixelsIn, uint16_t* PixelsOut, size_t InputLength ) {
    while ( InputLength-- ) {
        *PixelsOut++ = ConversionTable[ *PixelsIn++ ];
    }
}

void UpdateMainScreen_8BPP( void ) {
    Unpack8BPP( ( const uint8_t* ) fb8bpp, MainScreenBuffer, 512 * 384 );

    GSPGPU_FlushDataCache( MainScreenBuffer, 512 * 384 * 2 );

    GX_DisplayTransfer( 
        ( uint32_t* ) MainScreenBuffer, 
        GX_BUFFER_DIM( MainTextureWidth, MainTextureHeight ), 
        ScreenSprite.image.tex->data, 
        GX_BUFFER_DIM( MainTextureWidth, MainTextureHeight ), 
        GX_TRANSFER_OUT_FORMAT( GX_TRANSFER_FMT_RGB565 ) |
        GX_TRANSFER_IN_FORMAT( GX_TRANSFER_FMT_RGB565 ) |
        GX_TRANSFER_OUT_TILED( 1 ) |
        GX_TRANSFER_FLIP_VERT( 0 )
    );
}

#if 0
void UpdateMainScreen_4BPP( void ) {
    const size_t Size = ( 512 * 384 ) / 2;

    Unpack4BPP( &( ( unsigned char* ) fb4bpp )[ 0 ], ( uint32_t* ) &MainScreenBuffer[ 0 ], Size );

    GSPGPU_FlushDataCache( MainScreenBuffer, Size );

    GX_DisplayTransfer( 
        ( uint32_t* ) MainScreenBuffer, 
        GX_BUFFER_DIM( MainTextureWidth, MainTextureHeight ), 
        ScreenSprite.image.tex->data, 
        GX_BUFFER_DIM( MainTextureWidth, MainTextureHeight ), 
        GX_TRANSFER_OUT_FORMAT( GX_TRANSFER_FMT_RGB565 ) |
        GX_TRANSFER_IN_FORMAT( GX_TRANSFER_FMT_RGB565 ) |
        GX_TRANSFER_OUT_TILED( 1 ) |
        GX_TRANSFER_FLIP_VERT( 0 )
    );
}

void UpdateMainScreen_1BPP( void ) {
    const size_t Size = ( 512 * 342 ) / 8;

    Unpack_1BPP( FBImage, MainScreenBuffer, Size );

    GSPGPU_FlushDataCache( MainScreenBuffer, Size );

    GX_DisplayTransfer( 
        ( uint32_t* ) MainScreenBuffer, 
        GX_BUFFER_DIM( MainTextureWidth, MainTextureHeight ), 
        ScreenSprite.image.tex->data, 
        GX_BUFFER_DIM( MainTextureWidth, MainTextureHeight ), 
        GX_TRANSFER_OUT_FORMAT( GX_TRANSFER_FMT_RGB565 ) |
        GX_TRANSFER_IN_FORMAT( GX_TRANSFER_FMT_RGB565 ) |
        GX_TRANSFER_OUT_TILED( 1 ) |
        GX_TRANSFER_FLIP_VERT( 0 )
    );
}

void UpdateMainScreenPartial_1BPP( int Top, int Bottom ) {
    Unpack_1BPP(
        &FBImage[ ( Top * EmuScreenWidth ) / 8 ],
        &MainScreenBuffer[ Top * MainTextureWidth ],
        ( ( Bottom - Top ) * EmuScreenWidth ) / 8
    );

    GSPGPU_FlushDataCache( MainScreenBuffer, MainTextureSize );

    GX_DisplayTransfer( 
        ( uint32_t* ) MainScreenBuffer, 
        GX_BUFFER_DIM( MainTextureWidth, MainTextureHeight ), 
        ScreenSprite.image.tex->data, 
        GX_BUFFER_DIM( MainTextureWidth, MainTextureHeight ), 
        GX_TRANSFER_OUT_FORMAT( GX_TRANSFER_FMT_RGB565 ) |
        GX_TRANSFER_IN_FORMAT( GX_TRANSFER_FMT_RGB565 ) |
        GX_TRANSFER_OUT_TILED( 1 ) |
        GX_TRANSFER_FLIP_VERT( 0 )
    );
}
#endif

int main( void ) {
    uint32_t Keys_Held = 0;
    uint32_t Keys_Down = 0;
    uint32_t Keys_Up = 0;
    uint64_t a = 0;
    uint64_t b = 0;
    float Time = 0.0f;

    gfxInitDefault( );
    consoleInit( GFX_BOTTOM, NULL );

    C3D_Init( C3D_DEFAULT_CMDBUF_SIZE );
    C2D_Init( C2D_DEFAULT_MAX_OBJECTS );
    C2D_Prepare( );

    if ( CreateMainScreen( ) ) {
        //Init_1BPP( );
        //Init_4BPP( );
        Init_8BPP( );

        printf( "hi!\n" );
        printf( "%d\n", sizeof( CLUT_reds ) );

        //APT_SetAppCpuTimeLimit( 75 );
        //osSetSpeedupEnable( true );

        a = svcGetSystemTick( );
            //UpdateMainScreenPartial_1BPP( 0, 342 );
            //UpdateMainScreen_4BPP( );
            UpdateMainScreen_8BPP( );
        b = svcGetSystemTick( );

        Time = ( float ) b - ( float ) a;
        Time = Time / CPU_TICKS_PER_MSEC;

        printf( "Full update took %.2fms\n", Time );

        //C2D_SpriteScale( &ScreenSprite, 0.78125, 0.78125 );
        //C3D_TexSetFilter( ScreenSprite.image.tex, GPU_LINEAR, GPU_LINEAR );

        SetUnscaled( );
        //SetScaled( );
        Scroll( );     

        while ( aptMainLoop( ) ) {
            hidScanInput( );

            Keys_Held = hidKeysHeld( );
            Keys_Down = hidKeysDown( );
            Keys_Up = hidKeysUp( );

            if ( Keys_Held & KEY_LEFT ) {
                ScrollX--;
            }

            if ( Keys_Held & KEY_RIGHT ) {
                ScrollX++;
            }

            if ( Keys_Held & KEY_UP ) {
                ScrollY--;
            }

            if ( Keys_Held & KEY_DOWN ) {
                ScrollY++;
            }

            Scroll( );

            C3D_FrameBegin( C3D_FRAME_SYNCDRAW );
                C2D_TargetClear( MainRenderTarget, C2D_Color32( 0, 255, 255, 255 ) );
                C2D_SceneBegin( MainRenderTarget );
                C3D_SetViewport( 0, 0, 240, 400 );
                C3D_SetScissor( GPU_SCISSOR_NORMAL, 0, 0, 240, 400 );
                C2D_DrawSprite( &ScreenSprite );
            C3D_FrameEnd( 0 );

            if ( Keys_Up & KEY_START ) {
                break;
            }

            printf( "\x1b[2J" );
            printf( "X: %d, Y: %d\n", ScrollX, ScrollY );
        }
    }

    DestroyMainScreen( );
   
    C2D_Fini( );
    C3D_Fini( );

    gfxExit( );
    return 0;
}
 

JeffRuLz

Well-Known Member
Member
Joined
Sep 14, 2018
Messages
164
Trophies
0
XP
2,561
Country
United States

TarableCode

Well-Known Member
Member
Joined
Mar 2, 2016
Messages
184
Trophies
0
Age
37
XP
319
Country
Canada
Here is a bit of odd behaviour I have noticed.
I moved my experimental framebuffer code from my demo project that took 2ms to complete each time, to the emulator core where it takes 2-9ms to complete.

The same amount of data is getting converted, but it's seemingly random how long it takes.
 

darkweb

Well-Known Member
Newcomer
Joined
Mar 15, 2020
Messages
45
Trophies
0
Age
39
XP
346
Country
Canada

darkweb

Well-Known Member
Newcomer
Joined
Mar 15, 2020
Messages
45
Trophies
0
Age
39
XP
346
Country
Canada
3dsxtool gives me 'unaligned relocation!' when compiling... but no more info. What does this means? anyone know? Thanks! :hateit:
I know this is going to be a reply to an old post but it was difficult tracking down how to solve this for my own homebrew as well as DevilutionX and it may be able to help people in the future.

The issue is that there is most likely a #pragma pack in the code which is instructing the compiler to ignore aligning global objects so 3dsxtool is failing because there isn't 32 bit offsets for pointers in a struct.
e.g. if you have something like this then it won't properly align the myPointer and 3dsxtool will fail.
Code:
#pragma pack(1)
struct myStruct
{
    i8 var1;
    char* myPointer;
}

There are 2 ways to deal with this; you can either pad the struct to ensure 32 bits prior to the pointer or remove the #pragma call and let the compiler do its job.
 

Site & Scene News

Popular threads in this forum

General chit-chat
Help Users
    Psionic Roshambo @ Psionic Roshambo: Only 666 dollars for 24TBs lol