Homebrew Homebrew Development

Badda

me too
Member
Joined
Feb 26, 2016
Messages
318
Trophies
0
Location
under the bridge
XP
2,404
Country
Tokelau
Why would it be a lot of work? The 3ds has cooperative threads btw, you must manually yeld so other threads can run. If you use svcsleepthread on both, that should make the home button work.
Yes, that's another possibility, but I would need to synchronize the threads so they all sleep at the same time. Otherwise I would kill performance. BTW, performance is the reason why I don't want the call to gspWaitForVBlank() in the main thread, too...
 

MasterFeizz

Well-Known Member
Member
Joined
Oct 15, 2015
Messages
1,098
Trophies
1
Age
29
XP
3,710
Country
United States
Yes, that's another possibility, but I would need to synchronize the threads so they all sleep at the same time. Otherwise I would kill performance. BTW, performance is the reason why I don't want the call to gspWaitForVBlank() in the main thread, too...
I don't think you're understanding how it works. Why would you have 2 threads and have both of them sleep? Why are you using 2 threads anyways?
 

TarableCode

Well-Known Member
Member
Joined
Mar 2, 2016
Messages
184
Trophies
0
Age
37
XP
319
Country
Canada
Did anyone ever figure out if you could do indexed textures on the GPU without needing to convert them manually?
This is one of the biggest bottlenecks in Mini vMac right now.
 

Badda

me too
Member
Joined
Feb 26, 2016
Messages
318
Trophies
0
Location
under the bridge
XP
2,404
Country
Tokelau
I don't think you're understanding how it works. Why would you have 2 threads and have both of them sleep? Why are you using 2 threads anyways?
That was your suggestion: "If you use svcsleepthread on both, that should make the home button work."
And two threads are necessary if you don't want to loose a lot of performance calling gspWaitForVBlank(). This way, you can have thread 1 rendering the scene while the other thread is waiting for the VBlank. as soon as the VBlank is there, thread 2 transfers the texture for the most current frame and draws it to the screen. Afterwards, thread 1 can continue rendering seamlessly. Having this solution is useful if your rendering engine in thread 1 produces less than 50fps, so you don't loose additional time in thread 1 waiting for the VBlank.
 
Last edited by Badda,

MasterFeizz

Well-Known Member
Member
Joined
Oct 15, 2015
Messages
1,098
Trophies
1
Age
29
XP
3,710
Country
United States
That was your suggestion: "If you use svcsleepthread on both, that should make the home button work."
And two threads are necessary if you don't want to loose a lot of performance calling gspWaitForVBlank(). This way, you can have thread 1 rendering the scene while the other thread is waiting for the VBlank. as soon as the VBlank is there, thread 2 transfers the texture for the most current frame and draws it to the screen. Afterwards, thread 1 can continue rendering seamlessly. Having this solution is useful if your rendering engine in thread 1 produces less than 50fps, so you don't loose additional time in thread 1 waiting for the VBlank.
I wasn't giving you a suggestion, you use svcsleepthread to yeld the thread so the other one can resume. Either way, you don't need a second thread. You can omit the wait, or use gspSetEventCallback to trigger the frameswap.
 
  • Like
Reactions: peteruk and Badda

Badda

me too
Member
Joined
Feb 26, 2016
Messages
318
Trophies
0
Location
under the bridge
XP
2,404
Country
Tokelau
Wow nice :-) gspSetEventCallback might be a solution to my problem - thanks for the hint!

--------------------- MERGED ---------------------------

By the way ... you say I always have to yield to let another thread run. What about threads running on another core? They should be able to run without the other threads sleeping, right?
 

MasterFeizz

Well-Known Member
Member
Joined
Oct 15, 2015
Messages
1,098
Trophies
1
Age
29
XP
3,710
Country
United States
Wow nice :-) gspSetEventCallback might be a solution to my problem - thanks for the hint!

--------------------- MERGED ---------------------------

By the way ... you say I always have to yield to let another thread run. What about threads running on another core? They should be able to run without the other threads sleeping, right?
If they run on another core yes, but the o3ds only has 2 cores and 1 is mostly reserved for the system so you will still have to yeld. On n3ds you can use threads on the extra cores without yelding, but your app will need to have some sort of synchronization.
 

darkweb

Well-Known Member
Newcomer
Joined
Mar 15, 2020
Messages
45
Trophies
0
Age
39
XP
346
Country
Canada
Those times are long past :-) In all homebrew that I know, the HOME-button works. Apart from this, another symptom of the home button not working is, that the app will not sleep when shutting the 3ds lid or - even worse - not wake up when opening the lid. The small example program above is showing the issue: It will not sleep when closing the lid.
:O I'm running into that exact problem where my homebrew doesn't wake up after opening the lid!

The game that I ported has a whole bunch of SDL_PollEvent loops in it so I think it keeps getting tripped up in there. Please post your final solution as I'd love to see what you come up with to solve the problem!
 

Badda

me too
Member
Joined
Feb 26, 2016
Messages
318
Trophies
0
Location
under the bridge
XP
2,404
Country
Tokelau
Either way, you don't need a second thread. You can omit the wait, or use gspSetEventCallback to trigger the frameswap.
I played around a bit with the gspSetEventCallback() function and found a couple of issues:
  1. The callback does not run in the main thread but rather in the GSP thread and thus does not solve my issue with the non-working home button
  2. A call to C3D_FrameBegin() in the callback function fails - no idea why. The call did not fail in the old code where it was called directly after gspWaitForVBlank()
This is driving me crazy ... Everything is working fine in my homebrew - except the home button. And I don't really want to sacrifice performance by including sleep periods in the main thread :cry:
 

MasterFeizz

Well-Known Member
Member
Joined
Oct 15, 2015
Messages
1,098
Trophies
1
Age
29
XP
3,710
Country
United States
I played around a bit with the gspSetEventCallback() function and found a couple of issues:
  1. The callback does not run in the main thread but rather in the GSP thread and thus does not solve my issue with the non-working home button
  2. A call to C3D_FrameBegin() in the callback function fails - no idea why. The call did not fail in the old code where it was called directly after gspWaitForVBlank()
This is driving me crazy ... Everything is working fine in my homebrew - except the home button. And I don't really want to sacrifice performance by including sleep periods in the main thread :cry:

What homebrew?
 

MasterFeizz

Well-Known Member
Member
Joined
Oct 15, 2015
Messages
1,098
Trophies
1
Age
29
XP
3,710
Country
United States
  • Like
Reactions: Badda

Badda

me too
Member
Joined
Feb 26, 2016
Messages
318
Trophies
0
Location
under the bridge
XP
2,404
Country
Tokelau
You're a hero, that was really simple. Just adding svcSleepThread(1) to the main thread really fixed the issue - and I think sleeping 50ns per second won't affect performance too much :P
Thanks for the help! :bow:
@darkweb : That might help fix your issue too ...

By the way - IMHO, yielding to system processes should the be part of aptMainLoop()
 
Last edited by Badda,

TarableCode

Well-Known Member
Member
Joined
Mar 2, 2016
Messages
184
Trophies
0
Age
37
XP
319
Country
Canada
Why does the same function call take varying amounts of time?
If I measure the time taken between setting the RenderThreadReq variable and RenderThreadBusy being false the time can vary wildly.

Sometimes it can take ~10ms, other 20.
Nothing makes seeeeeenseeeeeeee :(

Code:
#define __STDC_FORMAT_MACROS
#include <inttypes.h>

#include <stdio.h>
#include <stdbool.h>
#include <stdlib.h>
#include <string.h>
#include <citro2d.h>
#include <3ds.h>

#include "fb_conv.h"

#include "CLUT_blues.h"
#include "CLUT_reds.h"
#include "CLUT_greens.h"
#include "fb8bpp.h"

#include "sub_ui.h"

extern const uint8_t FBImage[ ];

#define MainTextureWidth 512
#define MainTextureHeight 512
#define MainTextureSize ( MainTextureWidth * MainTextureHeight * 4 )

// Cobbled together main framebuffer sprite
C2D_Sprite ScreenSprite = {
   .image = ( C2D_Image ) {
       .tex = &( C3D_Tex ) {
           // Gets initialized later
       },
       .subtex = &( const Tex3DS_SubTexture ) {
           .width = MainTextureWidth,
           .height = MainTextureHeight,
           .left = 0.0f,
           .right = 1.0f,
           .top = 1.0f,
           .bottom = 0.0f
       }
   },
   .params = ( C2D_DrawParams ) {
       .pos.x = 0.0f,
       .pos.y = 0.0f,
       .pos.w = MainTextureWidth,
       .pos.h = MainTextureHeight,
       .center.x = 1.0f,
       .center.y = 1.0f,
       .depth = 1.0f,
       .angle = 0.0f //-C3D_AngleFromDegrees( 90.0f )
   }
};

C3D_RenderTarget* MainRenderTarget = NULL;
uint32_t* MainScreenBuffer = NULL;

bool CreateMainScreen( void ) {
   MainRenderTarget = C2D_CreateScreenTarget( GFX_TOP, GFX_LEFT );

   if ( MainRenderTarget ) {
       MainScreenBuffer = ( uint32_t* ) linearAlloc( MainTextureSize );

       if ( MainScreenBuffer ) {
           memset( MainScreenBuffer, 0, MainTextureSize );

           return C3D_TexInit( ScreenSprite.image.tex, 512, 512, GPU_RGBA8 );
       }
   }

   return false;
}

void DestroyMainScreen( void ) {
   if ( MainScreenBuffer ) {
       linearFree( MainScreenBuffer );
   }

   C3D_TexDelete( ScreenSprite.image.tex );
}

int ShowError( const char* ErrorText ) {
   errorConf e;

   errorInit( &e, ERROR_TEXT_WORD_WRAP, CFG_LANGUAGE_EN );
   errorText( &e, ErrorText );
   errorDisp( &e );

   return 1;
}

volatile bool RenderThreadRun = true;
volatile bool RenderThreadReady = false;
volatile bool RenderThreadBusy = false;
volatile bool RenderThreadReqUpdate = false;

Thread RenderThreadHandle = NULL;

void RenderThread( void* Param ) {
   gfxInitDefault( );
   consoleInit( GFX_BOTTOM, NULL );

   C3D_Init( C3D_DEFAULT_CMDBUF_SIZE );
   C2D_Init( C2D_DEFAULT_MAX_OBJECTS );
   C2D_Prepare( );

   if ( CreateMainScreen( ) ) {
       RenderThreadReady = true;

       while ( RenderThreadRun ) {
           if ( RenderThreadReqUpdate == true ) {
               RenderThreadReqUpdate = false;
               RenderThreadBusy = true;
                   Unpack8BPP( ( const uint8_t* ) fb8bpp, MainScreenBuffer, 512 * 384 );
               RenderThreadBusy = false;
           }

           C3D_FrameBegin( C3D_FRAME_SYNCDRAW );
               // Flush and transfer main screen to texture
               GSPGPU_FlushDataCache( MainScreenBuffer, MainTextureSize );
               C3D_SyncDisplayTransfer(
                   ( uint32_t* ) MainScreenBuffer,
                   GX_BUFFER_DIM( MainTextureWidth, MainTextureHeight ),
                   ScreenSprite.image.tex->data,
                   GX_BUFFER_DIM( MainTextureWidth, MainTextureHeight ),
                   GX_TRANSFER_OUT_FORMAT( GX_TRANSFER_FMT_RGBA8 ) |
                   GX_TRANSFER_IN_FORMAT( GX_TRANSFER_FMT_RGBA8 ) |
                   GX_TRANSFER_OUT_TILED( 1 ) |
                   GX_TRANSFER_FLIP_VERT( 0 )
               );

               // Draw
               C2D_TargetClear( MainRenderTarget, C2D_Color32( 0, 255, 255, 255 ) );
               C2D_SceneBegin( MainRenderTarget );
               C2D_DrawSprite( &ScreenSprite );
           C3D_FrameEnd( 0 );           
       }

       DestroyMainScreen( );
   }

   C2D_Fini( );
   C3D_Fini( );

   gfxExit( );

   threadExit( 0 );
}

int main( void ) {
   uint32_t Keys_Down = false;

   osSetSpeedupEnable( false );
   APT_SetAppCpuTimeLimit( 75 );

   RenderThreadHandle = threadCreate( RenderThread, NULL, 4096, 0x20, 1, false );

   if ( ! RenderThreadHandle ) {
       return ShowError( "Failed to create render thread" );
   }

   // Wait for render thread to init and start running
   while ( RenderThreadReady == false )
   ;

   // Initial render req
   RenderThreadReqUpdate = true;

   // Spin until it finishes
   while ( RenderThreadBusy )
   ;

   printf( "Hi!\n" );

   while ( aptMainLoop( ) ) {
       hidScanInput( );
       Keys_Down = hidKeysDown( );

       if ( Keys_Down & KEY_A ) {
           RenderThreadReqUpdate = true;
           RenderThreadBusy = true;

           while ( RenderThreadBusy )
           ;
       }

       if ( Keys_Down & KEY_START ) {
           break;
       }
   }

   RenderThreadReqUpdate = false;
   RenderThreadRun = false;

   threadJoin( RenderThreadHandle, U64_MAX );
   threadFree( RenderThreadHandle );

   return 0;
}
 

MasterFeizz

Well-Known Member
Member
Joined
Oct 15, 2015
Messages
1,098
Trophies
1
Age
29
XP
3,710
Country
United States
Why does the same function call take varying amounts of time?
If I measure the time taken between setting the RenderThreadReq variable and RenderThreadBusy being false the time can vary wildly.

Sometimes it can take ~10ms, other 20.
Nothing makes seeeeeenseeeeeeee :(

First you are using threads. Threads in the same core share the available resources, that will cause some variance depending on the priority, specially if you are running them in the system core. And also you are using C3D_FRAME_SYNCDRAW which is probably the biggest factor here.
 
Last edited by MasterFeizz,
  • Like
Reactions: TarableCode

TarableCode

Well-Known Member
Member
Joined
Mar 2, 2016
Messages
184
Trophies
0
Age
37
XP
319
Country
Canada
Okay that makes sense, removing the drawing calls brings it around the ~9ms range.
Though the weird part is that the RenderIsBusy flag is set before drawing begins yet removing the drawing calls speeds it up.

How would I best approach this so that each frame is handled smoothly to get ~60FPS?
It looks like the conversion routines are capable of it, it's just syncing and drawing now.
 

MasterFeizz

Well-Known Member
Member
Joined
Oct 15, 2015
Messages
1,098
Trophies
1
Age
29
XP
3,710
Country
United States
Okay that makes sense, removing the drawing calls brings it around the ~9ms range.
Though the weird part is that the RenderIsBusy flag is set before drawing begins yet removing the drawing calls speeds it up.

How would I best approach this so that each frame is handled smoothly to get ~60FPS?
It looks like the conversion routines are capable of it, it's just syncing and drawing now.
There are many things that you need to fix. Don't keep the thread running the whole time, use signals. Use a mutex so you don't write to shared resources at the same time. Use C3D_FRAME_NONBLOCK. And APT_SetAppCpuTimeLimit above 30 only works for .3dsx
 

TarableCode

Well-Known Member
Member
Joined
Mar 2, 2016
Messages
184
Trophies
0
Age
37
XP
319
Country
Canada
Is the threading route even worth pursuing then?
I made an attempt at optimizing the framebuffer conversion using asm but it's not a huge amount faster plus it gives the wrong colours for some reason.

I feel kinda bad when it chugs a little bit, like I haven't done enough to make it faster.
 

MasterFeizz

Well-Known Member
Member
Joined
Oct 15, 2015
Messages
1,098
Trophies
1
Age
29
XP
3,710
Country
United States
Is the threading route even worth pursuing then?
I made an attempt at optimizing the framebuffer conversion using asm but it's not a huge amount faster plus it gives the wrong colours for some reason.

I feel kinda bad when it chugs a little bit, like I haven't done enough to make it faster.
If you implement it correctly it could help a bit. Can't say if it will do much though, specially in the case of the o3ds
 
  • Like
Reactions: TarableCode

TarableCode

Well-Known Member
Member
Joined
Mar 2, 2016
Messages
184
Trophies
0
Age
37
XP
319
Country
Canada
Okay, so I would change it to use a signal when the main thread is like "hey update the display", and the mutex is like "nah im busy, wait"?

I'm still looking for a good example of mutexes, is the caller the one that makes the mutex and the thread clears it?

Kinda sucks about the 30% time limit on the syscore though, what's the point of using a CIA build then?
 

Site & Scene News

Popular threads in this forum

General chit-chat
Help Users
    Xdqwerty @ Xdqwerty: @BakerMan, it wasnt aimed at you but ok