video with audio playback

Post your cool example code here.
User avatar
hoek67
Posts: 59
Joined: Thu Mar 28, 2019 1:11 am
Location: Mount Beauty, Victoria, Australia
OS: Windows 10, Linux
IDE: STM32duino via Visual Micro in VS 2017
Core: STM official (I think)
Board: STM32F407VET6
Contact:

video with audio playback

Post by hoek67 » Sat Mar 30, 2019 3:59 am

:shock: Finally got it all working under the STM32 environment with very pleasing results. The SD card has a LOT better bandwidth (compared to Due using SPI * DMA) that will enable larger file sizes, smoother playback.

Had problems last night with sync... then found the file I produced.... had the problem and the code was playing faithfully with the inherited delay.

Have a program on my website that can convert av files into a known format for smaller devices.
http://kiweed-software.000webhostapp.com/a_news.php

Has ability to extract a variety of resolutions and pixel bit depths. If audio is extracted have a few rates to choose from... video can also be re-sampled to 12, 12.5, 15, ... etc fps or left as is.

The program will also block the video and audio out to 512 byte blocks to make seeking and reading more efficient.

File is [header][padding if blocked][f0 video][f0 audio][padding if blocked][f1 video][f1 audio][padding if blocked].....

If no audio the timer is set for frame rate * 100 and then every 100th frame it displays the "nominal frame"... (this was done because NTSC with 23.98 FPS and wanted to avoid using float). have to ensure if the hardware can't render fast enough it skips frames instead of getting behind.

Audio is a lot different... as "audio is king". Basically I play audio normally.... as each frame has a fixed number of bytes of audio we know that when an audio block has finished... must be time for the next frame. I have 4 audio buffers... 2 are always loaded and 3rd may or may not be loaded, 4th allows a bit of room between the play position and the loading position as it's a cyclic buffer.

Audio is split evenly between all video frames... if fps is faster then each frame will have smaller audio blocks. Maximum is 1024 bytes of audio per frame meaning fps and audio quality need to be adjusted to get this < 1024.

ATM... playing 23.97 FPS @11025 8 bit audio on STM32F407VET6 using inbuilt SDIO SD, inbuilt DAC and SSD1327 OLED 128*128 4bpp.

Bit of code below... may help anyone else interested as video and audio sync are always a pain. Uses quite a bit of the library been working on for ages... but basic flow of program should be understandable.

Code: Select all

#include <ks_pinmap.h>
#include <stm32sd.h>
#include <ks_utils.h>
#include <video.h>
#include <new_dib_4.h>  
#include <OLED_SSD1327.h>

cOLED_SSD1327 gOLED; // OLED output

#define OLED_BPP 4

#if (OLED_BPP ==  16) 
#define RENDER_SCREEN gOLED._1351renderScreen((uint16_t *)gDIB.getBuffer()); 
cDIB_virt16bppStream gDIB; // virtual screen buffer
#elif (OLED_BPP == 1)
#define RENDER_SCREEN  gOLED.renderScreen1BPP((uint8_t *)gDIB.getBuffer()); 
cDIB_virt1bppStream gDIB; // virtual screen buffer
#elif (OLED_BPP == 4)
#define RENDER_SCREEN  gOLED.renderScreen4BPP((uint8_t *)gDIB.getBuffer()); 
cDIB_virt4bpp gDIB; // virtual screen buffer
#endif

void loop(void)
{
	showbitmap();
	delay(2000);
}
  
void setup(void)
{
	delay(500); // give other stuff time to warm up init etc
	Serial.begin(9600);
	Serial.println("Init");

#if (OLED_BPP == 16)
	gOLED.begin(OLED_init1351_128_128_15, MOSI_PIN, SCK_PIN, PIN_OLED_CS, PIN_OLED_RESET, PIN_OLED_DC);
#else
	gOLED.begin(OLED_init1327_128_128_4BPP,  PE_13, PE_9, PE_11, SPISettings(24000000));
#endif

	Serial.print("OLED DMA=");
	Serial.println(gOLED.isDMA());
	
	gDIB.begin(gOLED.getWidth(), gOLED.getHeight());

	Serial.println(gDIB.getWidth());
	Serial.println(gDIB.getHeight());
	Serial.println("Begin!");
	Serial.flush();

	gDIB.clearScreen();
	RENDER_SCREEN

	while (SD.begin(SD_DETECT_NONE) != TRUE)
	{
		 Serial.println("***FAIL*** - SD card init failed!");
		 delay(2000);		
	}

	Serial.println("SD Found!");	
} 

void showbitmap()
{
	char buff[100];
	bool mute = false;
	
	uint32_t t0 = millis();

	const char *arr_files[] = { "PWR4.LMB" };
	uint32_t ret;
	
	for (int fn = 0; fn < 1; fn++)
	{
		gDIB.Bitmap.LoadFromFile((char *)arr_files[fn]); 

		ret = streamVideo1351(0, 0, gOLED, &gDIB.Bitmap, gDIB.getBuffer(), gDIB.getBufferSize(), (int)(gDIB.Bitmap.getFPS() * 0.0f), (int)(gDIB.Bitmap.getFPS() * 95.0f), mute);
		
		uint32_t v1;
		uint32_t v2;

		streamGetDMAStats(&v1, &v2);

		if (v1)
		{
			sprintf(buff, "* ret=%d lines=%d %6.2f%% were still waiting on DMA transfer!!\n", ret, v1, ((float)v2 * 100.0f) / (float)v1);
			Serial.print(buff);
		}

		gDIB.Bitmap.Clear(); // closes files etc unloads resource
	}

	t0 = millis() - t0;

	sprintf(buff, "*\n*\n*  Timing = %6.2f seconds!!\n*\n*\n", (float)t0 / 100.0f);
	Serial.print(buff);	
}

Code that does the rendering etc..

video.h

Code: Select all

#pragma once
 
#include <new_dib.h>
#include <new_dib_1.h>
#include <new_dib_4.h>
#include <new_dib_16.h>

#include <OLED_SSD1327.h>

#define VIDEO_ERR(x) (x)

uint32_t streamVideo1351(int16_t vx, int16_t vy, cOLED_SSD1327 &dev, cMCU_Resource *res, void *buffer, uint16_t buffer_len, uint16_t fr_start, uint16_t fr_end, bool mute);

__INLINE__ uint32_t streamVideo1351(cOLED_SSD1327 &dev, cMCU_Resource *res, void *buffer, uint16_t buffer_len, uint16_t fr_start, uint16_t fr_end, bool mute)
{
	return streamVideo1351(0, 0, dev, res, buffer, buffer_len, fr_start, fr_end, mute);
}

void streamGetDMAStats(uint32_t *lines, uint32_t *waiting);


video.cpp ... still a lot lot of WiP to clean up... old DMA code allowed me to call transfer then process pixels while the transfer was happening.

Eventually the audio and video rendering will be a callback function. The video will just do all the loading, sync etc.

Should be noted 2 file streams are used... one for video and other for audio... that way the seek function is always going forward and not too far from where it currently is... which because of the way fatfs (and sdfat) work makes a big difference.

Code: Select all

#include <video.h>
#include <timers.h>

/*
VIDEO STREAMING ... oddly enough needs streaming and DMA and hopefully has ability to do work while DMA is transferring in the background
*/

#define VIDEO_MAX_AUDIO_BUFFERS 4

#define _DEBUG 

#define MAX_SCAN_WIDTH 256

#define ENABLE_SOUND_DAC  // uncomment to allow for sound

//#define SCAN_DMA_LOG	 // if defined will show how much of the inter-scanline processing and looping was done while waiting for DMA

#ifdef SCAN_DMA_LOG
	uint32_t sl_lines;
	uint32_t sl_all;

	void streamGetDMAStats(uint32_t *lines, uint32_t *waiting)
	{
		*lines = sl_lines;
		*waiting = sl_all;
	}

#else
	void streamGetDMAStats(uint32_t *lines, uint32_t *waiting)
	{
		*lines = 0;
		*waiting = 0;
	}
#endif

static volatile uint16_t rev_buff_ptr; // what byte in the audio block we're up to
static volatile uint8_t rev_buff_buff; // gets incremented each time we've read a full audio block... wraps to 0... 0,1,2,3,0,1,2,3
static uint16_t rev_size; // local copy of audio block size... each frame has this number of bytes .. or 0 if no audio
static volatile uint8_t next_video_ready;

static volatile int16_t video_last;	 // nominal frame... one we should be on at certain point in time
static volatile int16_t audio_last; // audio (if any) just plays sequentially and at the end of each block it means a new video frame should be grabbed and displayed
static uint8_t fps_count; // if no audio... when this hits 100 we need next video frame

static uint8_t m_audio_buffs[VIDEO_MAX_AUDIO_BUFFERS][1024]; // eventually all these variable will be in a single struct and allocated!!!!

void __INLINE__ rev_load(cMCU_Resource *res) // load next audio block into buffer and increment the load counter
{
	uint8_t idx = (audio_last & (VIDEO_MAX_AUDIO_BUFFERS - 1));

	if (audio_last >= res->getFrameCount())
	{		
		memset((void *)m_audio_buffs[idx], 0, rev_size);
	}
	else
	{
		res->LoadAudioBlock((void *)m_audio_buffs[idx], audio_last);
	}

	audio_last++;
}

void __INLINE__ rev_buff_init(cMCU_Resource *res)
{
	rev_size = res->getAudioBlockSize(); // size

	Serial.print("Audio block size...");
	Serial.println(rev_size);

	rev_buff_ptr = 0;
	rev_buff_buff = 0; 

	for (uint8_t i = 0; i < VIDEO_MAX_AUDIO_BUFFERS - 1; i++)
	{
		rev_load(res);
	}
}

static uint8_t __INLINE__ rev_get_byte() // get the current audio sample
{
	return m_audio_buffs[rev_buff_buff][rev_buff_ptr];
}

static void __INLINE__ rev_inc_ptr() // increment the playback position if last byte in audio block... then tell video it's time to display next frame
{
	if (++rev_buff_ptr == rev_size)	// when each audio buffer completes it's time for a new video frame... and time to grab another audio buffer
	{
		rev_buff_ptr = 0;
		rev_buff_buff = (rev_buff_buff + 1) & (VIDEO_MAX_AUDIO_BUFFERS - 1);
	
		next_video_ready = 1;
		video_last++; // nominal frame
	}
}


void video_send_sound()	  // still waiting for dac
{
#ifdef ENABLE_SOUND_DAC
	DAC->DHR12R1 = (uint32_t)rev_get_byte() * 8; // 8 bit -> 12 bit ... eventually support 16 bit -> to 12 bit!
    DAC->SWTRIGR |= DAC_SWTRIGR_SWTRIG1;   
#endif
	rev_inc_ptr();	// sound has 4 cyclic buffers so usually 2-3 full ones pre-loaded
}


void video_send_no_sound()	  // no sound... each tick is 1/100th of a frame... so need to wait for 100 before next frame
{
	if (++fps_count >= 100)
	{
		fps_count = 0;
		next_video_ready = 1;
		video_last++; // nominal frame	
	}
}

uint32_t streamVideo1351(int16_t vx, int16_t vy, cOLED_SSD1327 &dev, cMCU_Resource *res, void *buffer, uint16_t buffer_len, uint16_t fr_start, uint16_t fr_end, bool mute)
{
	uint8_t test = false;

	uint8_t oled_bpp = dev.getBPP();
	uint16_t ver = dev.getVersion();
	float fps = res->getFPS();
	uint8_t pixel_mode = res->getPixelFormat();
	uint16_t vw = res->getMaxWidth();
	uint16_t vh = res->getMaxHeight();

	int16_t vscan = (vw * res->getBPP()) / 8;

	if (!dev.isDMA())
	{
		return VIDEO_ERR(1); // no DMA support
	}

	if (!res->isStreaming())  // must be open
	{
		return VIDEO_ERR(2); // not streaming
	}
	
	if (fps == 0.0f)
	{
		return VIDEO_ERR(3); // not an animation with fps
	}

	/*
	Init DAC ... stm32f407vet6 ATM
	*/

	 RCC->APB1ENR |= RCC_APB1ENR_DACEN;
     DAC->CR |= DAC_CR_TSEL1;
     DAC->CR |= DAC_CR_TEN1;
     DAC->CR |= DAC_CR_EN1;

#ifdef SCAN_DMA_LOG
	sl_lines = sl_all = 0;
#endif
	
	fps_count = 0; // used if no sound
	
	if (pixel_mode != _MCU_PIXEL_MODE_MONO &&
		pixel_mode != _MCU_PIXEL_MODE_PAL2 &&
		pixel_mode != _MCU_PIXEL_MODE_PAL4 &&
		pixel_mode != _MCU_PIXEL_MODE_PAL8 &&
		pixel_mode != _MCU_PIXEL_MODE_RGB565)
	{
		return VIDEO_ERR(4); // mode not supported
	}

	// to do ... more checks
	
	if (false && ver != 1351) // if can't support update "window" ... mucking around to put in place
	{
		vx = 0;
		vy = 0;

		if (dev.getWidth() != vw || dev.getHeight() != vh)
		{			
			return VIDEO_ERR(5); // 
		}

		// if oled and res width and height mismatch ... exit ... for now ... as can't set window
	}

	// height check

	if (vy >= dev.getHeight())
	{
		return VIDEO_ERR(6); // fully clipped 
	}

	if (vy + vh >= dev.getHeight())
	{
		vh = dev.getHeight() - vy;
	}

	// width check

	if (vx >= dev.getWidth())
	{
		return VIDEO_ERR(6); // fully clipped
	}

	if (vx + vw >= dev.getWidth())
	{
		vw = dev.getWidth() - vx;
	}

	next_video_ready = 2;

	video_last = audio_last = fr_start; // none shown
	
#ifdef _DEBUG
	Serial.print(F("\nfps="));
	Serial.print(fps);
	Serial.print(F("  ab="));
	Serial.print(res->getAudioBlockSize());
	Serial.print(F("  hz="));
	Serial.println(res->getAudioRate());
#endif
   	
	if (1) // ver == 1351) // if can be windowed
	{
		//dev._chip_enable();
		//dev._setWindow(vx, vy, vw - 1, vh - 1); // using this all the time
		//dev._chip_disable();
	}

	uint8_t hasAudio = mute ? false : res->hasAudio();
		
	if (test) // if set to true no audio is played AND frames will run as fast as possible... basically a hardware speed test
	{
		// nothing
	}
	else if (hasAudio)
	{
		rev_buff_init(res);
		Timer1.start(video_send_sound,(uint32_t) res->getAudioRate());
	}
	else
	{
		Timer1.start(video_send_no_sound, fps * 100.0f);
	}

	if (fr_end >= res->getFrameCount())
	{
		fr_end = res->getFrameCount() - 1;
	}

	while (video_last <= fr_end)
	{
		if (next_video_ready == 1) // also means next audo block needs to be loaded
		{
			if (hasAudio)
			{
				rev_load(res); // audio has it's own stream so video and audo don't fluff each other up																						
			}
			next_video_ready = 2;
			continue;
		}
		else if (test || next_video_ready == 2)
		{
			next_video_ready = 0;
			res->PushRawFrameToBuffer(buffer, video_last); // should load asap ... so not being done AFTER it's needed

			if (test) { video_last++;  }
		}
		else
		{
			continue;
		}

		// common start of frame setup

		//if (ver == 1351)
		//{
		//	dev._chip_enable();
		//	dev._enableWriteRAM();
		//}

		// pixel mode specific stuff

		if (oled_bpp == 16)
		{
			uint16_t sbuff16[MAX_SCAN_WIDTH * 2];  // allow for 160 * 128

			if (pixel_mode == _MCU_PIXEL_MODE_PAL8)
			{
				uint16_t *pal = res->getPalettePtr16();
				uint8_t *b1 = (uint8_t *)buffer;
				uint8_t *b2 = b1;
				uint16_t disp = 0;

				for (uint16_t y = vh; y--; )
				{
					uint16_t *tb = &sbuff16[disp];

					for (uint16_t i = vw; i--;)
					{
						*tb = pal[*b1];
						tb++;
						b1++;
					}

					b1 = b2 + vscan;
					b2 = b1;

#ifdef SCAN_DMA_LOG
					sl_all += (dev._do_wait() & 1);
					sl_lines++;
#else
					dev._do_wait();
#endif

					dev._write(&sbuff16[disp], vw * 2, false); // no wait								
					disp ^= MAX_SCAN_WIDTH;
				}
			}
			else if (pixel_mode == _MCU_PIXEL_MODE_PAL4) // covers gray4 and bgr4
			{
				uint16_t *pal = res->getPalettePtr16();
				uint8_t *b1 = (uint8_t *)buffer;
				uint8_t *b2 = b1;
				uint16_t disp = 0;

				for (uint16_t y = vh; y--;)
				{
					uint16_t *tb = &sbuff16[disp];

					for (uint16_t i = vw / 2; i--;)
					{
						*tb = pal[*b1 >> 4]; tb++;
						*tb = pal[*b1 & 15]; tb++;
						b1++;
					}

					b1 = b2 + vscan;
					b2 = b1;

#ifdef SCAN_DMA_LOG
					sl_all += (dev._do_wait() & 1);
					sl_lines++;
#else
					dev._do_wait();
#endif

					dev._write(&sbuff16[disp], vw * 2, false); // no wait
					disp ^= MAX_SCAN_WIDTH;
				}
			}
			else if (pixel_mode == _MCU_PIXEL_MODE_PAL2)
			{
				uint16_t *pal = res->getPalettePtr16();
				uint8_t *b = (uint8_t *)buffer;
				uint16_t disp = 0;

				for (uint16_t y = vh; y--;)
				{
					uint16_t *tb = &sbuff16[disp];

					for (uint16_t i = vw / 4; i--;) // try and do bg processing
					{
						*tb = pal[(*b >> 6)]; tb++;
						*tb = pal[(*b >> 4) & 3]; tb++;
						*tb = pal[(*b >> 2) & 3]; tb++;
						*tb = pal[*b & 3]; tb++;
						b++;
					}

#ifdef SCAN_DMA_LOG
					sl_all += (dev._do_wait() & 1);
					sl_lines++;
#else
					dev._do_wait();
#endif

					dev._write(&sbuff16[disp], vw * 2, false); // no wait
					disp ^= MAX_SCAN_WIDTH;
				}
			}
			else if (pixel_mode == _MCU_PIXEL_MODE_MONO)
			{
				uint8_t *b = (uint8_t *)buffer;
				uint16_t disp = 0;

				for (uint16_t y = vh; y--;)
				{
					uint16_t *tb = &sbuff16[disp];

					for (uint16_t i = vw / 8; i--;) // try and do bg processing
					{
						*tb = (*b >> 7) * 0xFFFF; tb++;// 0 or 255 ... no lookup or if					
						*tb = ((*b >> 6) & 1) * 0xFFFF; tb++;// 0 or 255 ... no lookup or if					
						*tb = ((*b >> 5) & 1) * 0xFFFF; tb++;// 0 or 255 ... no lookup or if					
						*tb = ((*b >> 4) & 1) * 0xFFFF; tb++;// 0 or 255 ... no lookup or if					
						*tb = ((*b >> 3) & 1) * 0xFFFF; tb++;// 0 or 255 ... no lookup or if					
						*tb = ((*b >> 2) & 1) * 0xFFFF; tb++;// 0 or 255 ... no lookup or if					
						*tb = ((*b >> 1) & 1) * 0xFFFF; tb++;// 0 or 255 ... no lookup or if					
						*tb = (*b & 1) * 0xFFFF; tb++;// 0 or 255 ... no lookup or if					
						b++;
					}

#ifdef SCAN_DMA_LOG
					sl_all += (dev._do_wait() & 1);
					sl_lines++;
#else
					dev._do_wait();
#endif

					dev._write(&sbuff16[disp], vw * 2, false); // no wait
					disp ^= MAX_SCAN_WIDTH;
				}
			}
			else if (pixel_mode == _MCU_PIXEL_MODE_RGB565)
			{
				dev._write(buffer, vw * vh * 2);
				dev._chip_disable();
				continue;
			}
			else
			{
				continue;
			}
		}
		else if (oled_bpp == 4)
		{
			uint8_t sbuff8[MAX_SCAN_WIDTH * 2];  // allow for 160 * 128

			if (true) //pixel_mode == _MCU_PIXEL_MODE_PAL4)
			{
				//dev._writeCommand(0x15);
				//dev._write8(0x1c);
				//dev._write8(0x5b);
//
				//dev._writeCommand(0x75);
				//dev._write8(0);
				//dev._write8(63);

				//dev._enableWriteRAM();

				if (0) // vx != 0 || vy != 0 || vw != dev.getWidth() || vh != dev.getHeight())
				{
					// basically if not full @ 0,0 need to do some "stuff" and do 1 scanline at a time

					uint8_t *b = (uint8_t *)buffer;
					uint8_t *b1 = b;
					uint16_t disp = 0;

					for (uint16_t y = 0; y < dev.getHeight(); y++)
					{
						uint8_t *tb = &sbuff8[disp];	// byte pointer!!!!

						memset(tb, 0, dev.getWidth() / 2); // scanline all 0's by default

						uint8_t b_mask = 0;
						uint8_t s_mask = 0;
						
						if (y >= vy && y < vy + vh)
						{
							for (uint16_t i = 0; i < dev.getWidth(); i++) // try and do bg processing
							{
								if (i >= vx + vw)
								{
									break;
								}

								if (i >= vx) // in window
								{
									uint8_t col = b_mask ? (*b >> 4) : (*b & 0x0f);
									*tb = s_mask ? (*tb & 0x0f) | (col << 4) : (*tb & 0xf0) | (col);

									b += b_mask;
									b_mask ^= 1;
								}

								tb += s_mask;
								s_mask ^= 1;
							}
						}

#ifdef SCAN_DMA_LOG
						sl_all += (dev._do_wait() & 1);
						sl_lines++;
#else
						dev._do_wait();
#endif 
						dev._write(&sbuff8[disp], dev.getWidth() / 2, false); // no wait
						disp ^= MAX_SCAN_WIDTH;

						b1 += (res->getMaxWidth() / 2);
						b = b1;
					}
				}
				else
				{
					dev.renderScreen4BPP((uint8_t *)buffer);
				}
			}
		}

		// was valid and finished

#ifdef SCAN_DMA_LOG
		sl_all += (dev._do_wait() & 1);
		sl_lines++;
#else
		dev._do_wait();
#endif

		dev._chip_disable();		
	}

	Timer1.stop();

#ifdef ENABLE_SOUND_DAC
	if (!test && hasAudio)
	{
		//vDAC._chip_disable();
	}
#endif

	if (1 ) //dev.ver == 1351)
	{
		//dev._chip_enable();
		//dev._setWindow(0, 0, dev.getWidth() - 1, dev.getHeight() - 1); // back to full
		//dev._chip_disable();
	}

	return 0;
}

User avatar
hoek67
Posts: 59
Joined: Thu Mar 28, 2019 1:11 am
Location: Mount Beauty, Victoria, Australia
OS: Windows 10, Linux
IDE: STM32duino via Visual Micro in VS 2017
Core: STM official (I think)
Board: STM32F407VET6
Contact:

Re: video with audio playback

Post by hoek67 » Sat Mar 30, 2019 4:20 am

Due to iPod being a "POS" and never being able to charge was unable to upload a video of it running... but it's really no different to
Although youtube shows it running on Arduino Due, it looks the same on the STM32 but has better sound and uses a lot less processing power to get the frames and audio out.

Find this a very good way to "test" a MCU as it needs a good SD, SPI and DAC speed.

User avatar
BennehBoy
Posts: 67
Joined: Tue Mar 05, 2019 7:43 pm
Location: Yorkshire
OS: Windows 10
IDE: 1.8.9, Sloeber
Core: Roger's & STM
Board: Blue/Blackpill, MM, HYTiny, Black407Z/VET6, DiyMroe, FK407M1
Contact:

Re: video with audio playback

Post by BennehBoy » Sat Mar 30, 2019 8:18 am

Pretty cool, do you think the f407 can cope with decoding standard video streams or is that sheer wishful thinking? I've a range of full colour OLED displays here :D
-Ben

User avatar
hoek67
Posts: 59
Joined: Thu Mar 28, 2019 1:11 am
Location: Mount Beauty, Victoria, Australia
OS: Windows 10, Linux
IDE: STM32duino via Visual Micro in VS 2017
Core: STM official (I think)
Board: STM32F407VET6
Contact:

Re: video with audio playback

Post by hoek67 » Sat Mar 30, 2019 9:46 am

BennehBoy wrote:
Sat Mar 30, 2019 8:18 am
Pretty cool, do you think the f407 can cope with decoding standard video streams or is that sheer wishful thinking? I've a range of full colour OLED displays here :D
Think decoding would be wishful if you refer to mpeg stream etc. It's possible to encode each video frame as a separate jpeg... but that then bloats memory and cpu needed.

Since the previous post I have dramatically altered the code and the size has shrunk heaps. Basically the video player accepts 2 function pointers and calls these to render a frame, or init,play and de-init audio. It also allocates all resources in 1 hit... then cleans up after playing. .... much neater.

Code: Select all

uint8_t fn_render_frame() // called by video player... we told it where the buffer was and how big... now been filled
{
	gOLED.renderScreen4BPP(gDIB.getBuffer());

	return 0; // return non zero to make it stop
}

void fn_handle_audio(uint8_t flag, uint32_t val) // called by video player if sound
{
	if (flag == 1) // output dac
	{
		DAC->DHR12R1 = val * 8; // 8 bit -> 12 bit ... eventually support 16 bit -> to 12 bit!
		DAC->SWTRIGR |= DAC_SWTRIGR_SWTRIG1;   
	}
	else if (flag == 0) // init DAC
	{
		RCC->APB1ENR |= RCC_APB1ENR_DACEN;
		DAC->CR |= (DAC_CR_TSEL1 | DAC_CR_TEN1 | DAC_CR_EN1);
	}
	else // de-init DAC
	{
		RCC->APB1ENR &= ~(RCC_APB1ENR_DACEN);
		DAC->CR &= ~(DAC_CR_TSEL1 | DAC_CR_TEN1 | DAC_CR_EN1);		
	}
}


...
gDIB.Bitmap.LoadFromFile((char *)arr_files[fn]); 

		ret = streamVideo(fn_render_frame, fn_handle_audio, &gDIB.Bitmap, gDIB.getBuffer(), gDIB.getBufferSize(), (int)(gDIB.Bitmap.getFPS() * 0.0f), (int)(gDIB.Bitmap.getFPS() * 95.0f), mute);
		
		gDIB.Bitmap.Clear(); // closes files etc unloads resource
...


I have had great success in playing on 128*128 SSD1351 16 bit OLED. Just because the OLED is 16 bit does not mean the input has to be 16 bit.

Playback of BGR8 and even BGR4 formats is quite decent on these. My decoder will convert to these obscure formats.

If the OLED resolution = frame buffer resolution then just dump 1:1 in a single SPI transfer.

Otherwise I'd send 1 scan line at a time after converting on the run. With DMA I was able to double buffer and be decoding 1 while SPI was transferring the previous one. I'd eventually like to get SPI.transfer(buff, len) to use DMA (maybe it does already) and not wait for the transfer to complete.

Other good thing about the STM32 I have is the 168Mhz clock. With SPI most OLEDs run @22Mhz. With a 168Mhz clock... 168/8=21 so if you ask for 22Mhz will actually get 21Mhz. If unlucky with speeds and the 2/4/8/16/32 divisors you could end up with SPI running almost 1/2 the speed of the actual maximum you ask for. 42Mhz is the next step up from 21Mhz and off memory the OLED I have use to wig-out at ~28Mhz.

I know the SPI code looks for the fastest speed <= CLOCK / DIV and tests 2... then 4... then 8 etc. Maybe it doesn't need power of 2 divisor.

ag123
Posts: 239
Joined: Thu Mar 07, 2019 6:15 am
OS: linux
IDE: eclipse, arduino 1.8.5
Core: Roger's
Board: Maple mini, Bluepill

Re: video with audio playback

Post by ag123 » Sat Mar 30, 2019 1:23 pm

+1 nice. it is cool! really!
it seemed mjpeg may be possible, but probably quite a challenge to set it all up :P :lol:

User avatar
hoek67
Posts: 59
Joined: Thu Mar 28, 2019 1:11 am
Location: Mount Beauty, Victoria, Australia
OS: Windows 10, Linux
IDE: STM32duino via Visual Micro in VS 2017
Core: STM official (I think)
Board: STM32F407VET6
Contact:

Re: video with audio playback

Post by hoek67 » Sun Mar 31, 2019 10:15 am

Looks like the Mjpg is taxing the system too the max... wonder if it's using a fixed point implementation.

Made changes to the avi decoder program last night and managed to get 16 bit audio (finally) so can give the 12 bit DAC a better workout.

Bit of a "buzzing" sound coming from the DAC and it seems only when the OLED is on :?

User avatar
hoek67
Posts: 59
Joined: Thu Mar 28, 2019 1:11 am
Location: Mount Beauty, Victoria, Australia
OS: Windows 10, Linux
IDE: STM32duino via Visual Micro in VS 2017
Core: STM official (I think)
Board: STM32F407VET6
Contact:

Re: video with audio playback

Post by hoek67 » Sun Mar 31, 2019 3:26 pm

Just for reference... got SPI working with DMA using a 3rd party library. :shock:

Can confirm as working with STM32F407VET6 "black edition".

https://github.com/pichenettes/stmlib

Very nice examples and rundown here https://stm32f4-discovery.net/2015/04/l ... stm32f4xx/

Basically means I can double buffer and process pixels while the previous block of pixels is being sent. By the time the 1st buffer has finished the next one will be ready.

If going to use... check TM_SPI_PinsPack_2 etc as I was using the wrong pin combination and port. Can laugh now but the SCK was being output to a LED.

The library has a heap of stuff... so if need just the DMA and SPI stuff will have to weed all the un-needed stuff out.

Code: Select all


__INLINE__ void cDMA_spi_init() 
{
	Serial.println("INIT SPI");
		
	TM_SPI_InitFull(SPI1, TM_SPI_PinsPack_2,  TM_SPI_GetPrescalerFromMaxFrequency(SPI1, 22000000), TM_SPI_Mode_0, SPI_MODE_MASTER, SPI_FIRSTBIT_MSB); 
	TM_SPI_DMA_Init(SPI1);
}

User avatar
BennehBoy
Posts: 67
Joined: Tue Mar 05, 2019 7:43 pm
Location: Yorkshire
OS: Windows 10
IDE: 1.8.9, Sloeber
Core: Roger's & STM
Board: Blue/Blackpill, MM, HYTiny, Black407Z/VET6, DiyMroe, FK407M1
Contact:

Re: video with audio playback

Post by BennehBoy » Sun Mar 31, 2019 6:30 pm

Feel free to augment STM's SPI implementation ;)
-Ben

User avatar
sheepdoll
Posts: 16
Joined: Wed Mar 27, 2019 12:42 am
OS: MacOS,Linux
IDE: Arduino,Eclipse
Core: STM official, HAL
Board: Nucleo, discovery
Contact:

Re: video with audio playback

Post by sheepdoll » Sun Mar 31, 2019 7:33 pm

I have not looked into any of the details. Did however get to see a demonstration of the 'Tasmanian Devil' back in the mid 1990s. This was also sometimes referred to as the 'two ton canary.' MPEG was always designed to be compression heavy and decoder lite. The 'Devil' was in Warner's video vault (next to LAX.) Impressive in itself as this was where the prints were flown in from what ever salt mine, and converted to video masters (and the Pan/Scan TV network masters were edited.'
The 'Devil' was a row of equipment racks, about 8 to 10 feet along one wall.

The playback device was a run of the mill, IBM 386 with one exception. That was it had a 4 GB hard drive which had the most impact on our group.
The demo's were 'The Fugitive' and "Twister."

There were compression artifacts, which were subtle, Apart from that the picture was amazingly clear. I remember pointing out that my acquaintance friends in the editing suit, would not accept the digital artifacts. I was told this was not for pro's but for the consumer. I still sort of remember my comment as 'You are going to sell the consumer, near mastering quality video?"

Later I purchased one (if not the first) DVD player ever sold. There were no DVDs at the time, just VCDs which were I think MJPEG based. Had some good titles in that format, Four weddings and a Funeral, and Star Trek:Wrath of Khan. It was hard at first to get DVDs. They would not sell them before the Tuesday release date. I remember arguing on Monday that it was Tuesday in most of the world. I did however start buying one DVD a week. (This was still in the test market phase.) There was a competing format DiVX, this would tether the player to the phone/bbs, and charge a per view fee or deny access to the title when on moratorium.

What ever the case, The Warner execs came to the school where I was taking Jewelry classes. They were there to look for programming talent to design the menu systems. May also have been a critical point when I told the exec, I was purchasing one DVD a week. (He whispered to me that he thought Disney would come on board, the next day. And they did.)

As an early adopter I was active on the USENET DVD group. There was another girl active in the USENET group, who was going to make her media career, by predicting that DVD would go the way of DAT due to the media encryption, Which was the same encryption used in VHS tapes (screwing up the vertical blanking field.) I think the execs mixed us up, as we were both at the lecture, Still I like to think I had a hand at a critical point. DiVX never took off past the testing stage. It still however remains a goal of the media providers to limit access to titles on moratorium (Since they borrow money, for new production, against the rare, unseen titles in the library.)

As I was working at Apple at the time in the printer division, I got deep into the jPeg codec. jPeg was not natively supported in the level 1 printers. I actually wrote a decoder in postscript, as I was interested in spacial transforms. (wrote an FFT too.) The core of these processes is something called a DCT or Discreet Cosine Transform. It is in effect a vector, which is why GPUs are what they are, designed to multiply a 1x3 vector against a 3x3 matrix. This is the heart of ray tracing, and the compression is simply a subset of this. The faster one can do the DCT which changes time into space, or vice versa, the quicker the decode, the rest is just parsing the packets.

I have some 35MM film 'trailers' from the final years of film projection. What is interesting are the audio tracks. These are literally images of the audio bits that can be processed in 1/24 of a second. There are about 6 different 'proprietary; audio tracks on them. From the 1900s optical tracks, to Dolby Digital and DTS. I think they also have magnetic stripes. The digital tracks were read with simple video camera's focused on the frame. I think one of the formats worked like the Warner's Vita-phone and synced to a CD. Technically the formats were supposed to be AES/EBU, but in practice they were more SPDIF, due to the noise immunity of the toslink cable.

In the late 1980s/early 1990s, I found a matrix decoder chip at the well known surplus store. Also found a fixable Stereo VHS machine. So I built my own Matrix surround system in my living room. They did not re-process the tracks back then. So you got the full uncompressed audio a direct dump of the theater track. VHS video may have been crap, But the audio was equivalent ,and used for studio mastering, as the stereo tracks were on the helix. (the mono track however was on the linear part below the helix.) The first Digital studio recorders were simply VHS machines that encoded the digital data into the 320x240 frame, which was the resolution of VHS. Ironically laser disk and CDs are actually an analogue format (PCM).

Back to the topic, Given the power of the STM32F series, especially the larger ones, I see no reason why a clever programmer, with the correct pipelining, could not do full motion standard def video and raw audio on this processor. It may have been part of the design parameter for the F4, to be the core of a STB, or a DVD stream decoder. It should also be noted that a lot of this decoding design would be more 'proprietary' as it would fall under the auspices of the DRM, and probably not publicly available. The trick would be engineering this outside the box -- so to speak. If a 386 with a 4GB drive could do it, I see no reason a STM32F4 is not up to the task.

Then again my F7 demo board has built in SPDIF format and header populated, so it is there for those who look hard enough.

Somewhere I have some NTSC/RGB/PAL decoders, Never had a fast enough DAT for them ...
Here is a link to that project http://delectra.com/anavid/anavid.html I still have a half dozen or more of these viewfinder displays ...Image

ag123
Posts: 239
Joined: Thu Mar 07, 2019 6:15 am
OS: linux
IDE: eclipse, arduino 1.8.5
Core: Roger's
Board: Maple mini, Bluepill

Re: video with audio playback

Post by ag123 » Sun Mar 31, 2019 9:10 pm

i think stm32f4 has an ART accelerator which probably works like cpu cache lines, stm32f103 m3 don't have that, but the f407 have it.
so certain codes runs (very) fast especially if it falls into sequences that can be 'ART accelerated'
nevertheless i'm noob on motion picture decoding, and a thing is i'd think most people simply program in C / C++ these days
optimizing things to use assembly is to an extent a 'lost art'
nevertheless even with C / C++ some 'speedups' may be possible, e.g. to unroll the loops
it seemed to an extent that helps

Post Reply