Circbuf

From ElphelWiki
Revision as of 11:12, 30 September 2007 by Andrey.filippov (talk | contribs)
Jump to: navigation, search

Circular buffer for the image/video data in Elphel 353 cameras: Description and access

Overview

Elphel cameras use FPGA to compress images and video, the compressed frames are transferred from the FPGA to the system memory through pseudo-DMA channel implemented in the CPU (Axis Etrax FS). It is "pseudo" as physically the data still goes (over the system bus) from the FPGA to the CPU chip, buffered there in small chunks and then sent to the system memory. Overall on average it takes 5 bus cycles (100MHz clock) to transfer one 32-bit word (4 bytes) of data from the FPGA to the system memory - 80MB/sec. There are hardware provisions in the model 353 to support new feature of ETRAX FS (not available in earlier ETRAX 100LX used in Elphel earlier models 303,313,323 and 333 cameras) - allowing external device (FPGA in our case) to become the bus master. That can allow to increase the data rate to virtually full 400MB/sec (in large bursts), but no FPGA code is yet developed to support this mode.

Circular buffer in Elphel 353 camera

The compressed frames are stored in a large circular buffer, it is done similar in the previous cameras and with other compressor (Ogg Theora) but the following description applies to the current state of the firmware (7.1.0.1) of the model 353 camera with JPEG compressor (Theora branch is not ported yet as of September 2007). Currently buffer size - CCAM_DMA_SIZE is ~19MB as defined in include/asm-cris/elphel/c313a.h).

All DMA transfers are 32-bytes aligned (32 bytes is a size of the internal DMA buffer in Etrax), and the data received from the FPGA contains JPEG-encode bit stream -- everything between SOF and EOF tags, no header, all 0xff data bytes escaped by appropriate 0x00 codes as required by the standard, so no 2 0xff bytes can appear together - this fact is used to find out if the data in the circular buffer was overwritten by the newer image frames.

When FPGA outputs the whole frame (it does not know the frame size in advance, only after compression is over) it adds 12 more bytes (3 long words), aligning them to the end of the 32-byte chunk, adding zeros before the data if needed. The last 4 bytes contain a 24-bit byte length of the JPEG data with the high byte set to 0xff. 8 bytes before (2 long words) have timestamp of the frame exposure start - seconds from epoch and microseconds.

The following listing has a sample of the buffer data (generated by the /usr/local/bin/test_mmap:

4b7ff8:  000000a4  06000800  016c0120  0a3946c1  01ab010a  40404040  ffff0000  000378fe
000000: {d30b03cc} 8af1dc34  a5875e77  9d4cf334  204db683  b1533cc7  c59f949e  36b1af45
000008:  90e20102  5e9ad7f1  f68cd273  130a1dc5  7e50641c  41bd14ef  140720c5  73c0e02e
...
00de38:  20a50766  4e433fc9  7614f51c  087bd2e4  988ee64e  cdf41ca0  6af40e1c  0000003b
00de40:  00000000  00000000  00000000  00000000  00000000  46ff4a82  000303bd <ff0378fe>
00de48:  f6285e9f  aeacb0a5  00000000  00000000  00000000  46ff4a81  000a1930  ff037929
00de50: [00000000] 00000000  00000000  00000000  00000000  00000000  00000000  00000000

The buffer is circular, so before 0 there is 0x4b7fff - last long word in the buffer). In this example FPGA compressed frame starting from the very beginning of the circbuf, bit stream is 0x378fe bytes long (as written in the 0x00de47 - OR-ed with 0xff000000) and it ends in the long word 0xde3f : "cc 03 0b d3 ... 1c 0e f4 6a 3b 00"

Timestamp - seconds=0x46ff4a82 and microseconds=000303bd are stored in the 0xde45 and 0xde46 - right before the bitstream length. Internal Etrax DMA ponter (circbuf write pointer) is now equal 0xde50 so the next frame will be acquired starting from that word. Data in the range 0xde48..0xde4f is garbage (you can see remnants from the other frame acquired starting from the same address 0x0 - "46ff4a81 000a1930 ff037929"). Actually FPGA already had sent 32 bytes (8 long words) of zeros - they are now in the ETRAX DMA buffer and will be written to 0xde48..0xde4f as soon as FPGA will start sending the next frame from address 0xde50. So the 32-byte spare area after (address-wise) the frame (it is used to make sure that all the current frame data, including timestamp+length is actually written to the system memory when the frame is finished without explicit flushing of the DMA buffer) can not be used, but it is possible to use it preceding the frame, and this is what is now implemented in the firmware of the 353 camera.

Frame metadata stored in the circbuf (added by the software

The interrupt service routine arch/cris/arch-v32/drivers/elphel/cc353.c::camSeq_interrupt is responsible for

  • updating software pointer (JPEG_wp) after each new frame is compressed to match the location where next frame bitstream will be output,
  • copying the bitstream length before the frame itself (no simplify traversing frames) - it is now at 0x4b7fff (copied from 0xde4f lower 24 bits)
  • writing 2 of the 0xff bytes just before the length as a marker of valid frame header (incoming compressed bitstream data will never have 2 0xffs, and the microsecond high byte, that might be written at that location is always 0)
  • filling in the 26 byte remaining area right before the last frame acquired - it is now safe until the whole buffer will be filled with image data. In the example above it starts at 0x4b7ff8.

The 26 (actually 28, as 0xffff marker is also included in the memcpy) bytes are copied from the data structure frame_params_t defined in include/asm-cris/elphel/c313a.h

struct frame_params_t {
/*00-03*/ unsigned long  exposure;       //! currently - exposure time measured in 100usec. Really need to change it to smaller increments?
/*04-05*/ unsigned short width;          //! frame width, pixels
/*06-07*/ unsigned short height;         //! frame height, pixels
/*08-11*/ unsigned long  colorsat;       //! matches FPGA format , for 1.0 it is DEFAULT_COLOR_SATURATION_RED<<16 + DEFAULT_COLOR_SATURATION_BLUE
/*12   */ unsigned char  color;          //! 0 - mono, 1 - color, 2 - jp4 + (0x40 - flipX) + (0x80 - flipY)
/*13   */ unsigned char  quality;        //! compression quality (%)
/*14   */ unsigned char  gamma;          //! (%), 255 - non-gamma curve, 0 - raw (16-bit data) - not yet implemented
/*15   */ unsigned char  black;          //! black level shift (255 - full scale)
/*16-17*/ unsigned short rscale;         //! 8.8 - red  relative to green - "gamma" table, not sensor gain
/*18-19*/ unsigned short bscale;         //! 8.8 - blue relative to green - "gamma" table, not sensor gain
/*20   */ unsigned char  gain_r;         //! color gain Red   (sensor specific), is set
/*21   */ unsigned char  gain_g;         //! color gain Green (sensor specific)
/*22   */ unsigned char  gain_b;         //! color gain Blue  (sensor specific)
/*23   */ unsigned char  gain_gb;        //! color gain Green in Blue line (sensor specific)
/*24   */ unsigned char  bindec_hor;     //! ((bh-1) << 4) | (dh-1) & 0xf (binning/decimation horizontal, 1..16 for each)
/*25   */ unsigned char  bindec_vert;    //! ((bv-1) << 4) | (dv-1) & 0xf (binning/decimation vertical  , 1..16 for each)
/*26-27*/ unsigned short signffff;       //! should be 0xffff - it will be a signature that JPEG data was not overwritten,
                                         //! JPEG bitstream can not have two 0xff after each other
/*28-31*/ unsigned long  timestamp_sec ; //! number of seconds since 1970 till the start of the frame exposure
/*32-35*/ unsigned long  timestamp_usec; //! number of microseconds to add
};

the last 8 bytes (imestamp_sec,imestamp_usec) are not copied - this information is provided by the FPGA following the bitstream (see above), these members of the structure can be used in a local copy in the application to store the time stamp data. By applying this data structure to the circbuf in the example above (this is just one line)

4b7ff8:  000000a4  06000800  016c0120  0a3946c1  01ab010a  40404040  ffff0000  000378fe

you may decode that

  • exposure was [0x4b7ff8] 0xa4*100usec= 0.0164 s,
  • frame width - 0x800 (2048)
  • frame height - 0x600 (1536)
  • saturation red is 016c (actually it is raw FPGA data proportional to actual saturation that was 2.0 in that case)
  • saturation blue is 0x120 (same note as above, also 2.0 - the scale for blue is different)
  • color/mirror mode was 0xc1 (normal color, flipX, flip Y)
  • JPEG quality - 0x46 (70%)
  • gamma - 0x39 - 0.57 (57%)
  • black level shift (subtracted from the pixels before gamma on 0..256 scale) - 0x0a (10)
  • rscale - 0x010a 0x01+(0x0a/0x100) - relative red-to-green scale used to build per-color component gamma tables
  • bscale - 0x01ab 0x01+(0xab/0x100) - relative blue-to-green scale used to build per-color component gamma tables
  • gain_r - 0x40 - analog gain setting of the sensor red pixels (hardware dependent, here 4.0x)
  • gain_g - 0x40 - same for green
  • gain_b - 0x40 - same for blue
  • gain_gb - 0x40 - same for green pixels in blue lines
  • bindec_hor - 0 - binning and decimation horizontal is 1/1 (no binning, no decimation)
  • bindec_vert - 0 - binning and decimation vertical is 1/1 (no binning, no decimation)

Circbuf data access - /dev/circbuf

User access to the circular buffer is provided by the /dev/circbuf device driver. It is now in arch/cris/arch-v32/drivers/elphel/circbuf.c. That device supports read, write, poll methods, but the most important are mmap and (overloaded) lseek. Mmap provides direct access to the circbuf data, lseek is overloaded to include additional functions that normally implemented in ioctl. It is a hack, of course, but gives easy access to the buffer data (and metadata) from the PHP scripts. Below is just the comments from the source code, I'll write more a little later...

/*!=============================================================================================
*! Overloading lseek with additional functionality (to avoid ioctls)
*! with orig==SEEK_END lseek will treat (offset>0) as a command
*! to manipulate frame pointer(s) or wait for the image to be ready
*! using these commands
*!  CIRCLSEEK_TORP .- set filepointer to global (shared) read pointer
*!  CIRCLSEEK_TOWP  - set filepointer to FPGA write pointer (next frame to be acquired)
*!  CIRCLSEEK_PREV  - move pointer to the previous frame, return -EOVERFLOW if there are none
*!  CIRCLSEEK_NEXT  - advance pointer to the next frame, return -EOVERFLOW if was already
*!      at the last
*!  CIRCLSEEK_LAST  - move pointer to the last acquired frame (default after open)
*!      (it is combination of 2+3)
*!  CIRCLSEEK_FIRST - move pointer to the first acquired frame, returns
*!                    total number of frames available
*!  CIRCLSEEK_SETP - save current pointer to global read pointer
*!  CIRCLSEEK_VRFY - verify frame is available at current loacation. Returns
*!      1 if there is, 0 if good pointer but no data yet,
*!      -1 - no farme at this location (probably buffer overrun),
*!      -2 - not 32-byte aligned
*!  CIRCLSEEK_WAIT - sleep until next frame is acquired 
*! All commands but (CIRCLSEEK_TOWP,CIRCLSEEK_LAST,CIRCLSEEK_FIRST) will return -EINVAL if read 
*! pointer is not valid  (i.e buffer was overrun and data pointed is lost). if success they return
*! the current (byte *) to the start of the frame data (parameters are at
*! offsett =-32 from it)
*! (0, SEEK_CUR) also verifies that the header is not overwritten. It can be used
*! after buffering frame data to verify you got it all correctly
*! SEEK_CUR also supports the circular nature of the buffer and rolls over if needed
*!=============================================================================================*/

Reading JPEG headers

JPEG headers are generated separately from the birstream, access is provided through /dev/jpeghead driver(description to be continued)

/*!=================================================================
*! Overloading lseek with additional functionality (to avoid ioctls)
*! with orig==SEEK_END lseek will treat (offset>0) as a byte pointer
*! in (char *)ccam_dma_buf of a frame pointer and use quality,
*! width and height to regenerate header.
*! frame pointers are 32-bytes aligned, so adding 1 to offest
*! will make sure it is always >0 (as offset=0, orig=SEEK_END
*! will just move pointer to the end and return file length.
*! 
*! When called with orig==SEEK_END, offset>0 lseek will position
*! file at the very beginning and return 0 if OK, -EINVAL if
*! frame header is not found for the specified offset
*!================================================================*/