Circbuf

From ElphelWiki
Revision as of 09:48, 6 October 2007 by Andrey.filippov (talk | contribs) (Circbuf data access - /dev/circbuf)
Jump to: navigation, search

Circular buffer for the image/video data in Elphel 353 cameras: Description and access

Overview

Elphel cameras use FPGA to compress images and video, the compressed frames are transferred from the FPGA to the system memory through pseudo-DMA channel implemented in the CPU (Axis Etrax FS). It is "pseudo" as physically the data still goes (over the system bus) from the FPGA to the CPU chip, buffered there in small chunks and then sent to the system memory. Overall on average it takes 5 bus cycles (100MHz clock) to transfer one 32-bit word (4 bytes) of data from the FPGA to the system memory - 80MB/sec. There are hardware provisions in the model 353 to support new feature of ETRAX FS (not available in earlier ETRAX 100LX used in Elphel earlier models 303,313,323 and 333 cameras) - allowing external device (FPGA in our case) to become the bus master. That can allow to increase the data rate to virtually full 400MB/sec (in large bursts), but no FPGA code is yet developed to support this mode.

Circular buffer in Elphel 353 camera

The compressed frames are stored in the large circular buffer, it is done similar in the previous cameras and with other compressor (Ogg Theora) but the following description applies to the current state of the firmware (7.1.0.1) of the model 353 camera with JPEG compressor (Theora branch is not ported yet as of September 2007). Currently buffer size - CCAM_DMA_SIZE is ~19MB as defined in include/asm-cris/elphel/c313a.h).

All DMA transfers are 32-bytes aligned (32 bytes is a size of the internal DMA buffer in Etrax), and the data received from the FPGA contains JPEG-encode bit stream -- everything between SOF and EOF tags, no header, all 0xff data bytes escaped by appropriate 0x00 codes as required by the standard, so no 2 0xff bytes can appear together - this fact is used to find out if the data in the circular buffer was overwritten by the newer image frames.

When FPGA outputs the whole frame (it does not know the frame size in advance, only after compression is over) it adds 12 more bytes (3 long words), aligning them to the end of the 32-byte chunk, adding zeros before the data if needed. The last 4 bytes contain a 24-bit byte length of the JPEG data with the high byte set to 0xff. 8 bytes before (2 long words) have timestamp of the frame exposure start - seconds from epoch and microseconds.

The following listing has a sample of the buffer data (generated by the /usr/local/bin/test_mmap):

4b7ff8:  000000a4  06000800  016c0120  0a3946c1  01ab010a  40404040  ffff0000  000378fe
000000: {d30b03cc} 8af1dc34  a5875e77  9d4cf334  204db683  b1533cc7  c59f949e  36b1af45
000008:  90e20102  5e9ad7f1  f68cd273  130a1dc5  7e50641c  41bd14ef  140720c5  73c0e02e
...
00de38:  20a50766  4e433fc9  7614f51c  087bd2e4  988ee64e  cdf41ca0  6af40e1c  0000003b
00de40:  00000000  00000000  00000000  00000000  00000000  46ff4a82  000303bd <ff0378fe>
00de48:  f6285e9f  aeacb0a5  00000000  00000000  00000000  46ff4a81  000a1930  ff037929
00de50: [00000000] 00000000  00000000  00000000  00000000  00000000  00000000  00000000

The buffer is circular, so before 0 there is 0x4b7fff - last long word in the buffer). In this example FPGA compressed frame starting from the very beginning of the circbuf, bit stream is 0x378fe bytes long (as written in the 0x00de47 - OR-ed with 0xff000000) and it ends in the long word 0xde3f : "cc 03 0b d3 ... 1c 0e f4 6a 3b 00"

Timestamp - seconds=0x46ff4a82 and microseconds=000303bd are stored in the 0xde45 and 0xde46 - right before the bitstream length. Internal Etrax DMA ponter (circbuf write pointer) is now equal 0xde50 so the next frame will be acquired starting from that word. Data in the range 0xde48..0xde4f is garbage (you can see remnants from the other frame acquired starting from the same address 0x0 - "46ff4a81 000a1930 ff037929"). Actually FPGA already had sent 32 bytes (8 long words) of zeros - they are now in the ETRAX DMA buffer and will be written to 0xde48..0xde4f as soon as FPGA will start sending the next frame from address 0xde50. So the 32-byte spare area after (address-wise) the frame (it is used to make sure that all the current frame data, including timestamp+length is actually written to the system memory when the frame is finished without explicit flushing of the DMA buffer) can not be used, but it is possible to use it preceding the frame, and this is what is now implemented in the firmware of the 353 camera.

Frame metadata stored in the circbuf (added by the software)

The interrupt service routine arch/cris/arch-v32/drivers/elphel/cc353.c::camSeq_interrupt is responsible for

  • updating software pointer (JPEG_wp) after each new frame is compressed to match the location where next frame bitstream will be output,
  • copying the bitstream length before the frame itself (no simplify traversing frames) - it is now at 0x4b7fff (copied from 0xde4f lower 24 bits)
  • writing 2 of the 0xff bytes just before the length as a marker of valid frame header (incoming compressed bitstream data will never have two of 0xff bytes together, and the microsecond high byte, that might be written at that location is always 0)
  • filling in the 26 byte remaining area right before the last frame acquired - it is now safe until the whole buffer will be filled with image data. In the example above it starts at 0x4b7ff8.

The 26 (actually 28, as 0xffff marker is also included in the memcpy) bytes are copied from the data structure frame_params_t defined in include/asm-cris/elphel/c313a.h

struct frame_params_t {
/*00-03*/ unsigned long  exposure;       //! currently - exposure time measured in 100usec. Really need to change it to smaller increments?
/*04-05*/ unsigned short width;          //! frame width, pixels
/*06-07*/ unsigned short height;         //! frame height, pixels
/*08-11*/ unsigned long  colorsat;       //! matches FPGA format , for 1.0 it is DEFAULT_COLOR_SATURATION_RED<<16 + DEFAULT_COLOR_SATURATION_BLUE
/*12   */ unsigned char  color;          //! 0 - mono, 1 - color, 2 - jp4 + (0x40 - flipX) + (0x80 - flipY)
/*13   */ unsigned char  quality;        //! compression quality (%)
/*14   */ unsigned char  gamma;          //! (%), 255 - non-gamma curve, 0 - raw (16-bit data) - not yet implemented
/*15   */ unsigned char  black;          //! black level shift (255 - full scale)
/*16-17*/ unsigned short rscale;         //! 8.8 - red  relative to green - "gamma" table, not sensor gain
/*18-19*/ unsigned short bscale;         //! 8.8 - blue relative to green - "gamma" table, not sensor gain
/*20   */ unsigned char  gain_r;         //! color gain Red   (sensor specific), is set
/*21   */ unsigned char  gain_g;         //! color gain Green (sensor specific)
/*22   */ unsigned char  gain_b;         //! color gain Blue  (sensor specific)
/*23   */ unsigned char  gain_gb;        //! color gain Green in Blue line (sensor specific)
/*24   */ unsigned char  bindec_hor;     //! ((bh-1) << 4) | (dh-1) & 0xf (binning/decimation horizontal, 1..16 for each)
/*25   */ unsigned char  bindec_vert;    //! ((bv-1) << 4) | (dv-1) & 0xf (binning/decimation vertical  , 1..16 for each)
/*26-27*/ unsigned short signffff;       //! should be 0xffff - it will be a signature that JPEG data was not overwritten,
                                         //! JPEG bitstream can not have two 0xff after each other
/*28-31*/ unsigned long  timestamp_sec ; //! number of seconds since 1970 till the start of the frame exposure
/*32-35*/ unsigned long  timestamp_usec; //! number of microseconds to add
};

the last 8 bytes (imestamp_sec,imestamp_usec) are not copied - this information is provided by the FPGA following the bitstream (see above), these members of the structure can be used in a local copy in the application to store the time stamp data. By applying this data structure to the circbuf in the example above (this is just one line)

4b7ff8:  000000a4  06000800  016c0120  0a3946c1  01ab010a  40404040  ffff0000  000378fe

you may decode that

  • exposure was [0x4b7ff8] 0xa4*100usec= 0.0164 s,
  • frame width - 0x800 (2048)
  • frame height - 0x600 (1536)
  • saturation red is 016c (actually it is raw FPGA data proportional to actual saturation that was 2.0 in that case)
  • saturation blue is 0x120 (same note as above, also 2.0 - the scale for blue is different)
  • color/mirror mode was 0xc1 (normal color, flipX, flip Y)
  • JPEG quality - 0x46 (70%)
  • gamma - 0x39 - 0.57 (57%)
  • black level shift (subtracted from the pixels before gamma on 0..256 scale) - 0x0a (10)
  • rscale - 0x010a 0x01+(0x0a/0x100) - relative red-to-green scale used to build per-color component gamma tables
  • bscale - 0x01ab 0x01+(0xab/0x100) - relative blue-to-green scale used to build per-color component gamma tables
  • gain_r - 0x40 - analog gain setting of the sensor red pixels (hardware dependent, here 4.0x)
  • gain_g - 0x40 - same for green
  • gain_b - 0x40 - same for blue
  • gain_gb - 0x40 - same for green pixels in blue lines
  • bindec_hor - 0 - binning and decimation horizontal is 1/1 (no binning, no decimation)
  • bindec_vert - 0 - binning and decimation vertical is 1/1 (no binning, no decimation)

Circbuf data access - /dev/circbuf

User access to the circular buffer is provided by the /dev/circbuf device driver. It is now in arch/cris/arch-v32/drivers/elphel/circbuf.c. That device supports read, write, poll methods, but the most important are mmap and (overloaded) lseek. mmap provides direct access to the circbuf data, lseek is overloaded to include additional functions that are normally implemented in ioctl. It is a hack, of course, but gives easy access to the buffer data (and meta data) from the PHP scripts.

Normally lseek (also fseek, ftell) are designed to move file pointer and find out current pointer position. For example these functions can be used to determine the file size - move the file pointer to the end of file at read it value back. The last argument of lseek (fseek) function (called whence for some historical reasons) can take one of the 3 values:

  • SEEK_SET - passed offset value (2-nd argument) used directly as the new pointer position (i.e. fseek(file,0,SEEK_SET) would place the pointer at the very beginning of the file)
  • SEEK_CUR - relative movement of the file pointer. lseek(fp,0,SEEK_CUR) will return current position without modifying it (like ftell(file))
  • SEEK_END - move file pointer relative to the end of file, so any positive value of offset moves file pointer beyond the end of file. It is not considered to be an error - normal files can be written to beyond the current file length. This is not applicable to current implementation of the circbuf (it has a fixed size), so the SEEK_END with positive (>0) values of offset are used to perform several specific actions with the pointers to circbuf data and to wait for the next image to be acquired. Using SEEK_END with offset==0 (or negative) will have no difference to normal implementations, fseek(file, 0, SEEK_END) will move the file pointer just after the last byte in the circbuf. All the accesses with SEEK_CUR, SEEK_SET will use the fact that the buffer is circular - the file pointer will just roll over, so fseek (file, -2, SEEK_SET) will position the pointer just in front of the 2 last bytes of the circbuf.

The values for symbolic constants described below are defined in the include/asm-cris/elphel/c313a.h header file.

"Current pointer" or "selected frame" in the description below references to the file pointer of an opened file. As soon as the /dev/circbuf file is closed (usually happens between individual accesses through the web server), the value of such pointer is lost.

  • CIRCLSEEK_TORP - set file pointer to global (shared) read pointer. This pointer is maintained by the camera driver and preserves the value even when particular file access to circbuf is closed, it is valid for individual accesses to the camera through embedded web servers. But this pointer is the shared one, so it will have predictable values only if there is a single client to manipulate it.
  • CIRCLSEEK_TOWP - set file pointer to FPGA write pointer. This is the "hardware" pointer in the circbuf (next frame to be acquired). There is no frame available at that position in the circbuf, but it is a good place to wait for the next frame to come.
  • CIRCLSEEK_PREV - move frame pointer to the previous one (acquired just before the currently selected), the call will return an negative number -- error (-EOVERFLOW) if there are none frames in the buffer older than the currently pointed one. That can be the case if there are not enough frames acquired since last circbuf buffer reset or if the newer frames had overwritten the ones before the current.
  • CIRCLSEEK_NEXT - advance frame pointer to the next frame acquired, return -EOVERFLOW if was already at the last one (if the current pointer equals to the FPGA write pointer)
  • CIRCLSEEK_LAST - move pointer to the last acquired frame (default after open). It is similar to a combination of CIRCLSEEK_PREV after CIRCLSEEK_TOWP, but no error will be generated if there are no frames in the buffer - pointer will stay at FPGA write pointer.
  • CIRCLSEEK_FIRST - move pointer to the oldest acquired frame that still is available in the buffer. It is OK to browse through the frames in the buffer including that one when the image acquisition is stopped, but it is not safe to rely on this pointer if more frames are expected - next incoming frame can overwrite this one.
  • CIRCLSEEK_SCND - move pointer to the second oldest acquired frame. A slightly safer to use instead of CIRCLSEEK_FIRST when constant acquisition is on and sensor provides new frames - this frame will likely survive longer and give you a chance for the CIRCLSEEK_NEXT at least to buy some more time.
  • CIRCLSEEK_SETP - save current pointer to the global read pointer. This will allow to continue from the same frame next time the circbuf will be accessed as the global read pointer survives closing the circbuf file. But it is a global variable, so if several clients use this pointer they might interfere with each other.
  • CIRCLSEEK_VALID - verify that the frame at the current location is valid (not overrun in the buffer) Returns file pointer if valid, else - an error.
  • CIRCLSEEK_READY - verify the frame at current location is available (valid and acquired) Returns file pointer if ready, else - an error. If the file pointer was set to FPGA write pointer it will pass validation with CIRCLSEEK_VALID ( the pointer is still valid and not overrun by the later data), but will fail CIRCLSEEK_READY (frame is not ready), for all other pointers CIRCLSEEK_VALID and CIRCLSEEK_READY will give the identical result.
  • CIRCLSEEK_WAIT - this call will put the process to slip until the image will be available at the current pointer. The actual sleep will take place only if the current pointer is equal to FPGA write pointer, in all other cases it will return immediately. If the pointer was invalid (i.e. overrun) the call will return with an error as there is no sense to wait for a frame here. If the pointer is valid but is not equal to write pointer - frame is already available and the call will return with current pointer immediately.

Reading JPEG headers

Circbuf buffer contains only the compressed bitstream and some metadata, JPEG headers that are needed for complete JPEG images are generated separately by the driver and can be read through the /dev/jpeghead device driver. This file supports read and lseek method among others, and SEEK_END is again overloaded with additional function - regenerating of the header (quantization tables and image size parameters) according to the metadata stored in the circbuf for the particular frame. This driver accepts frame pointer in the circbuf (that can be obtained with lseek/ftell calls to /dev/circbuf described in [Circbuf data access] section. Frames are always aligned to the 32-byte boundaries and lseek /dev/jpeghead with SEEK_END and positive offsets always zero out lower 5 bits so it is safer to provide offset=frame_pointer+1 to avoid offset of zero that will yield the file size when combined with SEEK_END, not the header recalculation.

This call will return an error if no valid frame exist at the current location (rounded down to the nearest 32 bytes).