Difference between revisions of "SoC"

From ElphelWiki
Jump to: navigation, search
(Demosaic Algorithms in FPGA)
 
(29 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== Design Ideas ==
+
[[Design_Ideas|Design Ideas]]
=== FPGA Theora Encoder for Videoconferencing ===
 
In 2005 Elphel implemented a subset of Theora video encoder in Xilinx® FPGA that is part of Elphel model 333 camera capable of compressing 1280x1024@30fps ([http://www.xilinx.com/publications/xcellonline/xcell_53/xc_pdf/xc_video53.pdf], [http://www.linuxdevices.com/articles/AT3888835064.html]), but the CPU in the camera was not fast enough for the job even when the hard part was made by the hardware. In the model 333 camera software was responsible for generating frame headers and Ogg encapsulation of the Theora bitstream provided by FPGA.
 
Knowing that [http://developer.axix.com Axis Communications AB] were going to release a new faster processor we decided to wait for it before proceeding with Theora in the camera and used plain old Motion JPEG for a while.
 
 
 
Now we have the the new [[Roadmap#Update_on_353.2F363_cameras|Model 353 camera]] tested and released to production - the camera that has a brand new [http://en.wikipedia.org/wiki/ETRAX_CRIS#ETRAX_FS ETRAX FS], more memory and larger FPGA and is already tested in JPEG mode. So now it is a perfect time to resurrect Theora code in the camera and move forward.
 
 
 
Current [http://elphel.cvs.sourceforge.net/elphel/camera333/fpga/x333 FPGA implementation] supports only INTRA and INTER NOMV frames - the goal was to provide efficient compression for the scenes where the camera does not move (CCTV, videoconferencing) and large part of the frame stays the same. To reduce the bandwidth more we need to utilize selective block encoding so if camera is looking at an empty hallway there would be no bitstream at all but INTRA frames - just header telling that no block was encoded.
 
 
 
Such ability to selectively encode blocks is already in the FPGA code but we never used it with the slow CPU - encoded block map is a part of the frame header and the header is built by software, currently - before the video starts. To move farther we need either add FPGA code to generate frame headers or make use of the faster processor and do it in software.
 
 
 
Such project requires both FPGA code development (we use, and the rest of the code is written in Verilog HDL) and driver/application code (usually in C). When I was writing code (and debugging it) for the original encoder of the 333 camera I had to do both, but it would be nice to make such development in a team.
 
=== AJAX Camera Interface with PHP/fastCGI ===
 
In March of the 2006 I published an article in [http://linuxdevices.com LinuxDevices] - [http://linuxdevices.com/articles/AT5951285077.html "AJAX, LAMP, and liveDVD for a Linux-based camera"]. It was a result of an interesting project to create a camera web interface with sliders, semi-transparent overlays, embedded MPlayer video plugin and other fancy features. It would be nice if you could create something like that with regular web development tools, but it that case I had to cheat - modify (and create new) CGI programs (mostly compiled programs in C, some - shell scripts) running in the camera provideing the server part of AJAX.
 
 
 
In the [[Roadmap#Update_on_353.2F363_cameras|Model 353 camera]] there is more memory (64MB system, 64 MB video buffer and 128MB flash) and the CPU is three times faster. This allows to expand the usage of the mainstream web development tools in the camera - replace binary CGI programs running on the server (in the camera) with the [http://www.php.net/ PHP] code - it seem to run nicely in the  [http://www.fastcgi.com fastCGI] mode in the camera with [http://www.lighttpd.net lightTPD] web server.
 
 
 
We plan to update the driver interface to simplify hardware interfacing with the PHP code (i.e. replace IOCTL with read/writes) and unleash creativity of the web developers. Instead of having API to the hardware as something given (in some cases - even "taken", not "given" - when the API is unpublished and has to be reverse-engineered ) you'll be able to create one of you dream.
 
 
 
Or - of the dream of [http://www.dvinfo.net/conf/ Digital Video enthusiasts]? See:
 
* [[HD_cinema_camera_Development_FAQ]]
 
* [http://dvinfo.net/conf/showthread.php?t=63677 High Definition with Elphel model 333 camera]
 
* [http://dvinfo.net/conf/showthread.php?t=83044 Elphel 333 (practical thread)]
 
* [http://dvinfo.net/conf/showthread.php?t=85568 Elphel 333 HTML]
 
 
 
=== LAMP-based DVR ===
 
=== Electronic Rolling Shutter Distortion Compensation ===
 
=== Demosaic Algorithms in FPGA ===
 
What is it and why it is needed in [http://en.wikipedia.org/wiki/Demosaicing Demosaicing] in Wikipedia.
 
In Elphel cameras Bayer-encoded pixels are processed just in front of the compressor (JPEG/MJPEG, Ogg Theora). These compressors use 16x16 blocks of pixels converted to YCbCr (intensity and 2 color components), currently we use 4:2:0 that means that color (chroma) components have twice less the spacial resolution (in each direction) than intensity (luma). So for each 4 input pixels compressor needs 4 Y (luma) pixels and one of each Cb and Cr (chroma) ones.
 
 
 
Our first cameras used very simple algorithm to calculate YCbCr from the Bayer pixels and for each pixel it needed just 3x3 block of neighbors. And as these neighbors were needed for the outer pixels in the 16x16 blocks, FPGA had to read larger (18x18) overlapping blocks from the external memory (internal FPGA memory is much smaller and can not hold the whole image). In the later FPGA code 20x20 blocks are read in (to make possible implementation of the fancier demosaic algorithms), but that was not done - the outer pixels are discarded and still only 3x3 used.
 
 
 
There are several algorithms that provide good results with less artifacts (see Wikipedia article) and these detailed descriptions:
 
* [http://scien.stanford.edu/class/psych221/projects/99/tingchen/algodep/vargra.html Variable Number of Gradients]
 
* [http://web.cecs.pdx.edu/~cklin/demosaic/ Pixel Grouping]
 
 
 
So just implement one in the FPGA code of the camera? Or adapt those ideas to use use 4:2:0 encoding and convert directly to Bayer->YCbCr (not Bayer->RGB->YCbCr)
 

Latest revision as of 20:58, 2 May 2007

Design Ideas