Difference between revisions of "Poky migration from rocko to warrior"

From ElphelWiki
Jump to: navigation, search
([SOLVED] Note 1: Bring back fpga char device)
(- Note 4: PHP causing 'unsupported FP instruction in kernel mode')
Line 129: Line 129:
 
  The current was 4.14
 
  The current was 4.14
 
  It didn't work. Roll back and check which php call caused it?
 
  It didn't work. Roll back and check which php call caused it?
 +
Also might be a linux driver.
  
 
* TODO: keep an eye on this, because the real reason is not investigated
 
* TODO: keep an eye on this, because the real reason is not investigated

Revision as of 16:10, 5 August 2019

Elphel's kernel tree

.
├── arch
│   └── arm
│       └── boot
│           └── dts/ # device trees for 393 cameras, considering tested
├── drivers
│   ├── ata
│   │   ├── ahci_elphel.c # tested reading and writing from/to SSD
│   │   └── libata-eh.c
│   ├── char
│   │   └── xilinx_devcfg.c # tested bitstream loading - brought back the old character device driver, it's simpler this way than the new one FPGA manager that can load only .bit.bin files
│   ├── clk
│   │   └── clk-si5338.c # chip found, no errors
│   ├── elphel
│   │   ├── circbuf.c # tested via recording
│   │   ├── clock10359.c
│   │   ├── command_sequencer.c # ok
│   │   ├── cxi2c.c
│   │   ├── detect_sensors.c
│   │   ├── elphel393-init.c # ok
│   │   ├── elphel393-mem.c # ok
│   │   ├── elphel393-pwr.c # ok
│   │   ├── exif393.c
│   │   ├── fpgajtag353.c
│   │   ├── framepars.c # ok
│   │   ├── gamma_tables.c # affects images which look ok
│   │   ├── histograms.c # displayed
│   │   ├── imu_log393.c
│   │   ├── jpeghead.c
│   │   ├── klogger_393.c
│   │   ├── lepton.c
│   │   ├── mt9f002.c
│   │   ├── mt9x001.c # sensor is programmed correctly
│   │   ├── multi10359.c
│   │   ├── pgm_functions.c # parameters are getting applied correctly (mt9p006)
│   │   ├── quantization_tables.c # images not broken
│   │   ├── sensor_common.c
│   │   ├── sensor_i2c.c
│   │   ├── x393.c
│   │   ├── x393_fpga_functions.c # ok
│   │   └── x393_videomem.c # also used in circbuf => recording => works
│   ├── misc
│   │   ├── ltc3589.c
│   │   └── vsc330x.c # switching between internal and external SSD ports works
│   ├── mmc
│   │   └── host
│   │       └── sdhci.c # this needed chip detect ORed with dat3: SDHCI_ANY_PRESENT = SDHCI_CARD_PRESENT | SDHCI_DAT3_PRESENT
│   ├── mtd
│   │   └── nand # added functions to work with OTP, tested only reading
│   │       ├── nand_base.c
│   │       ├── nandchip-micron.c
│   │       └── pl35x_nand.c
│   ├── net
│   │   └── ethernet
│   │       └── cadence
│   │           └── macb_main.c # needed fixup for Atheros chip - disable SmartEEE
│   └── rtc
│       └── rtc-m41t80.c # updated to latest version. Our changes only ignore Oscillator failure at boot at m41t80_get_datetime().
├── helpers
│   └── si5338_register_map_dts.py # test it?
├── other
│   └── mem.py
└── patches
    ├── ahci.patch
    ├── drivers-elphel.patch
    ├── garmin_usb.c.patch
    └── libahci.patch

[SOLVED] Note 1: Bring back fpga char device

  • /dev/xdevfg got retired by Xilinx - instead there's the FPGA 'Manager' which is unable to load a simple *.bit (only *.bin or *.bit.bin).
  • Solution:
Brought back the old driver (drivers/char/xilinx_devcfg.c and edited Kconfig and Makefile)- it works as it used to

[SOLVED] Note 2: Build php 5.6.40

  • php 5.6.40 - EOL and won't build - mysql supposedly moved header files.
  • Solution:
Disabled mysql extension:
To meta-elphel393/recipes-devtools/php/php_5.6.%.bbappend:
    PACKAGECONFIG[mysql] = "--without-mysql --without-mysqli --without-pdo-mysql"
    CFLAGS += " -ldl"

[SOLVED] Note 3: Entropy device hwrng

  • New package rng-tools is whining: Failed to init entropy source hwrng
  • Solution:
Leave as is for now. The full log is:
Initalizing available sources
Failed to init entropy source hwrng
Enabling JITTER rng support
Initalizing entropy source jitter
  • Comments:
    • Haven't found if Xilinx uses any driver for /dev/hwrng
    • TODO: Find out if the order of entropy sources can be changed

- Note 4: PHP causing 'unsupported FP instruction in kernel mode'

  • autocampars.php runs at boot and sometimes causes Kernel Oops:
[   35.872118] BUG: unsupported FP instruction in kernel mode
[   35.877621] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
[   35.883380] Modules linked in:
[   35.886498] CPU: 1 PID: 1756 Comm: php Not tainted 4.14.0-xilinx-v2018.3 #1
[   35.893459] Hardware name: Xilinx Zynq Platform
[   35.897989] task: ee83f280 task.stack: ef1d6000
[   35.902527] PC is at vfp_reload_hw+0x30/0x44
[   35.906802] LR is at __und_usr_fault_32+0x0/0x8
[   35.911338] pc : [<c0102e10>]    lr : [<c010c280>]    psr: a0000013
[   35.917529] sp : ef1d7fb0  ip : 00000051  fp : 00000001
[   35.922813] r10: ef1d61f8  r9 : c010c308  r8 : ee9893c0
[   35.928040] r7 : 00000001  r6 : 00400100  r5 : c0138d08  r4 : ecd600f8
[   35.934569] r3 : c0c6c064  r2 : b67bde8c  r1 : ecd9a224  r0 : eeb00a40
[   35.941098] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[   35.948241] Control: 18c5387d  Table: 2cda404a  DAC: 00000051
[   35.953993] Process php (pid: 1756, stack limit = 0xef1d6210)
[   35.959740] Stack: (0xef1d7fb0 to 0xef1d8000)
[   35.964020] 7fa0:                                     a5f43f50 a5f43e18 00000080 00000000
[   35.972269] 7fc0: 00000000 a5f43f4c b687b338 000000ae 00000000 bedcdfe4 00000001 a5f43ffc
[   35.980385] 7fe0: a5f43f50 a5f43d7c b676cf78 b67bde8c 60000010 ffffffff 00000000 00000000
[   35.988626] Code: 128aa080 e89a0162 e3110102 0a000003 (eee96a10) 
[   35.994724] ---[ end trace 06029778db6d2d90 ]---
[   35.999422] note: php[1756] exited with preempt_count 2

Unsupported floating point instruction in kernel?

  • Is it hardware (some faulty board? temperature based?) or kernel or php?
  • solution?:
Took arch/vfp/vfpmodule.c from kernel 4.19
The current was 4.14
It didn't work. Roll back and check which php call caused it?
Also might be a linux driver.
  • TODO: keep an eye on this, because the real reason is not investigated

[SOLVED] Note 5: Bring up NAND OTP support

  • MAC is not read from NAND, displays the default: 00:0e:64:10:00:00
  • Problem?
[    3.639851] elphel393-init: Flash page read, code -95
  • Comments:
    • Lookup what had changed.
  • Solution: (for xlnx_rebase_v4.14 branch of linux-xlnx):
In drivers/mtd/nand_base.c in nand_scan_tail() they call nand_manufacturer_init()
which is mapped to a new driver drivers/mtd/nand_micron.c.
So, when it fails - the driver init fails - mtd functions do not get assigned. 
(And the driver (drivers/elphel/elphel393_init.c) that reads from OTP area returns
-95 which is EOPNOTSUPP.)
We just need to fall through for a quick fix.
The reason that function exits with an error is it decides that it does not support
forcefully enabled on-die ECC. And this needs to be investigated.

[SOLVED] Note 6: udev - unknown group 'kvm'

  • Problem:
[    5.817352] udevd[1478]: starting version 3.2.7
[    5.918028] udevd[1478]: specified group 'kvm' unknown
[    5.986364] udevd[1479]: starting eudev-3.2.7
[    6.142897] udevd[1479]: specified group 'kvm' unknown
  • Solution:
KVM == Kernel-based Virtual Machine. Remove for now (and maybe forever)
.
└── udev
    ├── eudev
    │   └── 50-udev-default.rules
    └── eudev_3.2.7.bbappend
50-udev-default.rules - gets installed over the original file.

[SOLVED] Note 7: Add back fixup for Atheros to updated ethernet driver

  • Problem:
Ethernet driver's structure has changed. It was split into several files.
Lives at /driver/net/ethernet/cadence/
  • Soluton:
For out ethernet chip (Atheros 80xx) a fixup had to be added to disable SmartEEE.
It's a single function, call and a couple defines - added all back to the new driver structure.

[SOLVED] Note 8: u-boot update

  • update u-boot
  • solution:
Updated to 2019.07 mainstream u-boot
- converted our *.h (with params used to generate SPL header) to Kconfigs
- updated driver for NAND flash - tested both boot modes - mmc and nand

[SOLVED] Note 9: test camogm

  • test camogm
/var/state/camogm_cmd accepts only the first write - switch to polling?
when switched to polling - when recording - buffer gets overflow. Because the polling version does not work correctly probably.
All is working for the version without polling - after adding EOF reset (clearerr(npipe)) right after reading from the pipe and checking if feof().

[SOLVED] Note 10: test streamer

  • test streamer
Streamer works

[SOLVED] Note 11: test AHCI driver

  • test ahci driver
  • results:
- SSD is detected and automounted
- write/read works

[SOLVED] Note 12: test raw recording

  • test recording on a raw partition
  • comments:
There was a typo in camogm_align.c - it was not aligning when it should have.
CHUNK_LEADER changed to CHUNK_HEADER in line 339:
...
if (chunks[CHUNK_HEADER].iov_len != 0){ // only if it is not TIFF
...

[SOLVED] Note 13: FLIR Lepton 3.5 sensor: NULL pointer dereference

  • Solution:
Forgot to pull the latest device tree with lepton description
Old device tree didn't have i2c configuration for lepton hence something returned NULL
  • Original log:
framepars_operations elphel393-framepars@0: Configuring compressor DMA channels
circbuf elphel393-circbuf@0: Setting i2c drive mode for port 0
circbuf elphel393-circbuf@0: register_i2c_sensor()
detect_sensors elphel393-detect_sensors@0: detect_sensors_par2addr_init(): sensorPortConfig[0].sensor[0] = 0x44
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = ecdb4000
[00000000] *pgd=00000000
Internal error: Oops - BUG: 5 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 1 PID: 1755 Comm: php Not tainted 4.14.0-xilinx-v2018.3 #1
Hardware name: Xilinx Zynq Platform
task: ee80cd80 task.stack: ecda0000
PC is at register_i2c_sensor+0x244/0x2ac
LR is at 0x0
pc : [<c05a19e8>]    lr : [<00000000>]    psr: 60030013
sp : ecda1480  ip : ecda14a8  fp : 00000000
r10: c0ee625c  r9 : 000000fc  r8 : 00000000
r7 : 00000028  r6 : ecda14a8  r5 : c0c3ca58  r4 : 00000000
r3 : 00000000  r2 : c09b093a  r1 : ee973c91  r0 : 00000000
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 18c5387d  Table: 2cdb404a  DAC: 00000051
Process php (pid: 1755, stack limit = 0xecda0210)
Stack: (0xecda1480 to 0xecda2000)
...
---[ end Kernel panic - not syncing: Fatal exception in interrupt