An image sensor board based on Arduino Due and uCamII camera

runs on Arduino Due/MEGA and Teensy3.2 with a uCamII camera
support for IEEE 802.15.4 and long-range LoRa^TM radios

C. Pham, LIUPPA laboratory, University of Pau, France. http://cpham.perso.univ-pau.fr/

Introduction

There are a number of image sensor boards available or proposed by the very active research community on image and visual sensors: Cyclops, MeshEyes, Citric, WiCa, SeedEyes, Panoptes, CMUcam3&FireFly, CMUcam4, CMUcam5/PIXY, iMote2/IMB400, ArduCam, ... All these platforms and/or products are very good and our motivations in building our own image sensor platform for research on image sensor surveillance applications are:

Architecture and components

We tested with Arduino Due, Arduino MEGA2560 and Teensy3.2. The Arduino Due board (left) and the Teensy3.2 (left) have enough SRAM memory (96kB and 64kB respectively) to store an 128x128 8-bit/pixel RAW image (16384 bytes). On the MEGA2560, which has only 8KB of SRAM memory, we store the captured image on an SD card (see middle figure below for an exemple) and then perform the encoding process by incrementally reading small portions of the image file. The MEGA, or other small-memory platforms, are only for validation, they can be quite unstable. The Due and the Teensy are much more reliable.

The encoding scheme is the one described in our test-bed pages and it has been ported to the Arduino Due (and later tested on the Teensy3.2) with very little modifications. On the MEGA2560, the packetization procedure has been modified by V. Lecuire to produce packets on-the-fly, during the encoding process. With the SD card to store the captured image and the modified packetization process, the entire Arduino sketch fits in the 8KB SRAM memory of the MEGA2560. The final result is shown below for the Arduino Due (left) and the MEGA (right) where we use the Wireless SD Shield from Arduino to have the embedded SD card slot.

Here is a detailed view of the connections (the image takes the Arduino Due but the connection layout is exactely the same for the MEGA).

The Teensy3.2 is a nice board in a smaller format that the Arduino Due. The LoRa module is connected with SPI pins (read above) and the uCam is connected to UART1 (RX1:pin0, TX1:pin1)

Remote commands

The image sensor accepts the following ASCII commands sent wirelessly. These commands must be prefixed by "/@" and separated by "#" (for instance "/@T130#")

For instance, you can send "/@Z90#Q60#T30#" to set the MSS to 90 bytes, the quality factor to 60 and start the capture, encoding and transmission of the image with an inter-packet time of 30ms. The quality factor can be set differently for each image. Here are some image samples taken with our image sensor to show the impact of the quality factor on the image size and visual quality.

[obsolete now, we use instead our LoRa gateway, see the LoRa image section] The 1-hop scenario is depicted below where we use an XBee gateway at the receiving side (connected to a Linux machine) configured at 115200 bauds. We have to first tell the image sensor the destination address. You have to start the receiver side first.

    > python SerialToStdout.py /dev/ttyUSB0 | ./display_image -vflip -timer 4 -framing 128x128-test.bmp
    set to framing mode
    Wait for image, original BMP file is 128x128-test.bmp, QualityFactor is 50
    Display timer is 4s
    Creating file tmp_1-128x128-test.bmp.Q20.dat for storing the received image data file
    Wait for image

Image encoding

The encoding and packetization process at the sender side produces variable packet size but the maximum size is defined by the MSS which is set with the "Z90" command (MSS=90 bytes). The quality factor on the scenario is 50. Here is an example of the produced encoded data in packets (shown with different colors) that will be transmitted wirelessly.

FF 50 00 32 56 00 00 E5 49 48 74 E7 F7 9B 9C 0F 17 B7 D9 21 AB C0 0B 40 71 02 F9 A5 A4 E8 48 6C C5 97 CC A0 63 03 ED 2A 36 00 E2 83 B0 9E 46 27 1B 4E 44 A9 BC 5E 22 39 F1 19 73 2A 21 64 52 35 A3 18 64 CE 8D 7A 3B F5 91 46 A7 2E 8D E0 D2 59 98 6C BA 1B 54 A2 5C 34 18 1F 1F FF 50 01 32 52 00 0B C1 36 7F 01 C4 1C 88 BB DB 92 A7 4D 30 C9 9E 5B 17 4E CD EF E5 C8 65 6E 59 72 99 BC B0 A8 CE CC 03 A3 38 DE 9F 57 07 61 D1 4B 9C 25 0C AF BB 78 F8 F9 90 CE 75 E0 85 47 A9 BF A9 08 1D 72 B8 68 F6 3B 84 8C 81 CC 87 7E 16 C1 49 43 E2 27 53 7F FF 50 02 32 51 00 15 E8 44 11 51 CF 70 A1 63 47 DA D4 54 D9 06 FA 46 01 25 A8 23 26 D8 A2 14 70 F6 20 4E 1B 60 B3 DD C0 E8 C3 86 01 BE 8A CC C2 5C 0E E9 86 14 AD 4C 96 B7 D2 39 0A 8F 3B A4 22 35 AC 66 58 C8 C6 64 1E 1C 16 C2 6E 69 14 CD 3B E5 18 C8 28 4E 7F ...

The first 5 bytes are the framing bytes that are normally defined as follows: the first 2 bytes are 0xFF 0x50->0x54 for image packets. Then comes a sequence number, the quality factor (50 is 0x32) and the packet size. The next 2 bytes following the framing bytes are the offset of the data in the image. This is how the encoder can produce very robust and out-of-order reception possibility. Then come the encoded data.

The display_image program run at the receiver size receives and writes the encoded image in a file. This file will then be decoded into a BMP file that will be displayed. See more explanations in our test-bed pages.

Therefore the encoded file has the following content where you can see the framing bytes removed.

00 00 E5 49 48 74 E7 F7 9B 9C 0F 17 B7 D9 21 AB C0 0B 40 71 02 F9 A5 A4 E8 48 6C C5 97 CC A0 63 03 ED 2A 36 00 E2 83 B0 9E 46 27 1B 4E 44 A9 BC 5E 22 39 F1 19 73 2A 21 64 52 35 A3 18 64 CE 8D 7A 3B F5 91 46 A7 2E 8D E0 D2 59 98 6C BA 1B 54 A2 5C 34 18 1F 1F 00 0B C1 36 7F 01 C4 1C 88 BB DB 92 A7 4D 30 C9 9E 5B 17 4E CD EF E5 C8 65 6E 59 72 99 BC B0 A8 CE CC 03 A3 38 DE 9F 57 07 61 D1 4B 9C 25 0C AF BB 78 F8 F9 90 CE 75 E0 85 47 A9 BF A9 08 1D 72 B8 68 F6 3B 84 8C 81 CC 87 7E 16 C1 49 43 E2 27 53 7F 00 15 E8 44 11 51 CF 70 A1 63 47 DA D4 54 D9 06 FA 46 01 25 A8 23 26 D8 A2 14 70 F6 20 4E 1B 60 B3 DD C0 E8 C3 86 01 BE 8A CC C2 5C 0E E9 86 14 AD 4C 96 B7 D2 39 0A 8F 3B A4 22 35 AC 66 58 C8 C6 64 1E 1C 16 C2 6E 69 14 CD 3B E5 18 C8 28 4E 7F ...

During operation, the image sensor uses the 2 leds to indicate some status/errors as the image sensor can run on battery without being connected to a computer. In your first test, connect the Arduino Due to the computer and use the serial monitor

Multi-hop image transmission

Multi-hop image transmission scenario can easily be set up using our relay nodes (see the relay node web page) and follows the example described in our test-bed pages.

Download

Simple intrusion detection application

We implemented an intrusion detection mechanism based on "simple-differencing" of pixel: each pixel of the image from the uCam is compared to the corresponding pixel of a reference image, taken previously at startup of the image sensor and stored in memory (for the Due and Teensy) or in a file on the SD card (for the MEGA2560). When the difference between two pixels, in absolute value, is greater than PIX_THRES we increase the number of different pixels, N_DIFF. When all the pixels have been compared, if N_DIFF is greater than NB_PIX_THRES we can assume an intrusion. However, in order to take into account slight modifications in luminosity due to the camera, when N_DIFF is greater than NB_PIX_THRES we additionally compute the mean luminosity difference between the captured image and the reference image, noted LUM_DIFF. Then we re-compute N_DIFF but using PIX_THRES+LUM_DIFF as the new threshold. If N_DIFF is still greater than NB_PIX\_THRES we conclude for an intrusion and trigger the transmission of the image. Additionally, if no intrusion occurs during 5 minutes, the image sensor takes a new reference image to take into account light condition changes.

In order to enable this behavior you have to compile the sketch with the following define statements uncommented:

Some energy consumption measures

This section presents some energy measures realized on the Due and MEGA platforms. We inserted additional power consumption by toggling a led in order to better identify on the measures the various phases of the image sensor operations. For all the energy tests, the image transmitted was encoded using a quality factor of 50 and between 45 and 49 packets were produced at the packetization stage. The objective here is not to have a complete energy map with varying quality factors and packet number, but to have an approximate idea of the energy consumption on both platforms. Figure below (left) shows an entire cycle of camera sync, camera config, data read, data encode and packetization with transmission on the Due. The right part shows the energy consumption during a periodic intrusion detection process. We forced the intrusion detection to return NO-INTRUSION in order to only read data from the camera and perform the comparison with a reference image. In both figures the x-axis is the time in second from the beginning of the energy capture process and the y-axis is the consumed energy in Joules per time interval of 2ms.

In the left figure we can compute the baseline energy consumption of the Due once the camera has turned to sleep mode (this happen after 15s of being idle. We waited long enough before starting the energy measure process). We measured this consumption at 1.39J/s. Note that we did not realize any advanced power saving mechanisms such as putting the micro-controller in deep sleep mode or lower frequency, or performing ADC reduction, nor powering off the radio module. It is expected that the baseline consumption can be further decreased with more advanced power management policy. After removing the energy consumed by the led, we found that an entire cycle for image acquisition, encoding and transmission consumes about 6J. The largest consumed energy part on the Due comes from polling the serial line to get the image data from the uCam (through the system serial buffer). The encoding process actually consumes less than half that amount of energy.

To perform the intrusion detection the Due consumes about the same amount to energy than just reading the image data. We can actually confirm that the simple-differencing mechanism introduces no additional cost. When no intrusion is detected, there is no need to encode nor transmit the image, therefore we measured the energy consumption at 3.571J for the intrusion detection task. If an intrusion is detected then we just have to add the energy consumption for the encoding and transmission phases shown in the left figure.

The energy measurements also have time information by 2ms increments. Figure below shows the detailed energy consumption along with time information for various phases of the image sensor. The last line shows the total time and the total energy consumption in Joules after removing the hard-coded delays and the additional energy consumption introduced by the led synchronization mechanism (values highlighted in yellow).

As the encoding time was found quite constant except for high values of quality factor (see column "global encode time") we can actually see that the time duration for reading data from uCam and for encoding the image data is quite consistent with the measures shown previously. For instance, if we look at column "global encode+transmit time"and at the line corresponding to 48 packets, the "global encode+transmit time" was found to be 1.088s. In table above, if we add the encoding time (0.551s) and the transmission time (0.594s) we find 1.145s.

Figure below shows the detailed measures for the MEGA board. The baseline consumption was found at 1.25J, a bit smaller than on the Due. However, we can actually see that the MEGA board consumes much more than the Due for all operations. This is mainly due to its much slower clock frequency making all the processes to take longer time. The need of an external storage such as an SD card also contributes to higher energy consumption. This energy consumption statement is actually quite surprising for us because we thought that the Due board would consume much more energy than the MEGA. Given the price of the Due compared to the MEGA, building the image sensor with the Due seems to be the best choice both in terms of performances and energy efficiency.

Building a multi-camera system

From the 1-camera system it is not difficult to have a multiple camera system. Both Arduino Due and MEGA2560 have 4 UART ports. In the current configuration, UART0 is used for connection to computer and UART3 is used for the XBee 802.15.4. It is possible to connect the XBee to UART0 and not using connection to the computer (not needed in a real case scenario) to leave 3 UARTs available (from UART1 to UART3) for 3 uCamII cameras. Figure (left) below shows our Arduino Due connected to 3 uCamII cameras. The cameras are set at 120° from each other and are activated in a round robin manner. At startup, a reference image is taken for each camera. Then intrusion detection is performed on each camera in a cyclic manner. As previously indicated, the minimum time between each snapshot from the camera is about 1712ms. Therefore, the 3-camera system can activate each camera and do the intrusion detection once every 1712ms: this is the maximum performance level. Figure(right) below shows the details of the connection of the 3-camera system.

We also have a version with dedicated leds for the uCams (can work also with the 1-camera system where only cam index 0 is attached). Each time that a uCam is activated (either for sync or to get image data and to perform intrusion detection, the corresponding led will light on). Note that this led can be used to provide lighting in case of dark environments. For instance, the image sensor can be used for close-up surveillance process (cracks, leakages,...) and placed in dark, hard to access areas. Figure below shows the additional leds. Since we need a lot of GND pins, we use a connector to gather all the GND signal (those of the additional leds and those of the uCam).

The uCamII is shipped with a 56° lens. 76° and 116° are available. Figure below shows the differences between the various lenses: from left to right, 56°, 76° and 116°.

With 76° lenses, Figure below compares the coverage of a 80 x 1-uCamII system (top-left, 36.3%) to a 80 x 3-uCamII system (top-right, 71.5%) and to a 240 x 1-uCamII system (bot-left, 71.2%). The FoV in red is the one of camera 0, for both 1-camera and 3-camera systems. The blue is for camera 1 and the green for camera 2, in the 3-camera system. We can see that the coverage is greatly improved, at a much lower cost than having 3 times more full sensor boards (right). Using 116° lenses for the 3 cameras can provide almost disk coverage, as can be seen in the 80 x 3-uCamII system with 116° lenses which provides in this example a coverage of 91.61%.

Here is a test we did in our science department hall with the 3-camera system equiped with 116° lenses. With 1 sensor node we can monitor a large portion of the hall and practically detect moving person in the entire hall.

Here is a simple output of the applications taken from the log from the serial monitor. Text starting with # and highlighted in red are inserted comments to explain the various steps of the application. We did not include all the outputs, just the relevant parts to see the multi-camera mode. Here we use 2 cameras in order to connect the XBee on Serial3 to leave Serial (UART0) available for PC monitoring.

#startup
Init uCam test.
Init XBee 802.15.4
Set MM mode to 2
MAC mode is now: 2
-mac:0013A200408BC81B WAITING for command from 802.15.4 interface. XBee mac mode 2
Wait for command @D0013A20040762053#T60# to capture and send image with an inter-pkt time of 60ms to 0013A20040762053
Current destination: 0013A20040762191
Init UARTs for uCam board
#try to sync each camera, start with camera 0 on Serial1
--->>> Initializing cam 0

Attempt sync 0
Wait Ack
Camera has Acked...
Waiting for SYNC...
Receiving data. Testing to see if it is SYNC...
Camera has SYNCED...
Sending ACK for sync
Now we can take images!
#then try with camera 1 on Serial2
--->>> Initializing cam 1
Attempt sync 0
Wait Ack
Camera has Acked...
Waiting for SYNC...
Receiving data. Testing to see if it is SYNC...
Camera has SYNCED...
Sending ACK for sync
Now we can take images!
#get first image from camera 0 to serve as reference image for this camera
--->>> Get reference image from uCam 0

Initial is being sent
Wait Ack
INITIAL has been acked...
Snapshot is being sent
Wait Ack
SNAPSHOT has been acked...
Get picture is being sent
Wait Ack
GET PICTURE has been acked...
Get picture DATA
Size of the image = 16384
Time for get snapshop : 3
Time for get picture : 123
Waiting for image raw data

Total bytes read: 16384
Time to read data from uCAM: 1512
Sending ACK for end of data picture
Finish getting picture data
#we encode and we chose to transmit this reference image as well
Encoding picture data, Quality Factor is : 50
MSS for packetization is : 90
Q: 1QT ok
Time to encode : 558
Total encode time : 149
Total pkt time : 56
Compression rate (bpp) : 1.42
Packets : 37 25
Q : 50 32
H : 128 80
V : 128 80
Real encoded image file size : 2909
#get first image from camera 1 to serve as reference image for this camera
--->>> Get reference image from uCam 1

Initial is being sent
Wait Ack
INITIAL has been acked...
Snapshot is being sent
Wait Ack
SNAPSHOT has been acked...
Get picture is being sent
Wait Ack
GET PICTURE has been acked...
Get picture DATA
Size of the image = 16384
Time for get snapshop : 3
Time for get picture : 127
Waiting for image raw data

Total bytes read: 16384
Time to read data from uCAM: 1512
Sending ACK for end of data picture
Finish getting picture data
#we encode and transmit this reference image
Encoding picture data, Quality Factor is : 50
MSS for packetization is : 90
Q: 1QT ok
Time to encode : 476
Total encode time : 139
Total pkt time : 62
Compression rate (bpp) : 1.21
Packets : 31 1F
Q : 50 32
H : 128 80
V : 128 80
Real encoded image file size : 2468
#at this point we finished the initialization and we have a reference image in memory for each camera

#new periodic intrusion detection, once every 30s
START INTRUSION DETECTION
#start with camera 0
--->>> Intrusion detection with ucam 0

Initial is being sent
Wait Ack
INITIAL has been acked...
Snapshot is being sent
Wait Ack
SNAPSHOT has been acked...
Get picture is being sent
Wait Ack
GET PICTURE has been acked...
Get picture DATA
Size of the image = 16384
Time for get snapshop : 3
Time for get picture : 145
Waiting for image raw data (compare)
#here we see that we are performing comparison with reference image of that camera
Total bytes compared: 16384
Time to read and process from uCAM: 1511
Sending ACK for end of data picture
Finish getting picture data
nb diff. pixel : 3
Maybe NO intrusion
#move to camera 1
--->>> Intrusion detection with ucam 1

Initial is being sent
Wait Ack
INITIAL has been acked...
Snapshot is being sent
Wait Ack
SNAPSHOT has been acked...
Get picture is being sent
Wait Ack
GET PICTURE has been acked...
Get picture DATA
Size of the image = 16384
Time for get snapshop : 3
Time for get picture : 100
Waiting for image raw data (compare)

Total bytes compared: 16384
Time to read and process from uCAM: 1511
Sending ACK for end of data picture
Finish getting picture data
nb diff. pixel : 1
Maybe NO intrusion

. . .

We have an enhanced version of the display_image tool (display_multi_image) that can collect images from several image sensor nodes for display, supporting also several cameras per node. To do so, the framing bytes need to be extended to store a 16-bit address for an image node. Note that this address could be derived from the 64-bit MAC address (by keeping the last 16 bits for instance) or be hard-coded when programming the image node. The frame structure is as follows for a node with hard-coded address 0x0001. The 16-bit address is inserted right after 0xFF0x50. Then, in order to support multiple cameras per node, we chose to use the flowid which is coded in the 2nd byte, i.e. 0x50. Using the flowid would allow multi-path routing as implemented by our relay nodes (see our relay node pages) according to which camera is sending. Cam id 0 would give 0x50, cam id 1 would give 0x51,... Here, node 0x0001 has only 1 camera so the cam id is 0.

FF 50 00 01 00 32 56 00 00 E5 49 48 74 E7 F7 9B 9C 0F 17 B7 D9 21 AB C0 0B 40 71 02 F9 A5 A4 E8 48 6C C5 97 CC A0 63 03 ED 2A 36 00 E2 83 B0 9E 46 27 1B 4E 44 A9 BC 5E 22 39 F1 19 73 2A 21 64 52 35 A3 18 64 CE 8D 7A 3B F5 91 46 A7 2E 8D E0 D2 59 98 6C BA 1B 54 A2 5C 34 18 1F 1F FF 50 00 01 01 32 52 00 0B C1 36 7F 01 C4 1C 88 BB DB 92 A7 4D 30 C9 9E 5B 17 4E CD EF E5 C8 65 6E 59 72 99 BC B0 A8 CE CC 03 A3 38 DE 9F 57 07 61 D1 4B 9C 25 0C AF BB 78 F8 F9 90 CE 75 E0 85 47 A9 BF A9 08 1D 72 B8 68 F6 3B 84 8C 81 CC 87 7E 16 C1 49 43 E2 27 53 7F FF 50 00 01 02 32 51 00 15 E8 44 11 51 CF 70 A1 63 47 DA D4 54 D9 06 FA 46 01 25 A8 23 26 D8 A2 14 70 F6 20 4E 1B 60 B3 DD C0 E8 C3 86 01 BE 8A CC C2 5C 0E E9 86 14 AD 4C 96 B7 D2 39 0A 8F 3B A4 22 35 AC 66 58 C8 C6 64 1E 1C 16 C2 6E 69 14 CD 3B E5 18 C8 28 4E 7F ...

The screenshot below shows my office with a 1-camera and a 3-camera image sensors sending to my desktop computer. The 1-camera sensor is configured with source address 0x0001 while the 3-camera system has source address 0x0002. The display tool will discover new nodes and assign for each node a column index in increasing order. Here column index 0 (left-most) is for node 0x0001. Node 0x0002 has column index 1. As node 0x0002 has 3 cameras, the image taken by each camera appears on a different line. The top line is for camera 0. The received image packets are stored in a file, then decoded in BMP format and displayed by the display tool. In our example, the BMP filename for the last received image from node 0x0002 is tmp_22-node#0002-cam#2-128x128-test.bmp-Q50-P26-S2110.bmp. It means that it is the 22nd image sent by node 0x0002 where 26 packets have been received for a total encoded size of 2210 bytes (the encoded version, not the decoded BMP version).