Embedded Data Storage

Filesystem

Journaling filesystems are a modern standard and a feature that is critical for anything storing data that may incur a power loss during write operations.

  • Journaling Filesystem
    • ZFS
    • BTRFS
    • EXT3
    • EXT4
  • Non-Journaling Filesystem
    • FAT
    • EXT
    • EXT2

Data Storage

Limitations of custom flat file formats such as plain text INI, CSV, XML, or JASON that is parsed into application-specific data structures and also flat binary files when writing these files directly to disk.

  • Flat File
    • Although easy to write and create, even the slightest change requires the entire file to be rewritten, which in turn makes files susceptible to data loss.
    • The extra wear caused by repeated erasing and writing of a large text file to flash memory will eventually wear out and thus reduce the total lifetime of flash media.
    • Since the entire file must be read into memory to efficiently search for data, they cannot be very large because of the time required to write changes.
    • Flat files also have a limited data management life cycle offering, as devices and embedded systems are becoming more aware of each other and other systems.
  • Binary Files
    • Binary files can be divided into smaller pieces used independently, but special tools must be developed to use these files outside the application.
    • Binary files provide some protection against corruption of the entire file, but can still lose data when changes are only partially written before an unexpected power failure.
    • Binary files make it more difficult to store variable-width data.
    • If data is copied directly from memory to a binary flat file, it is not easy to open the file on another architecture.

Areas most impacted by the implementation of the database include:

  • Frequency of costly disk or flash media I/O operations.
  • Time required to recover from a crash or power loss.
  • Ability to manage large amounts of data without severe performance degradation.
  • Performance impact of sharing data between tasks and other applications.
  • Relative performance of read and write operations.
  • Portability of the storage format and application code.
  • Effort required to integrate database technology into the application.

All embedded databases serve three main purposes for data management:

  • Reliable storage
  • Efficient queries
  • Safe shared access

Balancing these three areas of functionality is difficult because strong guarantees in one area can weaken the capabilities of the other two areas. Things to look for when choosing an embedded database include:

  • Performance
  • Low-level access to the engine
  • Concurrency
  • Replication
  • Data typing and type safety
  • True in-memory storage engine
Plain Text Binary Database
Platform-independence Yes No Yes Variations in data structure alignment and padding tie flat binary formats to one platform.
Variable-width fields Yes No Yes Compact storage formats rely on dynamic buffers, not fixed-width structures.
Minimum overhead No Yes No Records in a binary file are copied directly between memory and disk.
Transaction logging No No Yes Crash recovery protects data from unexpected power loss and similar failures.
Isolated shared access No No Yes Concurrent tasks must coordinate to access data safely.

For the CubeSat

Data storage should be more than sufficient so disk compression is less an issue than transmission. Sudden power failure is extremely unlikely to occur, it should always be the case that it is seen coming and can be prepared for, so data corruption from power loss during write is very unlikely. Some sort of journaling as a safety net would still be good to have though.

Unfortunately

Without an operating system it will not be possible to use a filesystem.