DVMS DAL-A Project

From DDCIDeos
Jump to navigationJump to search

This page is a discussion/log of the tasks and design to get DVMS to DAL-A in a multi-user environment. The DAL-D certification for the TrickyFish Program program was done in a single-user context.

Merge experimental/mainline

Self-explanatory. The codebase should be fairly similar if not identical already so perhaps just delete the experimental branch.

Features for other components

  • kernel: Registry accessors or resource description available (for use during coldstart)
  • kernel: multi-core interrupts?

PCRs to implement

  • To be worked:
    • DDCI_PCR:4989 config Add temporal aspect to ACL to allow performance tuning to consider time
    • DDCI_PCR:4895 config Partition size of "remaining" causes dvmsCheckDisk failure
    • DDCI_PCR:4894 config Remove "journal" from flags description in partition
    • DDCI_PCR:4888 config Read-only mountPoint with contained read-write ACL should be an error
    • DDCI_PCR:4988 core Add support for multi-user concurrent access
    • DDCI_PCR:4899 core Partitions not sized appropriate to sectorAlignment cause dvmsCheckDisk failure
    • DDCI_PCR:4890 core O_APPEND not implemented
    • DDCI_PCR:4889 core Errors in dvmsIterateDisk not propagated
    • DDCI_PCR:4683 core Customer reported DVMS hang under high logbook load
    • DDCI_PCR:4463 core dvmsFsckVolume should be more verbose in user response
    • DDCI_PCR:4338 core Documentation improvement: filesystem interface
    • DDCI_PCR:4884 filesyst Support for [f]truncate growth is broken in journaled filesystem volumes
    • DDCI_PCR:4685 filesyst dvmsMountVolumes use in example before zeroize
    • DDCI_PCR:4885 journal Support for [f]truncate growth is broken in journaled filesystem volumes
    • DDCI_PCR:4880 journal Interrupted journal replay causes journal corruption
    • DDCI_PCR:4878 journal Support different journal locations
    • DDCI_PCR:4862 journal Consider cross-media journal enhancement
    • DDCI_PCR:4464 journal Create a tool similar to hypdump for offline journal analysis
    • DDCI_PCR:4462 journal Journal replay should be configurable
    • DDCI_PCR:4458 journal Use linguistic TLS to be ARINC653 compliant
    • DDCI_PCR:5023 media-ra Add support for multi-user concurrent access
    • DDCI_PCR:4441 media-ra Add support for specification of platform resource
  • MAL-specific - work when customer:
    • DDCI_PCR:4891 media-mm Unaligned or non-block-sized writes implementation broken
    • DDCI_PCR:4459 media-mm Use linguistic TLS to be ARINC653 compliant
    • DDCI_PCR:4659 media-no To use exFAT, need to unimplement the ioctl function
    • DDCI_PCR:4627 media-no Move source under a new subdirectory for NOR Flash.
    • DDCI_PCR:4518 media-no Initial release of the DVMS NOR Flash media component
    • DDCI_PCR:4660 media-no Feature Provider allocates more RAM than needed.
    • DDCI_PCR:4941 media-sa Fix compatibility of component -- make work on nai-ultrascale and zcu102
    • DDCI_PCR:4460 media-sa Use linguistic TLS to be ARINC653 compliant
    • DDCI_PCR:4444 media-sa dvms sata ahci check for 0 byte read and write
    • DDCI_PCR:4911 media-sa Iincorrect use of HCONTROL_FORCE_OFFLINE
    • DDCI_PCR:4590 media-sa Finding - Software Life Cycle Audit - media-sata-atapi 2.0.0
    • DDCI_PCR:4555 media-sa Running vfile_demo example after vfile_logbook doesn't work.
  • Future features:
  • Not needing work or already done:
  • Lower than low priority:

Remove Limitations

  • DDCI_PCR:4890
    • Plan: Add O_APPEND support to DVMS write for filesystems.
  • DDCI_PCR:4899
    • Plan: Align-up small partition size to sector alignment in dvmsconfig, providing informational or warning message that this has been done.
  • DDCI_PCR:4884
    • Plan: Investigate whether this is OBE by adoption of TexFAT and the fact that with that, journal is not required.
  • DDCI_PCR:4880
    • Plan: The journal should not be a vast prairie into which all things are dumped.
    1. Recommend small journals (current minimum is 32K). Restrict the journal size in dvmsconfig much, much less than 2T.
    2. Add support to dvmsconfig to adjust MALLOC_PAGES required for number of journals configured.

Path to Multi-user

  • Remove all System Semaphore usage.
  • Update dvmsconfig to generate client resource PIAs.

Use PIA to provide client-based queues

Current DVMS MAL strategy grew from CFFS where, typically, one RAM area was used for the hardware queue for the controller. This precludes being able to indicate completeness of requests to multiple clients out-of-line from the issuance of their request. If PIA is used such that each client has its own RAM area where it can queue requests, the PRL can convey requests from multiple user's RAM areas into the RAM area for the controller, and back again when those requests are complete. Blocking at the user level would be on its own requests, not on the requests of all other users.

  • Update each MAL's top layer to be client resource aware.
    • The MAL's PRL will collect from the client resources when building the HW transfer queue.
    • The MAL's PRL will manage client completion notifications.
      • MALs will no longer go directly to HW for completion confirmation. Need some structure in the client resource for completion polling.
    • What about a multi-threaded client? One resource per thread?


Allow PRL to handle interrupts?

One proposal to alleviate the concern above (multiple users and how to indicate to them that their transaction is completed) is for the kernel to allow PRLs to respond to interrupts. The PRL should be the place where the user context is mapped to the HW context, so the PRL upon handling the interrupt can, without causing partitioning concerns, indicate transaction completeness to the correct user and, additionally, start new transactions to increase throughput.

  • Is this just extending the KMI concept and its concerns beyond the PAL to PRLs?
  • The Kernel UG makes it sound like the handling of timing adjustments to make the KMI transparent to the kernel might be platform-specific?


Prior stream-of-consciousness garble

    • Avoid blocking users except where absolutely necessary.
      • Where is absolutely necessary?
        • When transactions to HW cannot be atomic, such as when a transaction is started but will take some time to complete.
    • Transactions to HW should:
    1. Poll for busy.
    2. Initiate transaction.
    3. Poll for complete.
    • How to handle completeness checking in the following scenarios:
      • In a multi-user scenario where user A that has made it to step 3, but is switched out by the kernel to user B which is at step 1.
    • When the HW completes:
      • If it completes while user B is in context, how does the driver tell user A that its transaction has completed? (Is this a potential partitioning violation?)
      • If user B proceeds through steps 2 and 3, but is switched out by the kernel back to user A, how does user A know that it is now polling for user B's transaction completeness and that its transaction has already completed?
  • Typically a HW controller has "complete" bit indications in its registers; but no indication of who started the thing that just completed.