DVMS DAL-A Project
This page is a discussion/log of the tasks and design to get DVMS to DAL-A in a multi-user environment. The DAL-D certification for the TrickyFish Program program was done in a single-user context.
Merge experimental/mainline
Self-explanatory. The codebase should be fairly similar if not identical already so perhaps just delete the experimental branch.
Features for other components
- kernel: Registry accessors or resource description available (for use during coldstart)
- kernel: multi-core interrupts?
PCRs to implement
- To be worked:
- DDCI_PCR:4989 config Add temporal aspect to ACL to allow performance tuning to consider time
- DDCI_PCR:4895 config Partition size of "remaining" causes dvmsCheckDisk failure
- DDCI_PCR:4894 config Remove "journal" from flags description in partition
- DDCI_PCR:4888 config Read-only mountPoint with contained read-write ACL should be an error
- DDCI_PCR:4988 core Add support for multi-user concurrent access
- DDCI_PCR:4899 core Partitions not sized appropriate to sectorAlignment cause dvmsCheckDisk failure
- DDCI_PCR:4890 core O_APPEND not implemented
- DDCI_PCR:4889 core Errors in dvmsIterateDisk not propagated
- DDCI_PCR:4683 core Customer reported DVMS hang under high logbook load
- DDCI_PCR:4463 core dvmsFsckVolume should be more verbose in user response
- DDCI_PCR:4338 core Documentation improvement: filesystem interface
- DDCI_PCR:4884 filesyst Support for [f]truncate growth is broken in journaled filesystem volumes
- DDCI_PCR:4685 filesyst dvmsMountVolumes use in example before zeroize
- DDCI_PCR:4885 journal Support for [f]truncate growth is broken in journaled filesystem volumes
- DDCI_PCR:4880 journal Interrupted journal replay causes journal corruption
- DDCI_PCR:4878 journal Support different journal locations
- DDCI_PCR:4862 journal Consider cross-media journal enhancement
- DDCI_PCR:4464 journal Create a tool similar to hypdump for offline journal analysis
- DDCI_PCR:4462 journal Journal replay should be configurable
- DDCI_PCR:4458 journal Use linguistic TLS to be ARINC653 compliant
- DDCI_PCR:5023 media-ra Add support for multi-user concurrent access
- DDCI_PCR:4441 media-ra Add support for specification of platform resource
- MAL-specific - work when customer:
- DDCI_PCR:4891 media-mm Unaligned or non-block-sized writes implementation broken
- DDCI_PCR:4459 media-mm Use linguistic TLS to be ARINC653 compliant
- DDCI_PCR:4659 media-no To use exFAT, need to unimplement the ioctl function
- DDCI_PCR:4627 media-no Move source under a new subdirectory for NOR Flash.
- DDCI_PCR:4518 media-no Initial release of the DVMS NOR Flash media component
- DDCI_PCR:4660 media-no Feature Provider allocates more RAM than needed.
- DDCI_PCR:4941 media-sa Fix compatibility of component -- make work on nai-ultrascale and zcu102
- DDCI_PCR:4460 media-sa Use linguistic TLS to be ARINC653 compliant
- DDCI_PCR:4444 media-sa dvms sata ahci check for 0 byte read and write
- DDCI_PCR:4911 media-sa Iincorrect use of HCONTROL_FORCE_OFFLINE
- DDCI_PCR:4590 media-sa Finding - Software Life Cycle Audit - media-sata-atapi 2.0.0
- DDCI_PCR:4555 media-sa Running vfile_demo example after vfile_logbook doesn't work.
- Future features:
- DDCI_PCR:4457 journal Implement journal generation number
- DDCI_PCR:4440 journal Make journal block size configurable.
- Maybe not work:
- DDCI_PCR:4801 config Support ACL access value of "none"
- DDCI_PCR:4814 core Support ACL access value of "none"
- Not needing work or already done:
- DDCI_PCR:4992 core Merge experimental to mainline
- DDCI_PCR:5000 examples Merge experimental to mainline
- DDCI_PCR:4998 filesyst Merge experimental to mainline
- DDCI_PCR:4999 journal Merge experimental to mainline
- DDCI_PCR:5017 media-ra Merge experimental to mainline
- DDCI_PCR:4677 config Generate CCK showing configuration in human-readable read-back form
- DDCI_PCR:4993 core Usage of vfile feature in vfile featureSet is missing
- Lower than low priority:
- DDCI_PCR:5005 examples Add reference to platform-specific section
- DDCI_PCR:4860 examples Enable CRC check in throughput test.
- DDCI_PCR:4855 examples Reconsider how examples are deployed
- DDCI_PCR:4779 examples Remove Deos:pcr-required property
Remove Limitations
- DDCI_PCR:4895
- Plan: Remove 'remaining' partition size support.
- DDCI_PCR:4683
- Plan: Monitor, create robustness tests for.
- DDCI_PCR:4890
- Plan: Add O_APPEND support to DVMS write for filesystems.
- DDCI_PCR:4899
- Plan: Align-up small partition size to sector alignment in dvmsconfig, providing informational or warning message that this has been done.
- DDCI_PCR:4884
- Plan: Investigate whether this is OBE by adoption of TexFAT and the fact that with that, journal is not required.
- DDCI_PCR:4458
- Plan: Switch to linguistic TLS.
- DDCI_PCR:4880
- Plan: The journal should not be a vast prairie into which all things are dumped.
- Recommend small journals (current minimum is 32K). Restrict the journal size in dvmsconfig much, much less than 2T.
- Add support to dvmsconfig to adjust MALLOC_PAGES required for number of journals configured.
- DDCI_PCR:4885
- Plan: Same as PCR 4884 above.
- DDCI_PCR:4459
- Plan: Switch to linguistic TLS.
- DDCI_PCR:4891
- Plan: Simply fix.
- DDCI_PCR:4460
- Plan: Switch to linguistic TLS.
Path to Multi-user
- Remove all System Semaphore usage.
- Update dvmsconfig to generate client resource PIAs.
Use PIA to provide client-based queues
Current DVMS MAL strategy grew from CFFS where, typically, one RAM area was used for the hardware queue for the controller. This precludes being able to indicate completeness of requests to multiple clients out-of-line from the issuance of their request. If PIA is used such that each client has its own RAM area where it can queue requests, the PRL can convey requests from multiple user's RAM areas into the RAM area for the controller, and back again when those requests are complete. Blocking at the user level would be on its own requests, not on the requests of all other users.
- Update each MAL's top layer to be client resource aware.
- The MAL's PRL will collect from the client resources when building the HW transfer queue.
- The MAL's PRL will manage client completion notifications.
- MALs will no longer go directly to HW for completion confirmation. Need some structure in the client resource for completion polling.
- What about a multi-threaded client? One resource per thread?
Allow PRL to handle interrupts?
One proposal to alleviate the concern above (multiple users and how to indicate to them that their transaction is completed) is for the kernel to allow PRLs to respond to interrupts. The PRL should be the place where the user context is mapped to the HW context, so the PRL upon handling the interrupt can, without causing partitioning concerns, indicate transaction completeness to the correct user and, additionally, start new transactions to increase throughput.
- Is this just extending the KMI concept and its concerns beyond the PAL to PRLs?
- The Kernel UG makes it sound like the handling of timing adjustments to make the KMI transparent to the kernel might be platform-specific?
Prior stream-of-consciousness garble
- Avoid blocking users except where absolutely necessary.
- Where is absolutely necessary?
- When transactions to HW cannot be atomic, such as when a transaction is started but will take some time to complete.
- Where is absolutely necessary?
- Transactions to HW should:
- Poll for busy.
- Initiate transaction.
- Poll for complete.
- How to handle completeness checking in the following scenarios:
- In a multi-user scenario where user A that has made it to step 3, but is switched out by the kernel to user B which is at step 1.
- When the HW completes:
- If it completes while user B is in context, how does the driver tell user A that its transaction has completed? (Is this a potential partitioning violation?)
- If user B proceeds through steps 2 and 3, but is switched out by the kernel back to user A, how does user A know that it is now polling for user B's transaction completeness and that its transaction has already completed?
- Avoid blocking users except where absolutely necessary.
- Typically a HW controller has "complete" bit indications in its registers; but no indication of who started the thing that just completed.