Modular Boot

From DDCIDeos
Jump to navigationJump to search

Description

A Binary Reusable, Cutomizeable, Boot. There is an associated chat.

Problem statement

This project was spawned by a discussion in the BSP_Project:

  1. What Boot customization technology?
    1. Can boot be structured in phases, e.g., secure boot, customer/PBIT-boot, Deos-boot
    2. Would it be acceptable for Deos applications to do selection logic? I.e., boot into a stripped down hyperstart that contained just the drivers necessary to load images from, e.g., SATA, into RAM, then return to boot to enter the selected hyperstart+MFS configuration.
      1. Increase in boot time (small enough?)
        1. RLF has requested a benchmark for time to boot into kernel, load DVMS drivers, and shutdown. I.e., the "overhead" portion of this proposal.
      2. Eliminates/minimizes need to have boot duplicate Deos drivers.
      3. Could enable customers to more easily customize the logic, e.g., via libraries, config files, etc., that are available to Deos applications.
    3. Should we add a new BIF "file type" which contains a binary code blob that does selection logic, i.e., that returns a new index?
  2. We need a list of potential challenges for staged boot
    • Linking locations -- physical vs virtual memory (on different architectures)
    • Relocating stages
    • Make infrastructure and linker scripts
    • How large of an area should we reserve for boot. It currently is very small but if we want customers to experiment with adding code, we may want to grow it to 1MiB? Set the imx8 family to 128pages.

Basic idea: https://ddci.zapto.org/scm/Deos/products/bsp/dev-kit/proposals/phased-boot-proposal-02.drawio

This also provides framework to address:

  1. Logging
    1. Customizable placement into persistent memory
    2. Where is the driver for this?
    3. How variable/complicated is it?

Rationale and Background

We've treated BSPs as customer specific and hence "custom". This has, at least, two drawbacks:

  1. Determining the customer's needs requires negotiating many items before we can even get the contract. This is time consuming and frustrating for the customers.
  2. Generating and verifying many variations of something means that it is harder to reuse artifacts.

The general thought is to try to be more "generic", especially for reference BSPs, and try to focus on architectural decisions that facilitate commonality even if some customers get features they technically don't need. I.e., just like other components we deliver.

Status

There are two BSPs that currently support phased boot:

  1. Unreleased or Build https://ddci.zapto.org/scm/Deos/products/bsp/s32g2/branches/mainline and install it in /desk
  2. Build https://ddci.zapto.org/scm/Deos/products/bsp/tiger-lake/branches/mainline -a come-ctl6-x86_64 and install it in /desk

The following components are unreleased for x86_64, arm, and aarch64:

  • boot-cpu-aarch64 1.0.0 stable
  • boot-cpu-arm 1.0.0 arm
  • boot-cpu-x86_64 1.0.0 x86_64
  • boot-multiboot-shim 1.0.0 x86_64

The following components are stable:

  • boot-cpu-shared 1.0.0 x86_64, arm, aarch64 and ppc
  • kernel-entry-loop 1.0.0 x86_64, arm, and aarch64
  • modular-boot-tools 1.0.0 x86_64, arm, aarch64 and ppc
  • module-loader 1.0.0 x86_64, arm, aarch64 and ppc
  • ansi 4.11.0

The following are in some stage of modular boot development:

There is documentation:

  1. kernel-entry-loop-user-guide.htm.
  2. modulular-boot-user-guide.htm is delivered with the modular-boot-tools.
  3. module-loader-user-guide.htm

Tasks

Products/components needed:

  1. kernel-entry-loop (mostly architecture independent) - prototype DONE.
  2. cpu-init (at least one per architecture, possibly one per sub-architecture). Plan this as the verified version (meets kernel requirements) that can modify any register and if bsp cannot support that, it can skip using this module and instead use boot-entry simplified dev-kit cpu-init.
  3. boot-entry (currently BSP specific, maybe possible to have more generic classes, e.g., ppc-uboot, x86_64-slimboot)
  4. Module loader - prototype and intial requirements DONE
  5. provisioning data workstation tool IN PROGRESS
  6. boot elfchk workstation tool (or boot specific mods to elfchk)
  7. example boot module
  8. mbconfig boot packager workstation tool IN CDPROC IN PROGRESS

broader tasks

  1. Port each BSP
  2. Training material
    • Create documentation for each module
    • Create generic UG for modular boot - Template done. Need to enhance to detail how to create boot modules.
  3. Figure out how to partition up the boot requirements
  4. AL: The makeboot extension needs to be able to locate it, even if it was copied for edit.


Specific Tasks: Boot:

  • Add the BIF Loader to the BARC - done for X86 only
  • Make the boot-entrys relocatable

PAL:

  • Create network PRLs that mux the interrupts
  • Add support for programming the trigger and sense from a bitmap in the registry
  • Add code to use the thread/window timer multiplier and divisor from the pdata if supplied. Arm can default to reading register if not defined.

Documentation updates:

    • bootModuleFile entries can be added to the openarbor.options file (.options in OA)
    • Each of the *.bm.xml files + basecon.* files are visible in OA under Complete Integration -> Config Files

basecon.pdata:

  • Add pdataVersion=!pdataVersion! (the version is automagically retrieved from the corresponding pdata.h)
  • Add pdataArchVersion=!pdataArchVersion! (the version is automagically retrieved from the corresponding pdata-arch.h)
  • Add --windowUsPerTickMultiplier=<value>
  • Add --windowUsPerTickDivisor=<value>
  • Add --threadUsPerTickMultiplier=<value>
  • Add --threadUsPerTickDivisor=<value>
  • Add --coreToBoot=<physical core number>, <processor core ID> for each core that will be enabled
    • Core is disabled by default unless there is an --coreToBoot entry for it
    • processor core ID = 0x80000000 means core is disabled


Notes from 2025-07-18 telecon.

- Whether the barc contains the BIF is determined by whether a bootModule lists the "darc/LFS" as a bootFile.
- The BSP uses modular boot iff a cd.xml contains a bootModueFile element.
- Whether to run makeboot is based basecon.makeboot and/or makeboot extension.

What is the name and type of the BIF i.e., "darc/LFS"?
What is the makeboot command and what are the makefile list of dependents?

Proposal 1:  Add to componentDescriptor.xsd: 
  <platformProjectBIF>composite.darc
  <bootArchive>composite.darc
  Then element can be used in either cd.xml or openarbor.options.
compositeOutputFilename

<makebootOutputFile>container.bin
<makebootOutputFile>deosBoot.qemu

Has to show up as "copy for editing":
  basecon.makeboot
  basecon.pdata

OA label: "Composite Filename" needs to be ???

Specific TODOs:

  • Move the Kernel Archive Load Code to a boot-load-kfs-ram Module.
  • Upgrade the selection algorithm in boot-load-kfs-ram module to use the fallback mechanism in BSP_Project#Boot_Archive_Image_Fallback_Selection proposal:
    • Add two fields to BIFH_Subsection_t
      1. The fallback behavior (e.g. fatalError, logAndContinue, loadAlternateSelectionIndex).
      2. The alternate image index to load
      3. Possibly combine the two above fields into one 32 bit value and add a fallForward behavior and index.
    • Each BIFH_contentType_compositeArchive type specifies an alternate selection index if this index fails
    • How is the building of the composite and selection archives going to be represented in OpenArbor??? This seems complicated.
  • Get a common dprintf working
  • Develop Tests for the kernel-entry-loop. Add in testpoint code as an additional boot module immediately before the module to be tested. Start with testing:
    • Changing images in a selection archive
    • kernelModeError
    • warmstart path
  • Evaluate what it will take to keep cache coherent in the kernel-entry-loop. e.g. celestial had to have a barrier to ensure the cache was invalidated while other cores waiting.
  • Write some documentation for the BSP UG that can be common for loading the .darcs
  • Move kernelModeError(), logFatalError() and idleMode() to the phased boot... architecture specific DONE
  • Add 4th parameter to logFatalError() that will get loaded to PSIO->startUpError
  • Move the stack initialization code to the cpu components. Try to combine the init of the stack in the PSIO into this code.

Design

Decision was made to switch from binary to ELF files. See Modular Loader UG "module-loader-user-guide.htm#term-ELF-module-constraints".

BIF header have a few words at the beginning to inject some jump instructions?

  1. Or should that be a makeboot thing, e.g., add a phase -1 which is just some jump instructions?
    • Useful for loading code at reset vector
    • Useful for simplifying uboot configuration (jump to where image was loaded).
    • One(all?) of the linux image formats work this way.
    • Multi-boot has a similar feature.

Notes from AL/RLR/AR telecons:

Phase 0:  slimboot, grub, reset vector
  does it initialize stack?


Phase 1 is always started without an ELF loader being present.

  A. Code that doesn't have ELF loader support
     Can't access through GOT, PLT, etc.
     must setup stack
     must do minimal CPU init (e.g., segment registers) to enable calling C code.

  B. Desire: Somehow need to be able to fixup GOT, PLT, etc, and still
     continue in phase 1.
     DONE: Call Module Loader's initModuleLoader() function as soon as stack is set up and paging (if applicable)

  C. Desire to use libraries (ANSI) while in phase 1.


Symbol resolution:  if same symbol is defined in phase 1, 2, 3
     The Highest numbered phase is used. Documented in Module Loader UG.

TODO: UG should describe protocol for deferring code to earlier phase.

TODO: Seriously restrict what the so's init can do, e.g., can't depend on
ability to call functions in another so that might not have setup
appropriate machine state.  Eesh.

Notes from 2025-10-03 telecon

Aaron, Adina, Carlos, Kobus, Ryan

How to handle requrements for modular boot.

tracetag per register
  - what about return to boot vs startup

low level requirements?

each boot component would trace to boot srd.

individual components would trace to boot srd.

Each BSP would have trace doc showing how each module traces to
bootsrd.

historically for each BSP:
 Create boot SRD, cloned from kernel's BOOTSRD
 Special section for board initialization

High level issues:

BSP SRD has content for
 multiple architectures
 sub-architectures

Modular boot has multiple modules
  need to trace module requirements to BOOTSRD
  need completeness coverage
  modules:
     boot entry (board initialization)
     boot-cpu-ARCH (cpu init, cpu-shared(pdata struct, cpu-arch
                    interface specification, etc.))
     module-loader (only high level requirements, would not trace to BOOTSRD)
     kernel entry loop
     ansi and other infrastructure modules have their own requirements
     platform module (anticipated)
     pbit
     
Do we need a boot-PIG?
  perhaps in modular-boot-UG.

textual "requirements" that do not have shalls.

Steps to Transform Boot to a Modular Boot

Under Construction

  1. Create an experimental branch starting with a currently operational modular boot of the same architecture following CM howto.
  2. Commit experimental branch using the branch PCR, and keep open to merge experimental back to mainline, but don't commit any other changes to this PCR except for merging.
  3. Create a PCR "Create a Modular Boot"
  4. Roll the major version number in configure.ac (based on the last release from mainline)
  5. Makefile Updates
    • Search for the old BSP name and replace it with new BSP name.
    • Is there any difference between the mainline and experimental PAL? Can it be fixed with contents of pdata?
    • Update the externals to use modular-boot instead of dev-kit as much as possible. Note PSIO is formatted differently, which effects the PAL.
      • Remove externals (some may not be present):
        • ansi
        • boot/code/bootcore
        • code/cpu-core
        • pal/code/pal-wat-support
    • New Boot is linking against ansi-core. Does it need to link against ansi-dale?
      • If linking against ansi-dale (or platform-boot-module) this should be in the makefile (see example of ansi-core)
      • If ansi-dale is needed the function names also changed to have *_dale(), e.g. strtoul_dale()
    • Evaluate the code and/or UG to see if this boot is doing anything "Special". Can this behavior be put in a platform-specific boot module?
  1. Code changes:
    • Delete config.h and all references to it if possible. Anything still defined in this file is usually moved to the makefile. But PPC has valid use.
    • Delete the old platinfo.h (in the shared folder or code or common). The new one is in modular-boot/code. Stop delivering a platinfo.h. If there is a custom extension on the one already in the desk/include/boot, that can be included as a separate file platinfo-custom.h with accessor functions to reach the custom fields.
    • Changes to constants.py:
      • Add page of RAM for PSIO (likely already in arm bsps)
      • Increase the size of boot (size is TBD, currently using 128 pages)
      • Boot stack should have a page of stackOverflow allocated before it. bootStack is no longer in the linker script. The address of the stack is proposed in the constant.py but may be moved by the user. basecon.pdata contains swithces for bootPrimaryStackPages and bootSecondaryStackPages. BootStack should be allocated before bootRAM and bootRAM should be last in case a big BOOT_IMAGE in the boot overlaps with the BOOT_IMAGE memory region.
    • No boot functions should return with errors... on failures, call logFatalError() and set error value in PSIO->startUpError then and calls idleMode() TODO: Put the logging of PSIO->startUpError inside logFatalError() with the startDeosReturnType as the 4th parameter to logFatalError()
    • May need to change the function that releases secondary cores "void releaseSecondaryCoresByBitMap(uint32_t coreBitMap). Maybe this function should be moved out of boot-entry and into a platform specific module?
    • Change the PAL's initialization of waitForNextSystemTickCompleteTimestamp to PSIO_getcore0TimeStampsOffset(PSIO)->waitForNextSystemTickCompleteTimestamp = readTimeStampCounter();
    • If secondary threads were being released, why? We don't support that in verf so remove this option. Add -architectureSpecificConfigurationWord0 to basecon.bangvar.hyp to specify which cores can be possibly booted (i.e. skipping the threads)
    • Evaluate all of the configuration component to update variables to match the platform, specifically:
      • basecon.bangvar.hyp
      • basecon.bangvar.makeboot
      • basecon.bangvar.pdata
      • basecon.bangvar.pia.xml - compare against the mainline version
    • Ensure that all memory reserved by the boot or PAL is reserved in basecon-resources.pia.xml
  2. Update the BSP documentation:
    • Update the release notes
    • Boot sequence description with diagrams.
    • Add warnings about modular boot (See imx8qm for example)
    • Read through BSP UG for any other necessary changes.

Changes to Other Components

BIF:

  1. We've talked about re-arranging the header fields to make integrity checking easier.
  2. Adding space at the beginning of the header for an instruction or two so the DeosBoot image could be located at the reset vector.
    1. The idea is that the instruction would be a branch to the entrypoint in phase zero.
  3. Mechanism to add "on load fail" and "on load success" attributes to archive member BIFH_Subsection_t. E.g., add a header member to specify the size of the BIFH_Subsection_t, or a special subsection_t_with_load-alternative-specification, or have the archive to specify data for each member some other way.

ANSI

  1. DONE: Could there be an ANSI .so, possibly shared with the Deos kernel, that would provide a subset of ANSI functions that do not require global storage? E.g., memcpy(), memset(), etc. There are now a ansi-core and ansi-dale libraries.