Jupiter kernel
From DDCIDeos
Jump to navigationJump to search
Description
Project for kernel for the Jupiter baseline. At this point this project is mainly to support Durants3 but there are tasks for Desert Eagle, Sales, and Celestial.
Useful Links
- Query of all outstanding PCRs
- Query of open Rqts/Code PCRs (PCRs with requirements/code work assigned).
- Query of open test PCRs (PCRs with test work assigned).
- mainline review status
Activities
| Create and Initial Population of Certification Archive (If already created, just perform populate step.) | RLR/RDR | Done | |
| Requirements Development | RLR | Done | |
| Standards Change Analysis | CF/RR | Done | |
| Architecture Change Impact Analysis | RLR | Done | |
| Initialize Status Files | CF/RR | Done | |
| Requirements review | CF/RLF | Done | |
| Code Development, i.e. trace tag insertion and additional test points as determined via test development. | RR/RLF | Done | |
| Code review | TBD | Done | |
| Test Case Development | TBD | Done | |
| Test Procedure Development | TBD | Done | |
| Test Reviews Notify SQA when this activity is complete. | TBD | Done | |
| Software life cycle audits - email team to review analysis if had kernel analysis files | JEC | Done | |
| Processor Compatibility Analysis | TBD | N/A | |
| Before the following can be done the above activities must be complete | |||
|---|---|---|---|
| Dynamic Dispatch aka V-Table Analysis (note: no review needed) | RLR | Done | |
| Compiler Assessment | RLR | Done | |
| Portal Analysis | RLR | Done | |
| Portal Analysis Review | RDR | Done | |
| Traceaid Qualification | RDR | Done | |
| ABC SCAT Qual | RDR | Done | |
| Integration Review | RDR/JEC | Done | |
| Before the following can be done the Requirements Coverage analysis' traceability needs to be confirmed and ABC-tool Qual needs to be completed | |||
| Formal build (including Conformity inspection) | RDR/JEC | Done | |
| Before the following can be done the Formal build needs to be completed. Each executable object code analysis has an analysis and review task. | |||
| Executable Object Code Analysis | |||
| - verify following task list matches kernel howto | RLR | Done | |
| - linker invariants Analysis | RLR | Done | |
| - Memory Barrier Analyses | RLR | Done | |
| - ARM ldrex instruction Analysis | RLR | Done | |
| Executable Object Code Analysis Review | RDR | Done | |
| B-Tree analysis | RDR | Done | |
| B-Tree analysis review | RLR | Done | |
| Requirements Coverage Analysis | RDR | Done | |
| MFS relocatable analysis | RLR | Done | |
| Before the following can be completed all of the above analysis need to be completed. The RFS was completed with the expectation the analysis above will be successful. | |||
| Run For Score | RDR/JEC | Done | |
| Test Results Review | RDR | Done | |
| Before the following can be done the Run For Score needs to be completed | |||
| Structural coverage analysis | RDR | Done | |
| Stack analysis | RDR | Done | |
| Stack analysis review | RLR | Done | |
| Confirm/update SCORE adarts SAS/stack usage analysis once the final kernel binaries have been built | JON | Done | |
| Verification audit | JEC | Done | |
| Before the following can be completed all of the above tasks need to be completed | |||
| Processor Errata analysis | RLR | Done | |
| Populate the cert archive | KKL | Done | |
| Report Documents | |||
| - SLCECI | RLR | Done | |
| - SLCECI review | KKL | Done | |
| - SCI | RLR | Done | |
| - SCI review | KKL | Done | |
| Open Problem Reports List (after the final CCB) | RLR | Done | |
| - SAS | RLR | Done | |
| - SAS review | KKL | Done | |
| - Publish Backend Docs: (update analysisStatus.txt) | KKL | Done | |
| Software conformity audit | JEC | Done | |
| Final Steps upon Verf Complete | KKL | Done | |
Testing on Docker
Install Docker, then:
cd ....kernel/branches/experimental run-docker -it --rm ubuntu-jupiter-dev # kernel headers changed, must rebuild and install kernel common/build-utils/build -q -j12 install sudo rm $(find $(dpkg-query -L kernel) -maxdepth 0 -type f) sudo cp -rvu output/desk / cd tests export DESK_IP_ADDR_qemu_arm=192.168.19.103 ./common/test-utils/regress -j12 -q kernel exe qemu-arm build run local debug tpk253
Tasks
| Task | Priority | Assignee | Status | Remarks |
|---|---|---|---|---|
| > 32-bit physical Deos RAM support on ARM | 1-release | RLR | Done | |
| trace32-extension-2.0.x (PCR 12395) | 2-release | RLR | Done | |
| Registry in KFS other than KFS0 | 3-release | AL/RLR | Done | |
| Platform resources support Cache Partitioning Semantics | 3-release | RLR/GK | Done | |
| Qualifiable KFS (LFS and MFS) checker tool. | 4-release | CF | Done-ish | |
| support debugging files in MFSs. | 4-release | RLR/GK | Done | Kernel code/requirement updates complete. IT and OA updates needed before users can use it. |
| Multi-core crittime updates. | 4-release | RLF | Done | Done? |
| Incremental file save? | 4-release | CP | Done | |
| Add LFS support to findFirstKernelFile() and friends. | 4-release | RLR | Done | |
| Finish MFS support | 4-release? | AL/RLR | Done | |
| Support WATs of varying lengths (for Desert Eagle). | 5-release | RLF/RLR | Done | |
| Multi-core |
6-release | RDR/GK | Done | Original activity was only arm. Both arm and ppc multi-core test complete. x86 not verified for Jupiter. |
| address multicore TODOs (1 -Docs, 7 - code). | 6-release | AL/RLR | Done | Pending tests to confirm |
| E6500 and T2080 requirements for Celestial | 5-release | RLF/CF | Done | |
| Add A72 to requirements | 5-release | RLR | Done | Not needed for this cert but needed by BSP team. |
| Add TI J721 E to requirements | TBD | TBD | rejected | Not needed for this cert. |
| Add i.MX 8QuadMax to requirements | 5-release | RLF/CF | Done | |
| Add Zynq UltraScale+ to requirements | 5-release | RLF/CF | Done | |
| Add ability to inhibit platform resource proxy access | TBD | RLR | Done | |
| multi-core boot interceptor | TBD | RDR | Done | Changes to kernel proper are "done". Final testing depends on port of kernel interceptor tests to multicore |
| support 653 needs for slack disabled scheduler stack analysis | 5-release | RLF | Done | Add setExceptionMask API |
| Remove dependence on sys/types.h and stop delivering it | TBD | TBD | Done | |
| svn up kernel sources in hypstart | TBD | RLR | Done |
Release stable/complete dates:
- 3/12/2021
- 4/7/2021 (Sales aperiodic release)
- 4/29/2021
- 8/23/2021
- 10/7/2021 (Desert Eagle release)
- 3/1/2022 (RFS)
Debugging files in MFSs
The Problem
When writeProcessMemory() is called on addresses that are mapped to files in MFSs, it will write to the file without copying it to RAM so files in MFSs will be currupted by the debugger.
Page attributes of interest
- owned
- readOnly
- readWrite
- malleable?
- frozen
- physical address
Proposal 1
- Update writeProcessMemory() to thaw if physical address ! in RAM and access it readOnly. Or should this be !readWrite?
- If a process has read only access to a platform resource writeProcessMemory() will no longer be able to write to the physical device.
Proposal 2
- Have mapViewOfPlatformResource() set the frozen attribute.
- Should the name of the frozen attribute be changed? If so to what? Immutable? CopyOnWrite? Persistent? I am not convinced it is necessary but the description would need to be changed.
Proposal 3 - The Winner!
- Add an attribute to platform resource definition in the registry indicating it is immutable and set the frozen attribute on attach if in immutable attribute is set.
- How does this differ from the readOnly attribute (see Proposal 1)? The attribute could be at the resource level, not the attachment level.
- This means 2 resources for runtime initialized MFSs. One for the initializing process and one for the using processes. Prone to error?
Proposal 4
- Set the frozen attribute when attaching readOnly to a platform resource.
Multiple WAT / 653 Part 2 Multiple Module Schedule
The problem
- Currently all WATs must be same duration
- MMS switches at a major frame - standard implies each can be different major frame
- WAT switches at WAT duration (currently hyperperiod)
- currently major frame is the same as WAT duration
- 653 major frame - runtime - when aperiodic processes record worst case execution time
- 653 major frame - check 653 config tooling
- All 653 Deos threads are fastest rate, min budget, slack consumer. Single thread in scheduler.
- MMS has a partition schedule change hook
- implemented in 653 lib when window activation detects new schedule
Typical scenario
- 1st schedule - minimal set of critical partitions (more time allocated)
- 2nd schedule - all partitions (critical get their normal budget, room for extra partitions, now)
- 3rd schedule - reconfiguration due to errors, etc
- Desert Eagle talked about scenario at one point that during power down event they want to give rest of time to file system to flush data
Proposal
- The system tick rate is constant across the system.
- The set of Deos periods is constant across the system.
- A WAT will indicate the slowest Deos period it supports.
- A WAT may have any set of Deos supported periods. The new WAT does not need to be the same or longer duration than the prior WAT.
- WAT change will continue to occur at the WAT duration for the WAT switching away from (i.e. that WAT's hyperperiod will be completed and when switching to window 0, it will be window 0 of the new WAT).
- It is not acceptable for short duration WAT to wait and switch at max WAT duration.
- WAT duration == Major Frame for 653 schedule
- Will add support for scheduler not at fastest rate. Allows 653 partitions to alternate spots in a timeline without extending tick.
- 653 config tool will pick slowest rate supporting partition period
- Scheduler will identify fastest/shortest period supported.
- IT will warn if threads assigned to scheduler with longer period than a WAT which the scheduler belongs
- This is not an error so critical threads can run in all WATs, but other threads start in a WAT that supports their period. E.g. is network. Limited threads at startup, but may be desired for all threads to be in same scheduler in the normal WAT vs having to have two schedulers.
- Thread will take up some fixed budget, if configured like this.
- IT configuration
- thread template - symoblic rates (fastest) is relative to the scheduler it's assigned
- scheduler - specification of slowest/fastest supported is explicit rate or systemSlowest/systemFastest expressions. certified apps should be explicit, but supplied apps can run based on system settings.
- IT must ensure windows occur at necessary frequency, and total budgets meet needs of scheduler
- CVT rule that thread template within slowest/fastest of scheduler.
- budgets are still normalized to fastest rate in system, not the scheduler
Frame sync with pending WAT change
- Add status code to PAL. Do not do and return status if WAT change pending.
- Use attribute on function to force return code to be checked.
- Alternate: frame sync wins and WAT change delayed until next WAT start.
- Should not be issue for powerTransient (moving one tick). Could still do new WAT start if pending.
Other issues
- There was an issue for threads CFP and running in next period (has period passed since I've been created)
- See PCR:13504. Make sure this has been addressed for scheduler switches.
- Can we blow stack if raise frame sync/warmstart multiple times to inactive schedulers (may be a current issue)
Risk Factors
- schedule before - should not be since WAT switch is at hyperperiod
- idle is fastest rate in system, as opposed to scheduler. Needs change.
- logic for window boundary has been crossed. window cannot span tick boundaries. deferred TLB. Should all still be valid.
- Is there a slack accumulator issue. Replenished at every tick. Trickles down until hyperperiod. Go back up to fastest... Ensure fastest rate in this scheduler end condition.
- Since WATs may have different durations, ensure PAL logic is correct for setActiveWAT and time accumulation for subsequent time APIs
- Are all places documented, where schedulers needed to be activated at fastest rate? Data structures and analytical (time availability and response time)
Future Possibilities (Not planned)
warmstart scenario
- should be separate purchase. Not promised to date that we're aware. TODO: Check with BC.
- Just CFFS needed? Any portion of network?
- Likely would require a frame sync type of approach to switch now.
- relax some constraints - may not get full budget. Exception tells you it occurred.
- Time will be discontiguous. May not be able to switch to another schedule without coldstart.
- non-trivial cost - must visit every thread (in new WAT, or all if able to continue); may want to distribute to thread to detect scheduler change, like window switch
- If raise exceptions to threads (only to destination schedulers).
- Should raise exception be more like SOW exception at thread switch to user?
- If there is a schedule switch exception, don't need SOW at the same time. It becomes implied for that window.
Eliminate RMA model for some collection of schedulers
- No guaranteed budget - 653 already using min budget, slack
- Fewer constraints for 2nd level scheduler
- No schedule befores. Eliminate need to visit threads.
- Bound worst case may not be possible
- SB is also independent topic
- Could eliminate schedule before from kernel. Make an integration issue creating more priorities.
- # priorities is O(1) for crittime
- bitarray can help with readying, but list merge with threads at all priorities is an issue