CFFS Project/hosmer mal

From DDCIDeos
Jump to navigationJump to search

Hosmer - NOR Flash MAL

This section represents the software component labeled "CFFS MAL" in the diagram on the CFFS project page.

Activities

Description Assignee Status Remarks
Requirements Development Sergio, Eduardo Done
Requirements Review Sergio, Eduardo Done
Code Development JK Done
Code Review Sergio Done
Test Case Development Sergio, Eduardo Done
Test Case Review Sergio, Eduardo Done
Test Procedure Development Sergio, Eduardo Done
Test Procedure Review Sergio, Eduardo Done
Executable Object Code Analysis N/A N/A
Software Lifecycle Audit SQA Done PCRs [9040] and [9065]
Document Build and Test Environments Jerry Done
Requirements Coverage Analysis, including Traceaid Qualification Eduardo Done
Compiler Assessment Jerry Done
ABC Tool Qualification Jerry Done
Structural Coverage Analysis JK Done
Conformity Inspection - SQA Build Witness SQA Done
Integration Review Jerry Done
Run for Score, including SQA Witnessing, and Test Results Review Jerry, Kelly Done
Verification Audit SQA Done
Open Problem Reports List Jerry Done
Report Document Development (SLCECI, SCI, SAS) TBD Done
Report Document Review TBD Done
Population of Certification Archive (PCA) Jerry Done
Software Conformity Audit SQA Done


Tests for Coverage

Test ID Description status Priority Asignee Misses
CovTest_01a Case when the Device Bank Switching Threshold is reached, so the MAL needs to switch to another device. First it finds that the device with the lowest erase count SDL is more than EB Age Limit below the highest erase count SDL; but it has less than 3 Clean EB's. So, at the end the MAL selects the cleanest Device Bank. Done (See tpc017 - TC 100) Medium emonge MAL_Data.cpp,541
MAL_Data.cpp,541-2
CovTest_01b We need to change the default MAL_EB_AGE_LIMIT env var.

Need testing around the EB age limit provided in the env var, could be as simple as setting the env var to a low age limit like 1 or 2, and creating fragmented/defrag opportunities. Goal is for code to defrag an EB that exceeds the age limit, even though there is a "dirtier" EB that would normally be selected.

Done (See tpc019 - TC 100 & 110) Medium emonge MAL_Data.cpp,320
MAL_Data.cpp,320-2
MAL_Data.cpp,320-3
MAL_Data.cpp,544
CovTest_02 Change the default NOR_MAL_API_RESOURCE env var. Use a platform resource name different than the default

and an invalid one.

Done (See tpc007 - TC120) Medium emonge MAL_Data.cpp,345
MAL_Data.cpp,346
MAL_Data.cpp,358


CovTest_03 May be protective code, is it possible to NOT have a current SDL, and be in stkDefrag? A test point could create this path, just not sure it's possible otherwise. stkDefrag does not make sure there is a current SDL before calling this routine, so there is no protection there. Perhaps after a LLF, where every EB fails, but only on one bank, then defrag is called on that bank will cause this.

See testPoint02

Pending Low TBD MAL_Data.cpp,854-2 [3401]
CovTest_04 1) Need 2 WLE's pointing to the same EB, finding the one with the higher chrono first.

2) Need to find 2 WLL's, 2nd one being Older than first, so 2nd one found is NOT selected as the current. Second WLL must not be full, it must have an all 0xFF entry at the end.

Test: Find a WLL, copy it to RAM so you can manipulate it, decrease the chrono and recalculate the CRC for the header and every WLE, program the modified EB data into a blank EB. Now you will have two WLL'sin which the second is older than the first, and their entries are older than the first WLL's entries.

Done (See tpc021) High ssalazar MAL_Data_Scan.cpp,130 [842]
MAL_Data_Scan.cpp,153 [964]
MAL_Data_Scan.cpp,153-2 [969]
MAL_Data_Scan.cpp,285 [1788]
MAL_Data_Scan.cpp,299 [1851]
MAL_Exec_Scan.cpp,188 [693]
MAL_Exec_Scan.cpp,191 [706]
MAL_Exec_Scan.cpp,191-2 [707]
MAL_Data_Scan.cpp,271 [1723]


CovTest_05 For some reason the compiler first checks if the wear Level Flag is below or above the Program Failed flag 0xFF07. After that it checks every flag listed in the switch one by one. The hole is that the flag is never less than 0xFF07 (Program Failed) and different than 0xFF00 (Dirty ).

Code should be updated to remove this path, what if we used else-if logic instead of switch()?

Done (branch was removed) Low jKelley MAL_Data_Scan.cpp,379 [2247]
MAL_Data_Scan.cpp,379-4 [2248]
CovTest_06

Case 1: Need to start a program during maintenance, when the timer is off, a Read Request is found in the queue before DfWrite state machine is able to check for completion. Then the Read realizes the program failed.

Case 2: Similar to case 1. Start an erase during maintenance, when the timer is off, a Read Request is found in the queue before erase state machine is able to check for completion. Then the Read realizes the erase failed.

Note: The Read must be in the same device the erase/program is in progress. testPoint03 may be good enough so we don't need to actually be erasing or programing when the Read comes.

Done (See tpc018 - TC100) High emonge MAL_Device.cpp,122
MAL_Device.cpp,130
MAL_Device.cpp,238
MAL_Device.cpp,238-2
MAL_Device.cpp,280


CovTest_07 Need the case where there is a program in progress, then a Read comes, so it waits for the program to complete. Done (See tpc018 - TC100) High emonge MAL_Device.cpp,249-2


CovTest_08 Case where during maintenance, there is no valid SDL, and the number of cleans = 1 or 0. May need to "dirty" one or more EB's before starting the MAL so numCleans is 1, but there are some dirties. Pending Low TBD MAL_Device_Maint.cpp,86-4 [149]
CovTest_09 We need to defrag the Current wear level sector. Looks like the WLL has never been the dirtiest EB available so that it is selected for defrag. Given enough time(writes/erases, WLL entries), and allowing the MAL to clean itself can get this path.

Note: Make sure the dirty entries in the WLL is greater than the WLD threshold

Done (See tpc019 - TC 130) High emonge MAL_Data.cpp,837
MAL_Device_Maint_StkDefrag.cpp,82[613]
MAL_Device_Maint_StkDefrag.cpp,83[618]
MAL_Device_Maint_StkDefrag.cpp,109[659]
MAL_Device_Maint_StkDefrag.cpp,112[667]
MAL_Device_Maint_StkDefrag.cpp,116[674]
CovTest_09b Protective code. Missing the case when an EB that was defragmented is not set as dirty by the defrag mechanism. Pending High TBD MAL_Device_Maint_StkDefrag.cpp,148 [752]
MAL_Device_Maint_StkDefrag.cpp,150 [759]


CovTest_11 Race condition, comes and goes. Requires maintenance running while writing. One bank has to be writing the same LBA as the other bank is defraging it such that after the defrag write, before the MAP update, the other bank has updated the MAP with a fresh write. Must have gotten accidental coverage before, leave this comment until a specific test (or test point) guarantees this path is covered.

The exec halt testpoint can be used for this, write an EB full of LBA's, erase hint all but one of them LBA-X so we know that LBA-X will get defraged. Turn on HALT testPoint01, and turn on maintenance. Run one (or two?) loops so defrag starts, then put a write request for same LBA-X in the queue, and run. Make sure after, that the last LBA-X data is read back, and doesn't match the defraged data. testPoint01

Pending Low TBD MAL_Device_Maint_StkDefrag.cpp,302
MAL_Device_Maint_StkDefrag.cpp,307
MAL_Device_Maint_StkDefrag.cpp,307-2


CovTest_12 Need a program error during the "Finalize" step in defrag. See testPoint04 Done (See tpc022) High ssalazar MAL_Device_Maint_StkDefrag.cpp,180
MAL_Device_Maint_StkDefrag.cpp,197-2
MAL_Device_Maint_StkDefrag.cpp,366
MAL_Device_Maint_StkDefrag.cpp,384
MAL_Device_Maint_StkFinalize.cpp,121
MAL_Device_Maint_StkFinalize.cpp,131-2 [305]
MAL_Device_Maint_StkFinalize.cpp,139 [313]
MAL_Device_Maint_StkFinalize.cpp,139-2 [314]
MAL_Device_Maint_StkFinalize.cpp,143 [319]


CovTest_13 Need case where StkFullWait exits when a dirty magically appears. Done (See tpc016) Medium emonge MAL_Device_Maint_StkFullWait.cpp,35 [49]
MAL_Device_Maint_StkFullWait.cpp,35-2 [50]
MAL_Device_Maint_StkFullWait.cpp,47 [81]
MAL_Device_Maint_StkFullWait.cpp,53 [86]
MAL_Device_Maint_StkFullWait.cpp,61 [92]
MAL_Device_Maint_StkFullWait.cpp,62 [99]


CovTest_14 Need a case where there are no cleans, but there are dirties while trying to get a new SDL. Pending Low TBD MAL_Device_Maint_StkNSDL.cpp,69 [406]
MAL_Device_Maint_StkNSDL.cpp,83 [421]
MAL_Device_Maint_StkNSDL.cpp,87 [435]
MAL_Device_Maint_StkNSDL.cpp,98-2 [438]


CovTest_15 Need case where a new WLS is needed, but there are NO cleans or dirties, causing the FullWait state to be entered. This would be an error path, may need prep by test before starting MAL, or a Test Point? Pending Low TBD MAL_Device_Maint_StkNWLS.cpp,58 [297]
MAL_Device_Maint_StkNWLS.cpp,61 [303]
MAL_Device_Maint_StkNWLS.cpp,62 [316]
MAL_Device_Maint_StkNWLS.cpp,69 [326]
MAL_Device_Maint_StkNWLS.cpp,89 [329]


CovTest_16 Need case where there are no cleans, but there is a dirty that can be erased and selected as the new WLS. Pending Low TBD MAL_Device_Maint_StkNWLS.cpp,76 [331]
MAL_Device_Maint_StkNWLS.cpp,80 [345]
MAL_Device_Maint_StkNWLS.cpp,89-2 [348]


CovTest_17 Need case where writing the WLS header on the newly selected EB fails. This test is implemented with CovTest_18 Done (See tpc004, Test Case 140) High ssalazar MAL_Device_Maint_StkNWLS.cpp,134
MAL_Device_Maint_StkNWLS.cpp,141-2
MAL_Device_Maint_StkNWLS.cpp,151
MAL_Device_Maint_StkNWLS.cpp,151-2
MAL_Device_Maint_StkNWLS.cpp,155


CovTest_18 Need case where we need to write WLD, and the current index indicates the current WLS is exactly full (has 8191 entries). Following we need to restart the MAL so that the full WLL can be detected suring scan. Done (See tpc004, Test Case 140) High ssalazar MAL_Device_Maint_StkWWLD.cpp,57 [424]
MAL_Exec_Scan.cpp,172-4 [773]


CovTest_19 We need to suspend a bank for reading, but the bank needs to return not-ready at least once while the read state machine is checking its status. This branch would only occur if a device was commanded to suspend, and it was not suspended in the spec'd amount of time we wait. See testPoint03 Done (See tpc018 - TC100) Medium emonge MAL_Exec_Read.cpp,116 [599]
MAL_Exec_Read.cpp,126-3 [645]
MAL_Exec_Read.cpp,133 [663]


CovTest_20 Need a case where a SDL entry fails CRC and is not all FF's, this simulates power fail while writing a log entry. Done (See tpc021 - TC120) High emonge MAL_Exec_Scan.cpp,290 [988]
MAL_Exec_Scan.cpp,299 [993]


CovTest_21 Need more robustness testing around the SDL "align-up" feature. While startup is scanning a SDL EB, it runs into a spot where a log could be, but fails the CRC. It then checks to see if there is a log at the next aligned-up location(1K boundary). Need cases where this happens when the current log (the one that failed) is already on a 1K boundary, and also when the last known data pointer is at that next 1K boundary. Create these scenarios, and let MAL startup will get these paths, may be 4 cases here, but can be done on 4 different SDL's with a single startup. Pending Low TBD MAL_Exec_Scan.cpp,368 [1246]
MAL_Exec_Scan.cpp,368-2 [1250]


CovTest_22 Startup needs a case where it finds an unfinalized SectorData512 log type. Basically an interrupted defrag where defrag wrote a SectorData512 log type. testPoint06a and testPoint06b were used to halt before and after finalization respectively, before the MAL is restarted. Done (See tpc023) Medium ssalazar MAL_Exec_Scan.cpp,396 [1335]
MAL_Exec_Scan.cpp,399 [1339]


CovTest_23 Need test where there is a FULL SDL, and the last entry in it was from defrag and was NOT finalized, similar to previous test, but requires the log to be full. I.E. The last log pointer, if incremented to the next log location, would cross into the data area. Achievable using testPoint01 to halt before finalization, and restarting MAL? Pending Medium TBD MAL_Exec_Scan.cpp,409 [1362]


CovTest_24 Need a case where the current bank is selected for the next SDL, but while acquiring that SDL, it has enough failures, that the bank is considered "dead", no cleans, no dirties. This could possibly get caught with the test that tries to fill the MAL, but needs to return to the write exec to hit this path. Pending Low TBD MAL_Device.h,244 [14]
MAL_Exec_Write.cpp,624 [1628]
MAL_Exec_Write.cpp,629 [1635]
MAL_Exec_Write.cpp,629-2 [1642]
MAL_Exec_Write.cpp,629-3 [1650]
MAL_Exec_Write.cpp,632 [1656]
MAL_Exec_Write.cpp,633 [1666]
MAL_Exec_Write.cpp,643-3 [1671]
MAL_Exec_Write.cpp,639 [1673]


CovTest_25 Robustness, this case is where we borrow/dequeue an extra request because current is odd. Borrowed request needs to NOT fit entirely in the current SDL. Try this: low level format, then enqueue 1 Write of 509 server sectors followed by a Write of >=3 server sectors, then start the transfers. Done (See tpc010 - TC100) High emonge MAL_Exec_Write.cpp,366 [1089]
MAL_Exec_Write.cpp,384 [1096]


CovTest_26 Need case where current SDL is failed or dirty while a Write Request is being processed.

Test: Low level format. Enqueue the following three requests: Write 511 server sectors, EH those sectors, Write any numbner of server sectors. Make sure regular maintenance is disabled before enabling the transfer. The goal is that the current SDL is Dirty when the third request wants to write it.

Pending High TBD MAL_Exec_Write.cpp,501 [1385]
MAL_Exec_Write.cpp,503 [1388]


CovTest_27 Need scenario where SDL has exactly enough room for a 1-Log, 1K write (32 Bytes in log area, 1K in Data area), then perform a write.

Try this:

1)Low Level Format.

2) Do 241 Writes of 1 server sector each, execute only one at the time so the MAL does not combine them. This will cause to allocate 240 17/32 KB in the Log Area.

3) Do 14 Writes of 2 server sectors each, enqueue and execute all of them at once. Now the layout will look like this: Data Area = 14KB, Log Area 240 31/32 KB. Therefore the remaining space will be Data Area=1KB, Log Area = 1 entry, what we wanted.

4)Execute 1 Write of 2 server sectors, this should hit the path.

Done (See tpc020 - TC 100) High emonge MAL_Exec_Write.cpp,536-3 [1437]


CovTest_28 Analysis or test point? Reset logic, guess HW is always devReady quick enough after a reset that the loop never loops. Pending Low TBD spansion.cpp,47 [121]


Clean Function

Objective: give user complete control over when the Maintenance features occur (i.e., an on/off switch); in order to ensure fast write performance.

  1. How to use the "on/off switch"
    1. Configuration
      1. The MAL and Client will attach to a new RAM based Platform Resource
        to facilitate 2-Way transfer of the API parameters. This resource
        to be attached to by the client, and its virtual address will be
        passed to the API interface. This is identical
        to the cffs API interfaces.
        The default resource name will be NOR_MAL_API_Resource_RW changeable via a new environment variable NOR_MAL_API_RESOURCE.
      2. The environment variable MAL_DF_CLEAN_LIMIT will be replaced with MAL_CLEAN_LIMIT to be
        the default desired number of clean media sectors, and it's new default
        value adjusted to become "completely clean" such that upon start up the
        MAL will immediately start cleaning itself until a client lowers this
        value via the interface.
      3. The environment variable MAL_DF_DATA_BYTE_LIMIT will be removed since
        it may conflict with the desired MAL_CLEAN_LIMIT.
    2. Guidelines/Restrictions
      1. The client and MAL are expected to be of the same design assurance level,
        so limited or no parameter validation will occur. Setting the clean
        limit unreasonably high will simply cause the MAL to clean as far as it
        can. Setting the value too low may cause the MAL to shutdown due to lack
        of space.
      2. The minimum floor of the MAL_CLEAN_LIMIT is 2 sectors per bank. For example, if the
        client needs 200 media sectors on a 2-Bank system, set the MAL_CLEAN_LIMIT to
        204. Then, when the cleaning is complete, set the value to 4. Setting
        the floor higher will allow more headroom.
      3. The actual clean count is already available in the malInfo structure.
      4. Write requests during the cleaning process will still be honored as before.
        The clean function only has the effect of delaying/halting the cleaning process.
      5. The "cleaning process" includes erasing dirty sectors first, followed by
        de-fragmenting and erasing if necessary to achieve the desired clean count.
      6. The client should not change the MAL_CLEAN_LIMIT until after the servers alive
        count has indicated the file system is ready for commands.
    3. run-time use
      1. No action is required at start up. The default MAL_CLEAN_LIMIT will
        begin immediately. When faster throughput is desired, and there are sufficient
        clean sectors available, the client may then change the MAL_CLEAN_LIMIT to a lower
        value (like the floor of 2 * NumBanks) to pause the cleaning process.
      2. When the client desires the cleaning process to resume, they may change the
        value to a larger number (like numNeeded + floor) to resume the cleaning process.
      3. When the MAL either reaches the limit, or cannot clean any further, it will set the
        cleanStatus flag in the interface to a 1, while cleaning this status will be zero.
      4. When the cleanStatus flag is set, there are two possible outcomes. Either the clean
        limit was reached, verifiable via the malInfo structures numCleanSectors, or there
        was insufficient dirty/healthy media sectors available to achieve the clean limit
        (i.e., files will need to be erased to achieve a higher clean count).

Interface

  1. No new deliverable software is required for this interface. Macros
    or inline functions can be used to pass the data. There is only
    one parameter, which affects the MAL_CLEAN_LIMIT, and a flag to indicate
    the value has changed.
  2. The client side should only change the parameter when the flag is clear, and
    set the flag when a new value has been written.
  3. The first UNSIGNED32 of the resource will be the flag, the 2nd UNSIGNED32
    will be the new MAL_CLEAN_LIMIT value. Flag = 1 indicates a new MAL_CLEAN_LIMIT value
    is present. The MAL will reset the flag to zero after it has captured the new value.
  4. The 3rd UNSIGNED32 is the real-time status of the cleaning function (cleaning = 0,
    clean complete = 1).
  5. Examples:
   typedef struct
   {
     UNSIGNED32 valueChanged;   // Tells the MAL there is a new clean threshold value
     UNSIGNED32 cleanLimit;     // Clean limit to use
     UNSIGNED32 cleanStatus;    // Real-Time Status: 0 == Cleaning, 1 == Clean Complete
   } norMalResourceDef;
   inline void norMalSetCleanLimit(const void * resourceVA, UNSIGNED32 cleanLimit)
   {
     // precondition: Make sure the flag is zero before updating the limit.
     // we don't want branches in an inline function so that needs to be done
     // elsewhere.
     ((norMalResourceDef*)resourceVA)->cleanLimit = cleanLimit;
     ((norMalResourceDef*)resourceVA)->valueChanged = TRUE;
   }
   inline BOOL MALIsClean(const void * resourceVA)
   {
       return (((norMalResourceDef*)resourceVA)->cleanStatus == 1);
   }

Returned Value

None

Warnings/Restrictions

  • Levy high DAL caller restriction

Design

  • Threshold independent of bank?

Yes

  • Maintenance independet of bank?

Yes

See Also