FileName: l2test.readme L2 CACHE test program History: 6/15/99 Original code from myt 9/09/99 Code modified by mlo 10/10/99 750 l2test cache works correctly mlo Use symbolic names for all L2CR settings Use L2CR settings as defined by the MPC750 manual Improve the test by following the Corrected 9.1.5.2 Cache Testing rules in Chapter 6. 10/20/99 7400 l2test cache works correctly mlo 4/10/00 Added description of ibat/dbat2 for maximer 5/17/00 Change the Anomolies section to Notes on Results. L2 backside cache test: Performance Monitor elucidation: Notes: dinkusr.h and dinkusr.s is from Dink32 V12.0 See Chapter 2 below. Table of Contents: 1. General Description 2. Dink32 R12.0 versus Previous Releases 3. File Organization 4. Makefiles 5. General Flow of Execution. 6. Corrected User's Manual L2 Cache Testing Rules 7. Example Results. 8. Notes on Results. 9. Caveats Appendix A: Initial settings for dbats, ibat, msr, and hid0. and description of ibat2/debat2 for yellowknife vers maximer 1. General Description This program can test the L2 cache for both the 750 and the 7400 PowerPC Processors. It can run two types of tests, both of them performing similar operations. The first test is to exercise all the types of counting available with the Performance Monitor. The second test, using the Performance Monitor follows the corrected version of the User's manual to exercise the L2 cache, reporting L2 hits and misses. The user is queried for the PowerPC processor, an available range of memory addresses for the tests, and a choice of the two types of tests listed above. The results are printed to the display. This program and it's associated makefiles is designed to compile and link, producing an elf and an srecord file. This srecord file can be downloaded by dink32 and run from dink32. It uses the dink32 getchar and putchar (printf) facility. Since the implementation of this facility changed from R11.0.2 to R12.0, there are two makefiles, one to build for dink32 R12.0 and the other for dink32 R11.0.2 and earlier. The files dinkusr.s and dinkusr.h are only available for dink32 R12.0 and beyond, see Chapter 2. 2. Dink32 R12.0 Versus Previous Releases Dink32 R12.0 is the first release that supports dynamic dink32 function addressing. It is implemented with two new files, dinkusr.s and dinkusr.h. The Dink 32 User's Manual R12 Appendix E has been expanded to include this l2test facility and Appendix G describes how to use this new dynamic addresses feature. The dink32 R11.0.2 User's Manual Appendix E describes the older technique of static dink32 function addressing as part of the description of the two example programs, demo and dhrystone. See Chapter 4 in this document on the makefiles for more information. Dink32 R12 also incorporates new code to enable the duart FIFO and bank two of memory. The FIFO allows downloading to proceed much more rapidly and obviates the need to set the Properties|Settings|ASCII setup Line delay and Character delay to any value other than 0. Previous versions of dink can be temporarily enabled with the following dink32 commands: Yellowknife and Sandpoint platforms: R12.0+ no changes needed R11.0.2 mm -b fe0003fa | set to 03 Previous to R11.0.2 mm fe0003f8 | set to xxxx03xx, where xx represents the previous value. Excimer and maximer platforms: mm 40500000 | set to 03000000 Enabling memory bank 2 allows one to utilize the second set of 16Mb of memory, doubling the available memory from 16Mb to 32Mb. Previous versions of dink can be temporarily enabled with the following dink32 commands: Yellowknife and Sandpoint platforms: R12.0+ no changes needed R11.0.2 and previous set these values in the mpc106/107 registers 80 = 30201000 84 = 70605040 90 = 3f2f1f0f 94 = 7f6f5f4f a0 = 03 88 = 8c = 98 = 9c = 0 (or leave as set) One can use the rd mpc106 and rm mpc106
E.g. rm mpc106 80 | set to 30201000 Excimer and Maximer not available 3. File Organization The code consists of C and PowerPC assembly programs. The code is delivered in a single flat directory. The files are: README - Important features of this test. l2test.readme - This file. l2test.c - The main program, which queries the user, performs the test, displays the results, and consists of several helper functions that can be performed in C. l2test.h - The macro definition file, definitions for the L2CR register and its parameters, other registers, and various parameters used in l2test.c and l2testutils.s. NOTE: Both makes use either l2testutils.s or l2testutils.c, but not both. l2testutils.s - Consists of various functions that can not be performed in C, such as setting or reading registers. l2testutils.c - Consists of dummy functions for all the functions defined in l2testutils.s. localio.c - A localized version of scanf to read characters from the console, i.e. the duart. makefile - The makefile used to make the test program for dink32 R12. makefile_dink11 - The makefile used to make the test program for dink32 R11.0.2. 750_l2test.prt - example output from running the MPC750 with the L2 cache test. 750_monitor.prt - example output from running the MPC750 with the Performance monitor elucidation. 7400_l2test.prt - example output from running the MPC7400 with the L2 cache test. 7400_monitor.prt - example output from running the MPC7400 with the Performance monitor elucidation. There are two ancillary files used for dink32 R12.0 and beyond and therefore the makefile is designed to be run from a subdirectory of dink32. If the user wishes to make it in a standalone directory, change the makefile and any files that reference these two files to point to their new locations. If modifying this code as a standalone program, either define another technique for printf and scanf, or remove those lines from the code. In either case dinkusr.s and dinkusr.h will most likely not be used. dinkusr.s - supplied with dink32, implements the linkage to dink32 functions such as printf, and getchar, etc. dinkusr.h - header file for code which uses dinkusr.s. 4. Makefiles There are three makefiles: 1. makefile, which is used to build l2test when used with dink32 R12.0 and beyond. This makefile includes targets to build dinkusr.o from dinkusr.s and dinkusr.h. 2. makefile_dink11, which is used to build l2test when used with dink32 R11.0.2 and earlier. This makefile does not include dinkusr.o targets. 3. makefile_gcc, which is used to build l2test when used with dink32 R12.0 and later. Two makefiles have the same targets, makefile_gcc has only the ppc target: 1. ppc This target makes l2test.src and l2test (elf) using the metaware compiler installed on a unix machine. The user will have to ensure that the paths are correct for this compiler. It generates PowerPC code. It uses l2testutils.s. 2. cleanppc removes only files associated with ppc 3. unix This target makes l2test.out using the unix gcc compiler. It generates host code (for unix) and can run on the host machine. It it used to display the execution flow, it does not test anything. The user will have to ensure that the paths are correct for this compiler. It uses l2testutils.c. 4. cleanunix removes only files associated with unix 5. ppc_pc This target makes l2test.src, l2test.txt, and l2testp (elf) using the metaware compiler installed on an NT PC desktop. The user will have to ensure that the paths are correct for this compiler. It generates PowerPC code. It uses l2testutils.s. 6. cleanppc_pc removes only files associated with ppc_pc 7. cleanall removes all files associated with all the targets Both makefiles make the same executables: l2test, l2testp - identical elf PowerPC executables l2test.out - host executable l2test.src, l2test.txt - identical s record files. xref.txt - cross reference report for the build. The makefiles use the following user modifiable definitions. DEBUG = -DDEBUG Default is commented out. When not commented out, this generates more copious output indicating when each function is called and what values are returned. Used primarily for debugging. DINKR12 = -DDINKR12 makefile: Default is defined. It is used to generate conditional code to support dink32 R12 and beyond. Do not remove it. makefile_dink11 Default is null. It is used (by it's absence) to generate conditional code to support dink32 R11.0.2 and earlier. Target UNIX build: The unix build is irrespective of the dink32 version, therefore the makefile_dink11 explicitly uses -DDINKR12. This build is sort of a simulation. It is used primarily for debugging the C code on the host unix machine. In this manner, printf, formats, scanf, and mathematical manipulation can be debugged with out the necessities of downloading the code to the target. It is sort of a simulation, because it does not simulate the operation of the L2 cache, rather, it just returns values from the all the l2testutils.s functions via a dummy file called l2testutils.c. Code and Data location memory: The code is relocated to start at address 0x90000, because dink32 utilizes memory from 0x0 to 0x7ffff, and the stack runs from 0x80000 to 0x8ffff. The Memory locations are shown below, as is indicated in the cross reference listing, xref.txt. Section type start address end address size .text text 00090000 000912db 000012dc .rodata lit 000912e0 00091ba3 000008c4 .sdata bss 00091ba8 00091ba7 00000000 .sbss bss 00091ba8 00091bdf 00000038 .sdata2 bss 00091be0 00091bdf 00000000 .data data 00091be0 00091bf6 00000017 .bss bss 00091bf8 00091bfb 00000004 5. General Flow of Execution. The program either elucidates all the Performance monitor counters, or tests the L2 cache. The program must run in supervisor mode. The two interposers used to develop this test are: A 750 interposer with 1/2 meg L2 cache. A 7400 interposer with 1/2 meg L2 cache. Some 7400 interposers have 1 meg and others have half meg L2 cache. This l2 test was developed with a 750 and a 7400, both of which have a half meg L2 cache. To run the test with an interposer with more or less L2 cache, change the value corresponding to MPC750_DISABLE and MPC7400_DISABLE, MPC750_CACHE_SIZE, and MPC7400_CACHE_SIZE, in l2test.h. The Flow is: Supervisor Mode is required 1. Set up dink transfer base for printf for R12.0 or static addresses for R11.0.2 2. Set the msr, the bats, and the caches to known values These values are hex constants embedded in the code as shown in in Appendix A: To make changes, edit l2test.c and l2testutils.s. 3. Query the user for processor, memory address, and test type. 4. Call prepareForTest which does the following 1. At reset, the L2 cache is disabled. If prior to this test the L2 cache has been enabled, then flush it and then disable it. In any case, it must be disabled. 1. Initialize the L1 and L2 cache. 2. globally invalidate the L1 and L2 cache The 750 calls global_L2_750_invalidate and the 7400 calls global_L2_7400_invalidate. The major difference is the 7400 must kill any stream touch instructions. Both functions flush both caches in cacheFlush, which reads 0x200000 (2Mb) into the cache. Ensure that your hardware supports addresses from 0x0 to 0x200000 inclusive. 3. enable the L2 cache (the L1 cache is never disabled) 4. Fill the entire L2 cache with zeros at known addresses. specifically, the user supplied start address. 5. Call one of the following four tests based on the user's input from step 3 above. 1. Call the 750 l2 cache test, L2cacheTest. which follows the corrected steps outlined in UM 9.1.5.2 2. Call the 7400 l2 cache test, L2cacheTest. which follows the corrected steps outlined in Book IV 7.9. 3. Call the 750 monitor elucidation, monitorTest. 4. Call the 7400 monitor elucidation, monitorTest. The original 750 User's Manual and the 7400 Book IV are incorrect: See Chapter 6 below for the correct sequence of steps. The l2cacheTest and monitorTest functions are very similar in operation. The only difference is: l2cacheTest uses only the appropriate Performance Monitor encodings to monitor and record L2 cache hits and misses. The encoding on the MPC750 for PMC1 and PMC3 is 7, and PMC2 and PCM4 is 0. PMC1 counts the L2 hits, PMC3 counts the L2 misses. The encoding on the MPC7400 for PMC1 is 33 and PMC2 is 26, and PMC3 and PMC4 is 0. PMC1 counts the L2 hits, PMC2 counts the L2 misses. The output then displays just L2 hits and misses. See Chapter 7 for some example results. See the file 750_l2test.prt for a complete run on the MPC750. See the file 7400_l2test.prt for a complete run on the MPC7400. monitorTest sets the encodings for PMC1, PMC2, PMC3, and PMC4 to all possible values for the appropriate PC. The output then displays all possible results for the PMC counters. See the file 750_monitor.prt for a complete run on the MPC750. See the file 7400_monitor.prt for a complete run on the MPC7400. The 750 varies the encodings from 0 - 17 and the 7400 varies it from 0 - 48. This is controlled by the macros, MPC750_MAX_ENCODE and MPC7400_MAX_ENCODE in l2test.h. However, since the PMC3 and PMC4 encodings are only 5 bits, on the 7400 test, PMC1 and PMC2 vary from 0 - 47, but PMC3 and PMC4 start with 0 again after 31. So the results will show 31-31-31-31 followed by 32-32-0-0 in the display. See chapter 7 for a discussion of PMC1-PMC2_PMC3-PMC4 output display. The sequence of activity for both l2cacheTest and monitorTest is: 1. Test the external SRAM. Write/Read in L2 cache range. Using the same addresses that have just been stored into the L2 cache, Write the pattern 0xaaaaaaaa, this should get all hits, then read the same addresses this should get all hits, then compare the read values with the expected values at these locations, they should be equal. Then change the pattern to the compliment, which is 0x55555555 and run the test again. Then do a series of walking ones in the pattern and test for each one. See Chapter 7 and 8. 2. Test the L2 cache Tag Memory. Write out of the L2 cache range. Using the previous ending address as the start address and incrementing the start and end addresses by 0x1000, write a pattern and then read the pattern. The write will get all misses, and the read will get all hits. See Chapter 7 and 8. 3. Continue testing the L2 cache Tag Memory. Read out of the L2 cache range. Perform a read to the next set of addresses (current address + 0x1000). This will get all misses. However, the values read will be different then the current pattern (expected values) and will all be registered as Mis-matched. See Chapter 7 and 8. 4. Continue testing the L2 cache Tag Memory. Write into L2 cache range. Perform a write to the last set of addresses (current address). This will get all hits. See Chapter 7 and 8. 5. Step four of the L2 cache test is a general memory test. This test does not perform the general memory test. 6. Corrected User's Manual L2 Cache Testing Rules The corrected version of MPC750 manual 10/13/99 9.1.5.2 L2 Cache Testing A typical test for verifying the proper operation of the MPC750's L2 cache memory (external SRAM and tag) would perform the following steps: 1. Initialize the L2 test sequence by 1. Set L2CR[DO] and L2CR[TS] and perform a global invalidation of the L1 data cache and the L2 cache. The L1 instruction cache can remain enabled to improve execution efficiency. Recommendation is a. Do not turn off the L1 data cache. Required for Performance monitor to function as described. b. Do not turn off translation. Required to maintain cache inhibited spaces such as the IO space, including printf, when turning off translation makes the IOspace cacheable. 2. Globally invalidate the L1 and the L2 cache, but do not turn it off. 2. Test the L2 cache external SRAM by 1. Enabling the L1 data cache and executing a sequence of dcbz and dcbf instructions to initialize the L2 cache with a desired range of consecutive addresses and with cache data consisting of zeros. 2. Once the L2 cache holds a sequential range of addresses, a. Turn the CRL2[TS] bit off. b. Execute a series of single-beat load and store operations employing a variety of bit patterns to test for stuck bits and pattern sensitivities in the L2 cache SRAM. 3. The performance monitor can be used to verify whether the number of L2 cache hits or misses corresponds to the tests performed. 3. Test the L2 cache tag memory by 1. Enabling the L1 data cache and executing a sequence of dcbz and dcbf instructions to initialize the L2 cache with a wide range of addresses and cache data. 2. Once the L2 cache is populated with a known range of addresses and data, a. Ensure the L2CR[TS] bit is off, however, references to addresses not physically available may cause 60x bus errors on nonexistent hardware or nonexistent memory. Note that setting or leaving the L2CR[TS] on inhibits the L2 cache misses from being forwarded to the 60x bus interface, which will result in no misses ever occurring and therefore no L2 misses will be registered with the Performance Monitor. b. execute a series of store operations to addresses not previously in the L2 cache. These store operations should miss in every case. c. The L2 cache then can be further verified by reading the previously loaded addresses and observing whether all the tags hit, and that the associated data compares correctly. d. The performance monitor can also be used to verify whether the proper number of L2 cache hits and misses correspond to the test operations performed. 4. The entire L2 cache can be tested by 1. Clearing L2CR[DO] 2. restoring the L1 and L2 caches to their normal operational state, and executing a comprehensive test program designed to exercise all the caches. 3. The test program should include operations that cause L2 hit, reload, and castout activity that can be subsequently verified through the performance monitor. The corrected version of MPC7400 Book IV 10/13/99 The L2 test method is described on page 223 section 7.9 of Book IV, and should be similar in the User's Manual when it has been released. Book IV is also not correct. The 7400 test methodology is the same as the 750 listed above. The only real difference between this test for the 7400 versus the 750 is the global L2 cache invalidate procedure. So the steps to test the L2 cache on the 7400 is identical to the 750 elucidated above. 7. Example Results Full examples of the 750/7400 with Performance monitor test/L2 cache test are included in the files 750_monitor.prt, 750_l2test.prt, 7400_monitor.prt, and 7400_l2test.prt. This section describes the output and gives some example result snippets. 1. Description of querying the user: The user is presented with four choices: Which PowerPC processor: 750 or 7400? Choose the processor. start_addr is: This is the starting address of a memory area that can be used for writing and reading. For dink32, it must be starting at 0x100000 or larger. Dink32 uses 0x0-0x8ffff, and l2test uses 0x90000-0x9ffff. end_addr is: This is the last address of a memory area that can be used for writing and reading. It must be at least twice the size of the L2cache. Which test: 1. Performance Monitor test, 2. L2 cache test 2. Example output: Performance Monitor or L2 Cache test programs Initialize the bats and HID0 Begin Test Which PowerPC processor: 750 or 7400? 750 (user answer) You chose: 750 The MPC750 interposer has Half Megabyte of L2 Cache Specify start and end address range. Range must be at least twice the size of the L2 cache start_addr is: 100000 (user answer) end_addr is: 200000 (user answer) Which test: 1. Performance Monitor test, 2. L2 cache test 2 (user answer) You chose test: 2 3. The program will continue with no more interaction from the user. It lists the steps it is performing, which agree with the corrected version of the Manuals. It is very verbose, giving the values of the MMCR0, MMCR1, PMC1, PMC2, PMC3, and PMC4 for each section of the test. example output: Step 1: Initialize L2 Step 1 part 1: Set L2CR[DO] and L2CR[TS] and global invalidate Step 1 part 2: L1 D cache global invalidate, but do not turn it off Step 2: Enable L2 cache and test Step 2 part 1: dcbz and dcbf to initialize L2 cache and turn off L2CR[Ts] Step 2 part 2: leave L1 cache on and read/write patterns to L2 cache 4. There are five sections to the test and they are displayed in the output. Section: 1. Write and read a pattern of all a's, then the inverse all 5's. for an memory area the same size as the L2 cache. This is called Step 2 part 3: Use performance monitor to record hits and misses 2. Write and read a pattern of walking 1's though the 32 bit value. for the same memory addresses as in section 1. 3. Write and read a series of address ranges beginning at the end of the previous address range and increment the test by 0x1000 bytes. This is called Step 3: L2 cache tag memory 4. Read a series of address ranges beginning at the end of the previous address range and increment the test by 0x1000 bytes. This is called: Read outside the L2 range, we should get misses 5. Write/read one set address ranges corresponding to section 4. This is called: Write/read to last read address, we should get hits A full test of memory as discussed in step 4 of the L2 cache test is not performed by this test, since that is not an L2 test, but is a memory test. Preceding each test of each section is a series of numbers as: a-b-c-d, as in 7-7-7-7 or 33-26-0-0. These are the current encodings in MMCR0 and MMCR1 for the PMC1, PMC2, PMC3, and PMC4 expressed in decimal. The next line is the actual hex values for MMCR0, MMCR1, PMC1, and PMC3 before the test in this section. The next line is the indicator Write or Read and the address range and pattern. The next line is the value of PMC1, PMC2, PMC3, and PMC4 for this test. example of section 1: Begin Test as described on 9.1.5.2 in the 750 Users Manual Step 2 part 3: Use performance monitor to record hits and misses 7-7-7-7: Values for this run will be:PMC1-PMC2-PMC3-PMC4 selector MMCR0 0x1c7, MMCR1 0x39c00000, PMC1 0x, PMC3 0x9 - Write: Addr range: 0x100000..0x180000, pattern is 0xaaaaaaaa PMC1= 0x3f94 PMC2= 0x PMC3= 0x7b PMC4= 0x - Read: Addr range: 0x100000..0x180000, pattern is 0xaaaaaaaa PMC1= 0x3fb0 PMC2= 0x PMC3= 0x50 PMC4= 0x Mis-matched= 0x example of section 2: 7-7-7-7: Values for this run will be:PMC1-PMC2-PMC3-PMC4 selector MMCR0 0x1c7, MMCR1 0x39c00000, PMC1 0x, PMC3 0x6 - Write: Addr range: 0x100000..0x180000, pattern is 0x1 PMC1= 0x3f90 PMC2= 0x PMC3= 0x78 PMC4= 0x - Read: Addr range: 0x100000..0x180000, pattern is 0x1 PMC1= 0x3fb0 PMC2= 0x PMC3= 0x50 PMC4= 0x Mis-matched= 0x example of section 3: Step 3: L2 cache tag memory Step 3: part 1 already done above Step 3: part 2a,b,c already done above Step 3: part 2d and e: read/write outside of L2 set up range Write outside the L2 range, we should get misses 7-7-7-7: Values for this run will be:PMC1-PMC2-PMC3-PMC4 selector MMCR0 0x1c7, MMCR1 0x39c00000, PMC1 0x, PMC3 0x5 - Write: Addr range: 0x180000..0x181000, pattern is 0x10101010 PMC1= 0x PMC2= 0x PMC3= 0x87 PMC4= 0x - Read: Addr range: 0x180000..0x181000, pattern is 0x10101010 PMC1= 0x PMC2= 0x PMC3= 0x PMC4= 0x Mis-matched= 0x NOTE: This test is designed to increment the start address in a geometric pattern, thus it can quickly generate large addresses. On the Yellowknife, there is 32 Megabytes of memory. On a system with a much smaller memory space, under 2 megabytes, this will cause an access to undefined memory. Thus, the user should modify the code to increase in a linear manner. So at about line 458 in l2test.c, about 18 lines past the statement: printf("Write outside the L2 range,..., change the increment line: start_addr=start_addr+(count*0x1000) to : start_addr=start_addr+(0x1000). example of section 4: Read outside the L2 range, we should get misses ... skip first read which is in range - Read: Addr range: 0x18b000..0x18c000, pattern is 0x10101010 PMC1= 0x PMC2= 0x PMC3= 0x81 PMC4= 0x Mis-matched= 0x400 example of section 5: Write/read to last read address, we should get hits 7-7-7-7: Values for this run will be:PMC1-PMC2-PMC3-PMC4 selector MMCR0 0x1c7, MMCR1 0x39c00000, PMC1 0x2, PMC3 0x - Write: Addr range: 0x194000..0x195000, pattern is 0x10101010 PMC1= 0x2 PMC2= 0x PMC3= 0x PMC4= 0x - Read: Addr range: 0x194000..0x195000, pattern is 0x10101010 PMC1= 0x PMC2= 0x PMC3= 0x PMC4= 0x Mis-matched= 0x 8. Notes on Results. 1. The number of L2 hits during the first section of the test, which writes and reads 0x1000 bytes are expected to be 0x100000 bytes hit/ 0x20 bytes per cache line = 0x4000 hits. These hits are detected by the Performance monitor on both the 750 and the 7400 in PMC1. Additionally, the L2 misses are detected by the Performance monitor on the 750 in PMC3 and on the 7400 in PMC2. In all cases the expected misses are zero. But as the results shown below indicate, there are 0x3f9c hits and 0x72 misses for a total of hit/misses of 0x400e. The expectation of 0x4000 hits and 0x0 misses can only be realized if the test only accesses the memory set up in the L2 cache. However, this test is using printf to print out the results. The printf code is pushing data values into the L2 cache, which pushes out some of the test values, since we originally filled the L2 cache completely. Thus the discrepancy between the expected hits/misses and the actual hits/misses can be explained by the presence of the print generating the results. Sample output showing this anomaly. Begin Test as described on 9.1.5.2 in the 750 Users Manual Step 2 part 3: Use performance monitor to record hits and misses 7-7-7-7: Values for this run will be:PMC1-PMC2-PMC3-PMC4 selector MMCR0 0x1c7, MMCR1 0x39c00000, PMC1 0x, PMC3 0x9 - Write: Addr range: 0x100000..0x180000, pattern is 0xaaaaaaaa PMC1= 0x3f9c PMC2= 0x PMC3= 0x72 PMC4= 0x 2. The number of L2 hits during the last section of the test, which writes and reads 0x1000 bytes are expected to be 0x1000 bytes/ 0x20 bytes per cache line = 0x80. Indeed, we get 0x80 misses for the write (into a new range of addresses not previously in the L2 cache), but the subsequent read should get 0x80 hits, in fact it got none. however the subsequent read should not get 0x80 hits, in fact it got none as expected, because it wrote into the L1 and L2 cache, so the read got a hit in the L1 and therefore did not need to hit in the L2. Sample output showing this anomaly. 7-7-7-7: Values for this run will be:PMC1-PMC2-PMC3-PMC4 selector MMCR0 0x1c7, MMCR1 0x39c00000, PMC1 0x, PMC3 0x - Write: Addr range: 0x18a000..0x18b000, pattern is 0x10101010 PMC1= 0x PMC2= 0x PMC3= 0x80 PMC4= 0x - Read: Addr range: 0x18a000..0x18b000, pattern is 0x10101010 PMC1= 0x PMC2= 0x PMC3= 0x PMC4= 0x Mis-matched= 0x 3. The number of L2 hits during the last section of the test, which only reads 0x1000 bytes are expected to be 0x1000 bytes/ 0x20 bytes per cache line = 0x80. However, we get 0x81 misses for the read (into a new range of addresses not previously in the L2 cache). This is caused by speculative read. Since HID0(SPD) bit 22 is set to zero. If we change HID0(SPD) to 1, then we always get 0x80 hits as expected. The Mis-matched is expected to be 0x400 because we are reading an area that has not been written with our current pattern. 0x1000 bytes/ 0x4 bytes per word = 0x400. Sample output showing this anomaly. Read outside the L2 range, we should get misses ... skip the first read which is in L2 range ... - Read: Addr range: 0x18b000..0x18c000, pattern is 0x10101010 PMC1= 0x PMC2= 0x PMC3= 0x81 PMC4= 0x Mis-matched= 0x400 4. The number of L2 hits during the last section of the test, which only write and reads 0x1000 bytes to the previous read addresses are expected to be 0x1000 bytes/ 0x20 bytes per cache line = 0x80. However, we get 0x2 hits for the write in the L2 cache). All hits are to the L1, since we have a small data size of only 1000 bytes. The 0x2 hits are related to speculate reads as described in number 3 above. Sample output showing this anomaly. 7-7-7-7: Values for this run will be:PMC1-PMC2-PMC3-PMC4 selector MMCR0 0x1c7, MMCR1 0x39c00000, PMC1 0x2, PMC3 0x - Write: Addr range: 0x194000..0x195000, pattern is 0x10101010 PMC1= 0x2 PMC2= 0x PMC3= 0x PMC4= 0x - Read: Addr range: 0x194000..0x195000, pattern is 0x10101010 PMC1= 0x PMC2= 0x PMC3= 0x PMC4= 0x Mis-matched= 0x 5. The 7400 test somehow destroys dink's continuity and the user must reset after each run. Apparently, there is some problem with recovering the L2 cache after testing. The 750 test has no problem being repeated with out the need for a reset. This is related to how DINK32 is reentered after the test. The test concludes by issuing a blr back to dink32, however, this somehow leaves the L1 and L2 cache in an unstable state on the 7400. If we returned via an exception, such as executing an instruction of all zeros, then dink32 would recover correctly. This will be fixed in a subsequent release to dink32 V12.0. 9. Caveats. Do not turn data translation off. Do not turn off the L2 data cache. Do turn the L2CR[TS] and L2CR[DO] bits on while filling the L2 cache before starting the tests. Do turn the L2CR[TS] bit off before starting the tests. It is good to also use instruction translation and icache, however, that is not required. These tests are run with instruction translation and icache on. Appendix A: Initial settings for dbats, ibat, msr, and hid0. All settings are the same for yellowknife/sandpoint and excimer/maximer with the exception of the ibat2/dbat3 values. These values define the IO address space, which are different for these two types of boards. The corresponding dbats and ibats have the same value: dbat0 and ibat0: For batu = 0xff0001ff BEPI Logical address is = 0xff000000 BL Block Length is = 0x7f 16 MB Range is = 0xff000000 - 0xffffffff VS is = 0x1 Supervisor mode access VP is = 0x1 User mode access For batl = 0xff000012 BRPN Physical address is = 0xff000000 WIMG = 0x2 W off Not Write Through i.e. Write back I off Not Cache Inhibited, i.e. cache active M on Memory Coherent G off Not Guarded, i.e. unguarded PP Block Access Protection Control = 0x2 Read and Write dbat1 and ibat1: For batu = 0xfff BEPI Logical address is = 0x0 BL Block Length is = 0x3ff 128 MB Range is = 0x0 - 0x7ffffff VS is = 0x1 Supervisor mode access VP is = 0x1 User mode access For batl = 0x12 BRPN Physical address is = 0x0 WIMG = 0x2 W off Not Write Through i.e. Write back I off Not Cache Inhibited, i.e. cache active M on Memory Coherent G off Not Guarded, i.e. unguarded PP Block Access Protection Control = 0x2 Read and Write dbat2 and ibat2: IMPORTANT NOTE: dbat2/ibat2 is the IO address space pointer. This address is different for Yellowknife/Sandpoint then excimer/maximer. NOTE: excimer has an MPC603, so it has no l2 cache, however, maximer has an MPC750 or MPC6400, which does have L2 cache. So the maximer can use the L2 cache. The default cade in l2testutils.s is for the yellowknife/sandpoint. The code for the maximer is "if 0" out, so swap the code for the maximer. ===== This configuration is for yellowknife/sandpoint ================= For batu = 0xfe0001ff BEPI Logical address is = 0xfe000000 BL Block Length is = 0x7f 16 MB Range is = 0xfe000000 - 0xfeffffff VS is = 0x1 Supervisor mode access VP is = 0x1 User mode access For batl = 0xfe000032 BRPN Physical address is = 0xfe000000 WIMG = 0x6 W off Not Write Through i.e. Write back I on Cache Inhibited M on Memory Coherent G off Not Guarded, i.e. unguarded PP Block Access Protection Control = 0x2 Read and Write ===== This configuration is for Maximer ================= maximer: Decoding the bat For batu = 0x404001ff BEPI Logical address is = 0x40400000 BL Block Length is = 0x7f 16 MB Range is = 0x40400000 - 0x413fffff VS is = 0x1 Supervisor mode access VP is = 0x1 User mode access For batl = 0x40400032 BRPN Physical address is = 0x40400000 WIMG = 0x6 W off Not Write Through i.e. Write back I on Cache Inhibited M on Memory Coherent G off Not Guarded, i.e. unguarded PP Block Access Protection Control = 0x2 Read and Write dbat3 and ibat3: Both bats are zero, Disabled HID0 = 0x0000cc00, which will be changed to 0x0000c000 by the hardware. MSR = 0x00003930 translation and floating point on.