Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Notice that there's no set/way operation that invalidates or cleans an entire data cache; to do that one has to loop over all sets and ways. Nor are there variants affecting multiple CPUs; for that one needs to use operations that take virtual addresses. It's not clear to me whether the SCU obviates the corresponding L2 operation after one manipulates a L1 cache line this way (no pun intended). It can't hurt to do an explicit L2 invalidation at startup after the L1 invalidation is done.

By virtual address

For a virtual address that is Normal and is cached, the following operations affect all CPUs in the sharability domain corresponding to the sharability attributes of the address as specified in the MMU's translation map. For a VA with the Ordered attribute, the CPUs affected are those in the Outer sharability domain that includes the local CPU.

ARM-speak:

ARM-speak:

  • Modified Virtual Address: For the Zynq it's the same as the plain old Virtual Address.
  • Point of Coherency (data accesses): All the levels of the memory hierarchy starting from L1 data cache of the CPU making the change out to and including the PoC must be adjusted to reflect the change in order to guarantee that all agents in the system can see the change. An agent can be a CPU, DMA engine, or whatnot. For the Zynq the PoC is main memory. Note that I said that agents can see the change, not that they will. If they have any data caches between themselves and the PoC then they will need to be notified so that they can invalidate the right entries in them, or some coherence mechanism must do it for them. On the Zynq the Snoop Control Unit will examine the attributes of the VA and invalidate
  • Modified Virtual Address: For Zynq it's the same as the plain old Virtual Address.
  • Point of Coherency (data accesses): All the levels of the memory hierarchy starting from L1 data cache of the CPU making the change out to and including the PoC must be adjusted to reflect the change in order to guarantee that all agents in the system can see the change. An agent can be a CPU, DMA engine, or whatnot. For the Zynq the PoC is main memory. Note that I said that agents can see the change, not that they will. If they have any data caches between themselves and the PoC then they will need to be notified so that they can invalidate the right entries in them, or some coherence mechanism must do it for them. On the Zynq the Snoop Control Unit will examine the attributes of the VA and invalidate data cache entries at L1 for at least some of the other CPUs and at L2 if need be:
    • Normal, cached memory: The CPUs affected will be those in the sharability domain specified for the VA.
    • Strongly ordered, cached memory: The CPUs in the same Outer sharability domain as the CPU making the change will be affected.
    • Shared, cached device memory: The ARMv7-A Architecture Manual says the behavior is implementation defined in the absence of the LVA extension, but the Cortex-A9 tech refs don't define it.
  • Point of Unification (instruction accesses): All levels of the memory get entries invalidated from the L1 instruction cache out to and including that level, the PoU, which is in common to the CPU's instruction fetches, data fetches, and table walk fetches. For the Zynq the PoU is the unified L2 cache. In this case the SCU won't invalidate any instruction cache entries for other CPUs. It seems as if code modification such as performed by a dynamic linker will have to involve inter-CPU signalling in order to get software to perform all the required instruction cache invalidations. Or perhaps we can just make a region of memory unshared and non-executable, load code into it and perform the relocations, then make the memory shared and executable again.
  • Sharability domain: A set of CPUs that share access to a given VA, according to the sharing attributes assigned to the VA by the MMU. CPUs can be partitioned into non-overlapping "inner" sharability domains while an "outer" domain can include all CPUs.

Operations:

  • DCIMVAC (Data Cache Invalidate by MVA to PoC)
  • DCCMVAC (like the above but cleans)
  • DCCIMVAC (like the above but cleans and invalidates)
  • DCCMVAU (Data Cache Clean by MVA to PoU)
  • ICIMVAU (Instruction Cache Invalidate by MVA to PoU)

...