Handling strided memref in Affine Super Vectorize

levin · July 8, 2025, 2:11pm

Hello, I’m trying to vectorize the following IR using:
-affine-simplify-structures -affine-super-vectorize="virtual-vector-size=8 test-fastest-varying=0"

module {
  func.func @func_failing() -> i32 {
    %c0 = arith.constant 0 : index
    
    %alloc = memref.alloc() : memref<16x16xi32>
    %subview = memref.subview %alloc[0, 0] [8, 16] [1, 1] : memref<16x16xi32> to memref<8x16xi32, strided<[16, 1]>>
    
    affine.for %i = 0 to 8 {
      affine.for %j = 0 to 8 {
        %val = affine.load %subview[%i, %j] : memref<8x16xi32, strided<[16, 1]>>
        affine.store %val, %subview[%i, %j] : memref<8x16xi32, strided<[16, 1]>>
      }
    }
    
    %result = affine.load %subview[%c0, %c0] : memref<8x16xi32, strided<[16, 1]>>
    memref.dealloc %alloc : memref<16x16xi32>
    return %result : i32
  }
}

However, the vectorization is failing with the following error:

error: NYI: non-trivial layout map
        %val = affine.load %subview[%i, %j] : memref<8x16xi32, strided<[16, 1]>>
               ^

I tried modifying this check with the following code, and it resolves the issue for me:

  if (auto layout = dyn_cast<AffineMapAttr>(memRefType.getLayout()))
    if (!layout.getAffineMap().isIdentity())
      return memoryOp.emitError("NYI: non-trivial layout map"), false;

Question:
Should I be adding more checks for strided layouts to ensure that I’m not violating any vectorization constraints?
If yes, How should I handle dynamic stride from strided layout and dynamic size?

Thanks!

ftynse · July 9, 2025, 7:14am

Does it actually resolve it or does it just produce some IR that may not be doing what you expect, e.g., reading consecutive elements regardless of strides?

Affine vectorized predates strided memrefs so it is likely not accounting for anything but dense row-major storage.

levin · July 9, 2025, 1:53pm

In my case, the data is stored in dense row-major format, at least for the dimensions I’m attempting to vectorize.

Thanks for the insight.

Would it be safe to assume that if I’m vectorizing along a dimension, and that dimension is contiguous, then the vectorization will generate correct code?

For example, if my memref is of type memref<axbxcxf32, strided<[z,c,1]>> and I’m using the vectorization arguments -affine-super-vectorize="virtual-vector-size=8,8 test-fastest-varying=1,0" can I expect correct vectorized code generation?

ftynse · July 10, 2025, 5:47am

I don’t remember offhand. This is rather old and rarely used code. There is a long description starting at llvm-project/mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp at 36cbd43ae8d5a5274ae3193b6383fff2ba9671f4 · llvm/llvm-project · GitHub that you could refer to, otherwise look at the code and inspect the generated IR. The function you pointed to has a “TODO: check access stride” above, which suggests that implementing the check for the stride along the given dimension to be 1 may make it all work together.

Topic		Replies	Views
Can't convert strided memref to LLVM MLIR	5	851	February 22, 2021
[RFC] Materialize strided MemRef layout as an attribute MLIR	14	750	August 15, 2022
How to allocate a memref with non-strided affine_map MLIR	6	590	July 27, 2020
How to implement memref writing in stride? MLIR	6	379	February 17, 2025
[MLIR] Strided memrefs Beginners	1	342	March 28, 2022

Handling strided memref in Affine Super Vectorize

Related topics