This document describes a low complexity pipelined FFT design for high throughput applications. It proposes a feedforward FFT architecture based on rotator allocation to reduce the number and complexity of rotators. The key aspects are:
1) It uses a divide-and-conquer approach to split the FFT computation into stages, with butterflies operating on data whose indexes differ in the stage bit position.
2) It allocates the index bits into serial and parallel dimensions to optimize the distribution of rotations across stages. This aims to minimize the number of rotators and keep rotations in the same serial allocation set.
3) The proposed approach is shown to reduce the number and complexity of rotators in the FFT architecture compared