Algorithm 2 gives the overview of the 3D WMFA. The algorithm mainly consists of four stages, which are Winograd transformation of the input feature tile; Winograd transformation of the filter tile; the matrix multiplication, which is converted from the dot product of the transformed input tile and the transformed filter tile; and the inverse Winograd transformation of the result of the matrix multiplication.
Algorithm 2: 3D Convolutional layer implemented with WMFA F(m x m x m, r x r x r).
We have three implementation versions for the 3D WMFA on a GPU.
We apply our 3D WMFA to a widely used 3D neural network called v3d , which is used to classify videos.
We use cuDNN SGEMM and cuDNN FFT Tiling to represent the two methods called in the cuDNN library, and we use 3D WMFA to represent our algorithm.
Table 3 shows the effective TFLOPS of cuDNN SGEMM and 3D WMFA method.
We designed a 3D WMFA to implement 3D convolution operation.