DLPrimitives Blog
Development Blog
Attempt to integrate with OneDNN
Intel's OneDNN is great project that provides cudnn/inference/training like tools for Intel's GPU.
Also it is called OneDNN... it should be called IntelDNN since it supports only Intel gpus and cpus.
Bottom line I tried to add OneDNN based convolutions for Intel GPU just to discover that my simple GEMM based convolution works better. Why? Apparently Intel's implementation seems to be optimized for Channel Last format only.
https://github.com/oneapi-src/oneDNN/issues/1194
A simple convolution with 3x3 kernel with 64 input and output channels with image dimension of 56 on Intel HD 530 with 400 GFlops capacity gives:
- 295.6 GFlops for OneDNN's channels last format
- 144.7 GFlops for dlprimitive's channel first format
- 33.4(!) GFlops for OneDNN's channels first format.
The problem is that channels first is the most common format used by pytorch, mxnet, caffe and many other tools (including dlprimitives)
Ok... I'll check it later when one of two happens:
- They fix channel first performance
- I'll support channel last format internally
Add Comment:
You must enable JavaScript in order to post comments.