-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
我有一个小改进给你分享一下,不过似乎性能上没有什么提升。
不过看起来简短了一些,可以治疗强迫症。
__m128d wr = _mm_set_pd( cc,cc ); // cc
__m128d wi = _mm_set_pd( ss, ss ); // dd
// compute the w*o
wr = _mm_mul_pd(o,wr); // ac|bc
__m128d n1 = _mm_shuffle_pd(o,o,_MM_SHUFFLE2(0,1) ); // invert
wi = _mm_mul_pd(n1, wi); // bd|ad
n1 = _mm_sub_pd(wr, wi); // ac-bd|x
wr = _mm_add_pd(wr, wi); // x|bc+ad
n1 = _mm_shuffle_pd(n1,wr,_MM_SHUFFLE2(1,0));// select ac-bd|bc+ad
__m128d wr = _mm_load1_pd(cc);//_mm_set_pd(rootOfUnity.re, rootOfUnity.re); // cc
__m128d wi = _mm_set_pd(s, -s); // -d, d 注意是逆序
// compute the w*o
wr = _mm_mul_pd(o,wr); // ac|bc
__m128d n1 = _mm_shuffle_pd( o, o, _MM_SHUFFLE2(0,1) ); // invert
wi = _mm_mul_pd(n1, wi); // -bd|ad
n1 = _mm_add_pd(wr, wi); // ac-bd|bc+ad 因为wi已经适当地改变了符号(负号),所以这里不需要再分开计算加法和减法了
Metadata
Metadata
Assignees
Labels
No labels