Agner's CPU blog

Agner's CPU blog • Re: Intel AVX10 & APX announcement

July 30, 2023, 2:44 pm

I don't see why AMD (assuming they adopt FP16) would choose not to support it. AVX10.2 doesn't seem to change all that much over AVX10.1, so it doesn't sound like much of a burden on AMD. Agner had...

View Article

Agner's CPU blog • Testp Question

August 25, 2023, 3:13 am

I recently used the testp tool to measure two instructions in unrolled code 1000 times.rept 1000mov (r|e)ax, 123endmI got the following results:CODE: mov rax,123 mov eax,123Clock Core cyc Clock Core...

View Article

Agner's CPU blog • Re: Testp Question

August 25, 2023, 6:22 am

An optimizing assembler should code mov rax,123 as mov eax,123 because the result is zero-extended into rax anyway. The two instructions should give identical results. Test results may vary for random...

View Article

Agner's CPU blog • Suggestion: Stop using "vector" for computer...

August 26, 2023, 2:22 pm

I suggest stopping using "vector" for dynamic arrays and alike in computer science as a highly misleading, historically incorrect term.Historically, vector is a mathematical term, it came into use to...

View Article

Agner's CPU blog • Re: Suggestion: Stop using "vector" for computer...

August 27, 2023, 5:39 am

Language evolves. I am not sure this is the right forum to discuss this.Statistics: Posted by agner — 2023-08-27, 5:39:08

View Article

Agner's CPU blog • Re: Intel's "cripple AMD" function

September 6, 2023, 7:52 am

In many cases, there are no good alternatives to Intel's function libraries. ... There's a diff between cripple and not-use-best.Although both AMD and Intel are descendants of Fairchild Semiconductor,...

View Article

Agner's CPU blog • Efficiency of array<Vec32uc, 8> vs....

December 23, 2023, 6:33 am

Is there an obvious performance penalty in using array<Vec32uc, 8> instead of ContainerV<Vec32uc, 8>? One reason for this choice is https://godbolt.org/ not having vector_containers.h...

View Article

Agner's CPU blog • Re: Efficiency of array<Vec32uc, 8> vs....

December 23, 2023, 3:21 pm

No, there is no performance penalty.Statistics: Posted by agner — 2023-12-23, 15:21:10

View Article

Agner's CPU blog • Is using BSF instruction instead of using GNU C...

December 25, 2023, 7:35 am

I posted this question on stackoverflow: https://stackoverflow.com/questions/777 ... imd-vector. The answer claims, among others, that using "legacy BSF instruction (slow on AMD), instead of using GNU...

View Article

Agner's CPU blog • Re: Is using BSF instruction instead of using GNU C...

December 25, 2023, 8:34 am

__builtin_ctz is not portable to all compilers. I don't think there is any difference in performance. Let's keep this discussion on stackoverflow. Remember to use the tag "vector-class-library" on...

View Article

Agner's CPU blog • Can we get _tzcnt_u32?

December 26, 2023, 11:10 am

I'm getting a tiny performance improvement with `_tzcnt_u32`. It would help to have a VCL equivalent, if it's easy to do. Given:Vec32uc a, b;This line:horizontal_find_first(a == b);is about 3% slower...

View Article

More Pages to Explore .....

Latest Images