Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm not sure you looked at what I posted. __builtin_ia32_psadbw is right there on the list of builtins. I've used __builtin_ia32_psadbw128 in GCC myself. It compiles directly to PSADW instructions. Perhaps you confused what I was talking about with GCC's auto-vectorization?

edit: Just realized that you're the x264 guy and it's unlikely you misunderstood me. Still I think my point about psadbw stands.



I'm not sure you looked at what I posted. __builtin_ia32_psadbw is right there on the list of builtins. I've used __builtin_ia32_psadbw128 in GCC myself. It compiles directly to PSADW instructions. Perhaps you confused what I was talking about with GCC's auto-vectorization?

Those aren't gcc vectors, those are intrinsics. Vectors use something like this:

    typedef uint32_t v4si __attribute__((vector_size (16)));
    v4si v16 = {v,v,v,v};
__builtin functions that act on __m128 values are separate from "GCC vector instructions".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: