The thing I'm looking forward to most is having Flash Attention built-in. Right ...

saiojd · on March 16, 2023

I really wish compiling cuda extensions worked better out of the box. Is there a reason they can't bundle nvcc alongside pytorch outside of complexity/expense?

brucethemoose2 · on March 16, 2023

Legal reasons.

Filesize.

Platform compatibility.

saiojd · on March 18, 2023

Interesting, I had not considered these points outside of file size! Do you think it is possible they will be overcome or is the chance 0?

xformers · on March 16, 2023

I work on xFormers and we definitely appreciate the candid feedback:

- We partnered with our PyTorch colleagues and some of the PyTorch 2.0 kernels for efficient attention actually originated from xFormers, so glad to read that having this now built-in into PyTorch is something users are really eager to use.

- While xFormers was originally targeting a pure researcher audience, we were aware of the installation problems: we started end of last year gradually making it easier to setup and use the library (both internally and externally). We have recently introduced non-dev conda packages, pip wheels and are also trying to release more often,

- We very much welcome hearing about any issue with the library and would certainly love discussing more the specifics of your experience (or others' who read this) if you have time (maybe via our GitHub to start with). Thanks again for the feedback here!

bobbygunderson · on March 16, 2023

What size of ViT? I’ve tried it with both a unet and an LM and didn’t see any benefit with the default args (and got a CUDA error after 30 mins of processing trying to compile an AR generation routine with all optimization turned on).

fpgaminer · on March 16, 2023