kernels-community
/

flash-attn3

Model card Files Files and versions

flash-attn3 / README.md

danieldk's picture

danieldk HF Staff

Update tag

cdde5f4 verified about 2 months ago

|

history blame contribute delete

482 Bytes

	---
	license: bsd-3-clause
	tags:
	- kernels
	---

	# Flash Attention 3

	Flash Attention is a fast and memory-efficient implementation of the
	attention mechanism, designed to work with large models and long sequences.
	This is a Hugging Face compliant kernel build of Flash Attention.

	Original code here [https://github.com/Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention).

	Kernel source: https://github.com/huggingface/kernels-community/tree/main/flash-attn3