-
Your Name authored
- Add flash_attn extraction from global_args in _load_default_model() - Add flash_attn extraction from global_args in _load_model_by_name() - Now --flash-attn flag will properly enable Flash Attention 2 when loading models
7e4ae96f
| Name |
Last commit
|
Last update |
|---|---|---|
| .vscode | ||
| codai | ||
| .gitignore | ||
| LICENSE.md | ||
| README.md | ||
| build.sh | ||
| coder | ||
| coderai | ||
| requirements-nvidia.txt | ||
| requirements-vulkan.txt | ||
| requirements.txt |