-
Your Name authored
- Fixed streaming mode pipeline issues: - Fixed n-gram counting to handle partial matches correctly - Added per-chunk filtering to prevent duplicate n-grams across chunks - Optimized regex patterns (~35 patterns pre-compiled): - Pre-compiled all regex patterns for better performance - Added false positive protection with length-based filtering - Optimized tool call parsing in parser.py - Added grammar-guided generation (--ggg / --grammar-guided-gen): - New GBNF grammar file (tool_call_grammar.gbnf) for tool call parsing - Grammar loading utilities in models/grammar.py - Vulkan backend: Added GBNF grammar support via llama_generate_grammar - CUDA backend: Added outlines support for structured output - Added prompt distillation (--tools-closer-prompt): - New CLI option --tools-closer-prompt for prompt distillation - Enables generating distilled tool descriptions for better accuracy
5341ee6a
| Name |
Last commit
|
Last update |
|---|---|---|
| .vscode | ||
| codai | ||
| .gitignore | ||
| LICENSE.md | ||
| README.md | ||
| build.sh | ||
| coder | ||
| coderai | ||
| requirements-nvidia.txt | ||
| requirements-vulkan.txt | ||
| requirements.txt |