- 16 Mar, 2026 40 commits
-
-
Your Name authored
- Changed model name format from openai/... to coderai/... - Added litellm.custom_provider_map to map coderai to openai handler - This allows litellm to use its internal HTTP handler for custom providers - Example: TeichAI/Qwen3-8B-... now becomes coderai/TeichAI/Qwen3-8B-...
-
Your Name authored
- Instead of defaulting to 'huggingface' for org/model paths, now preserves the original org name as the provider - Example: TeichAI/Qwen3-8B-... now becomes openai/TeichAI/Qwen3-8B-... instead of openai/huggingface/TeichAI/Qwen3-8B-...
-
Your Name authored
- Add logic to set api_base to server's own URL for non-Ollama models - Extract host/port from request headers (X-Forwarded-For, Host header) - Determine protocol (http/https) based on global_args - Include debug output showing the determined api_base - This ensures litellm can properly connect to local server when using litellm backend with local models
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
- When using HuggingFace inference endpoints, set api_key to None to avoid auth errors
-
Your Name authored
- When model starts with 'ollama:', construct api_base from request host and port - api_base is now passed to LiteLLMBackend for local connections
-
Your Name authored
- Don't check environment for OPENAI_API_KEY - Use fake key directly in LiteLLMBackend if no key passed
-
Your Name authored
- If no API key is provided in request, use a fake key to allow litellm to proceed - Check both request body and Authorization header for API key
-
Your Name authored
- Add tool_parser parameter to litellm backend calls in coderai endpoint - ModelParserAdapter now passed to both streaming and non-streaming calls - Enables model-specific tool call parsing for external models via litellm
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
- Add model_manager parameter to LiteLLMBackend for alias resolution - Add _resolve_model_alias() method to handle default, image, audio, tts aliases - Update get_litellm_backend() to pass model_manager - Update coderai call site to pass multi_model_manager Now --parser litellm will resolve aliases like 'default', 'image' to actual model names before normalizing for litellm.
-
Your Name authored
- Add method to normalize model names for litellm - Maps common model patterns to providers (gpt-* -> openai/, llama -> meta/, etc.) - Falls back to openai/ for unknown models
-
Your Name authored
-
Your Name authored
Created codai/models/cache/__init__.py with: - get_model_cache_dir() - get_all_cache_dirs() - get_cached_model_path() - is_huggingface_model_id() - download_huggingface_model() - download_model() - list_cached_models() - remove_cached_model() - remove_all_cached_models() This extracts the cache-related functionality into a separate module.
-
Your Name authored
- Rename codai/litellm_backend.py to codai/openai/litellm.py - Create codai/openai/__init__.py - Update imports in coderai and codai/__init__.py
-
Your Name authored
- Add litellm to requirements.txt - Add --parser CLI arg (auto/litellm, default auto) - Create codai/litellm_backend.py module with: - LiteLLMBackend class for standardized responses - Rate limit headers (x-ratelimit-remaining-tokens, x-ratelimit-limit-tokens) - Qwen tool-call resilience (parse <tool> and <tool_call> tags) - Error handling with litellm exception mapping - Update chat completions endpoint to use litellm when --parser litellm - Update codai/__init__.py to export litellm components
-
Your Name authored
- Added litellm>=1.40.0 to requirements.txt - Added --parser argument (auto/litellm, default auto) Note: Full litellm integration requires significant refactoring of the chat completion endpoints to use litellm.completion() for standardized responses, adding rate limit headers, and error handling.
-
Your Name authored
QwenParser: - Add repetition guard to handle looping models - Improve flexible tag matching for tool/tool_call/function_call - Add JSON recovery for unclosed JSON - Add circuit breaker after first valid call - Support <call=name> in coder style fallback API: - Add repeat_penalty parameter to ChatCompletionRequest - Add repeat_penalty parameter to CompletionRequest
-
Your Name authored
- Added pre-cleaning for thinking/special tokens - Unified tag matching for both <tool> and <tool_call> - Added markdown code block stripping inside tags - Added lazy JSON parsing fallback - Added _parse_coder_style() and _relaxed_val() helper methods
-
Your Name authored
- Added _clean_json_string() method to BaseParser for cleaning JSON strings - Updated QwenParser.parse() with 3-step parsing strategy: 1. Qwen format: <tool=func_name>...</tool> 2. JSON format with flexible tag matching 3. Fallback coder style with parameter tags - Fixed syntax issues in the module
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
- Handle both dict and pydantic model formats for tools - Add try/except around tool conversion and extraction - More robust error handling to prevent 500 errors
-
Your Name authored
- Move model_parser.py into codai/ directory - Add __init__.py to make it a proper Python module - Create ModelParserAdapter class to wrap ModelParserDispatcher - Replace ToolCallParser() with ModelParserAdapter() in 4 locations - Update import to use 'from codai import ModelParserDispatcher' This enables model-specific tool call parsing for Qwen, DeepSeek, Llama, Mistral, Claude, Command R, Gemma, Grok, and Phi models.
-
Your Name authored
- Add Qwen-specific tool call parsing in ToolCallParser - Support for Instruct-style: <tool_call>{JSON}</tool_call> - Support for Coder-style: <tool_call><function=name><parameter=k>v</parameter></function></tool_call> - Add model_name attribute to ToolCallParser for model-specific parsing - Update ModelManager.load_model to set model name on tool parser - Fix duplicate method definitions in ToolCallParser class
-