- 06 Feb, 2026 40 commits
-
-
Stefy Lanza (nextime / spora ) authored
- Return Google's synchronous iterator directly from provider handler - Detect Google streaming responses by checking for __iter__ but not __aiter__ - Convert Google chunks to OpenAI format in stream_generator - Handle both sync (Google) and async (OpenAI/Anthropic) streaming responses - Fix 'async_generator object is not iterable' error This fixes streaming requests through autoselect and rotation handlers that were failing with 'async_generator' object is not iterable error.
-
Stefy Lanza (nextime / spora ) authored
- Keep stream_generator as async function (not sync) - Wrap Google's synchronous iterator in async generator - Properly structure if/else for streaming vs non-streaming paths - Fix 'client has been closed' error in streaming responses This fixes the issue where streaming requests through autoselect were failing with 'Cannot send a request, as a client has been closed' error.
-
Stefy Lanza (nextime / spora ) authored
- Ensure complete chunk object is yielded as single unit - Add logging to show complete chunk structure - Fix issue where chunk was being serialized as separate fields - Maintain OpenAI-compatible chat.completion.chunk format This should fix the streaming issue where chunks were being serialized as separate data: lines instead of complete JSON objects.
-
Stefy Lanza (nextime / spora ) authored
- Use generate_content_stream() for streaming requests - Create async generator that yields OpenAI-compatible chunks - Extract text from each stream chunk - Generate unique chunk IDs - Format chunks as chat.completion.chunk objects - Include delta content in each chunk - Maintain non-streaming functionality for regular requests This fixes the streaming issue where Google GenAI was returning a dict instead of an iterable, causing 'JSONResponse object is not iterable' errors.
-
Stefy Lanza (nextime / spora ) authored
- Test non-streaming requests to autoselect endpoint - Test streaming requests to autoselect endpoint - Test listing available providers - Test listing models for autoselect endpoint - Use model 'autoselect' for autoselect endpoint - Include jq installation instructions for formatted output Run with: ./test_proxy.sh
-
Stefy Lanza (nextime / spora ) authored
- Remove ChatCompletionResponse validation from GoogleProviderHandler - Remove ChatCompletionResponse validation from AnthropicProviderHandler - Return raw response dict directly - Add logging to show response dict keys - This tests if Pydantic validation was causing serialization issues Testing if removing validation fixes client-side 'Cannot read properties of undefined' errors.
-
Stefy Lanza (nextime / spora ) authored
GoogleProviderHandler: - Wrap validated response dict in JSONResponse before returning - Add logging to confirm JSONResponse is being returned - Ensures proper JSON serialization for Google GenAI responses AnthropicProviderHandler: - Wrap validated response dict in JSONResponse before returning - Add logging to confirm JSONResponse is being returned - Ensures proper JSON serialization for Anthropic responses RequestHandler: - Remove JSONResponse wrapping (now handled by providers) - Update logging to detect JSONResponse vs dict responses - OpenAI and Ollama providers return raw dicts (already compatible) This fixes client-side 'Cannot read properties of undefined' errors by ensuring Google and Anthropic responses are properly serialized as JSONResponse, while leaving OpenAI and Ollama responses as-is since they're already OpenAI-compatible.
-
Stefy Lanza (nextime / spora ) authored
- Import JSONResponse from fastapi.responses - Explicitly wrap response dict in JSONResponse - Add logging to confirm JSONResponse is being returned - This ensures FastAPI properly serializes the response dict - Fixes potential serialization issues causing client-side errors
-
Stefy Lanza (nextime / spora ) authored
- Log response type and full response object - Log response keys to verify structure - Check if 'choices' key exists - Verify choices is a list and not empty - Log choices[0] content if available - Add error logging for missing or malformed response structure This will help identify why clients are getting 'Cannot read properties of undefined (reading '0') errors when accessing response.choices[0]
-
Stefy Lanza (nextime / spora ) authored
GoogleProviderHandler enhancements: - Process all parts in response content (not just first part) - Extract and combine all text parts - Detect and convert Google function_call to OpenAI tool_calls format - Generate unique call IDs for tool calls - Handle function responses for debugging - Set content to None when tool_calls are present (OpenAI convention) - Add comprehensive logging for tool call detection and conversion - Support both text and function/tool calls in same response - Validate response against ChatCompletionResponse Pydantic model - Add detailed response structure logging AnthropicProviderHandler enhancements: - Process all content blocks (not just text) - Detect and convert Anthropic tool_use blocks to OpenAI tool_calls format - Generate unique call IDs for tool calls - Combine all text parts from multiple blocks - Set content to None when tool_calls are present (OpenAI convention) - Add comprehensive logging for tool_use detection and conversion - Validate response against ChatCompletionResponse Pydantic model - Add detailed response structure logging Both handlers now properly translate provider-specific function calling formats to OpenAI-compatible tool_calls structure, ensuring clients receive valid structured responses with proper schema validation.
-
Stefy Lanza (nextime / spora ) authored
- Log response type and all attributes - Log candidates structure and length - Log candidate attributes and content structure - Log parts structure and first part details - Log raw text extraction and parsing steps - Log final extracted text and finish reason - Add error logging with full stack trace for exceptions This will help identify where the parsing is failing when responses show as 'parsed=None' and clients receive nothing.
-
Stefy Lanza (nextime / spora ) authored
- Properly extract finish_reason from candidate object and map to OpenAI format - Correctly extract usage metadata from response.usage_metadata structure - Extract prompt_token_count, candidates_token_count, and total_token_count - Add logging for usage metadata extraction - Handle Google finish reasons: STOP, MAX_TOKENS, SAFETY, RECITATION, OTHER This fixes the issue where Gemini responses were arriving corrupted to OpenAI-compatible clients due to incorrect parsing of the new Google GenAI SDK response structure.
-
Stefy Lanza (nextime / spora ) authored
- Google: Parse formatted responses with JSON structure (e.g., 'assistant: [{'type': 'text', 'text': '...'}]') - Anthropic: Extract text from content blocks and map stop_reason to finish_reason - Both handlers now return properly formatted OpenAI-compatible responses - Ensures clients receive correctly structured messages without malformed content -
Stefy Lanza (nextime / spora ) authored
- Added AISBF_DEBUG check to control verbose message content logging - Messages are only dumped when AISBF_DEBUG=true/1/yes - Otherwise only message count is logged - Applies to Google, OpenAI, and Anthropic provider handlers - Reduces log verbosity in production while maintaining debug capability
-
Stefy Lanza (nextime / spora ) authored
- Fixed GoogleProviderHandler to properly extract text from response.candidates[0].content.parts[0].text - Added error handling for text extraction - Resolves client error 'Cannot read properties of undefined (reading '0')' - The Google GenAI SDK returns nested response structure, not direct .text property
-
Stefy Lanza (nextime / spora ) authored
- Fixed GoogleProviderHandler to return OpenAI-style response format - Added tools and tool_choice parameters to OllamaProviderHandler (accepted but ignored) - Fixed OpenAI message building to properly handle tool messages with tool_call_id - Fixed max_tokens handling to avoid passing null values to APIs - Converted Ollama response to OpenAI-style format for consistency This fixes the following errors: - 'Cannot read properties of undefined (reading '0')' - Google response format issue - 'OllamaProviderHandler.handle_request() got an unexpected keyword argument 'tools'' - 'for 'role:tool' the following must be satisfied[('messages.23.tool_call_id' : property 'tool_call_id' is missing)]' - 'Invalid input: expected number, received null' for max_tokens parameter -
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
- Add optional tool_call_id field to Message model - Required for tool response messages (role:tool) - Identifies which tool call the response is for - Fixes 400 errors for missing tool_call_id in tool messages
-
Stefy Lanza (nextime / spora ) authored
- Pass autoselect_config to _get_model_selection in handle_autoselect_streaming_request - Fixes TypeError: missing 1 required positional argument - Ensures streaming autoselect requests use the configured selection_model
-
Stefy Lanza (nextime / spora ) authored
- Add optional tool_calls field to Message model - Make content field optional with default None - Allows assistant messages with tool_calls instead of content - Fixes 422 validation errors for tool call messages - Supports OpenAI message format with function calls
-
Stefy Lanza (nextime / spora ) authored
- Import RequestValidationError from fastapi.exceptions - Update exception handler to catch RequestValidationError instead of status code - Add console logging for immediate visibility of validation errors - Log validation error details using exc.errors() method
-
Stefy Lanza (nextime / spora ) authored
- Add exception handler to catch and log validation errors - Log request path, method, headers, and raw body - Log validation error details from FastAPI - Helps diagnose why requests are failing validation
-
Stefy Lanza (nextime / spora ) authored
- Change host from 0.0.0.0 to 127.0.0.1 for improved security - Change port from 8000 to 17765 to match main.py default - Ensures consistency between development and production modes
-
Stefy Lanza (nextime / spora ) authored
- Log raw request body before validation to diagnose 422 errors - Log request headers and path for debugging - Make Message content field more flexible with List type - Helps identify validation issues in incoming requests
-
Stefy Lanza (nextime / spora ) authored
- Set default value for selection_model to 'general' in AutoselectConfig - Maintains backward compatibility with existing configuration files - Prevents 422 errors when loading configs without selection_model field
-
Stefy Lanza (nextime / spora ) authored
- Add selection_model field to AutoselectConfig model - Update _get_model_selection to use autoselect_config.selection_model instead of hardcoded 'general' - Update handle_autoselect_request to log selection_model from config - Update handle_autoselect_streaming_request to log selection_model from config - Allows flexible configuration of which rotation to use for model selection
-
Stefy Lanza (nextime / spora ) authored
- Add selection_model field to specify which rotation to use for model selection - Default value is 'general' rotation - Allows explicit control over which rotation models are available for autoselect - Provides flexibility in configuring autoselect behavior
-
Stefy Lanza (nextime / spora ) authored
- Changed max_retries from 2 to 5 in RotationHandler.handle_rotation_request - Provides more opportunities to find a working model when errors occur - Especially helpful for tool call errors and other transient failures - Improves reliability of rotation and autoselect model selection
-
Stefy Lanza (nextime / spora ) authored
- Add tools and tool_choice fields to ChatCompletionRequest model - Update OpenAIProviderHandler to accept and pass tools/tool_choice parameters - Update handlers to pass tools/tool_choice from request to provider - Treat tool call errors during streaming as provider failures - Record failure and re-raise to trigger retry with next model in rotation - Allows proper tool/function calling support through the proxy - Resolves 'Tool choice is none, but model called a tool' error by retrying with another model
-
Stefy Lanza (nextime / spora ) authored
- Log chunk type and content before serialization attempt - Log chunk type and content when serialization fails - Helps diagnose 'Tool choice is none, but model called a tool' errors - Apply debug logging to both RequestHandler and AutoselectHandler streaming methods
-
Stefy Lanza (nextime / spora ) authored
- Add try-catch around chunk serialization in stream_generator functions - Skip chunks that fail to serialize (e.g., tool calls without tool_choice) - Log warnings for chunk serialization errors - Prevent streaming failures when models attempt tool calls without proper configuration - Apply fix to both RequestHandler and AutoselectHandler streaming methods
-
Stefy Lanza (nextime / spora ) authored
- Add Key Features section to README.md - Describe Rotation Models with weighted load balancing and automatic failover - Describe Autoselect Models with AI-powered content analysis - Update Rotation Endpoints with detailed model descriptions - Update Autoselect Endpoints with detailed model descriptions - Add comprehensive Rotation Models section to DOCUMENTATION.md - Add comprehensive Autoselect Models section to DOCUMENTATION.md - Include example use cases for both rotation and autoselect models - Update overview with key features and capabilities - Document fallback behavior to 'general' when autoselect can't choose a model
-
Stefy Lanza (nextime / spora ) authored
- Update version to 0.3.0 in setup.py, pyproject.toml, and aisbf/__init__.py
-
Stefy Lanza (nextime / spora ) authored
- Update host from 0.0.0.0 to 127.0.0.1 for localhost-only access - Update port from 8000 to 17765 - Update log message to reflect new address
-
Stefy Lanza (nextime / spora ) authored
- Add prominent ABSOLUTELY CRITICAL section emphasizing ONLY output requirement - Explicitly state NO additional text, explanations, or commentary - Add repeated warnings about outputting nothing except the single tag - Clarify that any extra text will cause system failure - Add examples of what NOT to include in response
-
Stefy Lanza (nextime / spora ) authored
- Properly serialize Stream chunks to JSON format - Convert ChatCompletionChunk objects using model_dump() - Apply fix to both RequestHandler and AutoselectHandler streaming methods - Resolves socket.send() exceptions during streaming
-
Stefy Lanza (nextime / spora ) authored
- Fixed AttributeError when stream=True is passed to OpenAI client - Changed return type to Union[Dict, object] to support streaming - Added conditional check to return Stream object for streaming requests - Bumped version to 0.2.7
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
-