1. 06 Feb, 2026 40 commits
    • Stefy Lanza (nextime / spora )'s avatar
    • Stefy Lanza (nextime / spora )'s avatar
    • Stefy Lanza (nextime / spora )'s avatar
    • Stefy Lanza (nextime / spora )'s avatar
      Fix Google GenAI streaming response handling · 77c08ee2
      Stefy Lanza (nextime / spora ) authored
      - Return Google's synchronous iterator directly from provider handler
      - Detect Google streaming responses by checking for __iter__ but not __aiter__
      - Convert Google chunks to OpenAI format in stream_generator
      - Handle both sync (Google) and async (OpenAI/Anthropic) streaming responses
      - Fix 'async_generator object is not iterable' error
      
      This fixes streaming requests through autoselect and rotation handlers
      that were failing with 'async_generator' object is not iterable error.
      77c08ee2
    • Stefy Lanza (nextime / spora )'s avatar
      Fix Google GenAI streaming handler to use async generator · 81e9a8f5
      Stefy Lanza (nextime / spora ) authored
      - Keep stream_generator as async function (not sync)
      - Wrap Google's synchronous iterator in async generator
      - Properly structure if/else for streaming vs non-streaming paths
      - Fix 'client has been closed' error in streaming responses
      
      This fixes the issue where streaming requests through autoselect
      were failing with 'Cannot send a request, as a client has been closed'
      error.
      81e9a8f5
    • Stefy Lanza (nextime / spora )'s avatar
      Fix streaming chunk structure for Google GenAI · 63268f97
      Stefy Lanza (nextime / spora ) authored
      - Ensure complete chunk object is yielded as single unit
      - Add logging to show complete chunk structure
      - Fix issue where chunk was being serialized as separate fields
      - Maintain OpenAI-compatible chat.completion.chunk format
      
      This should fix the streaming issue where chunks were being
      serialized as separate data: lines instead of complete
      JSON objects.
      63268f97
    • Stefy Lanza (nextime / spora )'s avatar
      Implement streaming support for Google GenAI provider · 8360e33b
      Stefy Lanza (nextime / spora ) authored
      - Use generate_content_stream() for streaming requests
      - Create async generator that yields OpenAI-compatible chunks
      - Extract text from each stream chunk
      - Generate unique chunk IDs
      - Format chunks as chat.completion.chunk objects
      - Include delta content in each chunk
      - Maintain non-streaming functionality for regular requests
      
      This fixes the streaming issue where Google GenAI was returning
      a dict instead of an iterable, causing 'JSONResponse object is
      not iterable' errors.
      8360e33b
    • Stefy Lanza (nextime / spora )'s avatar
      Create test script for AISBF proxy · 3c7bec4c
      Stefy Lanza (nextime / spora ) authored
      - Test non-streaming requests to autoselect endpoint
      - Test streaming requests to autoselect endpoint
      - Test listing available providers
      - Test listing models for autoselect endpoint
      - Use model 'autoselect' for autoselect endpoint
      - Include jq installation instructions for formatted output
      
      Run with: ./test_proxy.sh
      3c7bec4c
    • Stefy Lanza (nextime / spora )'s avatar
      Remove Pydantic validation to test serialization · 6a1fc753
      Stefy Lanza (nextime / spora ) authored
      - Remove ChatCompletionResponse validation from GoogleProviderHandler
      - Remove ChatCompletionResponse validation from AnthropicProviderHandler
      - Return raw response dict directly
      - Add logging to show response dict keys
      - This tests if Pydantic validation was causing serialization issues
      
      Testing if removing validation fixes client-side 'Cannot read properties
      of undefined' errors.
      6a1fc753
    • Stefy Lanza (nextime / spora )'s avatar
      Wrap Google and Anthropic provider responses in JSONResponse · 4760277f
      Stefy Lanza (nextime / spora ) authored
      GoogleProviderHandler:
      - Wrap validated response dict in JSONResponse before returning
      - Add logging to confirm JSONResponse is being returned
      - Ensures proper JSON serialization for Google GenAI responses
      
      AnthropicProviderHandler:
      - Wrap validated response dict in JSONResponse before returning
      - Add logging to confirm JSONResponse is being returned
      - Ensures proper JSON serialization for Anthropic responses
      
      RequestHandler:
      - Remove JSONResponse wrapping (now handled by providers)
      - Update logging to detect JSONResponse vs dict responses
      - OpenAI and Ollama providers return raw dicts (already compatible)
      
      This fixes client-side 'Cannot read properties of undefined' errors by ensuring
      Google and Anthropic responses are properly serialized as JSONResponse,
      while leaving OpenAI and Ollama responses as-is since they're already
      OpenAI-compatible.
      4760277f
    • Stefy Lanza (nextime / spora )'s avatar
      Wrap response in JSONResponse for proper serialization · d4a92e37
      Stefy Lanza (nextime / spora ) authored
      - Import JSONResponse from fastapi.responses
      - Explicitly wrap response dict in JSONResponse
      - Add logging to confirm JSONResponse is being returned
      - This ensures FastAPI properly serializes the response dict
      - Fixes potential serialization issues causing client-side errors
      d4a92e37
    • Stefy Lanza (nextime / spora )'s avatar
      Add response structure logging in RequestHandler · 94f17378
      Stefy Lanza (nextime / spora ) authored
      - Log response type and full response object
      - Log response keys to verify structure
      - Check if 'choices' key exists
      - Verify choices is a list and not empty
      - Log choices[0] content if available
      - Add error logging for missing or malformed response structure
      
      This will help identify why clients are getting 'Cannot read properties
      of undefined (reading '0') errors when accessing response.choices[0]
      94f17378
    • Stefy Lanza (nextime / spora )'s avatar
      Add comprehensive tool calls support for Google and Anthropic providers · 60ca20d2
      Stefy Lanza (nextime / spora ) authored
      GoogleProviderHandler enhancements:
      - Process all parts in response content (not just first part)
      - Extract and combine all text parts
      - Detect and convert Google function_call to OpenAI tool_calls format
      - Generate unique call IDs for tool calls
      - Handle function responses for debugging
      - Set content to None when tool_calls are present (OpenAI convention)
      - Add comprehensive logging for tool call detection and conversion
      - Support both text and function/tool calls in same response
      - Validate response against ChatCompletionResponse Pydantic model
      - Add detailed response structure logging
      
      AnthropicProviderHandler enhancements:
      - Process all content blocks (not just text)
      - Detect and convert Anthropic tool_use blocks to OpenAI tool_calls format
      - Generate unique call IDs for tool calls
      - Combine all text parts from multiple blocks
      - Set content to None when tool_calls are present (OpenAI convention)
      - Add comprehensive logging for tool_use detection and conversion
      - Validate response against ChatCompletionResponse Pydantic model
      - Add detailed response structure logging
      
      Both handlers now properly translate provider-specific function calling
      formats to OpenAI-compatible tool_calls structure, ensuring clients receive
      valid structured responses with proper schema validation.
      60ca20d2
    • Stefy Lanza (nextime / spora )'s avatar
      Add comprehensive debug logging for Google GenAI response parsing · 627f1407
      Stefy Lanza (nextime / spora ) authored
      - Log response type and all attributes
      - Log candidates structure and length
      - Log candidate attributes and content structure
      - Log parts structure and first part details
      - Log raw text extraction and parsing steps
      - Log final extracted text and finish reason
      - Add error logging with full stack trace for exceptions
      
      This will help identify where the parsing is failing when responses
      show as 'parsed=None' and clients receive nothing.
      627f1407
    • Stefy Lanza (nextime / spora )'s avatar
      Fix Google GenAI response translation to OpenAI format · 9c5d68d9
      Stefy Lanza (nextime / spora ) authored
      - Properly extract finish_reason from candidate object and map to OpenAI format
      - Correctly extract usage metadata from response.usage_metadata structure
      - Extract prompt_token_count, candidates_token_count, and total_token_count
      - Add logging for usage metadata extraction
      - Handle Google finish reasons: STOP, MAX_TOKENS, SAFETY, RECITATION, OTHER
      
      This fixes the issue where Gemini responses were arriving corrupted to
      OpenAI-compatible clients due to incorrect parsing of the new Google GenAI
      SDK response structure.
      9c5d68d9
    • Stefy Lanza (nextime / spora )'s avatar
      Fix: Translate Google and Anthropic responses to OpenAI format · 3a46b7ad
      Stefy Lanza (nextime / spora ) authored
      - Google: Parse formatted responses with JSON structure (e.g., 'assistant: [{'type': 'text', 'text': '...'}]')
      - Anthropic: Extract text from content blocks and map stop_reason to finish_reason
      - Both handlers now return properly formatted OpenAI-compatible responses
      - Ensures clients receive correctly structured messages without malformed content
      3a46b7ad
    • Stefy Lanza (nextime / spora )'s avatar
      Add AISBF_DEBUG environment variable for conditional message logging · d0cdc55a
      Stefy Lanza (nextime / spora ) authored
      - Added AISBF_DEBUG check to control verbose message content logging
      - Messages are only dumped when AISBF_DEBUG=true/1/yes
      - Otherwise only message count is logged
      - Applies to Google, OpenAI, and Anthropic provider handlers
      - Reduces log verbosity in production while maintaining debug capability
      d0cdc55a
    • Stefy Lanza (nextime / spora )'s avatar
      Fix: Extract text from nested Google GenAI response structure · 631d4c4d
      Stefy Lanza (nextime / spora ) authored
      - Fixed GoogleProviderHandler to properly extract text from response.candidates[0].content.parts[0].text
      - Added error handling for text extraction
      - Resolves client error 'Cannot read properties of undefined (reading '0')'
      - The Google GenAI SDK returns nested response structure, not direct .text property
      631d4c4d
    • Stefy Lanza (nextime / spora )'s avatar
      Fix provider handler errors · 871fcdaf
      Stefy Lanza (nextime / spora ) authored
      - Fixed GoogleProviderHandler to return OpenAI-style response format
      - Added tools and tool_choice parameters to OllamaProviderHandler (accepted but ignored)
      - Fixed OpenAI message building to properly handle tool messages with tool_call_id
      - Fixed max_tokens handling to avoid passing null values to APIs
      - Converted Ollama response to OpenAI-style format for consistency
      
      This fixes the following errors:
      - 'Cannot read properties of undefined (reading '0')' - Google response format issue
      - 'OllamaProviderHandler.handle_request() got an unexpected keyword argument 'tools''
      - 'for 'role:tool' the following must be satisfied[('messages.23.tool_call_id' : property 'tool_call_id' is missing)]'
      - 'Invalid input: expected number, received null' for max_tokens parameter
      871fcdaf
    • Stefy Lanza (nextime / spora )'s avatar
      Try to fix... · ef580a2b
      Stefy Lanza (nextime / spora ) authored
      ef580a2b
    • Stefy Lanza (nextime / spora )'s avatar
      Add tool_call_id field to Message model · 38a58f51
      Stefy Lanza (nextime / spora ) authored
      - Add optional tool_call_id field to Message model
      - Required for tool response messages (role:tool)
      - Identifies which tool call the response is for
      - Fixes 400 errors for missing tool_call_id in tool messages
      38a58f51
    • Stefy Lanza (nextime / spora )'s avatar
      Fix missing autoselect_config parameter in streaming request · 948f7b63
      Stefy Lanza (nextime / spora ) authored
      - Pass autoselect_config to _get_model_selection in handle_autoselect_streaming_request
      - Fixes TypeError: missing 1 required positional argument
      - Ensures streaming autoselect requests use the configured selection_model
      948f7b63
    • Stefy Lanza (nextime / spora )'s avatar
      Add tool_calls field and make content optional in Message model · 11082190
      Stefy Lanza (nextime / spora ) authored
      - Add optional tool_calls field to Message model
      - Make content field optional with default None
      - Allows assistant messages with tool_calls instead of content
      - Fixes 422 validation errors for tool call messages
      - Supports OpenAI message format with function calls
      11082190
    • Stefy Lanza (nextime / spora )'s avatar
      Fix exception handler to use RequestValidationError · bd5b2939
      Stefy Lanza (nextime / spora ) authored
      - Import RequestValidationError from fastapi.exceptions
      - Update exception handler to catch RequestValidationError instead of status code
      - Add console logging for immediate visibility of validation errors
      - Log validation error details using exc.errors() method
      bd5b2939
    • Stefy Lanza (nextime / spora )'s avatar
      Add exception handler for 422 validation errors · dce9a861
      Stefy Lanza (nextime / spora ) authored
      - Add exception handler to catch and log validation errors
      - Log request path, method, headers, and raw body
      - Log validation error details from FastAPI
      - Helps diagnose why requests are failing validation
      dce9a861
    • Stefy Lanza (nextime / spora )'s avatar
      Update start_proxy.sh to use 127.0.0.1:17765 by default · 9fea17b2
      Stefy Lanza (nextime / spora ) authored
      - Change host from 0.0.0.0 to 127.0.0.1 for improved security
      - Change port from 8000 to 17765 to match main.py default
      - Ensures consistency between development and production modes
      9fea17b2
    • Stefy Lanza (nextime / spora )'s avatar
      Add debug logging for autoselect request validation · 78bd7ea5
      Stefy Lanza (nextime / spora ) authored
      - Log raw request body before validation to diagnose 422 errors
      - Log request headers and path for debugging
      - Make Message content field more flexible with List type
      - Helps identify validation issues in incoming requests
      78bd7ea5
    • Stefy Lanza (nextime / spora )'s avatar
      Make selection_model field optional with default value · b560e363
      Stefy Lanza (nextime / spora ) authored
      - Set default value for selection_model to 'general' in AutoselectConfig
      - Maintains backward compatibility with existing configuration files
      - Prevents 422 errors when loading configs without selection_model field
      b560e363
    • Stefy Lanza (nextime / spora )'s avatar
      Use selection_model field from autoselect configuration · 3b6feed8
      Stefy Lanza (nextime / spora ) authored
      - Add selection_model field to AutoselectConfig model
      - Update _get_model_selection to use autoselect_config.selection_model instead of hardcoded 'general'
      - Update handle_autoselect_request to log selection_model from config
      - Update handle_autoselect_streaming_request to log selection_model from config
      - Allows flexible configuration of which rotation to use for model selection
      3b6feed8
    • Stefy Lanza (nextime / spora )'s avatar
      Add selection_model field to autoselect configuration · dde30272
      Stefy Lanza (nextime / spora ) authored
      - Add selection_model field to specify which rotation to use for model selection
      - Default value is 'general' rotation
      - Allows explicit control over which rotation models are available for autoselect
      - Provides flexibility in configuring autoselect behavior
      dde30272
    • Stefy Lanza (nextime / spora )'s avatar
      Increase max retries for rotation and autoselect models from 2 to 5 · 7fcfacfe
      Stefy Lanza (nextime / spora ) authored
      - Changed max_retries from 2 to 5 in RotationHandler.handle_rotation_request
      - Provides more opportunities to find a working model when errors occur
      - Especially helpful for tool call errors and other transient failures
      - Improves reliability of rotation and autoselect model selection
      7fcfacfe
    • Stefy Lanza (nextime / spora )'s avatar
      Add support for tools and tool_choice with retry on tool call errors · e4148fcf
      Stefy Lanza (nextime / spora ) authored
      - Add tools and tool_choice fields to ChatCompletionRequest model
      - Update OpenAIProviderHandler to accept and pass tools/tool_choice parameters
      - Update handlers to pass tools/tool_choice from request to provider
      - Treat tool call errors during streaming as provider failures
      - Record failure and re-raise to trigger retry with next model in rotation
      - Allows proper tool/function calling support through the proxy
      - Resolves 'Tool choice is none, but model called a tool' error by retrying with another model
      e4148fcf
    • Stefy Lanza (nextime / spora )'s avatar
      Add debug logging for streaming chunk serialization errors · 9840590a
      Stefy Lanza (nextime / spora ) authored
      - Log chunk type and content before serialization attempt
      - Log chunk type and content when serialization fails
      - Helps diagnose 'Tool choice is none, but model called a tool' errors
      - Apply debug logging to both RequestHandler and AutoselectHandler streaming methods
      9840590a
    • Stefy Lanza (nextime / spora )'s avatar
      Handle tool call errors during streaming response serialization · fccf6bca
      Stefy Lanza (nextime / spora ) authored
      - Add try-catch around chunk serialization in stream_generator functions
      - Skip chunks that fail to serialize (e.g., tool calls without tool_choice)
      - Log warnings for chunk serialization errors
      - Prevent streaming failures when models attempt tool calls without proper configuration
      - Apply fix to both RequestHandler and AutoselectHandler streaming methods
      fccf6bca
    • Stefy Lanza (nextime / spora )'s avatar
      Update documentation with detailed descriptions of rotations and autoselect models · 08361f16
      Stefy Lanza (nextime / spora ) authored
      - Add Key Features section to README.md
      - Describe Rotation Models with weighted load balancing and automatic failover
      - Describe Autoselect Models with AI-powered content analysis
      - Update Rotation Endpoints with detailed model descriptions
      - Update Autoselect Endpoints with detailed model descriptions
      - Add comprehensive Rotation Models section to DOCUMENTATION.md
      - Add comprehensive Autoselect Models section to DOCUMENTATION.md
      - Include example use cases for both rotation and autoselect models
      - Update overview with key features and capabilities
      - Document fallback behavior to 'general' when autoselect can't choose a model
      08361f16
    • Stefy Lanza (nextime / spora )'s avatar
      Bump version to 0.3.0 · 7f71e9d7
      Stefy Lanza (nextime / spora ) authored
      - Update version to 0.3.0 in setup.py, pyproject.toml, and aisbf/__init__.py
      7f71e9d7
    • Stefy Lanza (nextime / spora )'s avatar
      Change default listening address to 127.0.0.1:17765 · 0fb18e5c
      Stefy Lanza (nextime / spora ) authored
      - Update host from 0.0.0.0 to 127.0.0.1 for localhost-only access
      - Update port from 8000 to 17765
      - Update log message to reflect new address
      0fb18e5c
    • Stefy Lanza (nextime / spora )'s avatar
      Make autoselect skill file more explicit about model selection output · bec2198c
      Stefy Lanza (nextime / spora ) authored
      - Add prominent ABSOLUTELY CRITICAL section emphasizing ONLY output requirement
      - Explicitly state NO additional text, explanations, or commentary
      - Add repeated warnings about outputting nothing except the single tag
      - Clarify that any extra text will cause system failure
      - Add examples of what NOT to include in response
      bec2198c
    • Stefy Lanza (nextime / spora )'s avatar
      Fix streaming response serialization in handlers · 218a35ee
      Stefy Lanza (nextime / spora ) authored
      - Properly serialize Stream chunks to JSON format
      - Convert ChatCompletionChunk objects using model_dump()
      - Apply fix to both RequestHandler and AutoselectHandler streaming methods
      - Resolves socket.send() exceptions during streaming
      218a35ee
    • Stefy Lanza (nextime / spora )'s avatar
      Fix streaming response error in OpenAIProviderHandler · 029c0668
      Stefy Lanza (nextime / spora ) authored
      - Fixed AttributeError when stream=True is passed to OpenAI client
      - Changed return type to Union[Dict, object] to support streaming
      - Added conditional check to return Stream object for streaming requests
      - Bumped version to 0.2.7
      029c0668