Commits · 77c08ee227fcce8e0cc3c92acf397224557e3f04 · nexlab / aisbf

06 Feb, 2026 40 commits

Fix Google GenAI streaming response handling · 77c08ee2

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Return Google's synchronous iterator directly from provider handler
- Detect Google streaming responses by checking for __iter__ but not __aiter__
- Convert Google chunks to OpenAI format in stream_generator
- Handle both sync (Google) and async (OpenAI/Anthropic) streaming responses
- Fix 'async_generator object is not iterable' error

This fixes streaming requests through autoselect and rotation handlers
that were failing with 'async_generator' object is not iterable error.

77c08ee2

Fix Google GenAI streaming handler to use async generator · 81e9a8f5

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Keep stream_generator as async function (not sync)
- Wrap Google's synchronous iterator in async generator
- Properly structure if/else for streaming vs non-streaming paths
- Fix 'client has been closed' error in streaming responses

This fixes the issue where streaming requests through autoselect
were failing with 'Cannot send a request, as a client has been closed'
error.

81e9a8f5

Fix streaming chunk structure for Google GenAI · 63268f97

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Ensure complete chunk object is yielded as single unit
- Add logging to show complete chunk structure
- Fix issue where chunk was being serialized as separate fields
- Maintain OpenAI-compatible chat.completion.chunk format

This should fix the streaming issue where chunks were being
serialized as separate data: lines instead of complete
JSON objects.

63268f97

Implement streaming support for Google GenAI provider · 8360e33b

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Use generate_content_stream() for streaming requests
- Create async generator that yields OpenAI-compatible chunks
- Extract text from each stream chunk
- Generate unique chunk IDs
- Format chunks as chat.completion.chunk objects
- Include delta content in each chunk
- Maintain non-streaming functionality for regular requests

This fixes the streaming issue where Google GenAI was returning
a dict instead of an iterable, causing 'JSONResponse object is
not iterable' errors.

8360e33b

Create test script for AISBF proxy · 3c7bec4c

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Test non-streaming requests to autoselect endpoint
- Test streaming requests to autoselect endpoint
- Test listing available providers
- Test listing models for autoselect endpoint
- Use model 'autoselect' for autoselect endpoint
- Include jq installation instructions for formatted output

Run with: ./test_proxy.sh

3c7bec4c

Remove Pydantic validation to test serialization · 6a1fc753

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Remove ChatCompletionResponse validation from GoogleProviderHandler
- Remove ChatCompletionResponse validation from AnthropicProviderHandler
- Return raw response dict directly
- Add logging to show response dict keys
- This tests if Pydantic validation was causing serialization issues

Testing if removing validation fixes client-side 'Cannot read properties
of undefined' errors.

6a1fc753

Wrap Google and Anthropic provider responses in JSONResponse · 4760277f

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

GoogleProviderHandler:
- Wrap validated response dict in JSONResponse before returning
- Add logging to confirm JSONResponse is being returned
- Ensures proper JSON serialization for Google GenAI responses

AnthropicProviderHandler:
- Wrap validated response dict in JSONResponse before returning
- Add logging to confirm JSONResponse is being returned
- Ensures proper JSON serialization for Anthropic responses

RequestHandler:
- Remove JSONResponse wrapping (now handled by providers)
- Update logging to detect JSONResponse vs dict responses
- OpenAI and Ollama providers return raw dicts (already compatible)

This fixes client-side 'Cannot read properties of undefined' errors by ensuring
Google and Anthropic responses are properly serialized as JSONResponse,
while leaving OpenAI and Ollama responses as-is since they're already
OpenAI-compatible.

4760277f

Wrap response in JSONResponse for proper serialization · d4a92e37

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Import JSONResponse from fastapi.responses
- Explicitly wrap response dict in JSONResponse
- Add logging to confirm JSONResponse is being returned
- This ensures FastAPI properly serializes the response dict
- Fixes potential serialization issues causing client-side errors

d4a92e37

Add response structure logging in RequestHandler · 94f17378

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Log response type and full response object
- Log response keys to verify structure
- Check if 'choices' key exists
- Verify choices is a list and not empty
- Log choices[0] content if available
- Add error logging for missing or malformed response structure

This will help identify why clients are getting 'Cannot read properties
of undefined (reading '0') errors when accessing response.choices[0]

94f17378

Add comprehensive tool calls support for Google and Anthropic providers · 60ca20d2

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

GoogleProviderHandler enhancements:
- Process all parts in response content (not just first part)
- Extract and combine all text parts
- Detect and convert Google function_call to OpenAI tool_calls format
- Generate unique call IDs for tool calls
- Handle function responses for debugging
- Set content to None when tool_calls are present (OpenAI convention)
- Add comprehensive logging for tool call detection and conversion
- Support both text and function/tool calls in same response
- Validate response against ChatCompletionResponse Pydantic model
- Add detailed response structure logging

AnthropicProviderHandler enhancements:
- Process all content blocks (not just text)
- Detect and convert Anthropic tool_use blocks to OpenAI tool_calls format
- Generate unique call IDs for tool calls
- Combine all text parts from multiple blocks
- Set content to None when tool_calls are present (OpenAI convention)
- Add comprehensive logging for tool_use detection and conversion
- Validate response against ChatCompletionResponse Pydantic model
- Add detailed response structure logging

Both handlers now properly translate provider-specific function calling
formats to OpenAI-compatible tool_calls structure, ensuring clients receive
valid structured responses with proper schema validation.

60ca20d2

Add comprehensive debug logging for Google GenAI response parsing · 627f1407

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Log response type and all attributes
- Log candidates structure and length
- Log candidate attributes and content structure
- Log parts structure and first part details
- Log raw text extraction and parsing steps
- Log final extracted text and finish reason
- Add error logging with full stack trace for exceptions

This will help identify where the parsing is failing when responses
show as 'parsed=None' and clients receive nothing.

627f1407

Fix Google GenAI response translation to OpenAI format · 9c5d68d9

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Properly extract finish_reason from candidate object and map to OpenAI format
- Correctly extract usage metadata from response.usage_metadata structure
- Extract prompt_token_count, candidates_token_count, and total_token_count
- Add logging for usage metadata extraction
- Handle Google finish reasons: STOP, MAX_TOKENS, SAFETY, RECITATION, OTHER

This fixes the issue where Gemini responses were arriving corrupted to
OpenAI-compatible clients due to incorrect parsing of the new Google GenAI
SDK response structure.

9c5d68d9

Fix: Translate Google and Anthropic responses to OpenAI format · 3a46b7ad

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Google: Parse formatted responses with JSON structure (e.g., 'assistant: [{'type': 'text', 'text': '...'}]')
- Anthropic: Extract text from content blocks and map stop_reason to finish_reason
- Both handlers now return properly formatted OpenAI-compatible responses
- Ensures clients receive correctly structured messages without malformed content

3a46b7ad

Add AISBF_DEBUG environment variable for conditional message logging · d0cdc55a

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Added AISBF_DEBUG check to control verbose message content logging
- Messages are only dumped when AISBF_DEBUG=true/1/yes
- Otherwise only message count is logged
- Applies to Google, OpenAI, and Anthropic provider handlers
- Reduces log verbosity in production while maintaining debug capability

d0cdc55a

Fix: Extract text from nested Google GenAI response structure · 631d4c4d

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Fixed GoogleProviderHandler to properly extract text from response.candidates[0].content.parts[0].text
- Added error handling for text extraction
- Resolves client error 'Cannot read properties of undefined (reading '0')'
- The Google GenAI SDK returns nested response structure, not direct .text property

631d4c4d

Fix provider handler errors · 871fcdaf

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Fixed GoogleProviderHandler to return OpenAI-style response format
- Added tools and tool_choice parameters to OllamaProviderHandler (accepted but ignored)
- Fixed OpenAI message building to properly handle tool messages with tool_call_id
- Fixed max_tokens handling to avoid passing null values to APIs
- Converted Ollama response to OpenAI-style format for consistency

This fixes the following errors:
- 'Cannot read properties of undefined (reading '0')' - Google response format issue
- 'OllamaProviderHandler.handle_request() got an unexpected keyword argument 'tools''
- 'for 'role:tool' the following must be satisfied[('messages.23.tool_call_id' : property 'tool_call_id' is missing)]'
- 'Invalid input: expected number, received null' for max_tokens parameter

871fcdaf

Try to fix... · ef580a2b
Stefy Lanza (nextime / spora ) authored Feb 06, 2026

ef580a2b

Add tool_call_id field to Message model · 38a58f51

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Add optional tool_call_id field to Message model
- Required for tool response messages (role:tool)
- Identifies which tool call the response is for
- Fixes 400 errors for missing tool_call_id in tool messages

38a58f51

Fix missing autoselect_config parameter in streaming request · 948f7b63

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Pass autoselect_config to _get_model_selection in handle_autoselect_streaming_request
- Fixes TypeError: missing 1 required positional argument
- Ensures streaming autoselect requests use the configured selection_model

948f7b63

Add tool_calls field and make content optional in Message model · 11082190

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Add optional tool_calls field to Message model
- Make content field optional with default None
- Allows assistant messages with tool_calls instead of content
- Fixes 422 validation errors for tool call messages
- Supports OpenAI message format with function calls

11082190

Fix exception handler to use RequestValidationError · bd5b2939

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Import RequestValidationError from fastapi.exceptions
- Update exception handler to catch RequestValidationError instead of status code
- Add console logging for immediate visibility of validation errors
- Log validation error details using exc.errors() method

bd5b2939

Add exception handler for 422 validation errors · dce9a861

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Add exception handler to catch and log validation errors
- Log request path, method, headers, and raw body
- Log validation error details from FastAPI
- Helps diagnose why requests are failing validation

dce9a861

Update start_proxy.sh to use 127.0.0.1:17765 by default · 9fea17b2

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Change host from 0.0.0.0 to 127.0.0.1 for improved security
- Change port from 8000 to 17765 to match main.py default
- Ensures consistency between development and production modes

9fea17b2

Add debug logging for autoselect request validation · 78bd7ea5

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Log raw request body before validation to diagnose 422 errors
- Log request headers and path for debugging
- Make Message content field more flexible with List type
- Helps identify validation issues in incoming requests

78bd7ea5

Make selection_model field optional with default value · b560e363

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Set default value for selection_model to 'general' in AutoselectConfig
- Maintains backward compatibility with existing configuration files
- Prevents 422 errors when loading configs without selection_model field

b560e363

Use selection_model field from autoselect configuration · 3b6feed8

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Add selection_model field to AutoselectConfig model
- Update _get_model_selection to use autoselect_config.selection_model instead of hardcoded 'general'
- Update handle_autoselect_request to log selection_model from config
- Update handle_autoselect_streaming_request to log selection_model from config
- Allows flexible configuration of which rotation to use for model selection

3b6feed8

Add selection_model field to autoselect configuration · dde30272

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Add selection_model field to specify which rotation to use for model selection
- Default value is 'general' rotation
- Allows explicit control over which rotation models are available for autoselect
- Provides flexibility in configuring autoselect behavior

dde30272

Increase max retries for rotation and autoselect models from 2 to 5 · 7fcfacfe

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Changed max_retries from 2 to 5 in RotationHandler.handle_rotation_request
- Provides more opportunities to find a working model when errors occur
- Especially helpful for tool call errors and other transient failures
- Improves reliability of rotation and autoselect model selection

7fcfacfe

Add support for tools and tool_choice with retry on tool call errors · e4148fcf

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Add tools and tool_choice fields to ChatCompletionRequest model
- Update OpenAIProviderHandler to accept and pass tools/tool_choice parameters
- Update handlers to pass tools/tool_choice from request to provider
- Treat tool call errors during streaming as provider failures
- Record failure and re-raise to trigger retry with next model in rotation
- Allows proper tool/function calling support through the proxy
- Resolves 'Tool choice is none, but model called a tool' error by retrying with another model

e4148fcf

Add debug logging for streaming chunk serialization errors · 9840590a

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Log chunk type and content before serialization attempt
- Log chunk type and content when serialization fails
- Helps diagnose 'Tool choice is none, but model called a tool' errors
- Apply debug logging to both RequestHandler and AutoselectHandler streaming methods

9840590a

Handle tool call errors during streaming response serialization · fccf6bca

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Add try-catch around chunk serialization in stream_generator functions
- Skip chunks that fail to serialize (e.g., tool calls without tool_choice)
- Log warnings for chunk serialization errors
- Prevent streaming failures when models attempt tool calls without proper configuration
- Apply fix to both RequestHandler and AutoselectHandler streaming methods

fccf6bca

Update documentation with detailed descriptions of rotations and autoselect models · 08361f16

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Add Key Features section to README.md
- Describe Rotation Models with weighted load balancing and automatic failover
- Describe Autoselect Models with AI-powered content analysis
- Update Rotation Endpoints with detailed model descriptions
- Update Autoselect Endpoints with detailed model descriptions
- Add comprehensive Rotation Models section to DOCUMENTATION.md
- Add comprehensive Autoselect Models section to DOCUMENTATION.md
- Include example use cases for both rotation and autoselect models
- Update overview with key features and capabilities
- Document fallback behavior to 'general' when autoselect can't choose a model

08361f16

Bump version to 0.3.0 · 7f71e9d7
Stefy Lanza (nextime / spora ) authored Feb 06, 2026
```
- Update version to 0.3.0 in setup.py, pyproject.toml, and aisbf/__init__.py
```
7f71e9d7

Change default listening address to 127.0.0.1:17765 · 0fb18e5c

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Update host from 0.0.0.0 to 127.0.0.1 for localhost-only access
- Update port from 8000 to 17765
- Update log message to reflect new address

0fb18e5c

Make autoselect skill file more explicit about model selection output · bec2198c

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Add prominent ABSOLUTELY CRITICAL section emphasizing ONLY output requirement
- Explicitly state NO additional text, explanations, or commentary
- Add repeated warnings about outputting nothing except the single tag
- Clarify that any extra text will cause system failure
- Add examples of what NOT to include in response

bec2198c

Fix streaming response serialization in handlers · 218a35ee

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Properly serialize Stream chunks to JSON format
- Convert ChatCompletionChunk objects using model_dump()
- Apply fix to both RequestHandler and AutoselectHandler streaming methods
- Resolves socket.send() exceptions during streaming

218a35ee

Fix streaming response error in OpenAIProviderHandler · 029c0668

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Fixed AttributeError when stream=True is passed to OpenAI client
- Changed return type to Union[Dict, object] to support streaming
- Added conditional check to return Stream object for streaming requests
- Bumped version to 0.2.7

029c0668

Update AI.PROMPT with recent changes and documentation updates · d7861544
Stefy Lanza (nextime / spora ) authored Feb 06, 2026

d7861544
Bump version to 0.2.6 · 0e11175f
Stefy Lanza (nextime / spora ) authored Feb 06, 2026

0e11175f
Update README and DOCUMENTATION with rotations and autoselect API endpoints · b7cd0053
Stefy Lanza (nextime / spora ) authored Feb 06, 2026

b7cd0053