Commits · 7daa1c227e339396b5bf0da49fc7c5dae1478d68 · nexlab / aisbf

08 Feb, 2026 11 commits

fix: Parse tool calls from text content in Google provider · 7daa1c22

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Add logic to detect and parse tool calls from text content
- Some models return tool calls as JSON in text instead of using function_call attribute
- Handles both Google-style (action) and OpenAI-style (function/name) tool calls
- Clears response_text when tool_calls are detected

7daa1c22

fix: Correct notifyerrors field name in handlers.py · 0798eecd

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Changed all references from 'notify_errors' to 'notifyerrors' to match RotationConfig model
- Fixes issue where notifyerrors setting was not being properly detected

0798eecd

fix: Add notifyerrors field to RotationConfig model · 2a522121

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Add notifyerrors field with default value False
- Fixes issue where notifyerrors was always detected as False
- Allows rotation to return error as normal message instead of HTTP 503

2a522121

fix: Add stream parameter check to 'All retries exhausted' case · e8bd3dea

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Get stream parameter from request_data to determine response type
- Return StreamingResponse if original request was streaming
- Return dict if original request was non-streaming
- Fixes notifyerrors not working for streaming requests in retry exhausted case

e8bd3dea

fix: Return correct response type based on original request mode · 7e5dc1ec

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Add stream parameter to handle_rotation_request()
- When notifyerrors is enabled, return StreamingResponse if original request was streaming
- Return dict if original request was non-streaming
- Fixes issue where autoselect handler expects StreamingResponse but was getting dict

7e5dc1ec

fix: Use getattr() instead of .get() for rotation_config · b3a547e1

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- RotationConfig is a Pydantic model, not a dictionary
- Use getattr() to safely access notifyerrors attribute
- Fixes AttributeError when accessing rotation_config attributes

b3a547e1

feat: Add notifyerrors configuration to rotations.json · 3beaded8

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Add 'notifyerrors' field to rotation configuration (default: false)
- When enabled, return errors as normal messages instead of HTTP 503
- Allows clients to consume error messages normally without HTTP errors
- Update handlers.py to check notifyerrors setting and return appropriate response

3beaded8

chore: Bump version to 0.3.3 · 56d1b65e
Stefy Lanza (nextime / spora ) authored Feb 08, 2026

56d1b65e

feat: Improve error messages when no models are available in rotation · ed17ff5c

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Add detailed error information including provider status
- Show cooldown remaining time for rate-limited providers
- Display failure counts for each provider
- Provide structured error response with rotation_id, attempted models, and details

ed17ff5c

fix: Resolve circular import between context.py and handlers.py · 290892ae

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Move RotationHandler import inside method to avoid circular dependency
- Import RotationHandler only when needed in ContextManager.__init__

290892ae

feat: Add condensation configuration with provider/model/rotation support (v0.3.2) · fe7b1fd9

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Add 'condensation' section to providers.json for dedicated provider/model
- Support rotation-based condensation by specifying rotation ID in model field
- Update ContextManager to use dedicated condensation handler
- Update handlers to pass condensation configuration
- Bump version to 0.3.2

fe7b1fd9

07 Feb, 2026 13 commits

Update wrapper script · d9772366
Stefy Lanza (nextime / spora ) authored Feb 07, 2026

d9772366

Fix effective_context variable scope and calculate total tokens for streaming responses · 8f07c660

Stefy Lanza (nextime / spora ) authored Feb 07, 2026

- Pass effective_context as parameter to stream_generator functions
- Update _create_streaming_response signature to accept effective_context
- Update all calls to _create_streaming_response to pass effective_context
- Track accumulated response text for token counting in streaming
- Calculate completion tokens for Google responses (since Google doesn't provide them)
- Calculate completion tokens for non-Google providers when they don't provide token counts
- Include prompt_tokens, completion_tokens, total_tokens, and effective_context in final chunk
- Fixes 'name effective_context is not defined' error in streaming responses
- Fixes issue where streaming responses had null token counts

8f07c660

Enable WAL mode for concurrent database access · d9cdab8b

Stefy Lanza (nextime / spora ) authored Feb 07, 2026

- Enable PRAGMA journal_mode=WAL for better concurrent access
- Set PRAGMA busy_timeout=5000 (5 seconds) for concurrent access
- WAL mode allows multiple readers and one writer simultaneously

d9cdab8b

Initialize database at application startup · 7628a6be

Stefy Lanza (nextime / spora ) authored Feb 07, 2026

- Import initialize_database from aisbf.database
- Call initialize_database() in main() to create/recreate database
- Clean up old token usage records to prevent database bloat

7628a6be

Add SQLite database module for persistent tracking · 2733f49f

Stefy Lanza (nextime / spora ) authored Feb 07, 2026

- Create aisbf/database.py with DatabaseManager class
- Track context dimensions (context_size, condense_context, condense_method, effective_context)
- Track token usage for rate limiting (TPM, TPH, TPD)
- Auto-create database at ~/.aisbf/aisbf.db if it doesn't exist
- Clean up old token usage records to prevent database bloat
- Export database module in __init__.py
- Update setup.py to include database.py in package data

2733f49f

Update setup.py to include context.py and utils.py modules in package data · ab62c97b
Stefy Lanza (nextime / spora ) authored Feb 07, 2026

ab62c97b

Add context management feature with automatic condensation · 55a8311f

Stefy Lanza (nextime / spora ) authored Feb 07, 2026

- Add context_size, condense_context, and condense_method fields to Model class
- Create new context.py module with ContextManager and condensation methods
- Implement hierarchical, conversational, semantic, and algoritmic condensation
- Calculate and report effective_context for all requests
- Update handlers.py to apply context condensation when configured
- Update providers.json and rotations.json with example context configurations
- Update README.md and DOCUMENTATION.md with context management documentation
- Export context module and utilities in __init__.py

55a8311f

Rate limitng and message splitting · 8bad912b
Stefy Lanza (nextime / spora ) authored Feb 07, 2026

8bad912b
Now it works! · e5494efd
Stefy Lanza (nextime / spora ) authored Feb 07, 2026

e5494efd
Fix streaming chunk serialization error · 49e14347
Stefy Lanza (nextime / spora ) authored Feb 07, 2026

49e14347
Fix streaming request handling for GoogleProviderHandler · e9a244cd
Stefy Lanza (nextime / spora ) authored Feb 07, 2026

e9a244cd
Add googletest rotation to rotations.json · 8a701a57
Stefy Lanza (nextime / spora ) authored Feb 07, 2026

8a701a57
Update test proxy to use /api/rotations/chat/completions with googletest model · 3ba34e24
Stefy Lanza (nextime / spora ) authored Feb 07, 2026

3ba34e24

06 Feb, 2026 16 commits

Fix streaming response handling for OpenAI async iterators · 9333666f
Stefy Lanza (nextime / spora ) authored Feb 06, 2026

9333666f
Fix: Import time module and improve stream type detection · 0e5fab02
Stefy Lanza (nextime / spora ) authored Feb 06, 2026

0e5fab02
Fix: Create new Google GenAI client for each streaming request · b6b35f58
Stefy Lanza (nextime / spora ) authored Feb 06, 2026

b6b35f58
Fix Google GenAI streaming response handling · dc79f93a
Stefy Lanza (nextime / spora ) authored Feb 06, 2026

dc79f93a

Fix Google GenAI streaming response handling · 77c08ee2

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Return Google's synchronous iterator directly from provider handler
- Detect Google streaming responses by checking for __iter__ but not __aiter__
- Convert Google chunks to OpenAI format in stream_generator
- Handle both sync (Google) and async (OpenAI/Anthropic) streaming responses
- Fix 'async_generator object is not iterable' error

This fixes streaming requests through autoselect and rotation handlers
that were failing with 'async_generator' object is not iterable error.

77c08ee2

Fix Google GenAI streaming handler to use async generator · 81e9a8f5

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Keep stream_generator as async function (not sync)
- Wrap Google's synchronous iterator in async generator
- Properly structure if/else for streaming vs non-streaming paths
- Fix 'client has been closed' error in streaming responses

This fixes the issue where streaming requests through autoselect
were failing with 'Cannot send a request, as a client has been closed'
error.

81e9a8f5

Fix streaming chunk structure for Google GenAI · 63268f97

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Ensure complete chunk object is yielded as single unit
- Add logging to show complete chunk structure
- Fix issue where chunk was being serialized as separate fields
- Maintain OpenAI-compatible chat.completion.chunk format

This should fix the streaming issue where chunks were being
serialized as separate data: lines instead of complete
JSON objects.

63268f97

Implement streaming support for Google GenAI provider · 8360e33b

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Use generate_content_stream() for streaming requests
- Create async generator that yields OpenAI-compatible chunks
- Extract text from each stream chunk
- Generate unique chunk IDs
- Format chunks as chat.completion.chunk objects
- Include delta content in each chunk
- Maintain non-streaming functionality for regular requests

This fixes the streaming issue where Google GenAI was returning
a dict instead of an iterable, causing 'JSONResponse object is
not iterable' errors.

8360e33b

Create test script for AISBF proxy · 3c7bec4c

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Test non-streaming requests to autoselect endpoint
- Test streaming requests to autoselect endpoint
- Test listing available providers
- Test listing models for autoselect endpoint
- Use model 'autoselect' for autoselect endpoint
- Include jq installation instructions for formatted output

Run with: ./test_proxy.sh

3c7bec4c

Remove Pydantic validation to test serialization · 6a1fc753

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Remove ChatCompletionResponse validation from GoogleProviderHandler
- Remove ChatCompletionResponse validation from AnthropicProviderHandler
- Return raw response dict directly
- Add logging to show response dict keys
- This tests if Pydantic validation was causing serialization issues

Testing if removing validation fixes client-side 'Cannot read properties
of undefined' errors.

6a1fc753

Wrap Google and Anthropic provider responses in JSONResponse · 4760277f

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

GoogleProviderHandler:
- Wrap validated response dict in JSONResponse before returning
- Add logging to confirm JSONResponse is being returned
- Ensures proper JSON serialization for Google GenAI responses

AnthropicProviderHandler:
- Wrap validated response dict in JSONResponse before returning
- Add logging to confirm JSONResponse is being returned
- Ensures proper JSON serialization for Anthropic responses

RequestHandler:
- Remove JSONResponse wrapping (now handled by providers)
- Update logging to detect JSONResponse vs dict responses
- OpenAI and Ollama providers return raw dicts (already compatible)

This fixes client-side 'Cannot read properties of undefined' errors by ensuring
Google and Anthropic responses are properly serialized as JSONResponse,
while leaving OpenAI and Ollama responses as-is since they're already
OpenAI-compatible.

4760277f

Wrap response in JSONResponse for proper serialization · d4a92e37

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Import JSONResponse from fastapi.responses
- Explicitly wrap response dict in JSONResponse
- Add logging to confirm JSONResponse is being returned
- This ensures FastAPI properly serializes the response dict
- Fixes potential serialization issues causing client-side errors

d4a92e37

Add response structure logging in RequestHandler · 94f17378

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Log response type and full response object
- Log response keys to verify structure
- Check if 'choices' key exists
- Verify choices is a list and not empty
- Log choices[0] content if available
- Add error logging for missing or malformed response structure

This will help identify why clients are getting 'Cannot read properties
of undefined (reading '0') errors when accessing response.choices[0]

94f17378

Add comprehensive tool calls support for Google and Anthropic providers · 60ca20d2

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

GoogleProviderHandler enhancements:
- Process all parts in response content (not just first part)
- Extract and combine all text parts
- Detect and convert Google function_call to OpenAI tool_calls format
- Generate unique call IDs for tool calls
- Handle function responses for debugging
- Set content to None when tool_calls are present (OpenAI convention)
- Add comprehensive logging for tool call detection and conversion
- Support both text and function/tool calls in same response
- Validate response against ChatCompletionResponse Pydantic model
- Add detailed response structure logging

AnthropicProviderHandler enhancements:
- Process all content blocks (not just text)
- Detect and convert Anthropic tool_use blocks to OpenAI tool_calls format
- Generate unique call IDs for tool calls
- Combine all text parts from multiple blocks
- Set content to None when tool_calls are present (OpenAI convention)
- Add comprehensive logging for tool_use detection and conversion
- Validate response against ChatCompletionResponse Pydantic model
- Add detailed response structure logging

Both handlers now properly translate provider-specific function calling
formats to OpenAI-compatible tool_calls structure, ensuring clients receive
valid structured responses with proper schema validation.

60ca20d2

Add comprehensive debug logging for Google GenAI response parsing · 627f1407

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Log response type and all attributes
- Log candidates structure and length
- Log candidate attributes and content structure
- Log parts structure and first part details
- Log raw text extraction and parsing steps
- Log final extracted text and finish reason
- Add error logging with full stack trace for exceptions

This will help identify where the parsing is failing when responses
show as 'parsed=None' and clients receive nothing.

627f1407

Fix Google GenAI response translation to OpenAI format · 9c5d68d9

Stefy Lanza (nextime / spora ) authored Feb 06, 2026

- Properly extract finish_reason from candidate object and map to OpenAI format
- Correctly extract usage metadata from response.usage_metadata structure
- Extract prompt_token_count, candidates_token_count, and total_token_count
- Add logging for usage metadata extraction
- Handle Google finish reasons: STOP, MAX_TOKENS, SAFETY, RECITATION, OTHER

This fixes the issue where Gemini responses were arriving corrupted to
OpenAI-compatible clients due to incorrect parsing of the new Google GenAI
SDK response structure.

9c5d68d9