- 08 Feb, 2026 11 commits
-
-
Stefy Lanza (nextime / spora ) authored
- Add logic to detect and parse tool calls from text content - Some models return tool calls as JSON in text instead of using function_call attribute - Handles both Google-style (action) and OpenAI-style (function/name) tool calls - Clears response_text when tool_calls are detected
-
Stefy Lanza (nextime / spora ) authored
- Changed all references from 'notify_errors' to 'notifyerrors' to match RotationConfig model - Fixes issue where notifyerrors setting was not being properly detected
-
Stefy Lanza (nextime / spora ) authored
- Add notifyerrors field with default value False - Fixes issue where notifyerrors was always detected as False - Allows rotation to return error as normal message instead of HTTP 503
-
Stefy Lanza (nextime / spora ) authored
- Get stream parameter from request_data to determine response type - Return StreamingResponse if original request was streaming - Return dict if original request was non-streaming - Fixes notifyerrors not working for streaming requests in retry exhausted case
-
Stefy Lanza (nextime / spora ) authored
- Add stream parameter to handle_rotation_request() - When notifyerrors is enabled, return StreamingResponse if original request was streaming - Return dict if original request was non-streaming - Fixes issue where autoselect handler expects StreamingResponse but was getting dict
-
Stefy Lanza (nextime / spora ) authored
- RotationConfig is a Pydantic model, not a dictionary - Use getattr() to safely access notifyerrors attribute - Fixes AttributeError when accessing rotation_config attributes
-
Stefy Lanza (nextime / spora ) authored
- Add 'notifyerrors' field to rotation configuration (default: false) - When enabled, return errors as normal messages instead of HTTP 503 - Allows clients to consume error messages normally without HTTP errors - Update handlers.py to check notifyerrors setting and return appropriate response
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
- Add detailed error information including provider status - Show cooldown remaining time for rate-limited providers - Display failure counts for each provider - Provide structured error response with rotation_id, attempted models, and details
-
Stefy Lanza (nextime / spora ) authored
- Move RotationHandler import inside method to avoid circular dependency - Import RotationHandler only when needed in ContextManager.__init__
-
Stefy Lanza (nextime / spora ) authored
- Add 'condensation' section to providers.json for dedicated provider/model - Support rotation-based condensation by specifying rotation ID in model field - Update ContextManager to use dedicated condensation handler - Update handlers to pass condensation configuration - Bump version to 0.3.2
-
- 07 Feb, 2026 13 commits
-
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
- Pass effective_context as parameter to stream_generator functions - Update _create_streaming_response signature to accept effective_context - Update all calls to _create_streaming_response to pass effective_context - Track accumulated response text for token counting in streaming - Calculate completion tokens for Google responses (since Google doesn't provide them) - Calculate completion tokens for non-Google providers when they don't provide token counts - Include prompt_tokens, completion_tokens, total_tokens, and effective_context in final chunk - Fixes 'name effective_context is not defined' error in streaming responses - Fixes issue where streaming responses had null token counts
-
Stefy Lanza (nextime / spora ) authored
- Enable PRAGMA journal_mode=WAL for better concurrent access - Set PRAGMA busy_timeout=5000 (5 seconds) for concurrent access - WAL mode allows multiple readers and one writer simultaneously
-
Stefy Lanza (nextime / spora ) authored
- Import initialize_database from aisbf.database - Call initialize_database() in main() to create/recreate database - Clean up old token usage records to prevent database bloat
-
Stefy Lanza (nextime / spora ) authored
- Create aisbf/database.py with DatabaseManager class - Track context dimensions (context_size, condense_context, condense_method, effective_context) - Track token usage for rate limiting (TPM, TPH, TPD) - Auto-create database at ~/.aisbf/aisbf.db if it doesn't exist - Clean up old token usage records to prevent database bloat - Export database module in __init__.py - Update setup.py to include database.py in package data
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
- Add context_size, condense_context, and condense_method fields to Model class - Create new context.py module with ContextManager and condensation methods - Implement hierarchical, conversational, semantic, and algoritmic condensation - Calculate and report effective_context for all requests - Update handlers.py to apply context condensation when configured - Update providers.json and rotations.json with example context configurations - Update README.md and DOCUMENTATION.md with context management documentation - Export context module and utilities in __init__.py
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
-
- 06 Feb, 2026 16 commits
-
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
- Return Google's synchronous iterator directly from provider handler - Detect Google streaming responses by checking for __iter__ but not __aiter__ - Convert Google chunks to OpenAI format in stream_generator - Handle both sync (Google) and async (OpenAI/Anthropic) streaming responses - Fix 'async_generator object is not iterable' error This fixes streaming requests through autoselect and rotation handlers that were failing with 'async_generator' object is not iterable error.
-
Stefy Lanza (nextime / spora ) authored
- Keep stream_generator as async function (not sync) - Wrap Google's synchronous iterator in async generator - Properly structure if/else for streaming vs non-streaming paths - Fix 'client has been closed' error in streaming responses This fixes the issue where streaming requests through autoselect were failing with 'Cannot send a request, as a client has been closed' error.
-
Stefy Lanza (nextime / spora ) authored
- Ensure complete chunk object is yielded as single unit - Add logging to show complete chunk structure - Fix issue where chunk was being serialized as separate fields - Maintain OpenAI-compatible chat.completion.chunk format This should fix the streaming issue where chunks were being serialized as separate data: lines instead of complete JSON objects.
-
Stefy Lanza (nextime / spora ) authored
- Use generate_content_stream() for streaming requests - Create async generator that yields OpenAI-compatible chunks - Extract text from each stream chunk - Generate unique chunk IDs - Format chunks as chat.completion.chunk objects - Include delta content in each chunk - Maintain non-streaming functionality for regular requests This fixes the streaming issue where Google GenAI was returning a dict instead of an iterable, causing 'JSONResponse object is not iterable' errors.
-
Stefy Lanza (nextime / spora ) authored
- Test non-streaming requests to autoselect endpoint - Test streaming requests to autoselect endpoint - Test listing available providers - Test listing models for autoselect endpoint - Use model 'autoselect' for autoselect endpoint - Include jq installation instructions for formatted output Run with: ./test_proxy.sh
-
Stefy Lanza (nextime / spora ) authored
- Remove ChatCompletionResponse validation from GoogleProviderHandler - Remove ChatCompletionResponse validation from AnthropicProviderHandler - Return raw response dict directly - Add logging to show response dict keys - This tests if Pydantic validation was causing serialization issues Testing if removing validation fixes client-side 'Cannot read properties of undefined' errors.
-
Stefy Lanza (nextime / spora ) authored
GoogleProviderHandler: - Wrap validated response dict in JSONResponse before returning - Add logging to confirm JSONResponse is being returned - Ensures proper JSON serialization for Google GenAI responses AnthropicProviderHandler: - Wrap validated response dict in JSONResponse before returning - Add logging to confirm JSONResponse is being returned - Ensures proper JSON serialization for Anthropic responses RequestHandler: - Remove JSONResponse wrapping (now handled by providers) - Update logging to detect JSONResponse vs dict responses - OpenAI and Ollama providers return raw dicts (already compatible) This fixes client-side 'Cannot read properties of undefined' errors by ensuring Google and Anthropic responses are properly serialized as JSONResponse, while leaving OpenAI and Ollama responses as-is since they're already OpenAI-compatible.
-
Stefy Lanza (nextime / spora ) authored
- Import JSONResponse from fastapi.responses - Explicitly wrap response dict in JSONResponse - Add logging to confirm JSONResponse is being returned - This ensures FastAPI properly serializes the response dict - Fixes potential serialization issues causing client-side errors
-
Stefy Lanza (nextime / spora ) authored
- Log response type and full response object - Log response keys to verify structure - Check if 'choices' key exists - Verify choices is a list and not empty - Log choices[0] content if available - Add error logging for missing or malformed response structure This will help identify why clients are getting 'Cannot read properties of undefined (reading '0') errors when accessing response.choices[0]
-
Stefy Lanza (nextime / spora ) authored
GoogleProviderHandler enhancements: - Process all parts in response content (not just first part) - Extract and combine all text parts - Detect and convert Google function_call to OpenAI tool_calls format - Generate unique call IDs for tool calls - Handle function responses for debugging - Set content to None when tool_calls are present (OpenAI convention) - Add comprehensive logging for tool call detection and conversion - Support both text and function/tool calls in same response - Validate response against ChatCompletionResponse Pydantic model - Add detailed response structure logging AnthropicProviderHandler enhancements: - Process all content blocks (not just text) - Detect and convert Anthropic tool_use blocks to OpenAI tool_calls format - Generate unique call IDs for tool calls - Combine all text parts from multiple blocks - Set content to None when tool_calls are present (OpenAI convention) - Add comprehensive logging for tool_use detection and conversion - Validate response against ChatCompletionResponse Pydantic model - Add detailed response structure logging Both handlers now properly translate provider-specific function calling formats to OpenAI-compatible tool_calls structure, ensuring clients receive valid structured responses with proper schema validation.
-
Stefy Lanza (nextime / spora ) authored
- Log response type and all attributes - Log candidates structure and length - Log candidate attributes and content structure - Log parts structure and first part details - Log raw text extraction and parsing steps - Log final extracted text and finish reason - Add error logging with full stack trace for exceptions This will help identify where the parsing is failing when responses show as 'parsed=None' and clients receive nothing.
-
Stefy Lanza (nextime / spora ) authored
- Properly extract finish_reason from candidate object and map to OpenAI format - Correctly extract usage metadata from response.usage_metadata structure - Extract prompt_token_count, candidates_token_count, and total_token_count - Add logging for usage metadata extraction - Handle Google finish reasons: STOP, MAX_TOKENS, SAFETY, RECITATION, OTHER This fixes the issue where Gemini responses were arriving corrupted to OpenAI-compatible clients due to incorrect parsing of the new Google GenAI SDK response structure.
-