Commits · 378e412711a455cb230239462988cac1ebd2e602 · nexlab / aisbf

08 Feb, 2026 35 commits

Always return formatted error responses for rotation providers with appropriate status codes · 378e4127
Stefy Lanza (nextime / spora ) authored Feb 08, 2026

378e4127
Change error HTTP status code from 503 to 429 when notifyerrors is false · ebddd59f
Stefy Lanza (nextime / spora ) authored Feb 08, 2026

ebddd59f
Adjust error message formatting - remove duplicate line and improve spacing · 4e2f7315
Stefy Lanza (nextime / spora ) authored Feb 08, 2026

4e2f7315
Enhance error message formatting with bold text and JSON pretty printing · b636940a
Stefy Lanza (nextime / spora ) authored Feb 08, 2026

b636940a
Improve error message formatting by replacing semicolon separators with... · 689d5b4b
Stefy Lanza (nextime / spora ) authored Feb 08, 2026
```
Improve error message formatting by replacing semicolon separators with newlines for better readability
```
689d5b4b
Fix notifyerrors and streaming error response · b4b01a13
Stefy Lanza (nextime / spora ) authored Feb 08, 2026

b4b01a13
Fix err 500 · 9a91c635
Stefy Lanza (nextime / spora ) authored Feb 08, 2026

9a91c635
Fix error 500 · c52a65d7
Stefy Lanza (nextime / spora ) authored Feb 08, 2026

c52a65d7
notifyerrors fix · 32380465
Stefy Lanza (nextime / spora ) authored Feb 08, 2026

32380465

fix: Handle assistant wrapper pattern in streaming responses · b3b44f6f

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Detect and unwrap responses wrapped in 'assistant: [{"type": "text", "text": "..."}]' format
- Use extracted text for response content instead of raw accumulated text
- Fix variable scoping issue with tool_match variable
- Update token counting to use final_text when available

b3b44f6f

fix: Stream chunks normally, only add tool call chunk at end · b307c7fb

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

Instead of collecting all chunks and sending a modified response:
- Stream chunks normally as they come (with deltas like before)
- Only at the END, if tool call pattern detected, send additional chunk with tool_calls
- Then send final chunk with usage statistics

This preserves the original streaming behavior while adding tool call detection.

b307c7fb

fix: Extract text from assistant wrapper when no tool call present · 72c50449

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

When the model returns a response in the format:
assistant: [{'type': 'text', 'text': '...'}]
but without a tool call, extract just the text content
instead of returning the raw wrapper format.

72c50449

fix: Decode unicode escape sequences in tool JSON · a28caa59

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

The model returns literal \n (backslash-n) instead of actual newlines.
This breaks JSON parsing because {\n is not valid JSON syntax.
Use codecs.decode with 'unicode_escape' to convert escape sequences
to actual characters before parsing.

a28caa59

debug: Add detailed logging for tool call parsing in streaming · 47ce8dfe

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Log accumulated response text (first 500 and last 200 chars)
- Log extracted tool JSON with length and byte details
- Log ASCII codes for first 20 chars to detect encoding issues
- Log JSON parse errors with position details
- Log success/failure of JSON parsing attempts

47ce8dfe

fix: Simplify streaming tool call parsing with robust JSON extraction · 01b8c0b8

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Use brace counting for robust JSON extraction
- Try JSON first, then fix common issues (single quotes, trailing commas)
- Extract final assistant text using regex after tool JSON
- Remove complex nested parsing that was failing with escaped quotes

01b8c0b8

fix: Use ast.literal_eval for Python-style single quotes in tool calls · b13b20c9

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Models may return single quotes instead of double quotes
- Fall back to ast.literal_eval when JSON parsing fails
- Handle both JSON and Python-style literals in streaming responses

b13b20c9

feat: Add tool call detection in streaming responses · 3507a642

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Detect tool calls in accumulated streaming text after all chunks received
- Parse nested 'assistant: [...]' format with tool calls inside
- Parse simple 'tool: {...}' format
- Convert detected tool calls to OpenAI-compatible format
- Send tool_calls in first chunk, then final assistant text
- Proper handling of finish_reason in final chunk

3507a642

feat: Handle nested assistant/tool format in text responses · 847353e3

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Detect when entire response is wrapped in 'assistant: [...]'
- Parse nested 'tool: {...}' inside the assistant text
- Extract final assistant text from nested structure
- Handle multi-line JSON content with proper brace counting
- More robust parsing for complex nested formats

847353e3

feat: Add parsing for 'content/assistant' text format · 36773e04

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Detect '"content": "..." } assistant: [...]' pattern
- Extract tool content and convert to write action
- Extract assistant text from JSON array
- Handle multi-line content with newlines
- More robust tool call detection for various text formats

36773e04

feat: Add tool call parsing for 'tool: {...}' text format · b2aca709

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Detect 'tool: {...}' pattern in Google model text responses
- Parse and convert to OpenAI-compatible tool_calls format
- Extract assistant text from 'assistant: [...]' format if present
- Handle both 'action' and 'name' fields for tool identification
- Convert arguments to JSON string for OpenAI compatibility

This fixes issues where models return tool calls as text instead of
using proper function_call attributes.

b2aca709

fix: Revert Google streaming to yield raw chunk objects · e1e0092d

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Google provider now yields raw chunk objects instead of pre-formatted SSE bytes
- The handlers.py handles the conversion to OpenAI-compatible format
- This fixes the issue where clients weren't receiving streaming responses

Note: Server must be restarted to pick up this change

e1e0092d

fix: Add missing import for count_messages_tokens in providers.py · 9d95c435

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Import count_messages_tokens from utils module
- Fixes 'name count_messages_tokens is not defined' error in Google streaming handler

9d95c435

feat: Add dedicated condensation provider/model configuration · a41de233

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Add 'condensation' section to providers.json for specifying dedicated provider/model
- Add CondensationConfig model to config.py
- Add _load_condensation() and get_condensation() methods
- Update ContextManager to use dedicated condensation handler when configured
- Update handlers to pass condensation config to ContextManager
- Allows using smaller/faster model for context condensation operations

This addresses the issue where conversational and semantic condensation
methods were using the same model as the main request, which was
inefficient. Now users can configure a dedicated provider and model
for condensation operations, typically using a smaller/faster model to reduce
costs and improve performance.

a41de233

fix: Properly parse tool calls in Google streaming responses · acce04f1

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Accumulate all streaming chunks before parsing
- Parse complete response at end of stream
- Detect and convert tool calls from accumulated text content
- Fixes issue where tool calls were returned as text instead of tool_calls structure

acce04f1

fix: Parse tool calls from text content in Google provider · 7daa1c22

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Add logic to detect and parse tool calls from text content
- Some models return tool calls as JSON in text instead of using function_call attribute
- Handles both Google-style (action) and OpenAI-style (function/name) tool calls
- Clears response_text when tool_calls are detected

7daa1c22

fix: Correct notifyerrors field name in handlers.py · 0798eecd

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Changed all references from 'notify_errors' to 'notifyerrors' to match RotationConfig model
- Fixes issue where notifyerrors setting was not being properly detected

0798eecd

fix: Add notifyerrors field to RotationConfig model · 2a522121

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Add notifyerrors field with default value False
- Fixes issue where notifyerrors was always detected as False
- Allows rotation to return error as normal message instead of HTTP 503

2a522121

fix: Add stream parameter check to 'All retries exhausted' case · e8bd3dea

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Get stream parameter from request_data to determine response type
- Return StreamingResponse if original request was streaming
- Return dict if original request was non-streaming
- Fixes notifyerrors not working for streaming requests in retry exhausted case

e8bd3dea

fix: Return correct response type based on original request mode · 7e5dc1ec

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Add stream parameter to handle_rotation_request()
- When notifyerrors is enabled, return StreamingResponse if original request was streaming
- Return dict if original request was non-streaming
- Fixes issue where autoselect handler expects StreamingResponse but was getting dict

7e5dc1ec

fix: Use getattr() instead of .get() for rotation_config · b3a547e1

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- RotationConfig is a Pydantic model, not a dictionary
- Use getattr() to safely access notifyerrors attribute
- Fixes AttributeError when accessing rotation_config attributes

b3a547e1

feat: Add notifyerrors configuration to rotations.json · 3beaded8

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Add 'notifyerrors' field to rotation configuration (default: false)
- When enabled, return errors as normal messages instead of HTTP 503
- Allows clients to consume error messages normally without HTTP errors
- Update handlers.py to check notifyerrors setting and return appropriate response

3beaded8

chore: Bump version to 0.3.3 · 56d1b65e
Stefy Lanza (nextime / spora ) authored Feb 08, 2026

56d1b65e

feat: Improve error messages when no models are available in rotation · ed17ff5c

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Add detailed error information including provider status
- Show cooldown remaining time for rate-limited providers
- Display failure counts for each provider
- Provide structured error response with rotation_id, attempted models, and details

ed17ff5c

fix: Resolve circular import between context.py and handlers.py · 290892ae

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Move RotationHandler import inside method to avoid circular dependency
- Import RotationHandler only when needed in ContextManager.__init__

290892ae

feat: Add condensation configuration with provider/model/rotation support (v0.3.2) · fe7b1fd9

Stefy Lanza (nextime / spora ) authored Feb 08, 2026

- Add 'condensation' section to providers.json for dedicated provider/model
- Support rotation-based condensation by specifying rotation ID in model field
- Update ContextManager to use dedicated condensation handler
- Update handlers to pass condensation configuration
- Bump version to 0.3.2

fe7b1fd9

07 Feb, 2026 5 commits

Update wrapper script · d9772366
Stefy Lanza (nextime / spora ) authored Feb 07, 2026

d9772366

Fix effective_context variable scope and calculate total tokens for streaming responses · 8f07c660

Stefy Lanza (nextime / spora ) authored Feb 07, 2026

- Pass effective_context as parameter to stream_generator functions
- Update _create_streaming_response signature to accept effective_context
- Update all calls to _create_streaming_response to pass effective_context
- Track accumulated response text for token counting in streaming
- Calculate completion tokens for Google responses (since Google doesn't provide them)
- Calculate completion tokens for non-Google providers when they don't provide token counts
- Include prompt_tokens, completion_tokens, total_tokens, and effective_context in final chunk
- Fixes 'name effective_context is not defined' error in streaming responses
- Fixes issue where streaming responses had null token counts

8f07c660

Enable WAL mode for concurrent database access · d9cdab8b

Stefy Lanza (nextime / spora ) authored Feb 07, 2026

- Enable PRAGMA journal_mode=WAL for better concurrent access
- Set PRAGMA busy_timeout=5000 (5 seconds) for concurrent access
- WAL mode allows multiple readers and one writer simultaneously

d9cdab8b

Initialize database at application startup · 7628a6be

Stefy Lanza (nextime / spora ) authored Feb 07, 2026

- Import initialize_database from aisbf.database
- Call initialize_database() in main() to create/recreate database
- Clean up old token usage records to prevent database bloat

7628a6be

Add SQLite database module for persistent tracking · 2733f49f

Stefy Lanza (nextime / spora ) authored Feb 07, 2026

- Create aisbf/database.py with DatabaseManager class
- Track context dimensions (context_size, condense_context, condense_method, effective_context)
- Track token usage for rate limiting (TPM, TPH, TPD)
- Auto-create database at ~/.aisbf/aisbf.db if it doesn't exist
- Clean up old token usage records to prevent database bloat
- Export database module in __init__.py
- Update setup.py to include database.py in package data

2733f49f