- 01 Apr, 2026 1 commit
-
-
Your Name authored
Add three key improvements to ClaudeProviderHandler: 1. Thinking Block Support (Phase 2.1): - Extract thinking/reasoning content from Claude API responses - Handle both 'thinking' and 'redacted_thinking' block types - Store thinking content in provider_options for downstream access - Reference: vendors/kilocode thinking support via AI SDK 2. Tool Call Streaming (Phase 2.2): - Parse content_block_start events for tool_use blocks - Stream tool call arguments via input_json_delta events - Emit tool calls in OpenAI streaming format on content_block_stop - Reference: fine-grained-tool-streaming-2025-05-14 beta feature 3. Detailed Usage Metadata (Phase 2.3): - Extract cache_read_input_tokens from API response - Extract cache_creation_input_tokens from API response - Add prompt_tokens_details and completion_tokens_details to usage - Log cache usage for analytics - Reference: vendors/kilocode session/index.ts usage extraction All methods integrated into _convert_to_openai_format() and _handle_streaming_request() for automatic application.
-
- 31 Mar, 2026 7 commits
-
-
Your Name authored
Add three key improvements to ClaudeProviderHandler based on comparison with vendors/kilocode implementation: 1. Tool Call ID Sanitization (_sanitize_tool_call_id): - Replace invalid characters in tool call IDs with underscores - Claude API requires alphanumeric, underscore, hyphen only - Reference: vendors/kilocode normalizeMessages() sanitization 2. Empty Content Filtering (_filter_empty_content): - Filter out empty string messages and empty text parts - Claude API rejects messages with empty content - Reference: vendors/kilocode normalizeMessages() filtering 3. Prompt Caching (_apply_cache_control): - Apply ephemeral cache_control to last 2 messages - Enable Anthropic's prompt caching feature for cost savings - Reference: vendors/kilocode applyCaching() All methods integrated into _convert_messages_to_anthropic() for automatic application during message conversion.
-
Your Name authored
Create docs/claude_provider_improvement_plan.md with detailed implementation plan for AISBF ClaudeProviderHandler improvements identified in the provider comparison analysis. Plan includes 10 improvements across 4 phases: - Phase 1 (Quick Wins): Tool call ID sanitization, empty content filtering, prompt caching - Phase 2 (Core): Thinking block support, tool call streaming, usage metadata - Phase 3 (Robustness): Message validation, tool result size limits, fallback - Phase 4 (Advanced): Image/multimodal support Each improvement includes: problem statement, reference implementation, detailed implementation steps, files to modify, and effort estimate. Total estimated effort: 24-37 hours across 4 weeks.
-
Your Name authored
Document now correctly compares only the three Claude provider implementations: - AISBF (aisbf/providers.py) - Direct HTTP with OAuth2 - vendors/kilocode (vendors/kilocode/packages/opencode/src/provider/) - AI SDK - vendors/claude (vendors/claude/src/) - Original Claude Code All tables and references now use these three sources exclusively. Removed all Kiro Gateway content which was unrelated to Claude.
-
Your Name authored
Kiro Gateway is an Amazon Q Developer implementation using AWS CodeWhisperer API, not a Claude provider. The comparison now focuses on actual Claude implementations: - AISBF Claude Provider (direct HTTP with OAuth2) - Original Claude Code (TypeScript/React from Anthropic) - KiloCode (TypeScript using @ai-sdk/anthropic) Removed all Kiro-related sections including: - Kiro Gateway architecture comparison - Kiro message conversion and tool handling - Kiro streaming (AWS Event Stream) - Kiro model name normalization - Kiro exclusive features (thinking injection, truncation recovery, etc.) Document now cleanly compares three Claude provider implementations.
-
Your Name authored
- Add KiloCode implementation analysis (vendors/kilocode/packages/opencode/src/provider/) - Compare KiloCode's AI SDK approach (@ai-sdk/anthropic) vs direct HTTP - Document KiloCode's features: automatic prompt caching, thinking support, message validation, reasoning variants, model management - Add comparison tables for architecture, message conversion, streaming, headers, model resolution, reasoning/thinking support, prompt caching - Document KiloCode exclusive features: empty content filtering, tool call ID sanitization, duplicate reasoning fix, provider option remapping, Gemini schema sanitization, unsupported part handling - Update summary with KiloCode strengths and additional improvement areas
-
Your Name authored
- Add comprehensive Kiro Gateway analysis alongside Claude Code comparison - Document Kiro's unified intermediate message format approach - Compare streaming implementations (SSE vs AWS Event Stream) - Document Kiro's advanced features: thinking injection, tool content stripping, image extraction, truncation recovery, model name normalization - Add comparison tables for architecture, message handling, tools, streaming - Identify patterns from Kiro that could improve AISBF (unified format, message validation, multimodal support)
-
Your Name authored
- Add comprehensive comparison of AISBF Claude provider vs original Claude Code source - Document message conversion, tool handling, streaming, and response parsing differences - Identify areas for improvement: thinking blocks, tool call streaming, usage metadata - Include all other pending changes across the codebase
-
- 30 Mar, 2026 5 commits
-
-
Your Name authored
- Updated CHANGELOG.md with complete feature list including: * Claude OAuth2 provider with PKCE flow and automatic token refresh * Response caching with semantic deduplication (Memory/Redis/SQLite/MySQL) * Model embeddings cache with multiple backends * User-specific API endpoints and MCP enhancements * Adaptive rate limiting and token usage analytics * Smart request batching and streaming optimization * All performance features and bug fixes - Enhanced README.md with: * Claude OAuth2 authentication section with setup guide * Response caching details with all backends and deduplication * Flexible caching system with Redis/MySQL/SQLite/File/Memory * Updated key features with expanded descriptions * Configuration examples for all caching systems - Updated DOCUMENTATION.md with: * Claude Code provider in Provider Support section * Enhanced provider descriptions with caching capabilities * Reference to Claude OAuth2 setup documentation - Enhanced CLAUDE_OAUTH2_SETUP.md with key features list - Added clarifying comments to aisbf/claude_auth.py All documentation now accurately reflects the codebase with complete coverage of caching systems (response cache and model embeddings cache), request deduplication via SHA256, and all implemented features.
-
Your Name authored
- Document user-specific API endpoints: /api/user/models, /api/user/providers, /api/user/rotations, /api/user/autoselects, /api/user/chat/completions - Document user MCP tools: list_user_models, list_user_providers, set_user_provider, delete_user_provider, list_user_rotations, set_user_rotation, delete_user_rotation, list_user_autoselects, set_user_autoselect, delete_user_autoselect, user_chat_completion - Update user dashboard with clear endpoint documentation - Add enhanced analytics for user token usage tracking - Add database improvements for user token management
-
Your Name authored
- Added /api/user/* endpoints for authenticated users to access their own configurations - Admin users get access to global + user configs, regular users get user-only - Global tokens from aisbf.json have full access to all configurations - Enhanced MCP with user-specific tools for authenticated users - Updated user dashboard with comprehensive API endpoint documentation - Updated README.md, DOCUMENTATION.md with new endpoint documentation - Updated CHANGELOG.md with new features - Bumped version to 0.9.1
-
Your Name authored
Add pricing extraction (rate_multiplier, rate_unit, prompt/completion tokens) and auto-configure rate limits on 429 - Parse rate_multiplier and rate_unit from nexlab API as pricing - Parse promptTokenPrice and completionTokenPrice from AWS Q API - Extract pricing from OpenRouter-style API responses for OpenAI provider - Add _auto_configure_rate_limits to extract X-RateLimit-* headers - Update parse_429_response to capture rate limit headers
-
Your Name authored
Add model metadata fields (top_provider, pricing, description, supported_parameters, architecture) and dashboard Get Models button - Update providers.py to extract all fields from provider API responses - Add max_input_tokens support for Claude provider context size - Add top_provider, pricing, description, supported_parameters, architecture fields - Update cache functions to save/load new metadata fields - Update handlers.py to expose new fields in model list response - Add Get Models button to dashboard
-
- 27 Mar, 2026 5 commits
-
-
Your Name authored
- Added filter parameters to analytics route in main.py - Updated get_model_performance() to support filtering by provider, model, rotation, and autoselect - Added get_rotations_stats() and get_autoselects_stats() methods - Added filter UI to analytics.html with dropdowns for filtering - Updated Model Performance table to show type (Provider/Rotation/Autoselect)
-
Your Name authored
- Added new dashboard template (templates/dashboard/users.html) for managing users - Added routes in main.py: GET /dashboard/users, POST /dashboard/users/add, POST /dashboard/users/{id}/edit, POST /dashboard/users/{id}/toggle, POST /dashboard/users/{id}/delete - Added 'Users' link to navigation menu (visible only for admin users) - Added update_user method to database.py for editing user details Features: - Add new users with username, password, and role (user/admin) - Edit existing user details - Toggle user active/inactive status - Delete users -
Your Name authored
- Add AdaptiveRateLimiter class in aisbf/providers.py for per-provider adaptive rate limiting - Enhance 429 handling with exponential backoff and jitter - Track 429 patterns per provider with configurable history window - Implement dynamic rate limit adjustment that learns from 429 responses - Add rate limit headroom (stays 10% below learned limits) - Add gradual recovery after consecutive successful requests - Add AdaptiveRateLimitingConfig in aisbf/config.py - Add adaptive_rate_limiting configuration to config/aisbf.json - Add dashboard UI at /dashboard/rate-limits - Add dashboard API endpoints for stats and reset functionality - Update TODO.md to mark item #8 as completed
-
Your Name authored
- Add aisbf/analytics.py module with Analytics class for tracking token usage, request counts, latency, error rates, and cost estimation per provider - Add templates/dashboard/analytics.html with comprehensive dashboard page - Integrate analytics recording into RequestHandler, RotationHandler, and AutoselectHandler - Add /dashboard/analytics route in main.py - Add Analytics link to base.html navigation - Update CHANGELOG.md with new feature documentation Features: - Token usage tracking with database persistence - Real-time request counts and latency tracking - Error rates and types tracking - Cost estimation per provider (Anthropic, OpenAI, Google, Kiro, OpenRouter) - Model performance comparison - Token usage over time visualization (1h, 6h, 24h, 7d) - Optimization recommendations - Export functionality (JSON, CSV) - Integration with all request handlers - Support for rotation_id and autoselect_id tracking
-
Your Name authored
- Add aisbf/streaming_optimization.py module with: - StreamingConfig: Configuration dataclass for optimization settings - ChunkPool: Memory-efficient chunk object reuse pool - BackpressureController: Flow control to prevent overwhelming consumers - StreamingOptimizer: Main coordinator combining all optimizations - KiroSSEParser: Optimized SSE parser for Kiro streaming - OptimizedTextAccumulator: Memory-efficient text accumulation - calculate_google_delta(): Incremental delta calculation - Update aisbf/handlers.py to integrate streaming optimizations: - Use chunk pooling for Google streaming - Use OptimizedTextAccumulator for memory efficiency - Add delta-based streaming for Google provider - Integrate KiroSSEParser for Kiro provider - Update setup.py to include streaming_optimization.py - Update pyproject.toml with package data - Update TODO.md with completed status - Update README.md with new feature description - Update CHANGELOG.md with streaming optimization details Expected benefits: - 10-20% memory reduction in streaming responses - Better flow control with backpressure handling - Optimized Google and Kiro streaming with delta calculation - Configurable optimization via StreamingConfig
-
- 26 Mar, 2026 8 commits
-
-
Your Name authored
- Add aisbf/batching.py module with RequestBatcher class - Implement time-based (100ms window) and size-based batching - Add provider-specific batching configurations (OpenAI: 10, Anthropic: 5) - Integrate batching with BaseProviderHandler - Add batching configuration to config/aisbf.json - Initialize batching system in main.py startup - Update version to 0.8.0 in setup.py and pyproject.toml - Add batching.py to setup.py data_files - Update README.md and TODO.md documentation - Expected benefit: 15-25% latency reduction Features: - Automatic batch formation and processing - Response splitting and distribution - Statistics tracking (batches formed, requests batched, avg batch size) - Graceful error handling and fallback - Non-blocking async queue management - Streaming request bypass (batching disabled for streams)
-
Your Name authored
-
Your Name authored
- Optimized existing condensation methods (hierarchical, conversational, semantic, algorithmic) - Added 4 new condensation methods (sliding_window, importance_based, entity_aware, code_aware) - Fixed critical bugs in conversational and semantic methods (undefined variables) - Added internal model warm-up functionality for faster first inference - Implemented condensation analytics (effectiveness %, latency tracking) - Added similarity detection in algorithmic method using difflib - Support for condensation method chaining - Per-model condensation thresholds - Adaptive condensation based on context size - Updated README, TODO, DOCUMENTATION, and CHANGELOG
-
Your Name authored
- Add ResponseCache class with multiple backend support (memory, Redis, SQLite, MySQL) - Implement LRU eviction for memory backend with configurable max size - Add SHA256-based cache key generation for request deduplication - Implement TTL-based expiration (default: 600 seconds) - Add cache statistics tracking (hits, misses, hit rate, evictions) - Integrate caching into RequestHandler, RotationHandler, and AutoselectHandler - Add granular cache control at model, provider, rotation, and autoselect levels - Implement hierarchical configuration: Model > Provider > Rotation > Autoselect > Global - Add dashboard endpoints for cache statistics (/dashboard/response-cache/stats) and clearing (/dashboard/response-cache/clear) - Add response cache initialization in main.py startup event - Skip caching for streaming requests - Add comprehensive test suite (test_response_cache.py) with 6 test scenarios - Update configuration models with enable_response_cache fields - Update TODO.md to mark Response Caching as completed - Update CHANGELOG.md with response caching features Files created: - aisbf/response_cache.py (740+ lines) - test_response_cache.py (comprehensive test suite) Files modified: - aisbf/handlers.py (cache integration and _should_cache_response helper) - aisbf/config.py (ResponseCacheConfig and enable_response_cache fields) - config/aisbf.json (response_cache configuration section) - main.py (response cache initialization) - TODO.md (mark task as completed) - CHANGELOG.md (document new features)
-
Your Name authored
- Implement Anthropic cache_control support for 50-70% cost reduction - Add Google Context Caching API framework with TTL configuration - Add provider-level caching configuration (enable_native_caching, cache_ttl, min_cacheable_tokens) - Update dashboard UI with caching settings - Update documentation with detailed caching guide and examples - Mark system messages and conversation prefixes as cacheable automatically - Test Python compilation and validate implementation
-
Your Name authored
- Implement user-specific configuration isolation with SQLite database - Add user management, authentication, and role-based access control - Create user-specific providers, rotations, and autoselect configurations - Add API token management and usage tracking per user - Update handlers to support user-specific configs with fallback to global - Add MCP support for user-specific configurations - Update documentation and README with multi-user features - Add user dashboard templates for configuration management
-
Your Name authored
- Integrate existing SQLite database module with full functionality - Add persistent token usage tracking across application restarts - Implement context dimension tracking and effective context updates - Add automatic database cleanup on startup (7+ day old records) - Implement multi-user authentication with role-based access control - Add user management with isolated configurations (providers, rotations, autoselects) - Enable user-specific API token management and usage tracking - Update dashboard with role-based access (admin vs user dashboards) - Add database-first authentication with config admin fallback - Update README, TODO, and documentation with database features - Cache model embeddings for semantic classification performance
-
Your Name authored
- Add NSFW/privacy boolean fields to models (providers.json, rotations.json, autoselect.json) - Implement content classification using last 3 messages for performance - Add semantic classification with hybrid BM25 + sentence-transformer re-ranking - Update autoselect handler to support classify_semantic flag - Add new semantic_classifier.py module with hybrid search capabilities - Update dashboard templates to manage new configuration fields - Update documentation (README.md, DOCUMENTATION.md) with new features - Bump version to 0.6.0 in pyproject.toml and setup.py - Add new dependencies: sentence-transformers, rank-bm25 - Update package configuration for PyPI distribution
-
- 23 Mar, 2026 14 commits
-
-
Your Name authored
- Added Kiro AWS Event Stream parsing and converters - Added TOR hidden service support - Added MCP server endpoint - Added credential validation for kiro/kiro-cli - Added various Python 3.13 compatibility fixes - Added intelligent 429 rate limit handling - Updated venv handling and auto-update features
-
Your Name authored
-
Your Name authored
- Added Unreleased section for OpenRouter fields implementation - Added web dashboard documentation updates - Documented Model class fixes for API compatibility
-
Your Name authored
- Added screenshot.png reference at top of README - Added Web Dashboard section describing features - Includes dashboard access information and default credentials
-
Your Name authored
- Added description, context_length, architecture, pricing, top_provider, supported_parameters, and default_parameters fields - Fixes crash when returning model data through API with OpenRouter metadata - Aligns Model class with ProviderModelConfig, RotationConfig, and AutoselectConfig
-
Your Name authored
- Add aisbf/kiro_parsers.py: AWS Event Stream parser for Kiro API responses - Update kiro_converters_openai.py: Add build_kiro_payload_from_dict function - Update kiro_converters.py: Minor fixes - Update kiro_auth.py: Add AWS SSO OIDC authentication support - Update handlers.py: Enhance streaming and error handling - Update main.py: Add proxy headers middleware and configuration - Update setup.py: Add version bump - Add TODO.md: Comprehensive roadmap for caching and performance improvements Features: - Kiro AWS Event Stream parsing for non-streaming responses - OpenAI-to-Kiro payload conversion - AWS SSO OIDC authentication for Kiro - Proxy headers middleware for reverse proxy support - TODO roadmap with prioritized items for future development
-
Your Name authored
- Avoid calling parser.get_tool_calls() during streaming loop - Only call get_tool_calls() after all chunks are processed - Send tool calls in separate chunk after content streaming completes - Prevents empty arguments issue caused by premature finalization
-
Your Name authored
- Updated validate_kiro_credentials() to handle both dict and object config access - Added get_config_value() helper function for flexible config access - Now correctly detects kiro-cli credentials from SQLite database - Validation now works out of the box when kiro-cli is installed
-
Your Name authored
- Added validate_kiro_credentials() function to check credential availability - Validates kiro IDE credentials from ~/.config/Code/User/globalStorage/amazon.q/credentials.json - Validates kiro-cli credentials from ~/.local/share/kiro-cli/data.sqlite3 - Integrated validation into get_provider_models() to exclude providers without credentials - Added validation checks to all request endpoints (chat, audio, images, embeddings) - Providers only appear in model listings and accept requests when credentials are valid - Returns HTTP 403 when credentials are missing or invalid
-
Your Name authored
- Updated base.html to use request.session instead of session - Resolves jinja2.exceptions.UndefinedError: 'session' is undefined - Completes Python 3.13 compatibility fix
-
Your Name authored
- Create venv with --system-site-packages flag to access system aisbf - Remove redundant aisbf package installation in venv - Venv now only contains requirements.txt dependencies - On upgrade detection, only update requirements.txt dependencies - System-installed aisbf package is automatically accessible in venv
-
Your Name authored
- Removed redundant 'session' parameter from all template responses - request.session is already accessible in templates via request object - Fixes TypeError: unhashable type: 'dict' in Jinja2 cache with Python 3.13
-
Your Name authored
- Added check_package_upgrade() function to detect when aisbf package is upgraded - Modified ensure_venv() to automatically update venv when package upgrade is detected - Version tracking using .aisbf_version file in venv directory - Ensures venv stays in sync with pip-installed package version
-
Your Name authored
-