Initial commit: Add CoderAI OpenAI-compatible API server
- Add main server script with FastAPI and memory-aware model loading - Add requirements.txt with dependencies and platform-specific PyTorch options - Add comprehensive README.md with installation, usage, and troubleshooting - Add LICENSE.md with GPLv3 license
Showing
LICENSE.md
0 → 100644
README.md
0 → 100644
File added
coderai
0 → 100644
requirements.txt
0 → 100644
| # FastAPI and server dependencies | ||
| fastapi>=0.104.0 | ||
| uvicorn[standard]>=0.24.0 | ||
| pydantic>=2.5.0 | ||
| # PyTorch - Uncomment the appropriate version for your system: | ||
| # For NVIDIA (CUDA): | ||
| # torch>=2.0.0 | ||
| # torchvision>=0.15.0 | ||
| # torchaudio>=2.0.0 | ||
| # For AMD (ROCm): | ||
| # --index-url https://download.pytorch.org/whl/rocm5.4.2 | ||
| # torch>=2.0.0 | ||
| # torchvision>=0.15.0 | ||
| # torchaudio>=2.0.0 | ||
| # For CPU only: | ||
| torch>=2.0.0 | ||
| # ML dependencies | ||
| transformers>=4.35.0 | ||
| accelerate>=0.24.0 | ||
| # System resource detection | ||
| psutil>=5.9.0 | ||
| procname>=0.3.0 | ||
| # Optional: for better performance | ||
| # bitsandbytes>=0.41.0 # for 4-bit/8-bit quantization | ||
| # sentencepiece>=0.1.99 # for some tokenizers | ||
| # protobuf>=3.20.0 # for some models | ||
| # Optional: Flash Attention 2 for faster inference on supported GPUs | ||
| # Requires specific CUDA/ROCm versions and may need manual installation | ||
| # Install with: pip install flash-attn --no-build-isolation | ||
| # flash-attn>=2.5.0 | ||
| # Installation instructions: | ||
| # 1. For NVIDIA GPUs: pip install torch torchvision torchaudio | ||
| # 2. For AMD GPUs: pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2 | ||
| # 3. For CPU only: pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu |
Please
register
or
sign in
to comment