Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Contribute to GitLab
Sign in
Toggle navigation
C
coderai
Project
Project
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
nexlab
coderai
Commits
10d10573
Commit
10d10573
authored
Mar 01, 2026
by
Stefy Lanza (nextime / spora )
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Disable bitsandbytes quantization for Qwen3.5-A3B/MoE models which don't support it
parent
8665016a
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
13 additions
and
7 deletions
+13
-7
coderai
coderai
+13
-7
No files found.
coderai
View file @
10d10573
...
...
@@ -665,14 +665,20 @@ class NvidiaBackend(ModelBackend):
#
Prepare
model
loading
arguments
load_kwargs
=
{
'trust_remote_code'
:
True
}
#
Check
if
model
supports
quantization
if
load_in_4bit
or
load_in_8bit
:
try
:
import
bitsandbytes
as
bnb
print
(
f
"Using {4 if load_in_4bit else 8}-bit quantization"
)
load_kwargs
[
'load_in_4bit'
]
=
load_in_4bit
load_kwargs
[
'load_in_8bit'
]
=
load_in_8bit
except
ImportError
:
print
(
"Warning: bitsandbytes not installed. Quantization disabled."
)
#
Qwen3
.5
-
A3B
/
MoE
models
don
't support bitsandbytes quantization
if '
qwen3
.5
' in model_name.lower() and ('
a3b
' in model_name.lower() or '
moe
' in model_name.lower()):
print(f"Warning: {model_name} does not support bitsandbytes quantization (load_in_4bit/load_in_8bit)")
print("Quantization disabled for this model")
else:
try:
import bitsandbytes as bnb
print(f"Using {4 if load_in_4bit else 8}-bit quantization")
load_kwargs['
load_in_4bit
'] = load_in_4bit
load_kwargs['
load_in_8bit
'] = load_in_8bit
except ImportError:
print("Warning: bitsandbytes not installed. Quantization disabled.")
# Set dtype
if self.device == "cuda":
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment