Skip to content

Commit

Permalink
adjusted layer estimation
Browse files Browse the repository at this point in the history
  • Loading branch information
LostRuins committed Jul 24, 2024
1 parent cca2fa9 commit d1f7832
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion koboldcpp.py
Original file line number Diff line number Diff line change
Expand Up @@ -612,7 +612,7 @@ def autoset_gpu_layers(filepath,ctxsize,gpumem): #shitty algo to determine how m
headcount = ggufmeta[1]
headkvlen = (ggufmeta[2] if ggufmeta[2] > 0 else 128)
ratio = mem/(fsize*csmul*1.5)
computemem = layers*4*headkvlen*cs*4*1.35 # For now the first 4 is the hardcoded result for a blasbatchsize of 512. Ideally we automatically calculate blasbatchsize / 4 but I couldn't easily grab the value yet - Henk
computemem = layers*4*headkvlen*cs*4*1.4 # For now the first 4 is the hardcoded result for a blasbatchsize of 512. Ideally we automatically calculate blasbatchsize / 4 but I couldn't easily grab the value yet - Henk
contextmem = layers*headcount*headkvlen*cs*4
reservedmem = 1.5*1024*1024*1024 # Users often don't have their GPU's VRAM worth of memory, we assume 500MB to avoid driver swapping + 500MB for the OS + 500MB for background apps / browser - Henk
if headcount > 0:
Expand Down

0 comments on commit d1f7832

Please sign in to comment.