12:30
GGUF Chat
Select a Model
Llama 2 7B Chat
Q4_K_M quantization
Recommended
7B params
3.8 GB
Mistral 7B Instruct
Q4_K_M quantization
7B params
4.1 GB
Llama 2 13B Chat
Q4_K_M quantization
13B params
7.2 GB
CodeLlama 7B Instruct
Q4_K_M quantization
7B params
3.9 GB
Load Custom GGUF Model
Loading Model
llama-2-7b-chat.Q4_K_M.gguf
0% • 0 MB / 3.8 GB
Model: llama-2-7b-chat.Q4_K_M.gguf
Switch Model
Settings
Model Configuration
Context Length
2048 tokens
4096 tokens
8192 tokens
GPU Layers
0 (CPU only)
30 layers
100 (Max)
Use 4-bit quantization
Application Settings
Dark Mode
Save Chat History
Save Settings
Made with
DeepSite
- 🧬
Remix