News

I want to evaluate the improvement of chunked prefill, comparing V0 and V1 I think this would be impossible. All the sched-requests have nothing difference in the v1 arch, which means that the ...
PyTorch version: 2.6.0+cu124 Is debug build: False CUDA used to build PyTorch: 12.4 ROCM used to build PyTorch: N/A OS: Ubuntu 22.04.4 LTS (x86_64) GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 ...