Comparing changes

* refine docs for multi-backend alpha release * docs: further tweaks to multi-backend alpha docs * docs: further tweaks to multi-backend alpha docs * docs: further tweaks to multi-backend alpha docs * docs: add multi-backend feedback links * docs: add request for contributions * docs: small fixes * docs: small fixes * docs: add info about `main` continuous build * docs: further tweaks to multi-backend alpha docs * docs: further tweaks to multi-backend alpha docs

…1385)

* Add build job for rocm * Add rocm build script * Copy shared obj file into output_dir * upload build artifacts and enable wheels build * Remove cuda build temporarily * Add ROCm version to .so filename * Add rocm_version to whls build * Revert "Remove cuda build temporarily" This reverts commit 1413c5f. * Add rocm_version env var * Remove thrush header files * Print node info * print cuda node info * Revert "print cuda node info" This reverts commit cdb209a. * Revert "Print node info" This reverts commit 7e9a65c. * Add rocm arch to compile command * Rename .so files to rocm * Update default gpu arch * Skip cpu based igemmlt int tests on ROCm * Update Documentation * Update upstream repo name * Update docs * Update string format Co-authored-by: Aarni Koskela <[email protected]> * Remove pre-release option for torch install * Update pytorch install path Co-authored-by: Titus <[email protected]> * Add messages for Heuristics error * Remove toolcache for disk space * print disk usage * Clean disk space for linux * Fix for ubuntu * Add sudo for apt clean * Update clean up disk list * remove disk usage print * Add BNB_BACKEND variable * Update diagnostic functions for ROCm * Fix tuple error * Fix library detection bug for recursive and symlink cases * fix pre-commit errors * Remove recursive path lib search * Create function for runtime lib patterns * Update logger format Co-authored-by: Aarni Koskela <[email protected]> * Update error reporting Co-authored-by: Aarni Koskela <[email protected]> * Remove commented code Co-authored-by: Aarni Koskela <[email protected]> * Update error reporting Co-authored-by: Aarni Koskela <[email protected]> * Update error reporting * Create hip diagnostics functions * Fix Typo * Fix pre-commit checks * Enable 6.2 build * Skip gemv 4 bit cpu test * Update documentation for 6.2.0 pip install * Update README for default branch change * Fix typo * Sync README with upstream * Remove depth --------- Co-authored-by: Aarni Koskela <[email protected]> Co-authored-by: Titus <[email protected]> Co-authored-by: Aswin John Mathews <[email protected]> Co-authored-by: root <[email protected]>

* Update pre-commit tools * Fix typos

* Fix syntax warning * Update the ruff rules to check for warnings

* [Build] Add CUDA 12.6.2 build; update 12.5.0 to 12.5.1 * bump cuda-toolkit action version * Update docs for cuda versions

* Start of int8 refactor: remove col32/col_ampere/col_turing transforms in new igemmlt implementation * Fix unintended change * New naive mm_dequant kernel for row-major; cleanup * fix * int8 refactor: initial sparse decomp, cleanup * Int8 refactoring: remove separate NO_CUBLASLT build; more cleanup * int8: inference optimizations, some cleanup * int8: more tests passing, cleanup * int8 - more cleanup, most tests passing * int8: specify CUDA stream for int8 ops * perf: reduce overhead from getting cudaStream ptr * Mark some functions for deprecation. * int8 sparse decomp: small perf improvement * update setup.py * Update bitsandbytes/autograd/_functions.py Co-authored-by: Aarni Koskela <[email protected]> * Update bitsandbytes/functional.py Co-authored-by: Aarni Koskela <[email protected]> * Update bitsandbytes/functional.py Co-authored-by: Aarni Koskela <[email protected]> * Update bitsandbytes/research/autograd/_functions.py Co-authored-by: Aarni Koskela <[email protected]> * int8 - perf improvement for sparse decomposition inference; deprecate get_tensor_stream() in favor of new private fn * int8 cleanup * Ignore ruff rule ISC001 (incompatible with formatter) * add comment * int8 more cleanup * Update bitsandbytes/functional.py Co-authored-by: Aarni Koskela <[email protected]> * int8: rename / deprecate old fn signatures * Update bitsandbytes/functional.py Co-authored-by: Aarni Koskela <[email protected]> * type annotation * format update * Update bitsandbytes/research/autograd/_functions.py Co-authored-by: Aarni Koskela <[email protected]> * cleanup * Add comment to explain division optimization * more cleanup * Update bitsandbytes/functional.py Co-authored-by: Aarni Koskela <[email protected]> * Update bitsandbytes/functional.py Co-authored-by: Aarni Koskela <[email protected]> * Update bitsandbytes/functional.py Co-authored-by: Aarni Koskela <[email protected]> * cleanup * Type annotations, cleanup * remove unused kernels; improved type annotations * small perf optimization for single-GPU systems * small perf optimization for single-GPU systems * update docstrings * Improve docs and tests * Update docstring * Update test * add benchmarking script * test cleanup: add deprecated marker, move benchmarks out * Add int8 dequant function; misc improvements * int8 matmul fallback for inner dims not divisible by 4 * improve register usage of kInt8VectorQuant - especially for A100/H100 * disable fail-fast for package build * maxwell compat * ptxas verbose * docs update * doc update * backward fix * Bugfix sparse decomp * Int8 fix for PEFT OLoRA init * Fix test for deprecated spmm_coo * test improvement * doc update * typo * doc cleanup * docs * add inference benchmark script * Add benchmarks, doc update --------- Co-authored-by: Aarni Koskela <[email protected]>

Commits on Nov 14, 2024

Fix Windows build.

matthewdouglas authored Nov 14, 2024

1 Configuration menu

View commit details

Copy full SHA for 9264f02

Browse repository at this point

Copy the full SHA

9264f02 View commit details

Browse the repository at this point in the history

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Comparing changes

Open a pull request

Commits on Sep 30, 2024

Commits on Oct 1, 2024

Commits on Oct 14, 2024

Commits on Oct 16, 2024

Commits on Oct 23, 2024

Commits on Nov 14, 2024

Commits on Nov 19, 2024

Commits on Dec 2, 2024

Commits on Dec 5, 2024

This comparison is taking too long to generate.

Uh oh!