-
Notifications
You must be signed in to change notification settings - Fork 5.8k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
model loads but won't output anything.
Is there an existing issue for this?
- I have searched the existing issues
Reproduction
loaded llama 7b with 4bit
Screenshot
Logs
Starting the web UI...
===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
CUDA SETUP: Loading binary C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\bitsandbytes\libbitsandbytes_cudaall.dll...
Loading llama-7b-4bit...
Loading model ...
C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\safetensors\torch.py:99: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
with safe_open(filename, framework="pt", device=device) as f:
C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\torch\_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
return self.fget.__get__(instance, owner)()
C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\torch\storage.py:899: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = cls(wrap_storage=untyped_storage)
Done.
Loaded the model in 5.45 seconds.
Loading the extension "gallery"... Ok.
C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\gradio\deprecation.py:40: UserWarning: The 'type' parameter has been deprecated. Use the Number component instead.
warnings.warn(value)
Running on local URL: http://127.0.0.1:7860
To create a public link, set `share=True` in `launch()`.
Traceback (most recent call last):
File "C:\Users\Robert\Desktop\oobabooga-windows\text-generation-webui\modules\callbacks.py", line 64, in gentask
ret = self.mfunc(callback=_callback, **self.kwargs)
File "C:\Users\Robert\Desktop\oobabooga-windows\text-generation-webui\modules\text_generation.py", line 222, in generate_with_callback
shared.model.generate(**kwargs)
File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\transformers\generation\utils.py", line 1485, in generate
return self.sample(
File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\transformers\generation\utils.py", line 2524, in sample
outputs = self(
File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 689, in forward
outputs = self.model(
File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 577, in forward
layer_outputs = decoder_layer(
File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 292, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 196, in forward
query_states = self.q_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\Robert\Desktop\oobabooga-windows\text-generation-webui\repositories\GPTQ-for-LLaMa\quant.py", line 426, in forward
quant_cuda.vecquant4matmul(x, self.qweight, y, self.scales, self.qzeros, self.groupsize)
TypeError: vecquant4matmul(): incompatible function arguments. The following argument types are supported:
1. (arg0: torch.Tensor, arg1: torch.Tensor, arg2: torch.Tensor, arg3: torch.Tensor, arg4: torch.Tensor, arg5: torch.Tensor) -> None
Invoked with: tensor([[ 0.0436, -0.0149, 0.0150, ..., 0.0267, 0.0112, -0.0011],
[-0.0108, 0.0345, -0.0282, ..., 0.0073, -0.0096, 0.0204],
[-0.0362, -0.0222, -0.0107, ..., -0.0035, -0.0108, 0.0189],
...,
[ 0.0324, 0.0055, 0.0122, ..., 0.0099, -0.0175, 0.0141],
[ 0.0160, -0.0103, -0.0197, ..., 0.0249, -0.0164, 0.0180],
[-0.0431, -0.0260, 0.0012, ..., 0.0075, -0.0076, -0.0037]],
device='cuda:0'), tensor([[ 2004248678, 2020046951, 1735952023, ..., -1738970729,
-1771669913, 1988708744],
[ 1752594295, 1985447527, 1719101559, ..., 1737979784,
1735882872, 1988584549],
[ 2003277431, -2038925705, 2003200134, ..., -1752671846,
-2055710840, -1418419097],
...,
[ 1987475319, -2021226904, 1719236470, ..., 1985514391,
1734904166, -1485412727],
[ 1988585302, 2004387686, 2020181895, ..., 947288215,
1701270918, 2019850854],
[ 1736935542, 2022213477, -2038995336, ..., -1484101769,
1718053495, 1151894375]], device='cuda:0', dtype=torch.int32), tensor([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]], device='cuda:0'), tensor([[0.0318, 0.0154, 0.0123, ..., 0.0191, 0.0206, 0.0137]],
device='cuda:0'), tensor([[ 1987540582, 1734764150, 1735812471, 1987540855, 1986492263,
2004248424, 1735878007, 2004313703, 1719035767, 1985439591,
1735821174, 1735817046, 2003265382, 1717991287, 1736931174,
-2023262601, 1752594023, 1735882598, 1986422375, 1450600038,
2004318070, 1987540599, 1988519783, 1719039591, 2004248182,
1986492038, 1719105398, 1734768246, -2038991258, 1752651622,
1181120102, 2003269237, 1735816808, 1736857191, 2004182904,
1736996453, -2041153432, 1718118021, 1719043958, 1717986935,
2003269735, 1734768503, 1735882343, 1736931174, 2003203975,
1753704296, 2003203702, 1718052728, 1986492006, 1719101046,
1987536758, 1718974055, 1735882599, 1734899575, 1735882599,
2003269479, 2003269239, 1718117990, 1467381350, 1718052486,
-2025429625, 2020042359, 1734768487, 1734829942, 1449617527,
2005297015, -1989913240, 2020104069, 1733715560, 1719105366,
-1771673769, 1467373158, 1719035767, 1734903431, 1986427223,
-2037942458, 1701209958, 1718904406, -2055833993, 1718965876,
-2038012041, 1736922759, 2004247893, -2008508537, 2003134327,
2003265381, 1719031639, 1969649238, 1735882597, 1719096677,
1720993382, 2004313734, 1702262901, 2020038503, 2003134312,
2003203958, 1987475079, 1972725351, 2019980646, 1986426742,
1701209974, 1733785206, 1735878503, 1735882598, 1987471221,
1953916516, -2039056537, -2055772569, 2004252022, 1986422151,
1717995127, 1735882615, 2019980903, -2023266714, 1467250774,
-2037881240, 1970685527, 2004383351, 1466332519, 1701336952,
2003269479, -2038007962, 1985312646, -2040047786, 1987475558,
1735944038, 1703302759, 1733855078, 1736865381, 2019915622,
-2039061129, -2023398026, 1182296151, 1703368567, -2041157496,
1751610757, 1718052454, 2003203943, 2021025382, 1987540838,
2003199864, 1734763895, 1448441429, -2025363337, 2002212726,
2002150998, 1718056583, -2056821131, 1719039319, 1986623350,
1719101046, 2003203958, -2038995099, 2003265111, 2004248438,
1451722359, 1988519557, -2040039561, 2004383604, 1717991286,
1971808103, 1716942951, 1734764422, 1766282614, 1969718917,
2019981141, 1751545685, 1987471223, 1701279591, 2020042376,
1969645160, 1467377527, 1450666105, 1970763607, 1986487639,
1467376998, 2003142518, 1468426103, 1987471175, 1450603879,
2003203942, 1986361191, 1720153735, 1986422392, 1734764664,
1734628966, -2024376202, 2004248439, 2004313958, 1718052726,
1467381367, 1701279606, 1750558357, 2002151319, 1969710711,
1971808103, 1735882104, 1718121845, 1483114102, 1735878230,
1700034391, 1450600055, 2004317814, 1717987175, 1719105399,
1986418791, 1969718886, 1702393702, 1986487927, -1737001371,
1450665830, 1735874166, 1987471207, 1986492006, 2003199847,
1987470950, -2056820873, -2054785162, 1719039863, 1719101014,
1735816807, 1718052471, 1735878262, 1986426470, -2041223578,
-2039126154, -2023319946, 1717987174, -2023274618, 1735817079,
1717008230, 1734833765, 1449682535, 1734908021, 2004252293,
-2006685577, 2003138198, 1467446873, 1717855863, 1987405670,
-2021231018, 1718052488, 1734767991, 1986491750, 2004318086,
1719039863, 1751672438, 1719048038, 1987413863, 1702458966,
1736927078, 2003265127, 2004248167, -1753847977, 1701275238,
1734829702, 2006410887, 1967683432, 1986484055, 1685550422,
1735878487, 1450604151, 1701209958, 1733715847, 1970681718,
-2038999449, 1751479959, -2023331995, 1702201160, 1988523862,
2004178790, 1702389606, 1969710965, 1986492295, 1702192246,
1768257686, 1734760312, 1736930918, 1719105398, 2004313719,
-2005506714, 1735878249, 1986422390, 1718122342, -2023331977,
1987536470, 1987475303, 2004313702, 1450600311, 2003269480,
1986422631, 1735812726, 2003199862, 1987475047, 1735878774,
1987475303, 1986430583, 1986422631, 1985377895, 1735878263,
-2040035449, 2003199623, 1719040119, 1987471223, 2020112246,
1718052198, 1718048391, -2038986890, 1701410439, -2022274698,
1717139317, 1736927079, 1734768247, 2005366391, 1718974310,
-2039056794, 2003199591, 1734833543, 1734829927, 1987475336,
2003277686, 1985377910, -2041162122, 1717987174, -2022348968,
1719105398, 1450665607, 1752589911, 1717991286, 1987535975,
1685485414, 1970771846, -1771604106, 1717987191, 1735820919,
1718056824, 1985443926, -2021300137, 1735874423, 1970825078,
1734764133, 1736862054, 1719105126, 1183213175, 1499883111,
1718060903, 1716938581, 1987471478, 1751606902, 1987540104,
1736865894, -1988659612, 1734764392, 1988585351, -2019068313,
1717995399, 1986426728, 1752655461, 1986553447, 1733716328,
2003330950, 1735878535, 1686595447, -2024249719, 1199015511,
1769367174, 1987475046, 2004318070, 1987467112, 1970694263,
1986422391, 2004252279, 1718113893, 2004317559, 1449617255,
1718122358, -2007541897, 1483167304, -2006485146, -2040175226,
2003199607, 2021091190, 1988523606, 1986483830, 2004186742,
1751607159, 1968597110, 1752598631, 2021029512, 1987536486,
1987471191, 1719109750, 1986426743, 1987475302, 1719101031,
-2024437913, 1969715031, 1736931192, 1466328678, 1735878520,
1719035767, 1466390375, 2004182390, 1701209974, 2004318054,
2003269511, 1734829415, 2003268726, 2004252773, 2004317799,
1718056839, 1735878519, -2021235353, 1987536487, 1449555815,
1969719142, 1769437031, 1733789544, 1985382246, 1718056566,
2004378983, 1717921143, -2039060889, 1986483831, 2003334504,
-2055838361, 1448510854, 1734760567, -2022217867, 1702324069,
1734895734, 1449616998, 1986426487, 1987409798, 1450669703,
1720158038, 2003200104, 2006414935, 1970689895, 1719035495,
1466390135, 2004248439, 1734833509, 1988585320, -2023331961,
1987536501, 1702323559, 2003265142, 1719039607, 1433888120,
1467447143, 2004313702, 1717999719, 1986357111, 1984390775,
1987540836, 2003265398, 1702324070, 1968662103, 2021155942,
1734763879, 1987475047, 1987541111, 1685481318, 1720150135,
1200126038, 2005436247, 1736992598, 1213618295, 1988524166,
1197893767, 1987475303, 1717007974, 1734768247, -2023266968,
2003334757, 1986483833, 1987544166, -2023266938, 1987466838,
1986487910, 1717982838, 1734768519, 2003261048, 1987540855,
2004182646, 2002220645, 2003138166, 1736865398, 1751541367,
2005362294, 2003265399, 1986488166, 1987475062, 1986487911,
1987540853, 2003264871, 1734764407, 2004313718, 2004318055,
1181120630, 1734764137, 1718056550, 1734838119, 1987540598,
1467381110, 1735882360, 2004248183, 1719101302, -2022283673,
1988523895, 1719100807]], device='cuda:0', dtype=torch.int32), 4096
Output generated in 1.17 seconds (0.00 tokens/s, 0 tokens, context 36)
Traceback (most recent call last):
File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\gradio\routes.py", line 393, in run_predict
output = await app.get_blocks().process_api(
File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\gradio\blocks.py", line 1108, in process_api
result = await self.call_function(
File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\gradio\blocks.py", line 929, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run
result = context.run(func, *args)
File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\gradio\utils.py", line 490, in async_iteration
return next(iterator)
File "C:\Users\Robert\Desktop\oobabooga-windows\text-generation-webui\modules\chat.py", line 173, in cai_chatbot_wrapper
for _history in chatbot_wrapper(text, max_new_tokens, do_sample, temperature, top_p, typical_p, repetition_penalty, encoder_repetition_penalty, top_k, min_length, no_repeat_ngram_size, num_beams, penalty_alpha, length_penalty, early_stopping, seed, name1, name2, context, check, chat_prompt_size, chat_generation_attempts):
File "C:\Users\Robert\Desktop\oobabooga-windows\text-generation-webui\modules\chat.py", line 144, in chatbot_wrapper
cumulative_reply = reply
UnboundLocalError: local variable 'reply' referenced before assignment
System Info
gtx 1070, i7 9700k and 16 GB of ram.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working