Skip to content

4bit model loads but doesn't output. UnboundLocalError: local variable 'reply' referenced before assignment #681

@Roerib

Description

@Roerib

Describe the bug

model loads but won't output anything.

Is there an existing issue for this?

  • I have searched the existing issues

Reproduction

loaded llama 7b with 4bit

Screenshot

Screenshot (44)

Logs

Starting the web UI...

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
CUDA SETUP: Loading binary C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\bitsandbytes\libbitsandbytes_cudaall.dll...
Loading llama-7b-4bit...
Loading model ...
C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\safetensors\torch.py:99: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  with safe_open(filename, framework="pt", device=device) as f:
C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\torch\_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  return self.fget.__get__(instance, owner)()
C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\torch\storage.py:899: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  storage = cls(wrap_storage=untyped_storage)
Done.
Loaded the model in 5.45 seconds.
Loading the extension "gallery"... Ok.
C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\gradio\deprecation.py:40: UserWarning: The 'type' parameter has been deprecated. Use the Number component instead.
  warnings.warn(value)
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Traceback (most recent call last):
  File "C:\Users\Robert\Desktop\oobabooga-windows\text-generation-webui\modules\callbacks.py", line 64, in gentask
    ret = self.mfunc(callback=_callback, **self.kwargs)
  File "C:\Users\Robert\Desktop\oobabooga-windows\text-generation-webui\modules\text_generation.py", line 222, in generate_with_callback
    shared.model.generate(**kwargs)
  File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\transformers\generation\utils.py", line 1485, in generate
    return self.sample(
  File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\transformers\generation\utils.py", line 2524, in sample
    outputs = self(
  File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 689, in forward
    outputs = self.model(
  File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 577, in forward
    layer_outputs = decoder_layer(
  File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 292, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
  File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 196, in forward
    query_states = self.q_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
  File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\Robert\Desktop\oobabooga-windows\text-generation-webui\repositories\GPTQ-for-LLaMa\quant.py", line 426, in forward
    quant_cuda.vecquant4matmul(x, self.qweight, y, self.scales, self.qzeros, self.groupsize)
TypeError: vecquant4matmul(): incompatible function arguments. The following argument types are supported:
    1. (arg0: torch.Tensor, arg1: torch.Tensor, arg2: torch.Tensor, arg3: torch.Tensor, arg4: torch.Tensor, arg5: torch.Tensor) -> None

Invoked with: tensor([[ 0.0436, -0.0149,  0.0150,  ...,  0.0267,  0.0112, -0.0011],
        [-0.0108,  0.0345, -0.0282,  ...,  0.0073, -0.0096,  0.0204],
        [-0.0362, -0.0222, -0.0107,  ..., -0.0035, -0.0108,  0.0189],
        ...,
        [ 0.0324,  0.0055,  0.0122,  ...,  0.0099, -0.0175,  0.0141],
        [ 0.0160, -0.0103, -0.0197,  ...,  0.0249, -0.0164,  0.0180],
        [-0.0431, -0.0260,  0.0012,  ...,  0.0075, -0.0076, -0.0037]],
       device='cuda:0'), tensor([[ 2004248678,  2020046951,  1735952023,  ..., -1738970729,
         -1771669913,  1988708744],
        [ 1752594295,  1985447527,  1719101559,  ...,  1737979784,
          1735882872,  1988584549],
        [ 2003277431, -2038925705,  2003200134,  ..., -1752671846,
         -2055710840, -1418419097],
        ...,
        [ 1987475319, -2021226904,  1719236470,  ...,  1985514391,
          1734904166, -1485412727],
        [ 1988585302,  2004387686,  2020181895,  ...,   947288215,
          1701270918,  2019850854],
        [ 1736935542,  2022213477, -2038995336,  ..., -1484101769,
          1718053495,  1151894375]], device='cuda:0', dtype=torch.int32), tensor([[0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        ...,
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.]], device='cuda:0'), tensor([[0.0318, 0.0154, 0.0123,  ..., 0.0191, 0.0206, 0.0137]],
       device='cuda:0'), tensor([[ 1987540582,  1734764150,  1735812471,  1987540855,  1986492263,
          2004248424,  1735878007,  2004313703,  1719035767,  1985439591,
          1735821174,  1735817046,  2003265382,  1717991287,  1736931174,
         -2023262601,  1752594023,  1735882598,  1986422375,  1450600038,
          2004318070,  1987540599,  1988519783,  1719039591,  2004248182,
          1986492038,  1719105398,  1734768246, -2038991258,  1752651622,
          1181120102,  2003269237,  1735816808,  1736857191,  2004182904,
          1736996453, -2041153432,  1718118021,  1719043958,  1717986935,
          2003269735,  1734768503,  1735882343,  1736931174,  2003203975,
          1753704296,  2003203702,  1718052728,  1986492006,  1719101046,
          1987536758,  1718974055,  1735882599,  1734899575,  1735882599,
          2003269479,  2003269239,  1718117990,  1467381350,  1718052486,
         -2025429625,  2020042359,  1734768487,  1734829942,  1449617527,
          2005297015, -1989913240,  2020104069,  1733715560,  1719105366,
         -1771673769,  1467373158,  1719035767,  1734903431,  1986427223,
         -2037942458,  1701209958,  1718904406, -2055833993,  1718965876,
         -2038012041,  1736922759,  2004247893, -2008508537,  2003134327,
          2003265381,  1719031639,  1969649238,  1735882597,  1719096677,
          1720993382,  2004313734,  1702262901,  2020038503,  2003134312,
          2003203958,  1987475079,  1972725351,  2019980646,  1986426742,
          1701209974,  1733785206,  1735878503,  1735882598,  1987471221,
          1953916516, -2039056537, -2055772569,  2004252022,  1986422151,
          1717995127,  1735882615,  2019980903, -2023266714,  1467250774,
         -2037881240,  1970685527,  2004383351,  1466332519,  1701336952,
          2003269479, -2038007962,  1985312646, -2040047786,  1987475558,
          1735944038,  1703302759,  1733855078,  1736865381,  2019915622,
         -2039061129, -2023398026,  1182296151,  1703368567, -2041157496,
          1751610757,  1718052454,  2003203943,  2021025382,  1987540838,
          2003199864,  1734763895,  1448441429, -2025363337,  2002212726,
          2002150998,  1718056583, -2056821131,  1719039319,  1986623350,
          1719101046,  2003203958, -2038995099,  2003265111,  2004248438,
          1451722359,  1988519557, -2040039561,  2004383604,  1717991286,
          1971808103,  1716942951,  1734764422,  1766282614,  1969718917,
          2019981141,  1751545685,  1987471223,  1701279591,  2020042376,
          1969645160,  1467377527,  1450666105,  1970763607,  1986487639,
          1467376998,  2003142518,  1468426103,  1987471175,  1450603879,
          2003203942,  1986361191,  1720153735,  1986422392,  1734764664,
          1734628966, -2024376202,  2004248439,  2004313958,  1718052726,
          1467381367,  1701279606,  1750558357,  2002151319,  1969710711,
          1971808103,  1735882104,  1718121845,  1483114102,  1735878230,
          1700034391,  1450600055,  2004317814,  1717987175,  1719105399,
          1986418791,  1969718886,  1702393702,  1986487927, -1737001371,
          1450665830,  1735874166,  1987471207,  1986492006,  2003199847,
          1987470950, -2056820873, -2054785162,  1719039863,  1719101014,
          1735816807,  1718052471,  1735878262,  1986426470, -2041223578,
         -2039126154, -2023319946,  1717987174, -2023274618,  1735817079,
          1717008230,  1734833765,  1449682535,  1734908021,  2004252293,
         -2006685577,  2003138198,  1467446873,  1717855863,  1987405670,
         -2021231018,  1718052488,  1734767991,  1986491750,  2004318086,
          1719039863,  1751672438,  1719048038,  1987413863,  1702458966,
          1736927078,  2003265127,  2004248167, -1753847977,  1701275238,
          1734829702,  2006410887,  1967683432,  1986484055,  1685550422,
          1735878487,  1450604151,  1701209958,  1733715847,  1970681718,
         -2038999449,  1751479959, -2023331995,  1702201160,  1988523862,
          2004178790,  1702389606,  1969710965,  1986492295,  1702192246,
          1768257686,  1734760312,  1736930918,  1719105398,  2004313719,
         -2005506714,  1735878249,  1986422390,  1718122342, -2023331977,
          1987536470,  1987475303,  2004313702,  1450600311,  2003269480,
          1986422631,  1735812726,  2003199862,  1987475047,  1735878774,
          1987475303,  1986430583,  1986422631,  1985377895,  1735878263,
         -2040035449,  2003199623,  1719040119,  1987471223,  2020112246,
          1718052198,  1718048391, -2038986890,  1701410439, -2022274698,
          1717139317,  1736927079,  1734768247,  2005366391,  1718974310,
         -2039056794,  2003199591,  1734833543,  1734829927,  1987475336,
          2003277686,  1985377910, -2041162122,  1717987174, -2022348968,
          1719105398,  1450665607,  1752589911,  1717991286,  1987535975,
          1685485414,  1970771846, -1771604106,  1717987191,  1735820919,
          1718056824,  1985443926, -2021300137,  1735874423,  1970825078,
          1734764133,  1736862054,  1719105126,  1183213175,  1499883111,
          1718060903,  1716938581,  1987471478,  1751606902,  1987540104,
          1736865894, -1988659612,  1734764392,  1988585351, -2019068313,
          1717995399,  1986426728,  1752655461,  1986553447,  1733716328,
          2003330950,  1735878535,  1686595447, -2024249719,  1199015511,
          1769367174,  1987475046,  2004318070,  1987467112,  1970694263,
          1986422391,  2004252279,  1718113893,  2004317559,  1449617255,
          1718122358, -2007541897,  1483167304, -2006485146, -2040175226,
          2003199607,  2021091190,  1988523606,  1986483830,  2004186742,
          1751607159,  1968597110,  1752598631,  2021029512,  1987536486,
          1987471191,  1719109750,  1986426743,  1987475302,  1719101031,
         -2024437913,  1969715031,  1736931192,  1466328678,  1735878520,
          1719035767,  1466390375,  2004182390,  1701209974,  2004318054,
          2003269511,  1734829415,  2003268726,  2004252773,  2004317799,
          1718056839,  1735878519, -2021235353,  1987536487,  1449555815,
          1969719142,  1769437031,  1733789544,  1985382246,  1718056566,
          2004378983,  1717921143, -2039060889,  1986483831,  2003334504,
         -2055838361,  1448510854,  1734760567, -2022217867,  1702324069,
          1734895734,  1449616998,  1986426487,  1987409798,  1450669703,
          1720158038,  2003200104,  2006414935,  1970689895,  1719035495,
          1466390135,  2004248439,  1734833509,  1988585320, -2023331961,
          1987536501,  1702323559,  2003265142,  1719039607,  1433888120,
          1467447143,  2004313702,  1717999719,  1986357111,  1984390775,
          1987540836,  2003265398,  1702324070,  1968662103,  2021155942,
          1734763879,  1987475047,  1987541111,  1685481318,  1720150135,
          1200126038,  2005436247,  1736992598,  1213618295,  1988524166,
          1197893767,  1987475303,  1717007974,  1734768247, -2023266968,
          2003334757,  1986483833,  1987544166, -2023266938,  1987466838,
          1986487910,  1717982838,  1734768519,  2003261048,  1987540855,
          2004182646,  2002220645,  2003138166,  1736865398,  1751541367,
          2005362294,  2003265399,  1986488166,  1987475062,  1986487911,
          1987540853,  2003264871,  1734764407,  2004313718,  2004318055,
          1181120630,  1734764137,  1718056550,  1734838119,  1987540598,
          1467381110,  1735882360,  2004248183,  1719101302, -2022283673,
          1988523895,  1719100807]], device='cuda:0', dtype=torch.int32), 4096
Output generated in 1.17 seconds (0.00 tokens/s, 0 tokens, context 36)
Traceback (most recent call last):
  File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\gradio\routes.py", line 393, in run_predict
    output = await app.get_blocks().process_api(
  File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\gradio\blocks.py", line 1108, in process_api
    result = await self.call_function(
  File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\gradio\blocks.py", line 929, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "C:\Users\Robert\Desktop\oobabooga-windows\installer_files\env\lib\site-packages\gradio\utils.py", line 490, in async_iteration
    return next(iterator)
  File "C:\Users\Robert\Desktop\oobabooga-windows\text-generation-webui\modules\chat.py", line 173, in cai_chatbot_wrapper
    for _history in chatbot_wrapper(text, max_new_tokens, do_sample, temperature, top_p, typical_p, repetition_penalty, encoder_repetition_penalty, top_k, min_length, no_repeat_ngram_size, num_beams, penalty_alpha, length_penalty, early_stopping, seed, name1, name2, context, check, chat_prompt_size, chat_generation_attempts):
  File "C:\Users\Robert\Desktop\oobabooga-windows\text-generation-webui\modules\chat.py", line 144, in chatbot_wrapper
    cumulative_reply = reply
UnboundLocalError: local variable 'reply' referenced before assignment

System Info

gtx 1070, i7 9700k and 16 GB of ram.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions