[python-package] UserWarning: Found `...` in params #6324

memeplex · 2024-02-16T18:30:49Z

Description

This and many other similar issues have been reported along the years. Yet I cannot find a reasonable solution. lightgbm is constantly throwing uninteresting warnings about aliases. They are not even easy to locally suppress once you start running subprocesses (n_jobs = -1 or > 1) and have to deal with PYTHONWARNINGS in the environment. In such cases there is a long stream of warnings, coming for example from a grid search. The verbose argument does little to silence lightgbm since the warnings are thrown at Python side.

It was suggested to use a parameter's name instead of its alias, but take for example num_iterations that is documented in https://lightgbm.readthedocs.io/en/latest/Parameters.html as a name, not an alias: it nevertheless throws a warning and you have to try every other alias in order to discover that n_iterations is the right one.

I think some action has to be taken here.

Reproducible example

import numpy as np
import pandas as pd
import lightgbm as lgb

X = pd.DataFrame(dict(x1=np.linspace(0, 1, 100), x2=np.linspace(0, 1, 100)))
y = 2 * X.x1 + 3 * X.x2
lgb.LGBMRegressor(verbose=-1, num_iterations=10).fit(X, y)

Environment info

LightGBM version or commit hash: 4.3.0

Command(s) you used to install LightGBM:

pip install lightgbm

macOS Sonoma
Python 3.11.7

The text was updated successfully, but these errors were encountered:

memeplex · 2024-02-16T18:51:06Z

Just for reference, currently I'm doing this first thing after launching a kernel:

    warnings.filterwarnings(
        "ignore", category=UserWarning, module="lightgbm.engine", lineno=172
    )
    os.environ["PYTHONWARNINGS"] = f"ignore::UserWarning:lightgbm.engine:172"

I would prefer to match the message using a regex but PYTHONWARNINGS doesn't support that and I need it to keep subprocesses silent. So in order not to suppress potentially interesting warnings I match the line number, which is quite fragile.

jmoralez · 2024-02-16T19:11:38Z

Hey @memeplex, thanks for using LightGBM. The arguments for the scikit-learn API are a bit different to be consistent with other scikit-learn estimators, you can see here that the expected argument for the iterations is n_estimators.

I believe this issue is specific to that argument and that line, since other arguments don't issue alias warnings with verbosity<0. I believe we can check if verbosity<0 there to be consistent with the behavior for the other arguments.

jameslamb · 2024-02-16T19:12:21Z

Sure, we can reconsider this.

Linking the relevant prior discussions:

Here is where the specific warnings you're talking about come from:

LightGBM/python-package/lightgbm/engine.py

Lines 169 to 172 in 6330d62

    
           for alias in _ConfigAliases.get("num_iterations"): 
        
               if alias in params: 
        
                   num_boost_round = params.pop(alias) 
        
                   _log_warning(f"Found `{alias}` in params. Will use it instead of argument")

LightGBM/python-package/lightgbm/engine.py

Lines 683 to 686 in 6330d62

    
           for alias in _ConfigAliases.get("num_iterations"): 
        
               if alias in params: 
        
                   _log_warning(f"Found '{alias}' in params. Will use it instead of 'num_boost_round' argument") 
        
                   num_boost_round = params.pop(alias)

I think some action has to be taken here.

I'll be more specific.

In my opinion, this warning causes more confusion than it prevents, and should just be completely removed.

I think it'd be a better state for lightgbm (the Python package) to have the following characteristics:

no warning raised when parameters from params override keyword arguments
well-documented rules for the order of precedence in resolving parameters
- e.g., constructor keyword arguments in scikit-learn interface, train() / cv() keyword arguments, parameters stored in an existing Booster if loading with init_model or similar, parameters passed through via params dictionary (with main params preferred to aliases)
tests confirming that that order is respected (many of these already exist)
documentation on how to determine which parameters were actually used during training (the recent work to store all params in the model file should help with this)

LightGBM's interface (especially in the Python and R packages) allows a lot of different ways to route configuration to training, and I think it'd be valuable to devote some time to elevating the rigor applied to documenting and testing those mechanisms.

@jmoralez @borchero @shiyu1994 @guolinke what do you think?

jameslamb · 2024-02-16T19:13:09Z

Sorry @jmoralez , I didn't see your response before posting mine. I still think we should consider removing this warning.

jmoralez · 2024-02-16T19:19:11Z

I agree on not issuing a warning if there's a single alias but I think we should check if there are several definitions for the iterations and issue warnings if they are as we do here:

LightGBM/src/io/config.cpp

Line 63 in 6330d62

Log::Warning("%s is set=%s, %s=%s will be ignored. Current value: %s=%s",

jameslamb · 2024-02-16T19:39:28Z

I agree with that. Doing something like this:

lgb.train(
    params={
        "num_iterations": 10,
        "n_iter": 500
    }
   ...
)

I'd want to be warned that I've provided multiple aliases via the same mechanism that mean the same thing and have different values.

But that should be accomplished on the C++ side, via the code in that link you shared.

jmoralez · 2024-02-16T19:44:07Z

But for train and cv we pick a single value for that

LightGBM/python-package/lightgbm/engine.py

Lines 169 to 173 in 6330d62

    
           for alias in _ConfigAliases.get("num_iterations"): 
        
               if alias in params: 
        
                   num_boost_round = params.pop(alias) 
        
                   _log_warning(f"Found `{alias}` in params. Will use it instead of argument") 
        
           params["num_iterations"] = num_boost_round

LightGBM/python-package/lightgbm/engine.py

Lines 683 to 687 in 6330d62

    
           for alias in _ConfigAliases.get("num_iterations"): 
        
               if alias in params: 
        
                   _log_warning(f"Found '{alias}' in params. Will use it instead of 'num_boost_round' argument") 
        
                   num_boost_round = params.pop(alias) 
        
           params["num_iterations"] = num_boost_round

so that would be the only place to warn (I believe that's why that warning is there)

jameslamb · 2024-02-16T20:11:38Z

ohhhh I missed that this already was specific to only num_iterations!!!

When I saw this in the description:

take for example num_iterations

the phrase "for example" made me think it was happening for all or at least many more parameters. @memeplex do you see this warning for any other parameters? If so can you share a reproducible example of that?

would be the only place to warn

Even in that case, I think it'd be preferable for LightGBM to just resolve num_boost_round + things passed through params on the Python side, and to only issue a warning in the following case:

multiple aliases for num_iterations passed through params dictionary
they have different values

num_iteration_configs_provided = {
    alias: params[alias]
    for alias in _ConfigAliases.get("num_iterations")
    if alias in params
}
multiple_values_provided = len(num_iteration_configs_provided) > 1 
values_conflict = len(set(num_iteration_configs_provided.values())) != len(num_iteration_configs_provided)

if multiple_values_provided and values_conflict:
     value_string = ", ".join(f"{alias}={val}" for alias, val in num_iteration_configs_provided)
     _log_warning(
         f"Found conflicting values for num_iterations provided via 'params': {value_string}."
         "To be confident in the maximum number of boosting rounds LightGBM will perform and to "
         "suppress this warning, modify 'params' so that only one of those is present."
    )

params = _choose_param_value(
    main_param_name='num_iterations',
    params=params,
    default_value=num_boost_round
)
num_boost_round = params["num_iterations"]

And for it to never warning about the num_boost_round keyword argument having a different value than one passed through params, since "pass in params to override other configuration" is the approach we promote as many other places as possible in the library.

So:

# no warning
lgb.train(
   params={"n_iter": 5},
   num_boost_round=10,
   ...
)

# no warning
lgb.train(
   params={"n_iter": 10},
   num_boost_round=10,
   ...
)

# no warning
lgb.train(
   params={"n_iter": 10, "num_iterations": 10},
   num_boost_round=10,
   ...
)

# no warning
lgb.train(
   params={"n_iter": 5, "num_iterations": 5},
   num_boost_round=10,
   ...
)

# warning
lgb.train(
   params={"n_iter": 5, "num_iterations": 75},
   num_boost_round=10,
   ...
)

memeplex · 2024-02-16T20:20:15Z

the phrase "for example" made me think it was happening for all or at least many more parameters.

You are right, in the past I've seen this happening with almost every parameter, but checking it again with ~20 parameters and aliases, I can only reproduce it with num_iterations variations. Much better, but still noisy since the number of trees is like the first parameter one sets.

jameslamb · 2024-02-16T20:24:00Z

I can only reproduce it with num_iterations variations

Excellent!

Narrowing it down like that helps a lot.

#6324)

jameslamb changed the title ~~UserWarning: Found ... in params~~ [python-package] UserWarning: Found ... in params Feb 16, 2024

jameslamb added the question label Feb 16, 2024

jameslamb mentioned this issue Mar 26, 2024

[python-package] UserWarning with num_iterations #6385

Closed

jameslamb mentioned this issue Jul 16, 2024

[python-package] No warning if single alias of num_boost_round is passed #6548

Closed

jameslamb added a commit that referenced this issue Jul 30, 2024

[python-package] limit when num_boost_round warnings are emitted (fixes

5fc42f6

#6324)

jameslamb mentioned this issue Jul 30, 2024

[python-package] limit when num_boost_round warnings are emitted (fixes #6324) #6579

Merged

jameslamb closed this as completed in #6579 Sep 3, 2024

jameslamb closed this as completed in 3ccdea1 Sep 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[python-package] UserWarning: Found `...` in params #6324

[python-package] UserWarning: Found `...` in params #6324

memeplex commented Feb 16, 2024

memeplex commented Feb 16, 2024

Uh oh!

jmoralez commented Feb 16, 2024

Uh oh!

jameslamb commented Feb 16, 2024

Uh oh!

jameslamb commented Feb 16, 2024

Uh oh!

jmoralez commented Feb 16, 2024

Uh oh!

jameslamb commented Feb 16, 2024

Uh oh!

jmoralez commented Feb 16, 2024

Uh oh!

jameslamb commented Feb 16, 2024

Uh oh!

memeplex commented Feb 16, 2024

Uh oh!

jameslamb commented Feb 16, 2024 •

edited

Loading

Uh oh!

[python-package] UserWarning: Found ... in params #6324

[python-package] UserWarning: Found ... in params #6324

Comments

memeplex commented Feb 16, 2024

Description

Reproducible example

Environment info

memeplex commented Feb 16, 2024

Uh oh!

jmoralez commented Feb 16, 2024

Uh oh!

jameslamb commented Feb 16, 2024

Uh oh!

jameslamb commented Feb 16, 2024

Uh oh!

jmoralez commented Feb 16, 2024

Uh oh!

jameslamb commented Feb 16, 2024

Uh oh!

jmoralez commented Feb 16, 2024

Uh oh!

jameslamb commented Feb 16, 2024

Uh oh!

memeplex commented Feb 16, 2024

Uh oh!

jameslamb commented Feb 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

[python-package] UserWarning: Found `...` in params #6324

[python-package] UserWarning: Found `...` in params #6324

jameslamb commented Feb 16, 2024 •

edited

Loading