Skip to content

Allow wildcards inside of configuration section names #5120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 30, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 19 additions & 3 deletions docs/source/config_file.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,15 +31,31 @@ characters.

- Additional sections named ``[mypy-PATTERN1,PATTERN2,...]`` may be
present, where ``PATTERN1``, ``PATTERN2``, etc., are comma-separated
patterns of the form ``dotted_module_name`` or ``dotted_module_name.*``.
patterns of fully-qualified module names, with some components optionally
replaced by `*`s (e.g. ``foo.bar``, ``foo.bar.*``, ``foo.*.baz``).
These sections specify additional flags that only apply to *modules*
whose name matches at least one of the patterns.

A pattern of the form ``dotted_module_name`` matches only the named module,
while ``dotted_module_name.*`` matches ``dotted_module_name`` and any
A pattern of the form ``qualified_module_name`` matches only the named module,
while ``qualified_module_name.*`` matches ``dotted_module_name`` and any
submodules (so ``foo.bar.*`` would match all of ``foo.bar``,
``foo.bar.baz``, and ``foo.bar.baz.quux``).

Patterns may also be "unstructured" wildcards, in which ``*``s may
appear in the middle of a name (e.g
``site.*.migrations.*``). Internal ``*``s match one or more module
component.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

componentS. Also I'd write "stars" instead of "*s" (both places).


When options conflict, the precedence order for the configuration sections is:
1. Sections with concrete module names (``foo.bar``)
2. Sections with "unstructured" wildcard patterns (``foo.*.baz``),
with sections later in the configuration file overriding
sections earlier.
3. Sections with "well-structured" wildcard patterns
(``foo.bar.*``), with more specific overriding more general.
4. Command line options.
5. Top-level configuration file options.

.. note::

The ``warn_unused_configs`` flag may be useful to debug misspelled
Expand Down
5 changes: 3 additions & 2 deletions mypy/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -746,8 +746,9 @@ def parse_config_file(options: Options, filename: Optional[str]) -> None:
glob = glob.replace(os.altsep, '.')

if (any(c in glob for c in '?[]!') or
('*' in glob and (not glob.endswith('.*') or '*' in glob[:-2]))):
print("%s: Invalid pattern. Patterns must be 'module_name' or 'module_name.*'"
any('*' in x and x != '*' for x in glob.split('.'))):
print("%s: Patterns must be fully-qualified module names, optionally "
"with '*' in some components (e.g spam.*.eggs.*)'"
% prefix,
file=sys.stderr)
else:
Expand Down
74 changes: 59 additions & 15 deletions mypy/options.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
from collections import OrderedDict
import fnmatch
import re
import pprint
import sys

from typing import Dict, List, Mapping, MutableMapping, Optional, Set, Tuple
from typing import Dict, List, Mapping, MutableMapping, Optional, Pattern, Set, Tuple

from mypy import defaults

Expand Down Expand Up @@ -167,8 +169,8 @@ def __init__(self) -> None:
self.plugins = [] # type: List[str]

# Per-module options (raw)
pm_opts = OrderedDict() # type: OrderedDict[str, Dict[str, object]]
self.per_module_options = pm_opts
self.per_module_options = OrderedDict() # type: OrderedDict[str, Dict[str, object]]
self.glob_options = [] # type: List[Tuple[str, Pattern[str]]]
self.unused_configs = set() # type: Set[str]

# -- development options --
Expand Down Expand Up @@ -208,27 +210,54 @@ def __ne__(self, other: object) -> bool:
def __repr__(self) -> str:
return 'Options({})'.format(pprint.pformat(self.snapshot()))

def apply_changes(self, changes: Dict[str, object]) -> 'Options':
new_options = Options()
new_options.__dict__.update(self.__dict__)
new_options.__dict__.update(changes)
return new_options

def build_per_module_cache(self) -> None:
self.per_module_cache = {}
# Since configs inherit from glob configs above them in the hierarchy,

# Config precedence is as follows:
# 1. Concrete section names: foo.bar.baz
# 2. "Unstructured" glob patterns: foo.*.baz, in the order they appear in the file
# 3. "Well-structured" wildcard patterns: foo.bar.*, in specificity order.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit head-exploding, since conflict resolution for (2) is based on file order while for (3) it is based on pattern length.

Plus I don't have a good intuition whether "file order" means "first in file wins" or "last in file wins".


# Since structured configs inherit from glob configs above them in the hierarchy,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't explain "glob config" -- presumably this is the same as structured config? I.e. this implements the "in specificity order" of point 3 above.

# we need to process per-module configs in a careful order.
# We have to process foo.* before foo.bar.* before foo.bar.
# To do this, process all glob configs before non-glob configs and
# We have to process foo.* before foo.bar.* before foo.bar,
# and we need to apply *.bar to foo.bar but not to foo.bar.*.
# To do this, process all well-structured glob configs before non-glob configs and
# exploit the fact that foo.* sorts earlier ASCIIbetically (unicodebetically?)
# than foo.bar.*.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also add (redundantly?) that "processed last" translates into "wins"?

keys = (sorted(k for k in self.per_module_options.keys() if k.endswith('.*')) +
[k for k in self.per_module_options.keys() if not k.endswith('.*')])
for key in keys:
# Unstructured glob configs are stored and are all checked for each module.
unstructured_glob_keys = [k for k in self.per_module_options.keys()
if '*' in k[:-1]]
structured_keys = [k for k in self.per_module_options.keys()
if '*' not in k[:-1]]
wildcards = sorted(k for k in structured_keys if k.endswith('.*'))
concrete = [k for k in structured_keys if not k.endswith('.*')]

for glob in unstructured_glob_keys:
self.glob_options.append((glob, re.compile(fnmatch.translate(glob))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this means that module foo.bar does not match the pattern foo.*.bar. And that's surprising since foo.bar does matchs foo.bar.*.


# We (for ease of implementation), treat unstructured glob
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No comma.

# sections as used if any real modules use them or if any
# concrete config sections use them. This means we need to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is a little shady, and explain why a.*.b is not an error in the test, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'll add a comment to the test.

# track which get used while constructing.
self.unused_configs = set(unstructured_glob_keys)

for key in wildcards + concrete:
# Find what the options for this key would be, just based
# on inheriting from parent configs.
options = self.clone_for_module(key)
# And then update it with its per-module options.
new_options = Options()
new_options.__dict__.update(options.__dict__)
new_options.__dict__.update(self.per_module_options[key])
new_options = options.apply_changes(self.per_module_options[key])
self.per_module_cache[key] = new_options
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps combine with previous line? (Unless it doesn't fit.)


self.unused_configs = set(keys)
# Add the more structured sections into unused configs .
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Space before period

self.unused_configs.update(structured_keys)

def clone_for_module(self, module: str) -> 'Options':
"""Create an Options object that incorporates per-module options.
Expand All @@ -250,18 +279,33 @@ def clone_for_module(self, module: str) -> 'Options':
# in that order, looking for an entry.
# This is technically quadratic in the length of the path, but module paths
# don't actually get all that long.
options = self
path = module.split('.')
for i in range(len(path), 0, -1):
key = '.'.join(path[:i] + ['*'])
if key in self.per_module_cache:
self.unused_configs.discard(key)
return self.per_module_cache[key]
options = self.per_module_cache[key]
break

# OK and *now* we need to look for glob matches
if not module.endswith('.*'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would module ever end in a star?

for key, pattern in self.glob_options:
if self.module_matches_pattern(module, pattern):
self.unused_configs.discard(key)
options = options.apply_changes(self.per_module_options[key])

# We could update the cache to directly point to modules once
# they have been looked up, but in testing this made things
# slower and not faster, so we don't bother.

return self
return options

def module_matches_pattern(self, module: str, pattern: Pattern[str]) -> bool:
# If the pattern is 'mod.*', we want 'mod' to match that too.
# (That's so that a pattern specifying a package also matches
# that package's __init__.)
return pattern.match(module) is not None or pattern.match(module + '.') is not None

def select_options_affecting_cache(self) -> Mapping[str, object]:
return {opt: getattr(self, opt) for opt in self.OPTIONS_AFFECTING_CACHE}
35 changes: 31 additions & 4 deletions test-data/unit/cmdline.test
Original file line number Diff line number Diff line change
Expand Up @@ -203,8 +203,8 @@ def g(a: int) -> int: return f(a)
def f(a): pass
def g(a: int) -> int: return f(a)
[out]
mypy.ini: [mypy-*x*]: Invalid pattern. Patterns must be 'module_name' or 'module_name.*'
mypy.ini: [mypy-*y*]: Invalid pattern. Patterns must be 'module_name' or 'module_name.*'
mypy.ini: [mypy-*x*]: Patterns must be fully-qualified module names, optionally with '*' in some components (e.g spam.*.eggs.*)
mypy.ini: [mypy-*y*]: Patterns must be fully-qualified module names, optionally with '*' in some components (e.g spam.*.eggs.*)
== Return code: 0

[case testMultipleGlobConfigSection]
Expand Down Expand Up @@ -268,7 +268,7 @@ mypy.ini: [mypy]: ignore_missing_imports: Not a boolean: nah
python_version = 3.4
[out]
mypy.ini: [mypy-*]: Per-module sections should only specify per-module flags (python_version)
mypy.ini: [mypy-*]: Invalid pattern. Patterns must be 'module_name' or 'module_name.*'
mypy.ini: [mypy-*]: Patterns must be fully-qualified module names, optionally with '*' in some components (e.g spam.*.eggs.*)
== Return code: 0

[case testConfigMypyPath]
Expand Down Expand Up @@ -1179,10 +1179,37 @@ warn_unused_configs = True
[[mypy-spam.eggs]
[[mypy-emarg.*]
[[mypy-emarg.hatch]
[[mypy-a.*.b]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't a.*.b considered unused?

[[mypy-a.*.c]
[[mypy-a.x.b]
[file foo.py]
[file quux.py]
[file spam/__init__.py]
[file spam/eggs.py]
[out]
Warning: unused section(s) in mypy.ini: [mypy-bar], [mypy-baz.*], [mypy-emarg.*], [mypy-emarg.hatch]
Warning: unused section(s) in mypy.ini: [mypy-bar], [mypy-baz.*], [mypy-emarg.*], [mypy-emarg.hatch], [mypy-a.*.c], [mypy-a.x.b]
== Return code: 0

[case testConfigNonsense]
# cmd: mypy emarg
[file mypy.ini]
[[mypy]
ignore_errors = true
[[mypy-emarg.*]
ignore_errors = false
[[mypy-emarg.*.vilip.*]
ignore_errors = true
[[mypy-emarg.hatch.vilip.mankangulisk]
ignore_errors = false
[file emarg/__init__.py]
[file emarg/foo.py]
fail
[file emarg/hatch/__init__.py]
[file emarg/hatch/vilip/__init__.py]
[file emarg/hatch/vilip/nus.py]
fail
[file emarg/hatch/vilip/mankangulisk.py]
fail
[out]
emarg/foo.py:1: error: Name 'fail' is not defined
emarg/hatch/vilip/mankangulisk.py:1: error: Name 'fail' is not defined