-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
Closed
Labels
BuildLibrary building on various platformsLibrary building on various platformsDependenciesRequired and optional dependenciesRequired and optional dependenciesRegressionFunctionality that used to work in a prior pandas versionFunctionality that used to work in a prior pandas version
Milestone
Description
Code Sample, a copy-pastable example if possible
Easiest way to reproduce this is in a fresh docker container, as below
docker run --rm -it alpine:latest
$ apk update
$ apk add build-base python3 python3-dev py3-pip
$ pip3 install pandas
Collecting pandas
Downloading https://files.pythonhosted.org/packages/81/fd/b1f17f7dc914047cd1df9d6813b944ee446973baafe8106e4458bfb68884/pandas-0.24.1.tar.gz (11.8MB)
100% |████████████████████████████████| 11.8MB 4.2MB/s
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 346, in get_provider
module = sys.modules[moduleOrReq]
KeyError: 'numpy'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-install-y3ff_xpr/pandas/setup.py", line 732, in <module>
ext_modules=maybe_cythonize(extensions, compiler_directives=directives),
File "/tmp/pip-install-y3ff_xpr/pandas/setup.py", line 475, in maybe_cythonize
numpy_incl = pkg_resources.resource_filename('numpy', 'core/include')
File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 1136, in resource_filename
return get_provider(package_or_requirement).get_resource_filename(
File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 348, in get_provider
__import__(moduleOrReq)
ModuleNotFoundError: No module named 'numpy'Problem description
Note that pip installing numpy and then pandas does work
pip3 install numpy
pip3 install pandasBut the standalone command is broken.
pip3 install pandasUnfortunately, this means that a requirements.txt file is insufficient to for setting up a new environment with pandas installed (like in a docker container).
booleangate, DusanMadar, luipillm, tblazina, bleuknight and 2 more
Metadata
Metadata
Assignees
Labels
BuildLibrary building on various platformsLibrary building on various platformsDependenciesRequired and optional dependenciesRequired and optional dependenciesRegressionFunctionality that used to work in a prior pandas versionFunctionality that used to work in a prior pandas version
Type
Projects
Relationships
Development
Select code repository
Activity
mgoldey commentedon Feb 6, 2019
I believe this to be a function of recent changes because
pandas==0.23.1does install numpy correctly.gfyoung commentedon Feb 7, 2019
cc @TomAugspurger
This is really weird, as doing
pip install pandas(withoutnumpyinstalled) seems to be just fine...jorisvandenbossche commentedon Feb 7, 2019
@gfyoung I suppose you tried with binary wheels?
I can reproduce this with, installing the following in a fresh environment with just python and pip:
For pandas 0.23.4 it indeed works.
jorisvandenbossche commentedon Feb 7, 2019
From a quick look, this might be caused by the changes in #21879 (which I don't fully understand though), as it is caused by
maybe_cythonizebeing ran insetup.pybefore setuptools is able to check and install dependencies.cc @jbrockmendel
[-]pip installation has broken dependency handling for numpy[/-][+]Pip install of version 0.24 is broken for platforms without wheels[/+]snordhausen commentedon Feb 7, 2019
@jorisvandenbossche You are right, this is the change that broke things. I checked with this Dockerfile:
jbrockmendel commentedon Feb 7, 2019
maybe_cythonizewraps cython'scythonize, so we may need to get a fix upstream. Shorter-term, I suppose there is something that happens withinsetuptools.setupthat installs numpy (?) that we could shoe-horn into the top of maybe_cythonizejorisvandenbossche commentedon Feb 7, 2019
What I understand from the diff in #21879 is that before, we just passed
extensionsto thesetup(ext_modules=...), and now we already callcythonize(or the maybe wrapper) on them inside setup.py.So my question then: what is exactly the reason that we started to use
cythonize? Or how do other projects deal with it?jbrockmendel commentedon Feb 7, 2019
cythonize is recommended way to compile cython modules. A lot of the dependency tracking we used to do manually in setup.py cythonize handles automatically. cythonize also takes care of figuring out which files need to be re-compiled vs are re-usable (I know statsmodels used to do this manually, not sure about pandas)
11 remaining items
jbrockmendel commentedon Feb 11, 2019
@jorisvandenbossche sure. Is the desired behavior just to check if numpy is installed and if not raise?
jorisvandenbossche commentedon Feb 11, 2019
Not sure if it should raise. I think you should be able to call
python setup.py ..without having numpy installed. So eg doing apython setup.py egg_infoshould not call cythonize / require numpy or cython to be installed (as it was before #21879)jbrockmendel commentedon Feb 12, 2019
OK. The fix that comes to mind is a check inside
maybe_cythonizeto skip cythonizing based on the command line arguments. At the moment we only skip forclean, but that could be extended to egg_info or whatever else (or the check could be reversed so cythonize is only run for build_ext or whatever)jorisvandenbossche commentedon Feb 13, 2019
I don't know all the possible commands of setuptools enough to know if specifically checking for that would be enough. Best might be to simply test it out.
In any case, it seems to me that there is some duplication now in the setup.py, as we have both functionality to cythonize in the
build_extclass and in themaybe_cythonizeyou added. But not familiar enough with it to know how it interacts with the coverage that you were trying to solve.jbrockmendel commentedon Feb 13, 2019
IIRC the coverage implementation isn't quite orthogonal to the usage of
cythonizebut close. I'm fairly confident that if we had to remove the usage of cythonize, we could do it without removing the coverage implementation.Some of that code is shared.
maybe_cythonizecallsbuild_ext.render_templates. I guess the code addingnumpy_inclto eachext.include_dirscould be shared. Do you see anything else that could be de-duplicated?This is partially an upstream problem with
cythonize. I implementedmaybe_cythonizeto wrapcythonizebecausepython setup.py cleanwould incorrectly callcythonizeif we didn't do a check. Unless you want to revert the use ofcythonize(I really hope that's not on the table), I'm pretty sure the place to fix this issue are on line 476 inmaybe_cythonizewhere it currently checks for "clean". We should also push the fix upstream (cython/cython#1495).I also don't know the setuptools API. It shouldn't be that hard to find a complete-ish list of the possible commands and sort them by whether cythonize should be called or not, should it?
jorisvandenbossche commentedon Feb 15, 2019
Well, I don't care that much how it is done, I mainly care about it being fixed for 0.24.2. I only know that before #21879 it was working, and that I don't have the time to dive into setuptools inner details right now, so if we don't find another solution, reverting it is an option for me.
In any case, the
maybe_cythonizeis not essential to get a build_ext working. I just removed it, and doing a dev build on master still works fine (but I assume the need for the maybe_cythonized is somehow related to the coverage?).But the suggestion about checking which setup command is ran, might make sense (at least, eg scipy does things like that: https://github.com/scipy/scipy/blob/master/setup.py)
Or you could also say that their recommendation to use
cythonizeis not a very good one, as mentioned in the last comment on the issue you linked (in their docs, they actually also mention some ways to avoid calling cythonize in certain cases). We already have all the custom build classes (CleanCommand,build_ext, ..) that deal with this, and thecythonize/maybe_cythonizeseems to be duplicating that partly (but again, I don't have much knowledge of setuptools/cython interaction)jbrockmendel commentedon Feb 15, 2019
I maintain this is the thing to do. Go for it.
cythonizetakes care of figuring out the dependencies that were previously explicitly (and error-proned-ly) listed in setup.py. Using it simplifies that part of setup quite a bit.jorisvandenbossche commentedon Feb 15, 2019
I currently don't have time to look more into it.
TomAugspurger commentedon Mar 6, 2019
I'll look into this today, as I think this is a blocker for 0.24.2?