-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
ENH: Join - Add a parameter to check for duplicates #46622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
A workaround at the moment would be to use pandas merge and use the |
.join calls merge under the hood so would accept a PR to add validate kwarg i believe there is another issue about this - pls check |
Hi @jreback , I submitted a PR. Would you please take a look? |
Is your feature request related to a problem?
It is arguably more common than not to join dataframes on a unique index. Optional check for uniqueness would help to prevent hidden errors and time spent on debugging.
Describe the solution you'd like
DataFrame.join
should get a new parameteron_unique
that would, if set to True, check if the index/columns on which joining is performed have duplicates and raise error if yes. Default should be False to keep backward compatibility.API breaking implications
Default set to False ensures no issues with compatibility.
The text was updated successfully, but these errors were encountered: