Skip to content

ENH: Join - Add a parameter to check for duplicates #46622

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jiri-kulik opened this issue Apr 3, 2022 · 3 comments · Fixed by #46740
Closed

ENH: Join - Add a parameter to check for duplicates #46622

jiri-kulik opened this issue Apr 3, 2022 · 3 comments · Fixed by #46740
Labels
Enhancement Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@jiri-kulik
Copy link

Is your feature request related to a problem?

It is arguably more common than not to join dataframes on a unique index. Optional check for uniqueness would help to prevent hidden errors and time spent on debugging.

Describe the solution you'd like

DataFrame.join should get a new parameter on_unique that would, if set to True, check if the index/columns on which joining is performed have duplicates and raise error if yes. Default should be False to keep backward compatibility.

API breaking implications

Default set to False ensures no issues with compatibility.

@jiri-kulik jiri-kulik added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 3, 2022
@samukweku
Copy link
Contributor

A workaround at the moment would be to use pandas merge and use the validate argument to check for uniques. Pass left_index and right_index as True. It would be nice though to have the validate argument in the join function

@jreback
Copy link
Contributor

jreback commented Apr 3, 2022

.join calls merge under the hood

so would accept a PR to add validate kwarg

i believe there is another issue about this - pls check

@simonjayhawkins simonjayhawkins removed the Needs Triage Issue that has not been reviewed by a pandas team member label Apr 3, 2022
@simonjayhawkins simonjayhawkins added this to the Contributions Welcome milestone Apr 3, 2022
@mroeschke mroeschke added the Reshaping Concat, Merge/Join, Stack/Unstack, Explode label Apr 3, 2022
gaotian98 added a commit to gaotian98/pandas that referenced this issue Apr 11, 2022
gaotian98 added a commit to gaotian98/pandas that referenced this issue Apr 12, 2022
@gaotian98
Copy link
Contributor

Hi @jreback , I submitted a PR. Would you please take a look?

@jreback jreback modified the milestones: Contributions Welcome, 1.5 Apr 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants