Skip to content

Implement 'intersections', and add an 'Intersection' newtype along with a 'Semigroup' instance #756

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

TOTBWF
Copy link
Contributor

@TOTBWF TOTBWF commented Dec 6, 2020

Patch Description

This PR adds the following function for computing the intersection of a series of sets:

intersections :: Ord a => NonEmpty (Set a) -> Set a
intersections (s :| ss) = Foldable.foldl' intersection s ss

It also adds a newtype around Set dubbed Intersection, which provides a Semigroup instance that uses intersections.

Mailing list discussion can be found here: https://mail.haskell.org/pipermail/libraries/2020-December/030951.html

@ChrisPenner
Copy link

I went looking for this function today as well and was a little surprised to see it was missing.

@treeowl
Copy link
Contributor

treeowl commented Dec 6, 2020

The implementation should short-circuit. If the accumulator ever becomes empty, then there's no need to continue.

@treeowl
Copy link
Contributor

treeowl commented Dec 6, 2020

No, that last comment was wrong.

@treeowl
Copy link
Contributor

treeowl commented Dec 6, 2020

Corrected version: something like this:

intersections (s0 :| ss) = foldr go id ss s0
  where
    go s r acc
      | null acc = empty
      | otherwise = r (intersection acc s)

@TOTBWF
Copy link
Contributor Author

TOTBWF commented Dec 6, 2020 via email

@TOTBWF
Copy link
Contributor Author

TOTBWF commented Dec 6, 2020 via email

@treeowl
Copy link
Contributor

treeowl commented Dec 6, 2020

@TOTBWF, foldl' won't short-circuit regardless. In principle, you could use a naive foldr (with no accumulator), but that will be much worst than what I did in the non-short-circuiting case.

@treeowl
Copy link
Contributor

treeowl commented Dec 6, 2020

Please be sure to document that the intersections are taken from left to right. Even aside from short-cutting, the smaller the accumulator the better.

@treeowl
Copy link
Contributor

treeowl commented Dec 6, 2020

Still need to quit using foldl'. If you don't believe me, try an infinite list of empty.

@TOTBWF
Copy link
Contributor Author

TOTBWF commented Dec 6, 2020

Sorry about that, had a laziness brainfart! Will make the requested changes.

Copy link
Member

@sjakobi sjakobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TOTBWF could you summarise the feedback you got on the mailing list?


instance (Ord a) => Semigroup (Intersection a) where
(Intersection a) <> (Intersection b) = Intersection $ intersection a b
stimes = stimesIdempotent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
stimes = stimesIdempotent
sconcat = coerce intersections
stimes = stimesIdempotent

@@ -897,6 +905,24 @@ intersection t1@(Bin _ x l1 r1) t2
{-# INLINABLE intersection #-}
#endif

#if (MIN_VERSION_base(4,9,0))
-- | The intersection of a series of sets. Intersections are performed left-to-right.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doctests would be nice.

Comment on lines +917 to +918
-- | Sets form a 'Semigroup' under 'intersection'.
newtype Intersection a = Intersection { getIntersection :: Set a }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to reference this newtype from the documentation for the union-based Semigroup Set instance.

| null acc = empty
| otherwise = r (intersection acc s)

-- | Sets form a 'Semigroup' under 'intersection'.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs should mention that this type is only included for base >= 4.9.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better, I think: include the type and let it be useless without the Semigroup instance.

@sjakobi
Copy link
Member

sjakobi commented Mar 19, 2021

If we add an Intersection newtype for Set, I think we should also add one for IntSet.

@sjakobi
Copy link
Member

sjakobi commented Mar 19, 2021

If we add an Intersection newtype for Set, I think we should also add one for IntSet.

And intersections too of course.

@TOTBWF
Copy link
Contributor Author

TOTBWF commented Mar 19, 2021

I don't think there was consensus. There were some +1's, but a lot of arguing over whether or not intersections [] = empty or not. I'm of the opinion that it shouldn't be the case, but this isn't universally held.

If we decide that this is worth adding, then I'll make the requested changes.

@sjakobi
Copy link
Member

sjakobi commented Mar 19, 2021

I don't think there was consensus. There were some +1's, but a lot of arguing over whether or not intersections [] = empty or not. I'm of the opinion that it shouldn't be the case, but this isn't universally held.

Yes, intersections [] = empty would suggest that Intersection has a Monoid instance which however wouldn't satisfy the monoid laws. It's good that intersections avoids this misunderstanding by taking a NonEmpty.

@dfeuer How do you feel about this PR?

@Boarders
Copy link
Contributor

Would be nice to get this merged!

@treeowl
Copy link
Contributor

treeowl commented Sep 18, 2021

Fell off my radar. Oops.

@isovector
Copy link

Can we get this merged? It's a great addition.

@treeowl
Copy link
Contributor

treeowl commented Oct 31, 2021

There seem to be a few unresolved discussions, especially regarding documentation

@treeowl
Copy link
Contributor

treeowl commented Jul 22, 2022

Merged in 89ded1c, 3fecaba, eb6cf52, and d0c9173.

@meooow25
Copy link
Contributor

intersections and the Intersection newtype was never exported from Data.Set. Seems like an oversight?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants