Skip to content

Conversation

@joshlemer
Copy link
Contributor

@joshlemer joshlemer commented Aug 23, 2018

This closes scala/bug#11098

This adds tapEach method to IterableOps which allows you to write:

View(1,2,3)
  .tapEach(println)
  .tapEach(println)
  .to(Seq)
// prints: 
1
1
2
2
3
3

Or

List(1,2,3)
  .tapEach(println)
  .tapEach(println)
// prints: 
1
2
3
1
2
3

* @tparam U the return type of f
* @return The same logical collection as this
*/
def tapEach[U](f: A => U): C = fromSpecific(new View.TapEach(this, f))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could potentially optimise this for strict collections by doing

  def tapEach[U](f: A => U): C = knownSize match {
    case 0          => this
    case n if n < 0 => fromSpecific(new View.TapEach(this, f))
    case n if n > 0 =>
      foreach(f)
      this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not perfect (e.g. it misses List), but it might be better than nothing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this size optimization could result in surprising behaviour if the underlying collection is mutated but lazy. In that case, when the empty collection has elements inserted, and then the lazy collection forces, f will not be called. I do not know if there's actually any collections like this however...

* @tparam U the return type of f
* @return The same logical collection as this
*/
def tapEach[U](f: A => U): C = fromSpecific(new View.TapEach(this, f))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not sure if we really need a View.TapEach, or can just do map(a => { f(a); a }) instead of fromSpecific(new View.TapEach(this, f))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I'll change this

@joshlemer
Copy link
Contributor Author

@NthPortal ready for re-review

@javax-swing
Copy link

I think this will need to be specialised for LazyList, since LazyList overrides map, I don’t think it plays nicely with views

@joshlemer
Copy link
Contributor Author

@javax-swing Added some tests here for LazyList#tapEach

@NthPortal
Copy link
Contributor

The fact that you need a specialization for LazyList is why I think it would be better to implement this as a call to map, rather than as fromSpecific(View.Map(...)); once you have map doing the right thing, tapEach will be correct for free

@NthPortal
Copy link
Contributor

I'm wondering if it would be a good idea to have a StrictTapEach trait to mix into strict collections; e.g.

trait StrictTapEach[A] {
  self: IterableOps[A, AnyConstr, _] =>
  override def tapEach[U](f: A => U): this.type = {
    foreach(f)
    this
  }
}

(if this.type isn't valid for for some reason, you could add a C type parameter to the self type)

@joshlemer
Copy link
Contributor Author

joshlemer commented Aug 26, 2018

I'm wondering if it would be a good idea to have a StrictTapEach trait to mix into strict collections;
@NthPortal

Actually I just put the implementation in StrictOptimizedIterableOps which all the strict iterables extend

// Optimization avoids creation of second collection
override def tapEach[U](f: A => U): C = {
foreach(f)
coll
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't return this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, StrictOptimizedSeqOps[..., C] does not conform to C. I could play around with introducing a self-type to it, self: C => and see if there are any issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, It seems that this solution does solve it, as long as you also enforce the self: C => type on all extending StrictOptimized{Seq,Set,Map,SortedMap, SortedSet}Ops. I'd like to get @julienrf 's opinion though if he forsees any issues with that constraint, maybe it's intended to be extendable by non-C things too

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exceptions used to be StringOps and ArrayOps. All regular collection types extend their C parameter. Can we make any further simplifications if we always enforce this? I don't think it's worth making this change specifically for tapEach. There are plenty of other methods that return C. In most cases you abstract over consuming a collection where this is not an issue. If you also need to produce a more specific collection type you need to use the proper abstraction anyway (unless we can change it in a more generic way).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@szeiger I can't think of any other improvements or simplifications that could be made if it had { self: C => unfortunately.

// Optimization avoids creation of second collection
override def tapEach[U](f: A => U): self.type = {
foreach(f)
this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be simpler to return coll here (and keep using C as the return type).

@joshlemer joshlemer force-pushed the tap-each branch 3 times, most recently from 2e778c1 to fc4b230 Compare September 5, 2018 20:02
assertEquals(Vector(1, 2), ll.take(2).to(Vector))
assertEquals(ListBuffer(1, 2), lb)
assertEquals(4, ll(3))
assertEquals(ListBuffer(1, 2, 3, 4), lb) // this should change to Listbuffer(1,2,4) if LazyList becomes independently lazy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can make the LazyList independently lazy by doing LazyList.tabulate(5)(_ + 1)

@joshlemer
Copy link
Contributor Author

@NthPortal your suggestions are implemented now

@joshlemer joshlemer force-pushed the tap-each branch 2 times, most recently from febe3e4 to 562131c Compare September 13, 2018 23:42
Copy link
Contributor

@NthPortal NthPortal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@NthPortal
Copy link
Contributor

IMO, this is ready to merge (once the conflict is resolved). @julienrf do you agree?

@joshlemer
Copy link
Contributor Author

joshlemer commented Sep 16, 2018 via email

@NthPortal
Copy link
Contributor

@joshlemer I'm not sure if I'm behind increasing the maintenance burden by adding a whole bunch of overrides that just call super. If there's a way to do it generically so that you only need at most a few overrides, that sounds good to me, but I'm not sure it's worth it to do it for every strict collection.

// Optimization avoids creation of second collection
override def tapEach[U](f: A => U): C = {
foreach(f)
coll
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exceptions used to be StringOps and ArrayOps. All regular collection types extend their C parameter. Can we make any further simplifications if we always enforce this? I don't think it's worth making this change specifically for tapEach. There are plenty of other methods that return C. In most cases you abstract over consuming a collection where this is not an issue. If you also need to produce a more specific collection type you need to use the proper abstraction anyway (unless we can change it in a more generic way).

@adriaanm
Copy link
Contributor

adriaanm commented Oct 3, 2018

@szeiger do I read your review correctly as "not convinced yet?"

@NthPortal
Copy link
Contributor

Is there a reason for this not to be merged?

@joshlemer
Copy link
Contributor Author

@NthPortal I don't believe so

@adriaanm
Copy link
Contributor

ping @szeiger

Copy link
Contributor

@szeiger szeiger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the missing knownSize, this looks good.

*/
def inits: Iterator[C] = iterateUntilEmpty(_.init)

override def tapEach[U](f: A => U): C = fromSpecific(new View.Map(this, { a: A => f(a); a }))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could I get others' opinions on whether or not this should be implemented as a call to map instead of fromSpecific(new View.Map)? I think that the former simplifies/reduces the work for those implementing collections which need to override map, but I am curious to hear others' thoughts.

I don't think the decision should hold up the PR - it's an easy change to make in its own PR if needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using map is better, indeed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, calling map { a => f(a); a } will only return a CC[A], rather than the more specific C.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is interesting and unfortunate. I'm torn on whether or not it would still be better to use map and cast the result. I'm leaning towards not.

Copy link
Member

@SethTisue SethTisue Nov 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked @szeiger to look at this and explain it to me.

Josh's point is that we can't typecast our way out of this. CC and C[A] can be different collection types entirely:

scala 2.13.0-M5> (Map(1 -> "foo"): Iterable[(Int, String)]).map(x => x)
res5: Iterable[(Int, String)] = List((1,foo))

scala 2.13.0-M5> (collection.immutable.BitSet(0, 5, 9): Iterable[Int]).map(_ + 1)
res0: Iterable[Int] = Set(1, 6, 10)

(in an earlier edit, I'd included a String example where I "upcast" it to Iterable[Char], but that's not really the same, because it's an implicit conversion, not an upcast)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SethTisue Wow! Very sneaky! Some downsides (which may have workarounds):

  • filter can (as far as the compiler is concerned) change the length of the Iterable. So the consequences of this are that, perhaps non-strict Seq's like SeqView and LazyList will have to then go through each element in order to fulfill a call to apply(100), whereas before the filter call, they may have been able to independently evaluate/access element 100 immediately.
  • The above point means that IndexedSeqView#filter cannot return an IndexedSeqView

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add on one more point, filter also destroys independent laziness of elements (currently only a thing for LazyList, but could potentially be a thing for some lazy IndexedSeq in the future).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshlemer okay, sounds like it’s better as-is then

@NthPortal I don’t follow?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SethTisue

scala> def list = LazyList.tabulate(10)(i => i)
list: scala.collection.immutable.LazyList[Int]

scala> def f(i: Int) = println(i)
f: (i: Int)Unit

scala> list.map(i => {f(i); i})(5)
5
res0: Int = 5

scala> list.filter(i => {f(i); true})(5)
0
1
2
3
4
5
res1: Int = 5

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I retract all complaints, and I'm happy with fromSpecific(new View.Map(...))

@lrytz
Copy link
Member

lrytz commented Nov 2, 2018

ping @joshlemer

@szeiger
Copy link
Contributor

szeiger commented Nov 2, 2018

Strange compiler failure in the Travis build: https://travis-ci.org/scala/scala/jobs/449897364. I've triggered a new build.

@lrytz
Copy link
Member

lrytz commented Nov 2, 2018

Travis should be ignored for the time being, scala/scala-dev#570

@szeiger
Copy link
Contributor

szeiger commented Nov 5, 2018

I see. In this case it just needs to be squashed because the first commit on its own failed.

@SethTisue SethTisue added the library:collections PRs involving changes to the standard collection library label Nov 8, 2018
@SethTisue
Copy link
Member

SethTisue commented Nov 28, 2018

@joshlemer rebase and squash?

@joshlemer
Copy link
Contributor Author

Done! @SethTisue

@SethTisue
Copy link
Member

gah, sorry Josh, needs one more rebase.

I tried to use this the other day and was surprised to find it hadn't been merged yet :-)

Better unit test for Vector.tapEach

tapeach returns specific iterable type

Add comments in vectortest

Fix ViewTest.tapEach

Add tapEach to iterableOnce, optimizations for strict collections

Add tests for LazyList.tapeach

Remove git merge artifacts

Fix LazyList.tapEach test

Fix LazyList#tapEach, add more and better tests

Better tests in IteratorTest.tapEach

Use 'locally' in IteratorTest#tapEach

Miscellaneous cleanup related to IterableOnce#tapEach

Remove copy of scaladoc for tapEach

Remove '(typically)' from tapEach scalado

override knownSize in IterableOnce#tapEach
@joshlemer
Copy link
Contributor Author

🤞 @SethTisue

@SethTisue SethTisue added the release-notes worth highlighting in next release notes label Dec 5, 2018
@SethTisue SethTisue merged commit 402c93f into scala:2.13.x Dec 5, 2018
@joshlemer joshlemer deleted the tap-each branch December 5, 2018 01:09
@SethTisue SethTisue changed the title [11098] Add IterableOps#tapEach method Add IterableOps#tapEach method (scala/bug#11098) Apr 4, 2019
@SethTisue SethTisue changed the title Add IterableOps#tapEach method (scala/bug#11098) Add IterableOps#tapEach method Apr 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

library:collections PRs involving changes to the standard collection library release-notes worth highlighting in next release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add method View#peek

10 participants