SlideShare a Scribd company logo
Functional Stream
Processing
with Scala
Adil Akhter
1
The next 45 minutes
... is about Scalaz-Stream:
• Objectives.
• Design Philosophy.
• Core Building Blocks & their Semantics.
2
Scalaz-Stream
3
scalaz.concurrent.Task
4
Task[A]
• Purely functional & compositional.
• Task[A] is a wrapper around Future[Throwable / A].
• Asynchronous: uses scalaz.concurrent.Future[A].
• Handles exceptions: with scalaz./, aka Either.
5
Task[A]
• Task separates what to do with the monadic context from when to do.
scala> Task{ println("Hello World!") }
res0: scalaz.concurrent.Task[Unit] = scalaz.concurrent.Task@2435b6e8
scala> res0.run
Hello World!
6
Task[A]
• Task separates what to do with monadic context from when to do.
scala> Task{ println("Hello World!") }
res0: scalaz.concurrent.Task[Unit] = scalaz.concurrent.Task@2435b6e8
scala> res0.run
Hello World!
scala> res0.run
Hello World!
7
Task[A]
• Example: Lifting computation into a Task:
import twitter4j._
val twitterClient = new TwitterFactory(twitterConfig).getInstance()
def statuses(query: Query): Task[List[Status]] =
Task {
twitterClient.search(query).getTweets.toList
}
8
Task Constructors
• def now[A](a: A): Task[A]
• def fail(e: Throwable): Task[Nothing]
• def fork[A](a: => Task[A]): Task[A]
• ...
9
Scalaz-Stream
10
Objectives
• Incremental IO & Stream Processing.
• Modularity & Composability.
• Resource-safety.
• Performance.
11
Canonical Example
• File IO using Scalaz-Stream1
:
val streamConverter: Task[Unit] =
io.linesR("testdata/fahrenheit.txt")
.filter(s => !s.trim.isEmpty && !s.startsWith("//"))
.map(line => fahrenheitToCelsius(line.toDouble).toString)
.intersperse("n")
.pipe(text.utf8Encode)
.to(io.fileChunkW("testdata/celsius.txt"))
.run
streamConverter.run
1
Source: https://github.com/functional-streams-for-scala/fs2/blob/topic/redesign/README.md
12
13
14
15
Scalaz-Stream
... provides an abstraction to declaratively specify on how to obtain
stream of data.
16
Scalaz-Stream
... centers around one core type: Process.
17
Process[F, O]
... represents a stream of O values
which can interleave external requests
to evaluate expressions of form F[_].
18
Process[F[_], O]
• F[_]: a context/container (e.g., scalaz.concurrent.Task)
• effectful or not,
• in which a computation runs.
19
Process[F[_], O]
• F[_]: a context/container (e.g., scalaz.concurrent.Task)
• effectful or not,
• in which a computation runs.
• Streams of Os is obtained from the context: F.
20
Process[F[_], O]
• F[_]: a context/container (e.g., scalaz.concurrent.Task)
• effectful or not,
• in which a computation runs.
• Streams of Os is obtained from the context: F.
• Process can accept different input types (e.g., F[_]).
21
Execution Model
scala> :paste
// Entering paste mode (ctrl-D to finish)
import scalaz.stream._
import scalaz.concurrent.Task
import scala.concurrent.duration._
// Exiting paste mode, now interpreting.
22
Process
Process = Halt | Emit | Await
trait Process[F[_], O]
23
Halt
case class Halt(cause: Cause)
extends Process[Nothing, Nothing]
scala> Process.halt
res19: scalaz.stream.Process0[Nothing] = Halt(End)
24
Emit[F[_],O]
case class Emit[F[_],O](
head: Seq[O],
tail: Process[F, O]) extends Process[F, O]
25
Emit[F[_],O]
scala> import Process._
import scalaz.stream.Process._
scala> emit(1)
res1: scalaz.stream.Process0[Int] = Emit(Vector(1))
scala> emit(1).toSource
res2: scalaz.stream.Process[scalaz.concurrent.Task,Int] = Emit(Vector(1))
26
Emit[F[_],O]
scala> emit(1)
res1: scalaz.stream.Process0[Int] = Emit(Vector(1))
scala> emit(1).toSource
res2: scalaz.stream.Process[scalaz.concurrent.Task,Int] = Emit(Vector(1))
scala> val p: Process[Task, Int] = emitAll(Seq(1,5,10,20))
p: scalaz.stream.Process[scalaz.concurrent.Task,Int] = Emit(List(1, 5, 10, 20))
27
Await[F[_], I, O]
case class Await[F[_], I, O](
req: F[I],
recv: I Process[F, O],
fallback: Process[F, O] = halt,
cleanUp: Process[F,O] = halt) extends Process[F, O]
28
Await[F[_], I, O]
• Example: Build a Twitter Search Process to get tweets with #spark
Tag.
• Defined earlier:
val twitterClient: Twitter = ???
def statuses(query: Query): Task[List[Status]] = ???
29
Await[F[_], I, O]
// Builds a Twitter Search Process for a given Query
def buildSearchProcess(query: Query): Process[Task, Status] = {
val queryTask: Task[List[Status]] = statuses(query)
await(queryTask){ statusList
Process.emitAll(statusList)
}
}
//In essence, it builds: Process: Await (_ Emit Halt)
val akkaSearchProcess: Process[Task, Status] =
buildSearchProcess(new Query("#spark"))
30
Await[F[_], I, O]
• Example: Build a stream of Positive Integers.
scala> import Process._
import Process._
scala> def integerStream: Process[Task, Int] = {
| def next (i: Int): Process[Task,Int] = await(Task(i)){ i
| emit(i) ++ next(i + 1)
| }
| next(1)
| }
integerStream: scalaz.stream.Process[scalaz.concurrent.Task,Int]
31
Notable Approaches to Construct Process
• Process.eval
• Process.repeatEval
• scalaz.stream.io and scalaz.stream.nio
• scalaz.stream.tcp
• scalaz.stream.async
• ...
32
Running a Process: run
• def run: F[Unit]: constructs the machinery that will execute the Process.
• Reduces the sequence of operations to a single operation, in the context of Task. Given:
scala> val intsP = integerStream.take(10)
intsP: scalaz.stream.Process[scalaz.concurrent.Task,Int] = ...
scala> val task = intsP.run
task: scalaz.concurrent.Task[Unit] = scalaz.concurrent.Task@fbce6f9
scala> task.run
// outputs nothing?
33
Running a Process: runLog
• def runLog:F[Vector[O]]: Gets all the intermediate results.
scala> intsP.runLog.run
res4: Vector[Int] = Vector(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
34
Running a Process: runLog
• What does happen when we execute integerStream.runLog.run?
def integerStream: Process[Task, Int] = {
def next (i: Int): Process[Task,Int] = await(Task(i)){ i
emit(i) ++ next(i + 1)
}
next(1)
}
35
Other Approaches
• def runLast: F[Option[A]]
• def runFoldMap(f: A => B): F[B]
36
Transformations
37
Transformations
// map
scala> integerStream.map(_ * 100).take(5).runLog.run
res5: Vector[Int] = Vector(100, 200, 300, 400, 500)
//flatMap
scala> integerStream.flatMap(i emit(-i) ++ emit(-i-1)).take(5).runLog.run
res6: Vector[Int] = Vector(-1, -2, -2, -3, -3)
// Zip
scala> val zippedP: Process[Task, (Int, String)]
| = integerStream.take(2) zip emitAll(Seq ("A", "B"))
scala> zippedP.runLog.run
res2: Vector[(Int, String)] = Vector((1,A), (2,B))
38
Composability
39
Process1[-I,+O]
• aka. Pipe
• Pure Transformations
• Process[F, A].pipe(Process1[A, B]) => Process[F, B]
scala> import process1._
scala> integerStream |> filter(i => i > 5) |> exists(_ == 10)
//res8: scalaz.stream.Process[Task,Boolean] = ...
40
Process1[-I,+O]
• type Process1[-I,+O] = Process[Env[I,Any]#Is, O].
• process1 object facilitates filter, drop, take and so on.
41
Channel[+F[_],-I,O]
• type Channel[+F[_],-I,O] = Process[F, I => F[O]].
• Process[F, A].through(Channel[Task, A, B]) =>
Process[Task, B].
42
Channel[+F[_],-I,O]
scala> val multiply10Channel: Channel[Task, Int, Int] =
| Process.constant { (i: Int) =>
| Task.now(i*10)
| }
scala> integerStream.take(5) through multiply10Channel
scala> res40.runLog.run
res41: Vector[Int] = Vector(10, 20, 30, 40, 50)
43
Sink[+F[_],-O]
• Process[F, A].to(Sink[Task, A]) => Process[Task,
Unit]
• type Sink[+F[_],-O] = Process[F, O => F[Unit]]
44
Sink[+F[_],-O]
scala> val printInts: Sink[Task, Int] =
| Process.constant { (i: Int) =>
| Task.delay { println(i) }
| }
scala> val intPwithSink: Process[Task, Unit]
| = integerStream to printInts
scala> intPwithSink.run.run
1
2
3
4
5
45
Example
Sentiment Analysis of Tweets
46
Source
• Building a source that queries Twitter in every 5 seconds.
def buildTwitterQuery(query: Query): Process1[Any, Query]
= process1 lift { _ query }
val source = awakeEvery(5 seconds) |> buildTwitterQuery(query)
47
Channel
• Building the Twitter Query Channel.
// defined earlier
def statuses(query: Query): Task[List[Status]] = ???
// Lift the Task to construct a channel
val queryChannel: Channel[Task, Query, List[Status]]
= channel lift statuses
48
Channel
• Building the Sentiment Analysis Channel.
import SentimentAnalyzer._
def analyze(t: Tweet): Task[EnrichedTweet] =
Task {
EnrichedTweet(t.author, t.body, t.retweetCount, sentiment(t.body))
}
val analysisChannel: Channel[Task, Tweet, EnrichedTweet] =
channel lift analyze
49
Source |> Channel
def tweetsP(query: Query): Process[Task, Status] =
source through queryChannel flatMap {
Process emitAll _
}
// source = awakeEvery(5 seconds) |> buildTwitterQuery(query)
50
Source |> Channel
def twitterSource(query: Query): Process[Task, String] =
tweetsP(query) map(status Tweet(status))
through analysisChannel // ... to analysis channel
map (_.toString)
51
Source |> Channel |> Sink
// http4s Websocket Route
case r @ GET -> Root / "websocket"
val query = new Query("#spark")
val src1: Process[Task, Text] = twitterSource(query)
.observe(io.stdOutLines)
.map(Text(_))
WS(Exchange(src1, Process.halt))
52
53
Example
Sentiment Analysis of Tweets
using Twitter's Streaming API
54
async.boundedQueue
val twitterStream: TwitterStream = ???
val tweetsQueue: Queue[Status]
= async.boundedQueue[Status](size)
def stream(q: Queue[Status]) = Task{
twitterStream.
addListener(twitterStatusListener(q))
twitterStream.sample()
}
55
async.boundedQueue
• Producer
Process.eval_(stream(tweetsQueue))
.run
.runAsync { _ println("All input data was written") }
56
async.boundedQueue
• Consumer
tweetsQueue.dequeue.map(status
Tweet(
author = Author(status.getUser.getScreenName),
retweetCount = status.getRetweetCount,
body = status.getText)
) through analysisChannel map (_.toString)
57
58
Notable Constructs
59
Tee
• type Tee[-I,-I2,+O] = Process[Env[I,I2]#T, O]
• Deterministic
• Left and then Right and then Left ...
scala> import tee._
scala> import Process._
scala> val p1: Process[Task, Int] = integerStream
scala> val p2: Process[Task, String] = emitAll(Seq("A", "B", "C"))
scala> val teeExample1: Process[Nothing, Any] = (p1 tee p2)(interleave)
scala> teeExample1.runLog.run
res2: Vector[Any] = Vector(1, A, 2, B, 3, C)
scala> val teeExample2: Process[Nothing, Int] = (p1 tee p2)(passL)
scala> teeExample2.take(5).runLog.run
res3: Vector[Int] = Vector(1, 2, 3, 4, 5)
60
Wye
• type Wye[-I,-I2,+O] = Process[Env[I,I2]#Y, O]
• Non-deterministic
• Left, right or both
61
Next Version: FS2
• Functional Streams for Scala
• New algebra.
• Improved abstraction.
• Better performance.
62
“There are two types of
libraries: the ones people hate
and the ones nobody use"
— Unknown
63
Thank You
64
• Creadits: Edward Kmett, Runar Bjarnasson & Paul Chuisano et al.
• Sides and code samples will be posted at http://github.com/
adilakhter/scalaitaly-functional-stream-processing.
65

More Related Content

PDF
FS2 for Fun and Profit
PDF
Compositional I/O Stream in Scala
DOCX
C# console programms
DOCX
C# labprograms
PDF
RxJava applied [JavaDay Kyiv 2016]
PDF
An introduction to functional programming with go
PDF
Get started with Lua - Hackference 2016
PDF
The basics and design of lua table
FS2 for Fun and Profit
Compositional I/O Stream in Scala
C# console programms
C# labprograms
RxJava applied [JavaDay Kyiv 2016]
An introduction to functional programming with go
Get started with Lua - Hackference 2016
The basics and design of lua table

What's hot (19)

PDF
Parallel streams in java 8
PDF
Twisted is easy
PDF
Python Performance 101
PDF
The Ring programming language version 1.8 book - Part 84 of 202
PDF
Metaprogramming and Reflection in Common Lisp
PDF
The Ring programming language version 1.5.2 book - Part 7 of 181
PDF
delegates
PPTX
Introduction to rx java for android
PDF
The Ring programming language version 1.5.1 book - Part 20 of 180
PPT
Queue implementation
PDF
Paradigma FP y OOP usando técnicas avanzadas de Programación | Programacion A...
PDF
Java Practical File Diploma
PPTX
Flying Futures at the same sky can make the sun rise at midnight
PDF
The Ring programming language version 1.8 book - Part 40 of 202
PPTX
Poor Man's Functional Programming
PDF
The Ring programming language version 1.9 book - Part 90 of 210
PDF
Cocoaheads Meetup / Alex Zimin / Swift magic
PDF
Time Series Meetup: Virtual Edition | July 2020
PDF
DSU C&C++ Practical File Diploma
Parallel streams in java 8
Twisted is easy
Python Performance 101
The Ring programming language version 1.8 book - Part 84 of 202
Metaprogramming and Reflection in Common Lisp
The Ring programming language version 1.5.2 book - Part 7 of 181
delegates
Introduction to rx java for android
The Ring programming language version 1.5.1 book - Part 20 of 180
Queue implementation
Paradigma FP y OOP usando técnicas avanzadas de Programación | Programacion A...
Java Practical File Diploma
Flying Futures at the same sky can make the sun rise at midnight
The Ring programming language version 1.8 book - Part 40 of 202
Poor Man's Functional Programming
The Ring programming language version 1.9 book - Part 90 of 210
Cocoaheads Meetup / Alex Zimin / Swift magic
Time Series Meetup: Virtual Edition | July 2020
DSU C&C++ Practical File Diploma
Ad

Viewers also liked (20)

PDF
agilent 2004Proxy
PPTX
Elit 17 class 17n
PPTX
Elit 17 class 13 special
PDF
Testimonials
PPT
Stress Management
PDF
Asiakkuus ja toimintaperustaisuus muutosdriverina
PPTX
Elit 17 class 11n end richard iii introduce essay 1
PDF
Kristyna_Erbenova_Pavel_Scheufler_2016
PDF
ใบงานที่ 4
PPTX
Router configuration
DOCX
ความหมายและความสำคัญของสารสนเทศเพื่อสนับสนุนการตัดสินใจ
PDF
investigation on thermal properties of epoxy composites filled with pine app...
PPTX
The cyber house of horrors - securing the expanding attack surface
PDF
Beyond the buzzword: a reactive web-appliction in practice
PDF
CMO Exchange - CMOs as Change Management Operators panel - January 27 2013
PDF
The hitchhicker’s guide to unit testing
PDF
Scala Past, Present & Future
PDF
The CMO Survey Highlights and Insights August 2016
PDF
Six years of Scala and counting
PDF
DevNexus 2017 - Building and Deploying 12 Factor Apps in Scala, Java, Ruby, a...
agilent 2004Proxy
Elit 17 class 17n
Elit 17 class 13 special
Testimonials
Stress Management
Asiakkuus ja toimintaperustaisuus muutosdriverina
Elit 17 class 11n end richard iii introduce essay 1
Kristyna_Erbenova_Pavel_Scheufler_2016
ใบงานที่ 4
Router configuration
ความหมายและความสำคัญของสารสนเทศเพื่อสนับสนุนการตัดสินใจ
investigation on thermal properties of epoxy composites filled with pine app...
The cyber house of horrors - securing the expanding attack surface
Beyond the buzzword: a reactive web-appliction in practice
CMO Exchange - CMOs as Change Management Operators panel - January 27 2013
The hitchhicker’s guide to unit testing
Scala Past, Present & Future
The CMO Survey Highlights and Insights August 2016
Six years of Scala and counting
DevNexus 2017 - Building and Deploying 12 Factor Apps in Scala, Java, Ruby, a...
Ad

Similar to Functional Stream Processing with Scalaz-Stream (20)

PDF
Java/Scala Lab: Анатолий Кметюк - Scala SubScript: Алгебра для реактивного пр...
PPT
1.Exploration_with_CAS-I.Lab1_(1)[1].ppt
PDF
Celery with python
PDF
Threads, Queues, and More: Async Programming in iOS
PDF
Introduction to Asynchronous scala
PDF
Programming Sideways: Asynchronous Techniques for Android
PDF
The Ring programming language version 1.6 book - Part 29 of 189
PPTX
Introduction to Apache Flink
PDF
Exploring Clojurescript
PDF
How to use DIX to clean up old DCL procedures
PDF
From Java to Scala - advantages and possible risks
PPTX
slide-keras-tf.pptx
PDF
Celery - A Distributed Task Queue
PDF
Spring Day | Spring and Scala | Eberhard Wolff
PPTX
Chp7_C++_Functions_Part1_Built-in functions.pptx
PDF
Spark workshop
PPTX
Intro to Spark - for Denver Big Data Meetup
PPTX
Parallel and Async Programming With C#
PPTX
More Data, More Problems: Evolving big data machine learning pipelines with S...
PDF
Dallas Scala Meetup
Java/Scala Lab: Анатолий Кметюк - Scala SubScript: Алгебра для реактивного пр...
1.Exploration_with_CAS-I.Lab1_(1)[1].ppt
Celery with python
Threads, Queues, and More: Async Programming in iOS
Introduction to Asynchronous scala
Programming Sideways: Asynchronous Techniques for Android
The Ring programming language version 1.6 book - Part 29 of 189
Introduction to Apache Flink
Exploring Clojurescript
How to use DIX to clean up old DCL procedures
From Java to Scala - advantages and possible risks
slide-keras-tf.pptx
Celery - A Distributed Task Queue
Spring Day | Spring and Scala | Eberhard Wolff
Chp7_C++_Functions_Part1_Built-in functions.pptx
Spark workshop
Intro to Spark - for Denver Big Data Meetup
Parallel and Async Programming With C#
More Data, More Problems: Evolving big data machine learning pipelines with S...
Dallas Scala Meetup

Recently uploaded (20)

PDF
How to Confidently Manage Project Budgets
PPTX
L1 - Introduction to python Backend.pptx
PDF
Best Practices for Rolling Out Competency Management Software.pdf
PDF
Multi-factor Authentication (MFA) requirement for Microsoft 365 Admin Center_...
PDF
Become an Agentblazer Champion Challenge Kickoff
PPTX
ISO 45001 Occupational Health and Safety Management System
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PPTX
Online Work Permit System for Fast Permit Processing
PDF
AI in Product Development-omnex systems
DOCX
The Five Best AI Cover Tools in 2025.docx
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
IEEE-CS Tech Predictions, SWEBOK and Quantum Software: Towards Q-SWEBOK
PPTX
Introduction to Artificial Intelligence
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
medical staffing services at VALiNTRY
PDF
The Role of Automation and AI in EHS Management for Data Centers.pdf
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
Transform Your Business with a Software ERP System
PPTX
Mastering-Cybersecurity-The-Crucial-Role-of-Antivirus-Support-Services.pptx
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
How to Confidently Manage Project Budgets
L1 - Introduction to python Backend.pptx
Best Practices for Rolling Out Competency Management Software.pdf
Multi-factor Authentication (MFA) requirement for Microsoft 365 Admin Center_...
Become an Agentblazer Champion Challenge Kickoff
ISO 45001 Occupational Health and Safety Management System
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Online Work Permit System for Fast Permit Processing
AI in Product Development-omnex systems
The Five Best AI Cover Tools in 2025.docx
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
IEEE-CS Tech Predictions, SWEBOK and Quantum Software: Towards Q-SWEBOK
Introduction to Artificial Intelligence
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
medical staffing services at VALiNTRY
The Role of Automation and AI in EHS Management for Data Centers.pdf
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Transform Your Business with a Software ERP System
Mastering-Cybersecurity-The-Crucial-Role-of-Antivirus-Support-Services.pptx
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...

Functional Stream Processing with Scalaz-Stream

  • 2. The next 45 minutes ... is about Scalaz-Stream: • Objectives. • Design Philosophy. • Core Building Blocks & their Semantics. 2
  • 5. Task[A] • Purely functional & compositional. • Task[A] is a wrapper around Future[Throwable / A]. • Asynchronous: uses scalaz.concurrent.Future[A]. • Handles exceptions: with scalaz./, aka Either. 5
  • 6. Task[A] • Task separates what to do with the monadic context from when to do. scala> Task{ println("Hello World!") } res0: scalaz.concurrent.Task[Unit] = scalaz.concurrent.Task@2435b6e8 scala> res0.run Hello World! 6
  • 7. Task[A] • Task separates what to do with monadic context from when to do. scala> Task{ println("Hello World!") } res0: scalaz.concurrent.Task[Unit] = scalaz.concurrent.Task@2435b6e8 scala> res0.run Hello World! scala> res0.run Hello World! 7
  • 8. Task[A] • Example: Lifting computation into a Task: import twitter4j._ val twitterClient = new TwitterFactory(twitterConfig).getInstance() def statuses(query: Query): Task[List[Status]] = Task { twitterClient.search(query).getTweets.toList } 8
  • 9. Task Constructors • def now[A](a: A): Task[A] • def fail(e: Throwable): Task[Nothing] • def fork[A](a: => Task[A]): Task[A] • ... 9
  • 11. Objectives • Incremental IO & Stream Processing. • Modularity & Composability. • Resource-safety. • Performance. 11
  • 12. Canonical Example • File IO using Scalaz-Stream1 : val streamConverter: Task[Unit] = io.linesR("testdata/fahrenheit.txt") .filter(s => !s.trim.isEmpty && !s.startsWith("//")) .map(line => fahrenheitToCelsius(line.toDouble).toString) .intersperse("n") .pipe(text.utf8Encode) .to(io.fileChunkW("testdata/celsius.txt")) .run streamConverter.run 1 Source: https://github.com/functional-streams-for-scala/fs2/blob/topic/redesign/README.md 12
  • 13. 13
  • 14. 14
  • 15. 15
  • 16. Scalaz-Stream ... provides an abstraction to declaratively specify on how to obtain stream of data. 16
  • 17. Scalaz-Stream ... centers around one core type: Process. 17
  • 18. Process[F, O] ... represents a stream of O values which can interleave external requests to evaluate expressions of form F[_]. 18
  • 19. Process[F[_], O] • F[_]: a context/container (e.g., scalaz.concurrent.Task) • effectful or not, • in which a computation runs. 19
  • 20. Process[F[_], O] • F[_]: a context/container (e.g., scalaz.concurrent.Task) • effectful or not, • in which a computation runs. • Streams of Os is obtained from the context: F. 20
  • 21. Process[F[_], O] • F[_]: a context/container (e.g., scalaz.concurrent.Task) • effectful or not, • in which a computation runs. • Streams of Os is obtained from the context: F. • Process can accept different input types (e.g., F[_]). 21
  • 22. Execution Model scala> :paste // Entering paste mode (ctrl-D to finish) import scalaz.stream._ import scalaz.concurrent.Task import scala.concurrent.duration._ // Exiting paste mode, now interpreting. 22
  • 23. Process Process = Halt | Emit | Await trait Process[F[_], O] 23
  • 24. Halt case class Halt(cause: Cause) extends Process[Nothing, Nothing] scala> Process.halt res19: scalaz.stream.Process0[Nothing] = Halt(End) 24
  • 25. Emit[F[_],O] case class Emit[F[_],O]( head: Seq[O], tail: Process[F, O]) extends Process[F, O] 25
  • 26. Emit[F[_],O] scala> import Process._ import scalaz.stream.Process._ scala> emit(1) res1: scalaz.stream.Process0[Int] = Emit(Vector(1)) scala> emit(1).toSource res2: scalaz.stream.Process[scalaz.concurrent.Task,Int] = Emit(Vector(1)) 26
  • 27. Emit[F[_],O] scala> emit(1) res1: scalaz.stream.Process0[Int] = Emit(Vector(1)) scala> emit(1).toSource res2: scalaz.stream.Process[scalaz.concurrent.Task,Int] = Emit(Vector(1)) scala> val p: Process[Task, Int] = emitAll(Seq(1,5,10,20)) p: scalaz.stream.Process[scalaz.concurrent.Task,Int] = Emit(List(1, 5, 10, 20)) 27
  • 28. Await[F[_], I, O] case class Await[F[_], I, O]( req: F[I], recv: I Process[F, O], fallback: Process[F, O] = halt, cleanUp: Process[F,O] = halt) extends Process[F, O] 28
  • 29. Await[F[_], I, O] • Example: Build a Twitter Search Process to get tweets with #spark Tag. • Defined earlier: val twitterClient: Twitter = ??? def statuses(query: Query): Task[List[Status]] = ??? 29
  • 30. Await[F[_], I, O] // Builds a Twitter Search Process for a given Query def buildSearchProcess(query: Query): Process[Task, Status] = { val queryTask: Task[List[Status]] = statuses(query) await(queryTask){ statusList Process.emitAll(statusList) } } //In essence, it builds: Process: Await (_ Emit Halt) val akkaSearchProcess: Process[Task, Status] = buildSearchProcess(new Query("#spark")) 30
  • 31. Await[F[_], I, O] • Example: Build a stream of Positive Integers. scala> import Process._ import Process._ scala> def integerStream: Process[Task, Int] = { | def next (i: Int): Process[Task,Int] = await(Task(i)){ i | emit(i) ++ next(i + 1) | } | next(1) | } integerStream: scalaz.stream.Process[scalaz.concurrent.Task,Int] 31
  • 32. Notable Approaches to Construct Process • Process.eval • Process.repeatEval • scalaz.stream.io and scalaz.stream.nio • scalaz.stream.tcp • scalaz.stream.async • ... 32
  • 33. Running a Process: run • def run: F[Unit]: constructs the machinery that will execute the Process. • Reduces the sequence of operations to a single operation, in the context of Task. Given: scala> val intsP = integerStream.take(10) intsP: scalaz.stream.Process[scalaz.concurrent.Task,Int] = ... scala> val task = intsP.run task: scalaz.concurrent.Task[Unit] = scalaz.concurrent.Task@fbce6f9 scala> task.run // outputs nothing? 33
  • 34. Running a Process: runLog • def runLog:F[Vector[O]]: Gets all the intermediate results. scala> intsP.runLog.run res4: Vector[Int] = Vector(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) 34
  • 35. Running a Process: runLog • What does happen when we execute integerStream.runLog.run? def integerStream: Process[Task, Int] = { def next (i: Int): Process[Task,Int] = await(Task(i)){ i emit(i) ++ next(i + 1) } next(1) } 35
  • 36. Other Approaches • def runLast: F[Option[A]] • def runFoldMap(f: A => B): F[B] 36
  • 38. Transformations // map scala> integerStream.map(_ * 100).take(5).runLog.run res5: Vector[Int] = Vector(100, 200, 300, 400, 500) //flatMap scala> integerStream.flatMap(i emit(-i) ++ emit(-i-1)).take(5).runLog.run res6: Vector[Int] = Vector(-1, -2, -2, -3, -3) // Zip scala> val zippedP: Process[Task, (Int, String)] | = integerStream.take(2) zip emitAll(Seq ("A", "B")) scala> zippedP.runLog.run res2: Vector[(Int, String)] = Vector((1,A), (2,B)) 38
  • 40. Process1[-I,+O] • aka. Pipe • Pure Transformations • Process[F, A].pipe(Process1[A, B]) => Process[F, B] scala> import process1._ scala> integerStream |> filter(i => i > 5) |> exists(_ == 10) //res8: scalaz.stream.Process[Task,Boolean] = ... 40
  • 41. Process1[-I,+O] • type Process1[-I,+O] = Process[Env[I,Any]#Is, O]. • process1 object facilitates filter, drop, take and so on. 41
  • 42. Channel[+F[_],-I,O] • type Channel[+F[_],-I,O] = Process[F, I => F[O]]. • Process[F, A].through(Channel[Task, A, B]) => Process[Task, B]. 42
  • 43. Channel[+F[_],-I,O] scala> val multiply10Channel: Channel[Task, Int, Int] = | Process.constant { (i: Int) => | Task.now(i*10) | } scala> integerStream.take(5) through multiply10Channel scala> res40.runLog.run res41: Vector[Int] = Vector(10, 20, 30, 40, 50) 43
  • 44. Sink[+F[_],-O] • Process[F, A].to(Sink[Task, A]) => Process[Task, Unit] • type Sink[+F[_],-O] = Process[F, O => F[Unit]] 44
  • 45. Sink[+F[_],-O] scala> val printInts: Sink[Task, Int] = | Process.constant { (i: Int) => | Task.delay { println(i) } | } scala> val intPwithSink: Process[Task, Unit] | = integerStream to printInts scala> intPwithSink.run.run 1 2 3 4 5 45
  • 47. Source • Building a source that queries Twitter in every 5 seconds. def buildTwitterQuery(query: Query): Process1[Any, Query] = process1 lift { _ query } val source = awakeEvery(5 seconds) |> buildTwitterQuery(query) 47
  • 48. Channel • Building the Twitter Query Channel. // defined earlier def statuses(query: Query): Task[List[Status]] = ??? // Lift the Task to construct a channel val queryChannel: Channel[Task, Query, List[Status]] = channel lift statuses 48
  • 49. Channel • Building the Sentiment Analysis Channel. import SentimentAnalyzer._ def analyze(t: Tweet): Task[EnrichedTweet] = Task { EnrichedTweet(t.author, t.body, t.retweetCount, sentiment(t.body)) } val analysisChannel: Channel[Task, Tweet, EnrichedTweet] = channel lift analyze 49
  • 50. Source |> Channel def tweetsP(query: Query): Process[Task, Status] = source through queryChannel flatMap { Process emitAll _ } // source = awakeEvery(5 seconds) |> buildTwitterQuery(query) 50
  • 51. Source |> Channel def twitterSource(query: Query): Process[Task, String] = tweetsP(query) map(status Tweet(status)) through analysisChannel // ... to analysis channel map (_.toString) 51
  • 52. Source |> Channel |> Sink // http4s Websocket Route case r @ GET -> Root / "websocket" val query = new Query("#spark") val src1: Process[Task, Text] = twitterSource(query) .observe(io.stdOutLines) .map(Text(_)) WS(Exchange(src1, Process.halt)) 52
  • 53. 53
  • 54. Example Sentiment Analysis of Tweets using Twitter's Streaming API 54
  • 55. async.boundedQueue val twitterStream: TwitterStream = ??? val tweetsQueue: Queue[Status] = async.boundedQueue[Status](size) def stream(q: Queue[Status]) = Task{ twitterStream. addListener(twitterStatusListener(q)) twitterStream.sample() } 55
  • 57. async.boundedQueue • Consumer tweetsQueue.dequeue.map(status Tweet( author = Author(status.getUser.getScreenName), retweetCount = status.getRetweetCount, body = status.getText) ) through analysisChannel map (_.toString) 57
  • 58. 58
  • 60. Tee • type Tee[-I,-I2,+O] = Process[Env[I,I2]#T, O] • Deterministic • Left and then Right and then Left ... scala> import tee._ scala> import Process._ scala> val p1: Process[Task, Int] = integerStream scala> val p2: Process[Task, String] = emitAll(Seq("A", "B", "C")) scala> val teeExample1: Process[Nothing, Any] = (p1 tee p2)(interleave) scala> teeExample1.runLog.run res2: Vector[Any] = Vector(1, A, 2, B, 3, C) scala> val teeExample2: Process[Nothing, Int] = (p1 tee p2)(passL) scala> teeExample2.take(5).runLog.run res3: Vector[Int] = Vector(1, 2, 3, 4, 5) 60
  • 61. Wye • type Wye[-I,-I2,+O] = Process[Env[I,I2]#Y, O] • Non-deterministic • Left, right or both 61
  • 62. Next Version: FS2 • Functional Streams for Scala • New algebra. • Improved abstraction. • Better performance. 62
  • 63. “There are two types of libraries: the ones people hate and the ones nobody use" — Unknown 63
  • 65. • Creadits: Edward Kmett, Runar Bjarnasson & Paul Chuisano et al. • Sides and code samples will be posted at http://github.com/ adilakhter/scalaitaly-functional-stream-processing. 65