Notes on Virtual Threads and Clojure
Note this article discusses Preview version of software. Take it as an inspiration, not something that is set to stone!
Intro to Project Loom and Virtual Threads
Virtual Threads are the most significant feature of the so-called Project Loom.
Project Loom was launched in 2017 by Ron Pressler and his team at Oracle. The main goal of the project was to extend the capabilities of Java Virtual Machine to address the complexity of writing highly concurrent and scalable software.
There is more to Project Loom than just Virtual Threads. Project wiki specifically mentioned Delimited continuations and Tail-call elimination. But it's fair to say they are the most significant addition to the Java platform from the user perspective and productivity.
I don't want to dive deeper into delimited continuations and tail call elimination features to stay focused on the most practical matters, but it's fair to point out at least that delimited continuations seem to be quite important for the introduction of the Virtual Threads to the Java platform.
So what are they, and why they are so groundbreaking that it was worthy to write this post about them?
Traditionally, JVM threads were built around OS threads. This fact also determines their major properties:
- Single thread was mapped to a single OS thread
- Blocking (waiting) on a thread caused the thread to be effectively wasted for other tasks
- Managing threads on JVM was costly. Each thread easily uses an additional Megabytes of memory thus spawning many of them is not wise.
These limitations are mitigated by introducing Virtual Threads. They no longer map one-to-one to OS threads. A single OS thread can host many thousands or more Virtual Threads without a worry about blocking issues or excessive memory demands. This requires changes to the implementation of JVM and standard library to allow an effective schedule of Virtual Threads.
Virtual Threads also improve a situation when limitations of OS threads were addressed by using more or less sophisticated thread pools. Experienced developers know that thread pools (of OS threads) also have significant downsides if not constrained properly.
Virtual Thread is represented by a class
java.lang.VirtualThread and it extends
java.lang.Thread. This follows the Liskov-substitution principle and allows us to easily introduce them into our existing codebases.
Clojure and Threads
It's clearly stated Clojure is designed to work well together with the Java thread system. Clojure function instances even implement
java.util.concurrent.Callable etc. so they naturally work with the Executor framework.
The most primitive way to do something is to launch it in a new thread like this:
(.start (Thread. #(println "Hello world!")))
Unsurprisingly there is also an API call for launching a Virtual Thread with a preview JDK (or Loom).
(Thread/startVirtualThread #(println "Hello world!"))
Nice! However, this is barely useful. We want concurrent processes to compose and coordinate. Clojure concurrency offers two essential mechanisms:
Let's revisit those in detail and see how we can spice it up with Loom's Virtual Threads.
Agents manage independent state. Their state can be changed only through submit of action. Actions are ordinary functions that take a state parameter and return a new state. Actions are dispatched using
send-via and they return immediately without waiting for completion. The action occurs asynchronously on thread-pool threads. Only one action per agent happens at a time.
Agents are nice because they come up with the following properties:
- their state is always available for a reader without blocking after dereferencing with
- they can be coordinated using
- any dispatches made during the action are held until after the state of the agent has changed
- agents coordinate with transactions - any dispatches made during a transaction are held until it commits
;; construct new agent
(def a-counter (agent 0))
;; send it a function
(send a-counter inc)
;; wait for the delivery
;; reveal the state
Spicing up Agents
Agent's dispatching functions
send-off use default implementations of executors for submitted tasks.
These executors live by default inside
- Dispatching function
Both executors work by default with heavy OS threads. Even though they are good defaults we can sneak in some goodies. Loom comes with a new executor service which you can easily create using the static method on the
Executors class. This new executor is represented by
ThreadPerTaskExecutor class. We can replace the default pooledExecutor with this new one.
(:import (java.util.concurrent Executors)))
;; Let's first define a factory that helps with spawning new Virtual Threads
(defn thread-factory [name]
(.name name 0)
;; Let's swap the default executor with the new one
;; This code is going to be executed using Virtual Threads under the hood
(def a-counter (agent 0))
(send a-counter inc)
The same applies to the executor for
send-off dispatching function.
If you want to retain more control just use
send-via where executor can be specified as a parameter:
;; Define an executor which just produce a new virtual thread for every task
(def unbounded-executor (Executors/newThreadPerTaskExecutor (thread-factory "unbounded-pool-")))
(send-via unbounded-executor a-counter dec)
This is all you need to transparently work with Agents under the new concurrency model. Clojure seems to be well prepared for the future! Futures...
Future represents a value that is going to be available at an indeterminate time in the future. It can be captured and passed around as you want. In Java futures are represented by objects implementing
Future<V> interface from the
java.util.concurrent package. The brief evolution of implementations of this interface can be captured by Java's standard library:
- Java 1.5 introduced
- Java 1.7 introduced
- As of Java 1.8 there is
Clojure contains a bunch of functions in its core library to work with futures. This is the most basic example that can demonstrate how to utilize futures in Clojure programs:
@(future (println "Before")
(println "After 2000 ms")
As we can see Clojure futures are nice, Just dereference them similarly to agents or atoms with
(deref a-future) or a shortcut
@a-future. Dereferencing causes execution to block until a future value is resolved and thus available. Unfortunately, that means that the whole OS thread is blocked.
So what can we do to make it cheaper? Of course, Loom has our back covered with a lot cheaper Virtual Threads. Function
future-call function under the hood. This function references
clojure.lang.Agent/soloExecutor. This means that if we replace this executor as we did for
send-off above, it's all we need to do.
There is Promesa library which contains constructs to deal with futures that goes way beyond the simplistic use of futures in the Clojure core library. Some functions from the Promesa library introduce arities that take executor as a parameter and use such executor to schedule computation. Passing the
ThreadPerTaskExecutor executor mitigates trouble mentioned under Promesa execution model.
Introducing Structured Concurrency
Structured concurrency is a concurrency programming model described in the following line:
When a flow of execution splits into multiple concurrent flows, they rejoin in the same code block
That means we have to be able to bind thread lifetime to a scope. Such scopes should naturally form parent-child relationships and there has to be programming constructs around the hierarchy.
Let's examine this simplistic example:
(defn run-concurrently 
(let [executor (Executors/newThreadPerTaskExecutor (thread-factory "perfectly-scoped-pool-"))]
(.submit executor ^Callable #(identity 2000))
(.submit executor ^Callable #(prn "Starting a long running operation"))
(.submit executor ^Callable #(Thread/sleep 1000))
(.submit executor ^Callable #(prn "Done."))
(finally (.close executor)))))
Here scope is a function with defined executor against which tasks are submitted. None of the Virtual Threads outlives the scope of the function. Reason being
ThreadPerTaskExecutor.close method do the join of the threads and cleanup after them. Caller does not need to know anything about level of concurrency of such method. Also this composes recursively (parent-child relationship), as other functions following the same structure can be called inside the body. It's deterministic and transparent.
These are less relevant to Clojure developers as most of us do not work on low-level mode of operation, but I'd like to mention them anyway.
InheritableThreadLocal. They are supported, but they defeat the cost advantages that come with Virtual Threads
- Avoid thread pools to control access to expensive resources. Use
Clojure itself contains very few instances of
Are they a problem? Probably not. My personal recommendation is to use structured concurrency approach similar to
run-concurrently above so that Virtual Threads not live long and unused resources are garbage collected as soon as possible.
At some point JDK can also receive Scoped Variables that can be a substitute for expensive ThreadLocals. But it's song of the distant future.
Virtual Threadsare important and extremely useful addition to Java platform
- Clojure concurrency mechanisms can be setup and effectively use
Virtual Threadstoday! No modifications to Clojure codebase appears to be necessary
- Structured concurrency becomes more important mechanism to deal with concurrent processes once
Virtual Threadswill be released
- Not everything is set to stone. Some mechanisms maybe revisited or adjusted
I hope this article triggered intelectual curiosity and provided with interesting information.