A Pattern For Component Based Program Architecture In Rust

Rust logo

tl;dr - I explore the component pattern and how I’ve gone about implementing it in rust, starting with the basic concept of the Component trait and going through to thread-per-component for parallel operation, and message-passing for communication. Skip to a a full working example @ the rust-component-pattern-example example repo.

UPDATE (07/23/2018)

After some great reddit feedback on some bits of the code that were confusing, I've added a section on how I addressed some of the issues along with committing some code to the example repo, please check it out!

I recently became a Rustacean (what people who use and like the Rust programming language call themselves) while recently working on an open source project of mine called postmgr that focuses on serving as a management layer on top of Postfix.

It’s been a very long time since I’ve done low-level “systems” programming, but I’ve been enjoying my foray into rust. Most of my recent projects have been in Haskell or Golang, the latter of which is close but not quite a “systems” langauge, due to the fact that it packs a runtime – I would characterize Go as an excellent language for writing systems (especially networked ones) not necessarily a systems language. Granted, a systems langauge is almost certainly overkill for what I want to do (manage postfix), other similar projects like Mailu or Mailcow use Python and PHP (respectively), but I was really chomping at the bit to get a chance to use rust in a project.

Love for rust aside, one of the best organizational patterns I’ve been introduced to and adopted for my programs is the high-level “Component” pattern – it’s basicaly table stakes for many Java (+/- Spring) driven projects, which initially left a bad taste in my mouth but after seeing the pattern distilled by projects like stuartsierra/component, it’s become a standard approach once I’m familiar with a language and want to build omething with it.

The component pattern is especially wonderful in languages with good support for interfaces – traits in rust, interfaces in Golang, typeclasses in Haskell. I think in Java-land I was put off by instances where I would see the pattern implemented with class hierarchies instead of the somewhat conceptually simpler interfaces/trait based approach.

In postmgr I’ve gotten to the point where I wanted to recreate this approach, which lead to a bunch of questions that my as-of-now very new rustacean mind needed to grapple with. In particular the concurrent + parallel interaction between components was particularly difficult for me to reason about/design, since in languages with runtimes (ex. Haskell, Golang), you’re somewhat allowed to pass around global (hopefully immutable) pointers very easily. This means it’s very easy to have a singleton “component container” somewhere that some functions can take and use to request the component they need to do work. In a language like Haskell, which runs techjobs.tokyo, my most recent project, the immutability of Haskell along with software transactional memory support makes concurrency and parallelism so easy that I didn’t really have to think about it too much. To be fair, the structure I was aiming for is a little bit different there, in the Haskell app since it’s a primarily an API server I stuff everything into one “App” component, where postmgr’s approach is a little different in that there are 2 main components – the Postfix component which manages postfix-related interactions, and the WebAdmin component which manages the control exerted from the bundled web app.

I posted about this design question once in a weekly rust thread but didn’t really get an answer so I took this as an opportunity to wade in myself – be ye warned, there might be much better solutions out there than what I get into in this post.

Writing the Component trait

The component trait is pretty simple – the most basic trait I could think of was the following:

#[derive(Debug)]
enum ComponentError {
  NotStarted,
  InvalidConfiguration
}

pub trait Component {
    fn get_name(&self) -> &str;
    fn start(&mut self) -> Result<(), ComponentError>;
    fn stop(&mut self) -> Result<(), ComponentError>;
}

While not strictly required, I’ve included get_name() as a basic way to differentiate/identify components in a sufficiently dynamic context. For example, if I’m making a status page of components and I have a list of trait objects (ex. a Vec<Component>), I’d much rather call get_name() if I can get away with it without pattern matching on every type. This also enables having a structure like a HashMap<String, Vec<Component>> for example.

As you can see, the Component trait basically only covers the lifecycle of this thing I’m calling a “component”. There’s one big question – if we’re managing lifecycles, does this mean components run independently? Well the short answer is “yes”, the longer answer needs a bit more explaining.

Component concurrency/parallelism

Concurrency and parallelism are distinct. I mention them together in this this instance not because they’re interchangeable but because I’m dealing with them in the same way – I’m planning to introduce thread-based concurrency AND parallelism – by introducing multiple threads even when only one core is being used, and sprinkling a little cooperative multi-tasking where possible.

At a higher level, one of the very basic next questions one might think of is how start() should behave. Primarily, should start() for (all?) components block? While there are possibly some components that don’t need to keep running in the background, I think it’s reasonable to assume that the majority of components get started and keep running.

Since I’m baking the assumption that components are expected to start and keep running is an assumption into the Component trait, let’s enshrine this new assumption let’s change the signature for start(&mut self) to return rust’s primitive never type which represents computations that will never return, !.

pub trait Component {
    ...
    fn start(&mut self) -> !;
    ...
}

Running multiple components at the same time

Since rust does not impose a runtime, it does not support things like green threads or coroutines in the standard library which would likely neccessitate runtime support for system thread multiplexing, etc. Support for these patterns is providedin crates like tokio-rs or the more experimental coroutine-rs. Since Components block when start()ed, implementers are free to choose their preferred concurrency/parallelism organizational tool (ex. actors, regular threads, coroutines, whatever).

The simplest of these is just starting a thread (which in rust is the same as a system thread), which can be captured in a trait:

trait ThreadedComponent where Self: Component {
    fn spawn(mut self) -> Result<JoinHandle<Self>, ComponentError>;
}

NOTE - the mutable borrow (&mut self) is likely necessary here, just in case the thread that takes ownership needs do things that modify the component itself – YMMV. You can avoid this by ensuring that the malleable internals of the component are protected with std::sync::Mutex<T>s.

This is all pretty straight forward on the face of it, but a subtle-but-pervasive set of issues has arose – now that components are in different threads, how do they share state and/or request work of one another?. There are lots of ways to share state, and the rust book has an good section on “fearless concurrency”, but it’s not necessarily clear which one of the approaches one should take.

As far as I could tell, the choices are actually limited more than you might expect – just passing around references to the components themselves didn’t quite work the first time I tried – IIRC the problem I ran into was that std::sync::Arc<T>s (“Atomically Reference Counted”) smart pointers require that their contents (the Tin Arc<T>) implement the Send and Sync traits. Theoretically an object that lives on the heap, like a Box<T> should trivially pass this, but I’ve had difficulties getting Arc<Box<T>> type objects to be sendable across threads, when T did not implement Send and/or Sync. I’m a little fuzzy in my understanding here – one side of me says this makes sense given rust’s lifetimes/borrowing system, but on the other hand I’m pretty sure if I’m dealing only in pointers to an object on the heap I shouldn’t have to worry about the difficulty of getting a pointer from one thread to another.

Every attempt I made at trying to get what I expected should work didn’t, for one reason or another, or reuqired T to be Send/Sync capable – I tried things like trying to pass an Arc<Mutex<T>> and Arc<Box<Mutex<T>>> but none of it seemed to work quite as I expected. There’s certainly more I should do here and as I become a most experienced rust developer I will likely revisit this, but for now there’s an easier (and simpler) paradigm out there – sharing memory by passing messages (covered in the rust book as well as being a pretty prominent philosophy of Golang).

Communication between thread-driven-components - message passing

What I’m basically building at this point, is a program-internal RPC system across multiple threads. Rust does come with a crate for managing thread-safe communication, and it’s called std::sync::mpsc – this would be a good time to browse it a little if you’re not familiar.

To showcase this concept I’m going to use the somewhat common tradition of building some machinery to deliver the current time. Let’s start with the messages that you might send to a Clock component.

// These are general messages you might send to any Component
pub enum ComponentLifecycleRequest {
    Shutdown
}

// Clock specific requests, note the embedding of ComponentLifecycleRequests in the enum of requests that Clock components take
pub enum ClockRequest {
    Lifecycle(ComponentLifecycleRequest),
    GetCurrentTime
    // More complicated serialized function calls as request might look like:
    // ChangeCurrentOffset(ClockOffset)
}

Since it’s usually not enough to just send requests to other components, let’s devise a structure for actually being able to send back responses to requests. What we’ll do is embed the request along with a channel for sending a response in a structure called a RequestResponse<RQ, RP>.

use std::sync::mpsc::Sender;

#[derive(Debug)]
struct RequestResponse<RQ> {
    req: RQ,
    resp_chan: Sender<Option<Box<Any + Send>>>
}

NOTE This abstraction can cover some the eventually of 0, 1 or more responses with the use of Option and how you use the Sender you give it. Ideally in actual use you would distinguish between these subcases at the type level (whether with the type of resp_chan, or RequestResponse itself), more specifically differentiating between one-time-use channels, streams, etc.

With this general idea of the request/response envelope in place, let’s add a trait that represents Components that can take and process simple request/response envelopes:

pub trait HandlesSimpleRequests<RQ> {
    fn handle_simple_req(envelope: RequestResponse<RQ>) -> Result<(), Error>;
}

Now, all we have to do impliment this trait for any components that handle requests and we’ve got a decent generic interface for all components.

It’s a bit of a jump but but now that we have a simple structure that could work for message passing between components, let’s see what the loop looks like for dealing with them.

Message handling in a tight control loop

Message handling is the next concern – time to write a relatively simple control loop to use in spawn() for a thread:

pub fn spawn<C, RQ>(c: C) -> Result<JoinHandle<C>, Error> where C: Component + Send + HandlesSimpleRequests<RQ> {
    Ok(thread::spawn(move || {
        loop {
            let msg = c.cmd_bus.recv();
            match msg {
                 RequestResponse(rq) => {
                     c.handle_req(rq);
                     thread::yield_now();
                 },
         // Eventually some request handler will call "break" to escape this endless control loop

                 // panic might be a little bit extreme here, but then again, so is getting a message that we didn't expect.
                 _ => panic!("unrecognized message sent to clock {:?}", msg)
            }
        }

        Ok(c)
    }))
}

NOTE I included thread::yield_now() after every handled message to introduce a little bit of hybrid cooperative multi-tasking. This would likely help in the multi-component (which means multi-thread) single core case, but I’m not 1000% sure it’s a great idea.

Note that there’s been a bit of a design decision here – I could have made start() be the threaded implementation, since it’s return type is !, it could spawn the necessary threads and immediately go into the control loop. I think this way is better because it allows for more fine-grained control, given the return of a JoinHandle.

This code is good, but it’s a little inflexible because it requires your component to be Sendable. Check out the completed code to see what this looks like without that requirement.

Some examples of other extensions building on Component

Here’s an example of extending a component to identify it at configurable:

Configurable<C> components

pub trait Configurable<C>: Component {
    fn update_config(&mut self, updated: Option<C>) -> Result<(), Error>;
    fn get_config(&self) -> &C;
}

This is obviously pretty simple – you could also choose to internalize the type by using rust’s associated type feature, but I personally like having to express the configuration that something takes in the type itself.

FileConfigurable<C> components

Here’s a further specialization of a ConfigurableComponent that specifically introduces the concept of a component that is configurable by one or more files:

pub trait FileConfigurable<C>: Configurable<C> {
    // it's probably a good idea to ensure that `dir` is an absolute path if you're gonna do stuff with paths....
    fn generate_config_in_dir(&self, dir: Option<Path>) -> Result<(), Error>;
    fn install_config(&self) -> Result<(), Error>;
    fn config_dir_path(&self) -> String;
}

ChildProcessManager components

While working on postmgr I found myself writing this trait to enshrine what a component that is managing a child process looks like:

pub trait ChildProcessManager where Self: Component {
    fn spawn_child_process(&mut self) -> Result<&Option<ChildProcess>, Error>;
    fn get_pid(&self) -> Option<u32>;
    fn get_command(&self) -> &Option<Command>;
    fn get_process(&self) -> &Option<ChildProcess>;
}

Pretty simple stuff here, here’s what this looked liked for postfix (this is straight from the postmgr codebase):

impl ChildProcessManager for Postfix {
    fn spawn_child_process(&mut self) -> Result<&Option<ChildProcess>, Error> {
        debug!("spawning postfix process...");
        self.command = Some(Command::new(&self.cfg.bin_path));

        let child = self.command.as_mut().unwrap().arg("start").arg("-v").spawn()?;
        self.pid = Some(child.id());
        self.process = Some(child);

        // Setup signal handlers with a channel
        let postfix_tx = self.get_cmd_bus_tx()?;
        simple_signal::set_handler(&[Signal::Int, Signal::Term], move |_signals| {
            info!("Received SIGINT/SIGTERM, forwarding to all components");
            postfix_tx.send(ReqResp{
                msg: PostfixCmd::Lifecycle(ComponentCmd::Shutdown),
                response_chan: None
            }).expect("failed to send shutdown command to postfix component");
        });

        Ok(&self.process)
    }

    fn get_pid(&self) -> Option<u32> { self.pid }
    fn get_command(&self) -> &Option<Command> { &self.command }
    fn get_process(&self) -> &Option<ChildProcess> { &self.process }
}

BONUS: Signal handling

If you looked closely, you saw that I setup handlers for SIGINT/SIGTERM inside the spawn_child_process function. One of the things I discovered when implementing this strategy in postmgr was that rust’s signal managing features are not quite there yet. There are some excellent projects like simple_signal and ctrlc that do their best to bridge the gap, but the wider signal handling story for rust is still in RFC (at the time this post was written).

One thing this architecture offers is an excellent chance for every component to handle shutting themselves down and doing any cleanup they need to do as a response to a ComponentLifecycleRequest. Trying to implement this without the message passing style leaves you in a weird spot due to the confluence of borrowing, closures and lifetimes in rust – the closure you pass to a library like simple_signals needs to (likely mutably) stop a bunch of components that have likely long since moved out of main() or need to be used later.

However, thanks to this additional RPC mechanism, you’ve got a relatively clean way of telling these threaded components to stop – just clone a component’s command bus channel’s Sender and send a ComponentLifecycleRequest from the relevant signal function handler.

Putting it all together - an example repo

Most of what was discussed in this post are the broad strokes of what you’d need to do to create this architecture. To see all of this working together, check out the rust-component-pattern-example that I’ve put together. The repo explores a fictional Clock component and most if not all of the techniques discussed in this post.

ADDENDUM: Approach/code clarification

/u/diwik on reddit pointed out that the Component trait seemed weird, given that it actually borrows mutably for life and never returns. /u/diwik absolutely has a point – it is weird, but still feels semantically correct to me – it feels like the only way to “start” a component without prescribing concurrency/parallelism primitives. If I have start() return a Result<JoinHandle<T>> then it seems like I will have prescribed somewhat early to threads, where someone might come along and want to run a component as part of some other framework (actix, let’s say).

An important bit I left behind is that the communication mechanism with the component must be known/setup before starting it:

impl Clock {
    pub fn new() -> Clock {
        let (cmd_bus_tx, cmd_bus_rx) = channel();
        let clock_cfg = ClockCfg::new();
        Clock {
            clock_cfg,
            cmd_bus_tx,
            cmd_bus_rx
        }
    }
}

This is the only way start could run, never return, and still do useful work. Looking at the working example code, I think I can make this a bit clearer by doing the following:

  1. Moving the message processing loop code I had previously in spawn to start since that’s where it was supposed to be to begin with (there was a TODO)
  2. Choosing the semantically less meaningful Result<(), ComponentError> (IMHO) as the return of start rather than !.
  3. Finding a way to make it more clear that a component HandlesSimpleRequests<M> for Component objects to at least leave an inkling of how you’re supposed to contact the thing once it takes off

As for #1, here’s the code in spawn as it stands now:

pub fn spawn<F: 'static>(constructor: F) -> Result<(JoinHandle<Result<(), ComponentError>>, Sender<ComponentRequest<ClockRequest>>), ComponentError>
where
    F: Fn() -> Clock + Send,
{

    let (tx, rx) = channel();
    let cloned_tx = tx.clone();

    let handle = thread::spawn(move || {
        let mut c = constructor();
        c.cmd_bus_tx = cloned_tx;
        c.cmd_bus_rx = rx;

        // Set up SIGINT handling for this clock component in particular
        let cloned_tx = c.cmd_bus_tx.clone();
        set_signal_handler(&[Signal::Int, Signal::Term], move |_signals| {
            println!("clock component received SIGINT/SIGTERM!");
            cloned_tx.send(ComponentRequest::Lifecycle(ComponentLifecycleRequest::Shutdown)).expect("failed to send shutdown");
        });

        // Control loop
        loop {
            let msg = c.recv_request()?;

            match msg {
                // Lifecycle handling
                ComponentRequest::Lifecycle(req) => {
                    match req {
                        ComponentLifecycleRequest::Shutdown => { break }
                    }
                },

                // Simple request-response handling
                ComponentRequest::Operation(req) => {
                    // Maybe we should do something with the result below...
                    let _ = c.handle_simple_req(req)?;
                    thread::yield_now();
                }
            }
        }

        Ok(())
    });

    Ok((handle, tx))
}

The reason this code can’t just wholesale be the start method is because of it’s prescription of a concurrency/parallelism mechanism (std::thread). I can do better by moving the control loop code to start, however.

#2 is more of a concession, as I think ! does more for the user in signaling immediately that the method will block infinitely, but then again it’s probably not a big a deal, especially if I want to do something like return an exit code.

As for #3, this is to make it a little clearer that Components must be able to handle simple requests of some message type M (hence HandleSimpleRequests<M>), which should make it a litte more clear how you have to interact with the component. I got this working by using rust’s associated types more heavily:

pub trait HandlesSimpleRequests {
    type Request;

    fn recv_request(&self) -> Result<ComponentRequest<Self::Request>, ComponentError>;
    fn get_request_chan(&self) -> Result<Sender<ComponentRequest<Self::Request>>, ComponentError>;
    fn handle_simple_req(&mut self, envelope: RequestResponse<Self::Request>) -> Result<(), ComponentError>;
}

And making sure the Component trait is a supertrait of that:

pub trait Component: HandlesSimpleRequests {
    fn get_name(&self) -> &str;
    fn start(&mut self) -> Result<(), ComponentError>;
    fn stop(&self) -> Result<(), ComponentError>;
}

I’ve updated the code in the example repo to reflect this – click here to see the commit directly.

Wrap-up

This was really fun for me to work through – I learned a lot more about rust on the way, so I wanted to share what I arrived at. A confluence of rust features – concurrency, traits, generics – have all come together to create a pattern that I think expresses very well the intent of the code and dictates a good/princpled structure.

While I didn’t go as far as to make this a library, since I’m not 1000% sure I’ve created a architecture/API surface worth of sharing and replicating, I’m happy to leave it for people to gawk at and criticize online. I’d love to hear any suggestions on how this could be improved or done better, feel free to reach out.

UPDATE As always, thanks to folks for asking questions and giving feedback on reddit!