SharedMessenger is a message oriented framework for performing long-running tasks (over hours or days), in a way that allows the program to be shut down (even abnormally) and then resumed. It is suitable for tasks that can segmented into smaller tasks that don't need ongoing communication between the sub-tasks (somewhat similar to the way that parallel tasks can be split up).
Provision is included for managing use of external tools, to keep from overusing them. That is, in some cases, long-running tasks may want to query a search engine or website, or access a database a very large number of times, but the program wishes to use these shared resources in a non-disruptive way.
Message — A block of data that is the input or output of a sub-task. It's the main type of data that's persisted when the program isn't running.
Worker — A piece of code that performs a single type of sub-task. When it runs, it consumes a single message, does some work on it, and typically outputs one or more messages (with a message-type indicating they should be sent to other workers). A worker may need data from an external resource to do its job.
Resource — An external tool that performs useful work for us... often it's something that runs slowly, and is something that we don't want to overuse, or is something we only want to use at night (typically because it's shared with other users)
Message filter — A piece of code (specific to a message type) that decides whether to keep a message or not, and it does so quickly (eg. seconds or minutes, not hours). Typically it's used to remove duplicate messages, so that work isn't performed twice. It runs as soon as a new message is created, so that the queue doesn't have to store excess data. Unlike workers, multiple filters can be attached to a given message type (and all filters must say "keep" for the message to be kept).
Workgroup — Something that identifies several sub-tasks as being part of a larger task (eg. identifies as message as originating from a single common parent message). Workgroups are purely optional, because often messages don't care if they came from a common parent. Alternatively, a message can also be identified as being part of multiple (nested) work-groups.
Once all messages in a workgroup are processed, a workgroup handler can perform some final processing on the output data (for example, to summarize the data, or to indicate to the user that a larger task has been completed) (AKA, a destructor).
(the only two things that contain persistent data are messages and workgroups. To always allow for global data storage, there's a default (root) workgroup created that is the parent to all messages)
Workgroups can be used by filters to remove duplicate sub-tasks within a single larger task, but to allow a sub-task to be worked on again if the parent task is intentionally repeated later. (yes, workgroups are intended to be an analogue of objects)
The main message loop:
Parallelizable. Because work is broken out into separate sub-tasks, and each sub-task is performed independently of each other (with no ongoing communication between them), the framework can be run on a networked cluster, on a multiprocessor computer, or in a multithreaded environment (which may useful if we spend large amounts of time waiting for external resources to do work for us).