Unlike the rest of the curation documentation this page is written from a purely technical perspective and is intended for a developer audience familiar with both ReDBox/Mint and the Fascinator platform on which they are built. For this reason there is a lot less 'hand holding' and certain assumptions are made about system knowledge and terminology without linking off to additional documentation. Having said that, if you are new to developing on this application and require any clarification just ask on the mailing list, the brevity here is not an attempt to be dismissive of newcomers, just an attempt to avoid over-explanation for an audience of developers. The Way Things WereThe traditional Fascinator tool chain is implemented as a series of queue consumers in the Core library with an intrinsic knowledge of each other and a hardcoded framework for routing work based on the type of objects being ingested. It was also trying to solve a very different problem to what ReDBox and Mint are presented with. To summarise the setup:
In earlier ReDBox work we had already run into issues with this tool chain when implementing VITAL integration. We wanted our workflow save events to trigger a trip through the tool chain (to get new RIF-CS templates) and then send our VITAL subscriber a message. However the hardcoded messages didn't contain the specific data we needed, and there wasn't an easy method of detecting that the tool chain had completed to then send the message ourselves. The end result was a very light, hacked together, version of the tool chain running at a couple of points in the ReDBox front-end. It wasn't particularly elegant, but it worked. Design GroundworkThe new solution we thrashed around and put forward in v1.2 was to start by adding a new queue consumer to the Fascinator core, called the The public static enum OrderType {INDEXER, MESSAGE, SUBSCRIBER, TRANSFORMER} The exact behaviour of each is outlined below, and they are all JSON objects:
In all of the cases above you will note the careful use of the phrase "the indicated object", as opposed to "this object". This is because each order could (in theory) relate to a different object. This is at odds with the typical thinking of the tool chain, where one object is considered to pass 'through' the tool chain, somewhat like you would imagine a physical object passing through a factory. The new setup however, allows for the possibility that a At the moment, the Curation Manager (next section) in ReDBox does not make use of this functionality, but when designing a distributed curation process across two systems and many objects we wanted to allow for this flexibility. Developers should consider this option carefully in practice since you want to balance flexibility with code intuitiveness and ease of maintenance. Sending a message relating to Object A into a queue consumer and finding that it takes action against Object B is non-intuitive in the context of historic Fascinator behaviour. Curation ManagerThe After perusing the rest of the curation documentation you would no doubt be aware that the primary building block of this whole process has been dubbed the ' The most important factor in both system's design however is that after each discreet event has been processed the entire system must be ready for a brand-new event with a possibly different context. So we don't have things sitting around in memory waiting for follow-up events in the curation process, it all must be stored for later access. If you are ever planning on modifying this class, keep that aspect of the design in mind. There isn't actually a lot more to say about the design at this point. The general documentation covers the idea of the Finer DetailsJust a couple of finer points you might note in the code. These details aren't included in the more general documentation since users don't really need to know about them.Relationship Authority: Each relationship found in the metadata will either be stored with a flag of ' authority ' = 'true ' or the flag would be absent (it isn't typically stored as 'false ', but it would be effectively the same thing). This should only occur on one side of the relationship, and this is typically ReDBox (although Mint Parties consider themselves to be the authority on their Group relations). The object that is considered the authority (the parent for lack of a better term) is the one where the 'curation-request ' tasks should come from, and the non-authoritative record (the 'child') should only send 'curation-query ' requests. This is designed primarily to avoid infinite loops on events since the 'curation-query ' is a very lightweight task that does not propagate out through the network like a 'curation-request ' does.A much clearer demonstration is for a ' publish ' task, where only parent records will send out 'publish ' tasks to their children after receiving one themselves, avoided a really basic infinite loop that would occur if children sent a 'publish ' task straight back to the parent. Instead the child records should only be ever sending 'curation-query ' tasks to their parents, which are basically saying "I need to know your curation details, and if they change please let me know, but don't take any further actions in response to this message".The typical setup would be for a ReDBox Collection to be the authority over relationships built from its form data, such as a Mint Party record listed as the curator. The Mint Party will never try to publish or curate the ReDBox collection since it is not the authority, but it will try to do so for the associated Mint Group (the creator's faculty for example) over which is does have an authority relationship, creating a network of linked data where messages always flow outwards from the Collection. Additional Reharvest Tasks: The general documentation maps out the basic flow of curation tasks, but there some additional minor steps that occur in between many events that weren't noted there. These are simply operational chores related to ensuring that the metadata templates and the Solr Index are kept up-to-date throughout the process and appropriate entries are added to the audit log for an object. So don't be surprised if you are looking at the code and find extra calls to the harvester, indexer or subscriber. Future WorksSome thoughts/ideas/notes on possible future expansions and work in this area. This includes ideas that may never go anywhere, areas where more polish could be used, and danger areas that might want some more robust error handling in future:
Some Utility MethodsIn writing the Creating a basic order: private JsonObject createNewOrder(JsonSimple response, String type) { JsonObject order = response.writeObject("orders", -1); order.put("type", type); return order; } The '-1' parameter is important, since Fascinator's JSON Library will append to a list it finds at that path. It will create the list first if it isn't there already. Creating a specific type of order: private JsonObject newMessage(JsonSimple response, String target) { JsonObject order = createNewOrder(response, TransactionManagerQueueConsumer.OrderType.MESSAGE.toString()); order.put("target", target); order.put("message", new JsonObject()); return order; }
Obviously this builds on the existence of the first method, and there are a number of these for different types on orders. The example shows messaging since it relates to 'tasks'. Creating a task: private JsonObject createTask(JsonSimple response, String broker, String oid, String task) { JsonObject object = newMessage(response, TransactionManagerQueueConsumer.LISTENER_ID); if (broker != null) { object.put("broker", broker); } JsonObject message = (JsonObject) object.get("message"); message.put("task", task); message.put("oid", oid); return message; }
So here we've added another layer on top of the earlier methods, as well as wrapping up the syntax for:
Breaking the rules: task = createTask(response, broker, relatedOid, "curation-query"); // We won't know OIDs for remote systems task.remove("oid") ; task.put("identifier", relatedId);
So building on all of these we see some usage in context, where we know that are not going to know the internal OIDs on a remote platform, but we do know one of its known identifiers and can supply that instead. We do this by modify the task's message once it has been return to us. A different example: private void email(JsonSimple response, String oid, String text) { JsonObject object = newMessage(response, EmailNotificationConsumer.LISTENER_ID); JsonObject message = (JsonObject) object.get("message"); message.put("to", emailAddress); message.put("body", text); message.put("oid", oid); }
Finally, this example is different again, in that it leverages the newMessage() method and then adds additional information. The wrapper makes sending emails via the relevant message queue quite trivial. |