## page was renamed from Cytoscape_3.0/EventHandling ## page was renamed from RFC Template ## This template should be used for creating new RFC's (Request for comments) for Cytoscape development ||'''RFC Name''' : Event Handling ||'''Editor(s)''': BrianTurner DanielAbel ||'''Date''': Nov 21 2008 ||'''Status''': Draft || <> == Proposal == A lot of hacks are used in the event handling framework in cytoscape 2.x. We don't want 3.0 to look like that, so rethinking the complete framework for firing, listening to and handling events is needed. === Background === Many parts of cytoscape (both 2.6 and 3.0) use the [[http://en.wikipedia.org/wiki/Observer_pattern|Observer pattern]]. There is a consensus that a naive implementation of the observer pattern would not be good enough for two main reasons: * performance: . Since cytoscape has to be able to handle very large networks, firing a seperate event for every single node for a network would cause significant performance problems. * extra features: . Since all of cytoscape would be using the same framework for handling events, and events would strongly correspond to state-modification, there is a general consensus that we should try to stuff extra functionality (history, partial support for provenance tracking, etc.) into the event-framework. (The consensus is on ''trying'' to do this, no one knows whether we will be happy with the result.) == Terminology used on this page == from Effective Java: * ''core code'' is the code that provides a given cytoscape API. * ''client code'' is the code that uses a given cytoscape API. MichaelCreech: Actually, Effective Java uses the term 'client' and 'user'. It seems like much of Cytoscape would be both core and client code, given this definition. For example, if CyNetwork.removeNode() uses other parts of the Cytoscape API, such as removeEdge() than it is both client and core. Don't we really want to make a distinction with what code defines and uses Cytoscape ''event handling'' code? For example CyNetworkView is 'core code' when talking about the viewmodel API, and 'client code' when talking about the model API. When talking about the model API, CyNetworkView is just something that uses that API, just as a network analysis plugin would. When talking about the model API, CyNetworkView should not be in a special position (more special than being an important usecase; it is certainly not the only usecase.) We should make the following distinction: * ''firing an event'' is the act of creating the event object and handing it over to the event-framework * ''triggering an event'' is calling a method that results in firing an event For example, in the current svn trunk, calling CyNetwork.addNode() triggers an AddedNode event, but the firing is done in core code (in the implementation of addNode()) from [[http://en.wikipedia.org/wiki/Observer_pattern|Observer pattern]] terminology: * ''Subject'' is the object whose state-changed the event is about. * ''Observer'' is the object that listens to the event We also need to talk about * ''Actor'' is the object that triggers the event (for example, a plugin that adds some nodes) * ''Event Framework'' is the code that handles dispatching the events. I.e. Subjects don't call Observers directly, they tell the Event Framework to fire an event and the framework takes care of the rest. * ''changeset'' is the data contained in the event object * ''event correctness'' is ensuring that every event is fired (i.e. that events are not dropped or ignores) == Decisions so far == In the following points a decision has basically been made, since a consensus emerged: === general framework === The framework needs to be general enough to handle events defined by plugins, and plugins should use the cytoscape-wide framework for firing and handling internal events. (Events documented in the public API of a plugin will be used by other plugins so that these events can pass between plugins.) === push changeset design === The event object should contain enough information so that for most usecases the Observer does not need to query the Subject to figure out what changed. The event object will contain, at least, flags that indicate what parts of Subject changed. There is no consensus yet on whether the changeset should also contain the description of the new state. For many event types, specifying ''what'' changed also specifies ''how'' it changed (for example, this is the case for AddedNodes event, where both would be described by passing the new CyNode). Thus, the event could have 'full changeset' information, so that the Observer does not need to query the Subject to handle the event. Since in theory Observers can do anything they want in the event-handling code, we can't guarantee that everything needed will be in the changeset, but it should be complete enough for the common usecases. === event framework uses OSGi internally === Listeners register via OSGi and the event framework will use OSGi to look them up, i.e. a whiteboard pattern is used. See [[http://www.osgi.org/wiki/uploads/Links/whiteboard.pdf|OSGI White Boarding]] === raw toolkit events are turned into semantic 'action' events === The Presentation layer turns swing (or other toolkit) events into high-level abstract events. For a list of these events, see 'Presentation Events' in [[/PlannedCoreEventsIn3.0]]. Plugins will listen to these events, not to raw toolkit events. This mapping from raw toolkit events to high-level abstract events will be made by an EventBroker, which will allow the user to modify this mapping, and provide some support for handling the case when plugins want to exclusively handle certain events, i.e. where Observers are not independent. (I.e. ask the user what to do; see [[Cytoscape_3.0/MiniRetreatTwo]] for details) (Note that some think that drag/drop may have to be a special case; the following is from an earlier version of this page, and I am not sure what it means exactly: . "What layer of ''semantics'' would Cytoscape want to provide above the Java paradigm of Transferables and Data''''''Flavors?" Is this settled yet? -- DanielAbel) . (AllanKuchinsky: Right now drag/drop functionality in Cytoscape works directly with Java UI objects, much in the same way that node selection code works directly with MouseListeners, MouseMotionListeners. This means that plugins also have direct access to CytoscapeDesktop, InnerCanvas objects. It is easy for different plugins' drag/drop functionality to clobber each other, particularly in that it is possible to set the unique Transferable object for InnerCanvas or Cytoscape desktop. Much in the same way that we are isolating the plugin developer from low-level mouse events via the use of higher-level semantic events, should we do the same for drag/drop framework? That is, have a higher level set of semantics that doesn't, for example, provide direct access to InnerCanvas, CytoscapeDesktop to the plugin developer. This is not a foregone conclusion, since the drag/drop framework in Java is already a higher-level abstraction and deals with more complex operations, for example, than simple mouse event handling). == Open questions == These are issues where no decision has been made yet. === representing events that belong together === This is the main topic of discussion currently. In general, events will not be completely independent: a single semantical operation by an Actor will in general result in triggering multiple events. This means that Actors will be removing multiple nodes at once, or replacing nodes with others, or opening a network. The events that will be fired will generally represent lower-level operations. This means that it would make sense to handle a bunch of events together due to not only performance reasons, but also because other parts of the framework (undo/redo, etc.) could use events better. Ideas: 1. '''Batching''': The case where many events (like the creation of many nodes) could possibly be grouped together as opposed to handling each one discretely. . What does batching mean? In Eclipse a batch is defined as set of nested events: ''It is important to note that the broadcast does not necessarily occur immediately after the method completes. This is because a resource changing operation may be nested inside of another operation. In this case, notification only occurs after the top-level operation completes. For example, calling IFile.move may trigger calls to IFile.create to create the new file, and then IFile.delete to remove the old file. Since the creation and deletion operations are nested inside the move operation, there will only be one notification. '' 1. '''Transactions''': Actors should leave the model in a valid state, and conflicting changes by different actors prevented. However not everything is editing. Some plugins may be passive, providing a view like an outline, in which case they should not have to concern themselves with the added complexities implied by participating in a transaction. In Eclipse a contributor to a perspective extends/implements either IViewPart or IEditorPart to help make this distinction in roles. === per-instance or per-class firing/listening? === The current implementation in core3 in svn trunk allows 'per-class' registration for a given event, i.e. the Observer gets all events of that type. The Observer is responsible for filtering out which events were fired for the instance it is interested in. The alternative here is to allow 'per-instance' registration, where one registers as an Observer for the events of some Subject instances. In this case, the event framework would filter events to some extent, and each Observer would get somewhat fewer events it is not interested in. === what a Event object looks like === Depending on what overall design we select for the event framework, the event objects would look significantly different. Note that to a large extent the details of what an event object looks like are defined by the overall design. Since the overall design is not decided yet, this is largely only 'implementation detail', for now. === Dependent design choices === How we design certain parts of 3.0 will be influenced by the design of the event-handling framework. Proposals should consider these as usecases and discuss whether they provide partial solutions for implementing these features. Even if these features are not provided by the event framework 'for free', we should consider how much an approach helps or hinders implementing these * history * undo / redo * provenance tracking == Particularly difficult cases == The following highlight usage patterns whose correct handling complicate certain designs === pre-events === Certain events, in particluar AboutToRemove events are 'pre-events', they are sent before an operation is done. The client code has to be able to access the object about to be removed when handling this event. This means that these events cannot be fired after the operation in question is done. Similar to pre-events are Observers that require immediate delivery of events because they are time-critical. A common use case is Observers wishing to show "progress bars". -- MichaelCreech I don't think we can allow such 'time-critical Observers'. I don't see any way to optimize or batch events if the contract made by the event handling API contains any such guarantees. (If we allow a 'this is time-critical' flag when registering events, too many people will use it, even if they don't need to.) -- DanielAbel === cross-object consistency === Certain parts of cytoscape will be using events to syncronise objects. For example, a viewmodel implementation will listen to model events to keep in sync (for example add View objects as GraphObjects are added). This is an expected way of using the framework. Handling this becomes tricky because if the Actor contains code like: {{{ for (String nodeName: nodeNames){ node = network.addNode(); nodeView = networkView.getCyNodeView(node); nodeView.setVisualProperty(.....); } }}} Which means that networkView already has to have handled the event before the {{{networkView.getCyNodeView(node)}}} call. === deletion === When deleting a node (especially now that nodes exist w.r.t. only to a model, and not independently of it), listeners need to know about the node AND its context before the node is actually deleted. Thus, in a batch of events, with a naieve implementation, the node would already be gone by the time the batch was received. To circumvent this you might flag the node as being "marked for deletion", but this means, for the duration of the batch, every piece of the code the examines and graph and its nodes, needs to be sensitive to this state. (Updates can present similar issues, but deletion is the best example.) MichaelCreech: This is a special case of the Out of Sync Observer-System State problem. === threads === MichaelCreech: This is a special case of the Out of Sync Observer-System State problem. This happens when event delivery is performed in a separate thread from a thread performing actions (action thread) that fire events. In this case, the action thread may have deleted or grossly changed objects that will later be used by the Observer after event delivery. This could happen when using Core3's eventHelper.fireAsychronousEvent(), or BAM's DeliveryTiming.ASYCHRONOUS. === out of sync Observer-System state === MichaelCreech: This problem raises difficulites for solutions that attempt to allow the execution of batches of event-firing actions that delay Observer notification of these events. Here's a simple example: {{{ CyNode node1 = ...; node1.getAttrs().set ("canonicalName", "jojo"); // fire attribute changed ... node1.getAttrs().set ("canonicalName", "node1"); // fire attribute changed }}} If the above code is executed before the Observers are notified, then each call to an Observer of this CyRow would have a final value of "node1" for the current value of the "canonicalName" attribute. This may not seem like any big deal, but the Observer would get different result if run as a batch versus not running as a batch. Also, it points to how allowing actions to be performed before their corresponding notifications can affect the API in terms of the information needed by an Observer. In this case, you could change the API to include the old value and new value thereby avoiding the problem of using the actual value of the CyRow for "canonicalName". But you'd need to know your solution has this problem. Here's a nastier example: Assume we have an Observer of a create node event. It needs to know the node created and the network the node belong to. Then, we have some code that performs the following: {{{ CyNetwork net1 = ...; CyNode node1 = net1.addNode(); // fire node added event. ... net1.removeNode (node1); // fire node removed event. ... // not sure how networks are removed in core3... removeNetwork(net1); // fire network removed event }}} Assuming some sort of delay, be it by batching, threads, or something else, if all these actions are executed before the add node event is delivered, then an add node Observer could be given completely bogus information--pointers to both a non-existing node1 and to a non-existent net1. Notice that this example doesn't directly deal with deletion from the Observer's perspective--it is a creation event. I don't think the framework has deliver an event for every operation. If we batch events, and the result of some operations in the middle of the batch are overwritten by other operations, then I think it is perfectly acceptable to fire events only for the newer operations. Since the results of the first operation, they shouldn't matter. -- DanielAbel == Proposals == The following are different alternatives that were brought up so far. Note that these are not mutually exclusive, and the final design might use several of these. Where the following mentions to ''current situation'', it refers to cytoscape3.0's current implementation, in the core3 directory in the svn repository. === Do nothing, optimize event firing === Do nothing, optimize event firing so that firing one event for each node in a network is feasible pro: * reasonably simple con: * highly dubious that it could work * could result in a plethora of ''ad hoc ''solutions by plugins === optimize Observer === This is more like a usage pattern for Observers: basically the event handler only notes that it has to be updated, and only actually processes the event when it is needed. For example a NetworkRenderer could simply set a flag when recieving an event from a CyNetworkView, and only redraw/rerender, when the GUI framework calls .repaint() or similar. pro: * simple con: * doesn't save event firing overhead, so it won't solve anything === Actor fires events === (Note: this is only mentioned for completeness, the consensus is that this pattern is too flawed to use.) Instead of the Subject object's method firing the event after an operation, have Actor do the event firing. (Either in every case, or by having a 'dont_fire' flag on every method in Subject that would fire an event.) pro: * reasonably simple, Actor knows best when a batch ends con: * can't ensure correctness * almost impossible to get minimal changesets * client code 'locks' object until it fires the event (since it has . already changed it, but didn't yet fire events) for example cross-object consistency (problem 'b' above) will be a problem. === Bulk methods === In addition to having methods that operate on a single object, like .addNode() or .removeNode(), add bulk methods to the API, like .removeNodes(Collection list) etc. The implementations of these methods will fire a single 'bulk' RemovedNodesEvent. From an Observer perspective, you don't want these methods "in addition to" but instead of single object methods. The reason for this is that if you have both, then you force every Observer to handle both the single object form of the event and the bulk form since the Observer will only care that an object has changed, not whether it was a bulk change or not.--MichaelCreech No. We can have a removeNode() method that fires AboutToRemoveNodes etc. bulk events. What methods Subject has should be determined by how Actors want to use it, not what Observers want to listen to. We don't need to have a 1:1 mapping between methods and events -- DanielAbel This is basically a cleaner way to do 'Actor fires events' pro: * ensures correctness con: * Actor code might have to be restructured to first collect objects . to operate on and then call the bulk method. === Event batching in event framework === The event framework, when it is passed an event to fire does not dispatch it immediately, instead waiting for similar events to arrive. It only dispatches events later (when explicitly told so, or in an idle callback). pro: * Observer, Subject, Actor code unchanged MichaelCreech: Actually this could affect the API needed for the Observer and Subject as shown in first example under "out of sync Observer-System state". con: * pretty much impossible to implement (''AllanKuchinsky: provide more details on WHY it's impossible to implement?)'' MichaelCreech: Allan, the out of sync Overser-System state may be helpful for why it is hard to do. === lazy Subject; pull event model === Subject code implementation doesn't actually do the transformations requested from it, until it needs to. So for example, CyNetwork.addNode() wouldn't actually add the node internally, just add that to a 'pendingEditsQueue'. The transformation (adding the node, removing it, etc.) is only actually done when it would affect the result of some other method. I.e. CyNetwork.getNodeCount() would apply all pending node addition/removal events and return the result. Since transformations are applied in a lazy manner, they can be collected, and bulk events fired. pro: * Subject knows best which operations can be interchanged (which operations commute) so it can optimize the most con: * have to transition to 'pull event model': to handle the 'cross-object consistency' problem, Observers would have to explicitly ask Subject to apply pending edits if they do something that depends on Observer's state. === Edit Views (ephemeral objects for transactions) === Extend the API to allow creation of 'ephemeral objects', which provide an 'Edit View' on the object: they provide the same API as the underlying Subject, but don't fire events, and only store the difference compared to the given Subject instance. Provide a method to merge the changes in the edit view back to the original object. Actor code would create such and object, modify that, then merge those changes. The merge would be one atomic operation, thus it would automatically define a transaction. pro: * reasonably simple way to express transactions con: * some added complexity in Subject * minor changes in Actor code patterns needed === NetworkBuilder === This proposal would only handle network creation, which is the usecase where event performance will be the worst, since the current design would fire a lot of events, for not-fully-initialized networks. As a restricted version of 'Edit Views', provide a proto-network object that is exactly the same as a network, except it doesn't fire events. Provide a way to turn this proto-network into a real network. pro: * reasonably simple con: * some API bloat, since all network, subnetwork, etc. objects need Builders * doesn't handle similar usecases, for example merging network, or rewiring a network (used for randomizing it) == Table summarizing proposals == The following is a quick summary of what parts of the code the given proposal would modify. It also summarizes whether the state of Subject is modified immediatelly or lazily by a call from Actor, and whether the event firing is delayed compared to the state modification. ||Proposal ||Adds complexity to ||state modification ||event firing compared to state modification || ||Optimize observer ||Observer ||immediate ||immediate || ||Actor fires events ||Actor ||immediate ||delayed || ||Bulk methods ||Actor ||semi-delayed ||immediate || ||event batching ||event framework (a bit to Actor) ||immediate ||delayed || ||lazy Subject ||Subject (a bit to Observer) ||delayed (looks immediate to everyone) ||immediate || ||edit views ||Subject (a bit to Actor) ||delayed (looks immediate to Actor) ||immediate || ||NetworkBuilder ||Subject (a bit to Actor) ||delayed (looks immediate to Actor) ||immediate || Note that delaying the event firing compared to the state modification will cause consistency problems, handling those might be too difficult. == Related projects == * [[http://www.eclipse.org/articles/Article-Resource-deltas/resource-deltas.html|How you've changed!]] * [[http://cytoscape.org/cgi-bin/moin.cgi/Cytoscape_3.0/ModelDiscussions#head-981c3691f1f46b4149fe9f22ab58a2a310b0e787|Previous Discussion on Event Hanlding]] * [[http://wiki.c2b2.columbia.edu/workbench/index.php/Framework|geWorkBench framework for event handling]] == Prototypes == 1. Svn+ssh://Grenache.ucsd.edu/cellar/common/svn/csplugins/trunk/Agilent/creech/BAMEventModel 2. * http://wodaklab.org/cytoscapeEventAnnotations/xref/index.html (source code) * http://wodaklab.org/cytoscapeEventAnnotations (maven site) * http://wodaklab.org/viewvc/svn/cytoscapeEventAnnotations/trunk (browse svn) * svn co https://wodaklab.org/svn/public/cytoscapeEventAnnotations/trunk (checkout) == Event Examples == * [[/EventsIn2.6]] * [[/PlannedCoreEventsIn3.0]] == Comments == Everything above should be either facts or consensus (unless clearly marked as being a non-consensus opinion), put comments here: === from a previous version of this page: === The problem with batching can be imagined in this way: When deleting a node (especially now that nodes exist w.r.t. only to a model, and not independently of it), listeners need to know about the node AND its context before the node is actually deleted. Thus, in a batch of events, with a naieve implementation, the node would already be gone by the time the batch was received. To circumvent this you might flag the node as being "marked for deletion", but this means, for the duration of the batch, every piece of the code the examines and graph and its nodes, needs to be sensitive to this state. (Updates can present similar issues, but deletion is the best example.) Why did we want batching? Originally to address the issue of excess "noise". It should be noted that there are ways of reducing noise without batching all events. You, could for instance, just batch similar event types. However the failure to be able to apply a batch to any and all events means concepts like transactions if they are to be arbitrarily defined as a 'unit of work' independently of the code being batched is not possible. Why not deal with batching later? Because batching (in a transactional sense, or any other higher level abstraction that groups events) does appear to be implementable without making changes to the API, something we wish to avoid in future releases. This is an assumption and it is possible someone might think of a way to introduce batching or transactions later on that don't affect the API, but in building the two prototypes, this does not seem likely. Do we care about transactions? Isn't easier to leave them out? It is, as we discovered, however that was not our original intuition. We all thought batching would be easy. As well, introducing higher level abstractions about events, like "Undo/Redo" and "Transactions" seems like a natural progression in event handling, like hyper edges and group nodes in graphs. While the need for these things may not exist now, it doesn't seem unreasonable to imagine that we might at some future point, wish we had these capabilities and it is likely that other people will think as we did, that batching is natural extension to event handling, but of course, allowing for it comes with a price. Batching may come at such a price that the trade offs don't make it worth it. === some questions form cyto-staff === * Can a plugin both consume and produce events from within its own generated threads? * Event granularity may affect deadlock problems with the event model. * How many times will consumers be called with "false alarm" events? * Another way to think of this is how hard does the consumer have to work to determine if an event is of interest? * How many classes are needed to be written to both produce and consume a new event? * Does a consumer have control over when they subscribe/unsubscribe to events? * Is there a way to have queued and non-queued events and how do we separate them? * If all events are queued, how do I show a progress bar? * If events are queued by the producer, what if they make a mistake and forget to stop queueing? * What are the efficiency bottlenecks of the event model? * Does the event model exacerbate efficiency issues for consumers (see question 1 & 4). * How do multiple consumers share (or not share) events? * Should this be under central control?