Remote Job API |
Scooter Morris, Barry Demchak |
2015-06-04 |
Initial |
Remote Job API |
Kei Ono, Barry Demchak |
2015-06-05 |
Comments |
Contents
Proposal
Beginning in Cytoscape 3.3, we hope to enable the execution of long running jobs remote from Cytoscape. As with PSICQUIC, a connection to an external service requires custom Cytoscape code (either core or app, but called app for the purpose of this discussion) to manage the overall workflow and the the dataflow between the service and the Cytoscape model. In this RFC, we propose mechanisms for assisting the dataflow, while leaving specific handshaking and interfacing to the app and service themselves.
Specifically, this proposal intends to be agnostic as to the specific protocols for interfacing with remote computations. (At this time, we foresee interfacing with both Opal-based services and CI-based services, both of which share basic protocol philosophy, but differ in many details. We intend to accommodate services whose interface details we don't know about yet, too.) It focuses on generic Cytoscape infrastructure that assists app code in interfacing with the Cytoscape model and remote computation.
Background
Provide a brief description of the background of project.
Sample Use Case
As a base case, suppose a remote execution platform (aka service) that is capable of accepting a number of computation parameters (possibly including one or more networks), returning a token identifying the computation, returning status for the computation, and returning a result (possibly including one or more networks or other data). The computation would execute detached from Cytoscape until it completes or until the computation is aborted.
Once Cytoscape initiates the computation, it would would poll status until the result is ready, and would then download the result and process it.
Variations of this scenario may include (for example):
- the computation not returning a token, but executing until completion and then returning the result immediately
- the computation providing management facilities for users to inspect and manipulate job queues independent of Cytoscape
- the computation providing facilities for custom Cytoscape code to manipulate job queues from within Cytoscape
- the computation requiring user credentials, which the custom Cytoscape code could procure, or which may be procured via other means
- the computation being able to send e-mail to notify the user when the computation is complete
Whichever variation the computation presents, we assume custom Cytoscape code will be built accordingly.
In a typical scenario:
- User executes an app that communicates with the remote computation
- App displays dialog box that gets execution parameters from user
As Tunables or Custom UI
- If the external service requires authentication, user credentials should be entered in this step
- App combines parameters and a network/table residing in the Cytoscape model
- App initiates computation, passing parameters and network/table
- Computation returns job token to Cytoscape
- Q. Does Job token includes user information?
- Cytoscape stores job token as part of the Cytoscape session
Q. Does Cytoscape automatically creates a new file to save a snapshot of the state?
- If exact same state is required for merging result, Session may need unique ID, like MD5.
- User saves Cytoscape session, including the job token
- User terminates Cytoscape
- User starts Cytoscape some time later
- User initiates app to begin polling for computation completion
- Upon completion, app downloads result
- The state may be different if the session is not a snapshot of the state when user started the Job.
- If the state is different from original, it may need to display warning
- App integrates result into Cytoscape session (as a new network, a merged network, a new display, or in some other way)
- App purges result
Other Scenario (Stateless Tasks):
(Optional) User enters ID/PW to a service. This will be saved as an encrypted data in CytoscapeConfig dir
- Select a menu item to start a service caller
- Enter parameters for the job
- Submit a job to external service
Job IDs will be saved in CytoscapeConfig dir
- Do whatever he/she wants
- Quit Cytoscape
- User starts Cytoscape again
Load the Job list from CytoscapeConfig
- If the list is not empty, check status of them
- Once it is finished, ask user to get/discard result
- If user wants the result, just fetch it whatever the current state is
- Remove the Job ID from the list
Note: There are many use cases of this. In general, if an external service is for generating new networks/tables, it can be merged to any session.
Technical Proposal
Generally, executing a remote computation from Cytoscape involves supporting the following steps (called the Cytoscape internal workflow):
- marshalling model data and execution parameters
- executing the remote job
- tracking job status
- retrieving job results
- unmarshalling result data into the model or elsewhere
We propose an API that can be used by app code code to implement a link between Cytoscape and a remote computation. The general idea behind the API is that each step of Cytoscape internal workflow could be implemented differently for different remote computations (which implicitly means possibly different job execution environments), but sometimes the implementations should be shared or reused. So, in the new API, we have the following objects:
CyJob A CyJob is an instance of an external execution that will (possibly receive data from Cytoscape, perform some external task, and (possibly) get data back from Cytoscape. In general, a CyJob is returned from a CyJobExecutor.
CyJobData contains the results from a data marshaller or fetcher to be handed to a CyJobExecutor. There are two types of CyJobData objects: CyJobBinaryData and CyJobStringData.
CyJobViewMarshaller, CyJobNetworkMarshaller, and CyJobTableMarshaller will implement the actual marshalling of data. Note that to marshall a series of network views with its underlying model including tables, you would only use the CyJobViewMarshaller, which extends the CyJobNetworkMarshaller. The CyJobNetworkMarshaller extends the CyJobTableMarshaller -- the idea being that a view marshaller will need to marshall the tables, etc.
CyJobExecutor will implement the necessary exchange with the remote environment to submit or execute the job. This could logically be run as part of a task so user's can select the parameters and objects for the execution. However, once the CyJobExecutor completes it is assumed the the task will complete and that the job runs asynchronously.
CyJobStatusChecker checks on the status of the job.
CyJobFetcher fetches the results of a job.
CyJobUnmarshaller unmarshalls the data and (possibly) registers the results with the appropriate managers
Q. Do we need a ServiceUser interface to access user credentials if authentication is required? Or is it part of CyJob?
Implementation Plan
Outline and describe the process and major issues related to implementing this proposal. Illustrate your plan when possible. Try this free online tool for making diagrams -> Best4c (draw; save; then insert hyperlink into this page)
Project Management
Project Timeline
Provide a timeline for implementation. Insert a graphic if you can. Try this free online tool for making project timelines -> Help-u-Plan (create a new chart; modify; right-click to save gif; then attach to this page)
Tasks and Milestones
Outline the major milestones and tasks involved in implementation.
Milestone 1: …
- Task 1: ...
- Task 2: ...
Milestone 2: …
Project Dependencies
Outline and projects that depend on this project, link to relevant RFC's and note at what point dependent projects could be started.
Related RFCs
Link to other related RFCs
Issues
List any issues, conflict, or dependencies raised by this proposal
Comments
[Scooter] I think this covers pretty much all of the cases I can think of for the various kinds of job execution environments. I'm pretty sure we can even write a CyJobExecutor that will launch binaries on the local machine and monitor for completion and then unmarshall the resulting data. I'm still not happy with the marshalling step. An alternative would be to have a more generic CyJobMarshaller interface the took views, networks, and tables in a single input statement and then just "did the best that it could". Anyone have a better approach? At any rate, I can land this in develop so people can take a look at it before I start adding any implementation. From the core perspective, the only implementation will be an implementation of the CyJobsManager, which monitors and manages all of the jobs. When a job completes, the CyJobManager will execute a task that will ask the user if they want to fetch the data (or dispense with it), fetch the data, and unmarshall it.
[Barry] I'm concerned about maintaining the linkage between a job token and the actual running job. Job tokens are too easy to lose, and once lost, the remote job is essentially marooned, with nothing to kill it and nothing to receive its result. While this linkage is really a matter between the app code and the remote service itself, fragile job tokens imply an out-of-band means for maintaining the service's job queue. Again, this, too, is the business of the service. For CI, we're considering having the job queue queryable, which then allows the job token to be soft state. The user would be able to recover the job token by enabling the app to query the queue, and then allowing the user to select a job of interest. This assumes the user can supply credentials that lead to a validated identity, and the CI will provide this feature.
[Barry] Scooter has said that he expects that a service will return a standalone network or some other autonomous entity. This enables Cytoscape to provide meaningful context for the computation result robustly. He also allows for the possibility that a smart unmarshaller could weave a computation result into an existing Cytoscape state that's consistent with the result. This also makes sense, and would need to be implemented as a special unmarshaller associated with the app. For example, a layout service may return only coordinates, which can be fused with an existing network if the network can be determined to match the layout.
[Barry] Scooter proposes a Status class that enumerates a reasonable set of remote calculation status. Given an arbitrary remote service, I can see the wisdom in having apps map remote status to Scooter's Status, but ultimately, this choice is a matter for the app, as it is responsible for presenting status to the user or processing the result. My judgment would be against conventionalizing Status across all apps. Scooter's class is:
public enum Status { SUBMITTED("Submitted"), QUEUED("In queue"), RUNNING("Running"), TERMINATED("Terminated"), FINISHED("Successfully finished"), FAILED("Failed"), ERROR("Finished with errors or warnings"); private final String name; Status(String n) { this.name = n; } public String toString() {return name;} } We could certainly add a PURGED value.
[Scooter] As outlined, the API returns CyNetworks, CyNetworkView, or CyTables. None of those would necessarily depend on any given Cytoscape state. Of course, a remote layout App might want to marshal a CyNetworkView and send it off to a server to process, and the result would be a simple CyTable that the App would need to merge in with the network. Oops -- that brings up some missing pieces, which is that we'll need to add some events so that Apps can get the results of CyJobs.
Add comment here…
How to Comment
Edit the page and add your comments under the provided header. By adding your ideas to the Wiki directly, we can more easily organize everyone's ideas, and keep clear records. Be sure to include today's date and your name for each comment. Try to keep your comments as concrete and constructive as possible. For example, if you find a part of the RFC makes no sense, please say so, but don't stop there. Take the extra step and propose alternatives.