This is just a brainstorming session right now. We can prioritize discussion and ideas for projects later.
Discuss the difference between node hide and delete concepts -> Goal: get conceptual clarity and a plan to document/implement any necessary changes based on this. Also select/deselect vs. flag/unflag
HACKATHON NOTES: Delete
Concepts
- GINY is the API for all graph-related stuff (view and model). It is independent of Cytoscape. It contains:
RootGraph: contains nodes/edges
GraphPerspective: contains nodes/edges
- FING implements GINY. PHOEBE implements the view part of GINY. But, this is hidden from users and Cytoscape.
- Cytoscape builds on GINY by extending or implementing.
- Cytoscape components:
CyNetwork extends GraphPerspective
CyNetworkView extends GraphView
CyNode implements cytoscape.giny.Node extends giny.model.node
CyAttributes - we are happy with this
- cytoscape.visual: Mapping, Visual Style, Calculators, Manager, etc.
- Filters API
How things are right now with delete stuff
- Cytoscape API: removeNode/removeEdge with a "permanent" argument
- Cytoscape API: addNode/addEdge with with a "create" argument
Cytoscape API: createNode, but not add it to any CyNetwork. We like this.
GINY API: hide, not in GP(GraphPerspective) but still in RG (RootGraph)
- GINY API: restore, add it back to the GP
Meeting proposals about delete
- Proposal, Global remove: Cytoscape.remove()
Proposal, Local remove: CyNetwork.remove()
- Encourage coders to use Cytoscape methods, not GINY methods? CONTROVERSIAL...
- Do we hide GINY from coders?
(not related to delete) Proposal for a GraphObject class that Edge and Node extend. This would simplify the API by not having methods like this: removeNode, removeEdge, addNode, addEdge, etc. High priority for 2.3!
Final conclusion: OK, yes the OO design of CyNetwork is ugly, but, we are not going to fix it now. We are going to document.
Delete and attributes. Local and global attributes? Tag attributes into types (user loaded or computed)? It is clear that we want local network attributes. If a network is destroyed, then its local attributes would also be deleted. We could have a UI to separate attributes that users want to keep, and attributes that users do not want to keep (when nodes/edges/nets are deleted).
Conclusion on delete and attributes: Local (CyNetwork), global (Cytoscape). Seems to solve most of the problems in that you can choose whether attributes are local or global.
Separate hairy issue: deleting and plugins. A plugin could delete attributes or graph-objects that other plugins need. Plugins should never try to talk to each other, if they did, there is lots of potential for bugs. Need Plugin "Good Citizen" guidelines. If plugins do not follow these rules, they risk not being used. This is a documentation issue.
Concepts on flagging
Why does the view have select methods and the model has flag methods? This is because originally, all graphs had views. But in 2.1, this is not true. Some graphs may not have views. Graphs without views need to be able to have selected nodes/edges, hence, flag methods in CyNetwork.
- Flagging is a way by which plugins talk to each other. For example, filters flag nodes that pass a filter. Then, a separate tool can act on the selected nodes (like create subgraph).
- Clipboard could be helpful, but flagging is pretty much the same thing.
Final conclusion on selection vs. flag: only have "select" method at the CyNetwork level. Deprecate CyNetwork.flag and CyNetworkView.select.
- Improve conceptual clarity and better document the core and graph model - Is the root graph a multigraph or a graph? Is the root graph directed, undirected or mixed? How about graph perspective? Can nodes or edges be duplicated? etc.
HACKATHON NOTES: SEE METANODES NOTES.
- Discuss switching to a numbered node system instead of the current node ID as a string system
- This may be required for some graph editing use cases (having nodes with no name yet) and two nodes with the same name.
HACKATHON NOTES: Switch to number node IDs
Summary of final conclusions
- Use a unique String ID that is generated by Cytoscape (maybe Strings parsable as numbers).
- These IDs are NOT visible to users (for example in the attribute browser).
- CANONICAL_NAME, COMMON_NAME should go away, instead use:
- LABEL attribute.
- SIF reader has to ensure that there are no duplicate labels. SIF reuses a label if it can find one. We are going to write SIF files.
- Many implementation issues, the persons who deal with this will have to think about importing and exporting. Import: OK to use old attributes file. Export: Not expected to use old attribute file format.
Brainstorming ideas, not final conclusions!
- Problem: canonical name is being used as unique id, label, and DB key. We need to separate these concepts. Use numeric IDs (Strings)!
- BUT: Currently, node and edge indeces are not persistent across Cytoscape sessions. If you save a sif, this does not matter. For GML it might.
- BUT: Currently, node and edge indices are reused if nodes or edges are deleted. ???
- Proposal: use unique numerical ids for nodes, each node has a label, which is not necessarily unique. Separate discussion: get rid of CANONICAL_NAME, use LABEL. Numerical String IDs are used as keys for attributes. So they are available to programmers, but hidden to users.
- API proposal: Only use graph objects on methods. Not IDs. For example, a lot of methods look like: int [] getAdjacentEdges (int nodeIndex);
- Use case: nodes with the same label, but different molecule type. SIF cannot handle this. GML can. Maybe we sould need to handle this for SIF.
Cytoscape uses attributes to set labels. It does not use CyNode.label. We should only store the label either in attributes or in CyNode.label. Which one should it be? Big discussion follows.
Currently, labels get stored in an attribute. Some agreement exists in that it should be stored in CyNode.label. But this presents problems for the attribute browser, bacause users like to view the label in a column, and, attributes can exist separate from the graph. To solve this, the attribute browser would display CyNode.label as a column. This means that the browser can only exist if there is a non-empty root graph (ugly?). In terms of usage, it means that you cannot view expression data that you loaded if there is no graph loaded.
- SIF. We need a new more informative SIF format. We still want to support the old SIF. The mechanism to read this old SIF should be the same as it is currently. If a GML is loaded first, and then a SIF, then nodes would not be duplicated (just as it is now!).
- Discussion of undo manager
HACKATHON NOTES: undo
Summary
- Allan described his undo data structure (a stack).
- Complications: multiple networks, global operations AND local operations.
- Proposal: Visual undo manager that shows operations so that users can select actions to undo. Have a global stack and a stack per network.
- Suggestion: only make certain actions undoable, not all. We need to find use cases.
- Currently undone: attribute browser cell editing, restore deleted nodes.
- Undo layouts???
We need requests about what actions should be undoable!!! (cytostaff list)
- add/remove nodes and/or edges
- edit attribute value
- destroy network
- destroy nodes/edges
- Prioritize subsystems for refactoring PROPOSAL FOR RETREAT
- Discuss new documentation options e.g. wiki based, which can be translated to PDF and Java help. DISCUSS HT
HACKATHON NOTES: Documentation
- Wiki based
- Table of contents with links to chapters/sections
HTML -> PDF tools could be used.
- Seems like Wiki is enough for aesthetic purposes.
Mike suggested using DocBook, but it is not as easy to use as Wiki. On the other hand, DocBook has a lot of file type conversion capabilities, including to Java Doc type.
- It would be useful to have at the end of each page a discuss link that allows users to enter their own advice, documentation, comments, etc.
Ideal solution would be to convert Wiki to DocBook using a script. Googling resulted in several possibilities of converters.
- Plan core code cleanup - removing old libraries, old classes, clean up of package structure.
- How can we package code so it is easier for developers to load - we have a problem with core plugin and library code being in too many places.
- Clarify and document the difference between core and non-core and decide on policies for making future decisions on this.
HACKATHON NOTES: What is core, and what is plugin?
Action items:
We need to come up with a basic review process by which plugins become core plugins: not biological, peer reviewed (not buggy, stable), useful to users. End point of review process would be to decide where to include the new plugin/library. We will come up with a formal proposal.
- We need to have one single Java Doc that contains everything that coders need.
- Wiki that contains links to Java Docs and downloading sites of all libraries that Cytoscape uses.
We do not HAVE TO reorganize the core (right now). But, looking forward, we should try to keep things more organized.
General discussion
- We need to know where new code fits in: core, plugins, core plugins, library???
- Library vs. plugin. Plugins that are used as libraries should be very modular (API separate from the plugin class). Plugins that make use of these libraries, include the library jar in their paths.
- lib/ : what should it include?
- what is core???
- Most of the group agrees that all source that we own should be included in the core. ant can have separate targets for each component. But, then anyone can change code that someone else wrote.
- (SIDENOTE) Have a controled vocabulary area for bio-semantics (for example, PROTEIN, DNA, etc). This should probably be a "semantics plugin".
Review community development process and core coding conventions, etc -> goal is to increase quality of our codebase and application. ETHAN RELATED, RETREAT
- Discuss user interface standardization issues (Benno brought this up 2 years ago).
HACKATHON NOTES:
- All agree that this is a good goal. No time to do it right now. We have bigger problems.
MetaNodes
Discussion of how to handle "compound" node types -- e.g. complexes, families (sets) (related to MetaNodes above...) (Note: I'd like to make this a high priority for discussion. Alex Pico of the GenMAPP project will be participating in the Hackathon and can go into considerable depth on the requirements for these constructs. It would be very useful to have this discussion while we have the opportunity to pick Alex's brain).
HACKATHON NOTES: Metanodes and hyperedges.
Alex Pico could not get in a plane. So we had the discussion without him.
Summary
- We agreed that we are no longer dealing with a 'traditional' graph (directed or undericted, one level). We have mixed types of edges (directed, underected) and metanodes.
- New terminology for our complex graphs (meta-mixed-networks?):
- multi-edges between nodes
- mixed edge direction
- metanodes
- Work group should do some research to see what's out there, what has been done, etc.
Have 'renderers'/'modelers'/'converters'/'mappers' (we did not agree on a term) that depict meta-networks in different ways. Each iterpretation of a meta-network would be a GraphPerspective. The GraphPerspectives need to be connected to each other so that they correctly reflect changes to the model. This plugin/code would be a layer on top of Nerius' renderer.
- This is a very BIG job. We need to find use cases and take care of those.
Need a work group!!!! Show of hands: Allan, Iliana and Melissa want to get really involved in this. Gary wants to review progress and have some input.
- Discuss new rendering engine that Nerius is building
HACKATHON NOTES: New rendering
Summary
- Nerius showed us his new rendering tool. There was general agreement in that it was impressive and fast.
- His API only contains static methods.
- The static render method takes objects that specify node and edge details (positions, colors, etc). This facilitates communication to databases that contain network information.
- Nerius says that we will use this new renderer for 2.3! Cool.
- Simplifying file formats and implementing the save session feature. Does .sif need to have two ways to specify interactions? Do we need GML if we have a save session file?
HACKATHON NOTES: File formats
Current types of file formats in Cytoscape:
- sif, network, i/o
- gml, network, i/o
- noa, attr, i/o
- eda, attr, i/o
- expression, TP, i/o
.onto, .anno, .syno, .obo (BioDataServer)
- GO .onto, .anno, .syno, (Kei)
Issues on each file format:
- expression file: should be a noa file
- GO: information gets loaded onto runtime memory, not reasonable for big species,(different topic: databases)
- GML: We are not respecting the format. Node integer IDs are ignored.
Action items:
- Test Rowan's file format for attributes
Other items under general heading: Questions/Issues/TODO on saving state below
Related topic:
- State saving
- Kei is working on this for 2.3. Summary of his strategy:
- zip contains all needed files
- network files: XGMML
- cysession.xml: tree strcture of loaded nets, cyto-panels state
- vizmap.props
- cytoscape.props
- Uses JAXB (Sun library for serializing)
- schema files are used to define contents of XGMML and XML files: XGMML.xsd, cysession.xsd
Questions/Issues/TODO on saving state
- how do we package? zip? Group agreed to use zip.
- save sif files or XML files? Maybe an exporter between the two formats would be a good solution.
- Package all files inside a directory that then gets zipped (so that when it gets unzipped, all files are within a single directory)
- Time-stamp the zip
- Naming issues: .cys or .cyto file extension
Plugins should be able to save their state. A listener for "state saving" would be a good strategy. Plugin coders would be responsible for saving their own state (writing and reading). Plugin API could have two methods to write and read. Details are somewhat complicated, but we all agree that it is doable and needed. This is not planned for 2.3, but we can discuss it.
- Optional interface to select files that should be included in the zip? Maybe just add option to delete attributes or other types of data that user does not want in the state. This probably belongs to the deletion item in this list.
- Other ideas thrown during the meeting: filter used to specify what data to save. Default filter saves everything.
Avoid duplication: only save the RootGraph, and a separate file would list for each GraphPerspective what nodes and edges it contains?
Big question: Should RootGraph be saved?, and a separate file contain what nodes/edges are in GraphPerspectives? or, do we only save GraphPerspectives? This is related to WHAT DELETE MEANS!!! So maybe, we leave this point to later. Use cases that need the RootGraph saved: Hyperedges (or meta-nodes), and undo.
Other big questions: Should the RootGraph be visible to users?
- Do we need to standardize context menus for nodes and edges or is the current model fine? It seems to be working fine, but will it scale? One reason for standardizing in the core is that the current model uses Piccolo and that may be replaced.
HACKATHON NOTES: context menus
- Current code and API needs to be cleaned
- Does not affect Nerius's rendering library.
- We need a public API for context menus.
- For plugins, we need a menu API.
- Cytoscape and databases. For example, GO structure is too big, and should not be loaded into runtime memory. Would be nice to have a database API.
HACKATHON NOTES: Databases
- We do not intend to come up with a general solution. Each data source is different:
- GO
- Networks: to fully use Nerius's renderer
- Attributes
- Expression data
EVERYONE: When implementing new plugins or features, take this new direction into account. Ask yourself: How would this new feature fit in with a database strategy???
- ID mapper services. Maybe includes cross-species services. Hard bioinformatics problem. Very much needed.
HACKATHON NOTES
- Should Cytoscape deal with identifiers?
- Use cases:
- 1. Convert one id to another (for example, id for a nucleotide sequence to amino acid sequence)
- 2. Given an id tell me everything you know about that id
- 3. Semantically equivalent ids
- 4. Gene name lookup
Gary: could convert CPATH API to Cytoscape. Architecture is biological semantics free. This would replace BioDataServer.
- Could be in 2.3.
- Event model. A couple of participants are interested on reviewing the existent event model. This part of Cytoscape needs refactoring and a new architecture. GROUP PROJECT, PROPOSAL BY PARTICIPANTS.
- Please add your discussion points here.