Java HDF Viewer Spreadsheet Package Requirements
(revised 7/8/98)
Robert E. McGrath (mcgrath@ncsa.uiuc.edu)
Introduction
The Java HDF Viewer provides a "spreadsheet" feature for viewing and manipulating
arrays and tables of numbers. The goal of this sub project is to
reimplement the spreadsheet, keeping the current features, adding additional
features, and creating a more general, flexible, and extensible package.
This document describes the desired spreadsheet features and other technical
requirements.
1. General Requirements
The fundamental goal of this work is to create a Java package that replaces
the current "spreadsheet" in the JHV. This package should meet the
following technical requirements:
- 
The spreadsheet must be a separate package, which can be compiled independently
of the rest of the HDF Java products
- 
The spreadsheet should not directly access HDF interfaces or files
- 
The spreadsheet should not use classes from other JHV packages without
a prior design review of that dependency
- 
The spreadsheet must support similar features across different types of
HDF objects, including at least:
- 
1, 2 and 3 dimensional arrays of numbers (e.g., from HDF SDS)
- 
8-bit and 24-bit images (same look and feel as 2D SDS)
- 
tables of numbers, arrays, and strings (e.g., from HDF Vdata)
- 
The spreadsheet must be able to support data from formats other than HDF
- 
The new spreadsheet must support all the user visible features of the current
JHV spreadsheet (see section 2 below for a more detailed
discussion of this requirement)
- 
The spreadsheet will fix all outstanding bugs reported against the current
spreadsheet (see section 1.2 below for a list of
specific bugs)
- 
The spreadsheet must not require that the entire data object (array, table,
etc.) reside in memory. (See section 3 below
for a more detailed discussion of this requirement)
- 
Large objects should be "paged" in a generic way
- 
It should be possible to navigate and scroll through the data efficiently
- 
The spreadsheet should provide a general and flexible package that can
be easily extended.
- 
cells, rows, and columns should be objects that can be manipulated
- 
The spreadsheet must support
creation and modification of data. (See section
4 below for a more detailed discussion of this requirement.)
- 
The spreadsheet must be able to function as an applet in a Java-enabled
Web browser
1.1 Optional RequirementsThe spreadsheet package ideally may but is not required to:
- 
support the full range of user visible features seen in commercial "spreadsheets",
such as complex dependencies, formulae, and report generation
1.2 Bugs and Enhancements AddressedThe spreadsheet must (attempt to) close the following bugs and enhancement
requests, and other bugs which may be discovered:
- 
Bug 202
- 
Bug 206
- 
Bug 207
- 
Bug 208
The spreadsheet must address memory usage and performance problems as apply
to the spreadsheet, including but not limited to:
- 
Bug 203
- 
Bug 218
The spreadsheet may address some of the general display issues, including
but not limited to:
- 
Bug 204
- 
Bug 210
- 
Bug 211
- 
Bug 212
2. User visible features
Overall, the new spreadsheet package is required to support all the user
features currently provided by the JHV spreadsheet. Since the redesigned
spreadsheet will have a completely new (and better) look and feel, some
old features may not make sense in new spreadsheet. Therefore, the
general requirement will applied with judgment.
The spreadsheet package should support the following user visible
features:
- 
viewing the data of
- 
an array or
- 
a table,
with appropriate numerical formats:
- 
 integer 32-, 16-bit
- 
 float
- 
 character, string (in tables)
- 
 arrays (in tables)
- 
display 2-D arrays and tables
- 
display higher dimensional structures as 2-D planes
- 
sliders, arrows, and menus to navigate higher dimensional data
- 
selecting cells, blocks of cells, columns, rows, and ranges of columns
and rows and pass the selected data to other functions. Examples
of the receiving functions might be:
- 
save selected data as a new HDF or ASCII (HAIF)
file
- 
create GIF image file from selected data
- 
create X-Y plots from selected columns of a table
- 
display of row and column labels and axes labels in appropriate formats
- 
toggle between index and scale values
- 
select row or column label to select whole row or column
- 
display summary information for each dimension, e.g., max, etc.
- 
adjustable display, with stretchable columns to show data values to user
selected area of screen.
- 
editing of data and label values in the spreadsheet (See section 4 below)
3. Handling Large Data Objects
One of the most important requirements for the new spreadsheet package
is that it must efficiently support very large data objects, specifically,
tables and arrays which cannot fit completely into memory.
The main problem to solve is how to manage the memory used in a way
that is transparent to the user, and works the same way across different
platforms and versions of Java.
It should be noted that our software must not only manage the memory
needed for data, we must also be careful about the data structures of our
program. E.g., if the objects that manage spreadsheet take up many
times the memory of the data itself (a very realistic possibility), this
will greatly exacerbate the memory limitations.
The spreadsheet should be designed to create data structures in memory
only when needed, and to request data only when needed. The spreadsheet
should assume the existence of a general "demand paging" or buffer scheme,
which will supply data from disk (e.g., parts of an HDF file) on demand.
It should be recognized that the spreadsheet package itself should not
be responsible for implementing such a "paging" scheme. This should
be done for the whole JHV package, indeed, this should probably be part
of the JHI low level interface itself.
4. Editing Data
"Modification of data" is one of the most requested
features from JHV users. This takes two forms:
- 
Ability to modify single values by hand. For
instance, modify a few parameter values in a Vdata.
- 
Ability to perform element-by-element arithmetic
operations on equal-sized objects of the same type. E.g. given three 100x200
arrays X, Y and Z, perform the operation W = X+Y - Z.
#1 is a lot easier than #2, and is
probably more often requested. #2 was supported in earlier NCSA tools.
I'd vote strongly for supporting #1.Implementation of this apparently
simple feature has several components:
- 
a GUI interface which allows selection
and typing of new values
- 
"active" components which capture and
retain new values
- 
a mechanism for permanently storing
the changed values. Here there are numerous variants of vastly different
implications for implementation, for example:
- 
changes only affect the screen, and
cannot be saved permanently
- 
changed data can be saved to a new
file, no update to existing files allowed
- 
changed data is updated in the original
file
- 
a model of the semantics user interaction,
i.e., what items can be changed, what is the semantics of "save", what
is the semantics of "undo", what kinds of insertion and deletion are allowed,
etc.
In addition, editing data in a spreadsheet
should be considered in the context of an overall "editing" capability
for the JHV. This would include consistent editing features across
other types of objects and of metadata. A list of features to consider
might include:
- 
manipulating image data
- 
editing and/or ingesting palettes
- 
editing annotations, attributes, labels,
scales, and so on
- 
converting data
- 
editing the structure of the file
- 
deleting whole objects
- 
inserting whole objects from external
data
- 
creating an "empty" object, and ingesting
external data into it
- 
manipulating Vgroups
It would not be wise for the spreadsheet
to implement too many editing features until a coherent overall plan for
"editing" features is considered.For this reason, the key requirement
for the spreadsheet is to provide the GUI which allows data to be changed,
and retains the changes. The spreadsheet must be able to "write out"
all data modified since the last write. The mechanism for saving
data, and the semantics of when data is saved and how, are beyond the scope
of the spreadsheet.
NCSA HDF Java Team 7/8/98