Java HDF Viewer Spreadsheet Package Requirements
(revised 7/8/98)

Robert E. McGrath (mcgrath@ncsa.uiuc.edu)

Introduction

The Java HDF Viewer provides a "spreadsheet" feature for viewing and manipulating arrays and tables of numbers. The goal of this sub project is to reimplement the spreadsheet, keeping the current features, adding additional features, and creating a more general, flexible, and extensible package. This document describes the desired spreadsheet features and other technical requirements.

1. General Requirements

The fundamental goal of this work is to create a Java package that replaces the current "spreadsheet" in the JHV. This package should meet the following technical requirements:
  1. The spreadsheet must be a separate package, which can be compiled independently of the rest of the HDF Java products
    1. The spreadsheet should not directly access HDF interfaces or files
    2. The spreadsheet should not use classes from other JHV packages without a prior design review of that dependency
  2. The spreadsheet must support similar features across different types of HDF objects, including at least:
    1. 1, 2 and 3 dimensional arrays of numbers (e.g., from HDF SDS)
    2. 8-bit and 24-bit images (same look and feel as 2D SDS)
    3. tables of numbers, arrays, and strings (e.g., from HDF Vdata)
  3. The spreadsheet must be able to support data from formats other than HDF
  4. The new spreadsheet must support all the user visible features of the current JHV spreadsheet (see section 2 below for a more detailed discussion of this requirement)
  5. The spreadsheet will fix all outstanding bugs reported against the current spreadsheet (see section 1.2 below for a list of specific bugs)
  6. The spreadsheet must not require that the entire data object (array, table, etc.) reside in memory. (See section 3 below for a more detailed discussion of this requirement)
    1. Large objects should be "paged" in a generic way
    2. It should be possible to navigate and scroll through the data efficiently
  7. The spreadsheet should provide a general and flexible package that can be easily extended.
    1. cells, rows, and columns should be objects that can be manipulated
  8. The spreadsheet must support creation and modification of data. (See section 4 below for a more detailed discussion of this requirement.)
  9. The spreadsheet must be able to function as an applet in a Java-enabled Web browser
1.1 Optional Requirements

The spreadsheet package ideally may but is not required to:

  1. support the full range of user visible features seen in commercial "spreadsheets", such as complex dependencies, formulae, and report generation
1.2 Bugs and Enhancements Addressed

The spreadsheet must (attempt to) close the following bugs and enhancement requests, and other bugs which may be discovered:

  1. Bug 202
  2. Bug 206
  3. Bug 207
  4. Bug 208
The spreadsheet must address memory usage and performance problems as apply to the spreadsheet, including but not limited to:
  1. Bug 203
  2. Bug 218
The spreadsheet may address some of the general display issues, including but not limited to:
  1. Bug 204
  2. Bug 210
  3. Bug 211
  4. Bug 212

2. User visible features

Overall, the new spreadsheet package is required to support all the user features currently provided by the JHV spreadsheet. Since the redesigned spreadsheet will have a completely new (and better) look and feel, some old features may not make sense in new spreadsheet. Therefore, the general requirement will applied with judgment.

The spreadsheet package should support the following user visible features:

  1. viewing the data of
    1. an array or
    2. a table,

    3. with appropriate numerical formats:
      1. integer 32-, 16-bit
      2. float
      3. character, string (in tables)
      4. arrays (in tables)
  2. display 2-D arrays and tables
    1. display higher dimensional structures as 2-D planes
    2. sliders, arrows, and menus to navigate higher dimensional data
  3. selecting cells, blocks of cells, columns, rows, and ranges of columns and rows and pass the selected data to other functions. Examples of the receiving functions might be:
    1. save selected data as a new HDF or ASCII (HAIF) file
    2. create GIF image file from selected data
    3. create X-Y plots from selected columns of a table
  4. display of row and column labels and axes labels in appropriate formats
    1. toggle between index and scale values
    2. select row or column label to select whole row or column
    3. display summary information for each dimension, e.g., max, etc.
  5. adjustable display, with stretchable columns to show data values to user selected area of screen.
  6. editing of data and label values in the spreadsheet (See section 4 below)

3. Handling Large Data Objects

One of the most important requirements for the new spreadsheet package is that it must efficiently support very large data objects, specifically, tables and arrays which cannot fit completely into memory.

The main problem to solve is how to manage the memory used in a way that is transparent to the user, and works the same way across different platforms and versions of Java.

It should be noted that our software must not only manage the memory needed for data, we must also be careful about the data structures of our program. E.g., if the objects that manage spreadsheet take up many times the memory of the data itself (a very realistic possibility), this will greatly exacerbate the memory limitations.

The spreadsheet should be designed to create data structures in memory only when needed, and to request data only when needed. The spreadsheet should assume the existence of a general "demand paging" or buffer scheme, which will supply data from disk (e.g., parts of an HDF file) on demand. It should be recognized that the spreadsheet package itself should not be responsible for implementing such a "paging" scheme. This should be done for the whole JHV package, indeed, this should probably be part of the JHI low level interface itself.

4. Editing Data

"Modification of data" is one of the most requested features from JHV users. This takes two forms:
  1. Ability to modify single values by hand. For instance, modify a few parameter values in a Vdata.
  2. Ability to perform element-by-element arithmetic operations on equal-sized objects of the same type. E.g. given three 100x200 arrays X, Y and Z, perform the operation W = X+Y - Z.
#1 is a lot easier than #2, and is probably more often requested. #2 was supported in earlier NCSA tools. I'd vote strongly for supporting #1.

Implementation of this apparently simple feature has several components:

  1. a GUI interface which allows selection and typing of new values
  2. "active" components which capture and retain new values
  3. a mechanism for permanently storing the changed values. Here there are numerous variants of vastly different implications for implementation, for example:
    1. changes only affect the screen, and cannot be saved permanently
    2. changed data can be saved to a new file, no update to existing files allowed
    3. changed data is updated in the original file
  4. a model of the semantics user interaction, i.e., what items can be changed, what is the semantics of "save", what is the semantics of "undo", what kinds of insertion and deletion are allowed, etc.
In addition, editing data in a spreadsheet should be considered in the context of an overall "editing" capability for the JHV. This would include consistent editing features across other types of objects and of metadata. A list of features to consider might include:
  1. manipulating image data
  2. editing and/or ingesting palettes
  3. editing annotations, attributes, labels, scales, and so on
  4. converting data
  5. editing the structure of the file
    1. deleting whole objects
    2. inserting whole objects from external data
    3. creating an "empty" object, and ingesting external data into it
    4. manipulating Vgroups
It would not be wise for the spreadsheet to implement too many editing features until a coherent overall plan for "editing" features is considered.

For this reason, the key requirement for the spreadsheet is to provide the GUI which allows data to be changed, and retains the changes. The spreadsheet must be able to "write out" all data modified since the last write. The mechanism for saving data, and the semantics of when data is saved and how, are beyond the scope of the spreadsheet.



NCSA HDF Java Team 7/8/98