Class Dataset

  • All Implemented Interfaces:
    java.io.Serializable, DataFormat
    Direct Known Subclasses:
    CompoundDS, ScalarDS

    public abstract class Dataset
    extends HObject
    The abstract class provides general APIs to create and manipulate dataset objects, and retrieve dataset properties datatype and dimension sizes.

    This class provides two convenient functions, read()/write(), to read/write data values. Reading/writing data may take many library calls if we use the library APIs directly. The read() and write functions hide all the details of these calls from users.

    Version:
    1.1 9/4/2007
    Author:
    Peter X. Cao
    See Also:
    ScalarDS, CompoundDS, Serialized Form
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected long[] chunkSize
      The array of dimension sizes for a chunk.
      protected java.lang.String compression
      The compression information.
      static java.lang.String compression_gzip_txt  
      protected boolean convertByteToString
      Flag to indicate if the byte[] array is converted to strings
      protected java.lang.Object convertedBuf
      The array that holds the converted data of unsigned C-type integers.
      protected java.lang.Object data
      The memory buffer that holds the raw data of the dataset.
      protected Datatype datatype
      The datatype object of the dataset.
      protected java.lang.String[] dimNames
      Array of strings that represent the dimension names.
      protected long[] dims
      The current dimension sizes of the dataset
      protected boolean enumConverted
      Flag to indicate if the enum data is converted to strings.
      protected java.lang.String filters
      The filters information.
      protected boolean isDataLoaded
      Flag to indicate if data values are loaded into memory.
      protected long[] maxDims
      The max dimension sizes of the dataset
      protected long nPoints
      The number of data points in the memory buffer.
      protected java.lang.Object originalBuf
      The data buffer that contains the raw data directly reading from file (before any data conversion).
      protected int rank
      The number of dimensions of the dataset.
      protected long[] selectedDims
      Array that contains the number of data points selected (for read/write) in each dimension.
      protected int[] selectedIndex
      Array that contains the indices of the dimensions selected for display.
      protected long[] selectedStride
      The number of elements to move from the start location in each dimension.
      protected long[] startDims
      The starting position of each dimension of a selected subset.
      protected java.lang.String storage
      The storage information.
    • Constructor Summary

      Constructors 
      Constructor Description
      Dataset​(FileFormat theFile, java.lang.String name, java.lang.String path)
      Constructs a Dataset object with a given file, name and path.
      Dataset​(FileFormat theFile, java.lang.String name, java.lang.String path, long[] oid)
      Deprecated.
      Not for public use in the future.
    • Method Summary

      All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Deprecated Methods 
      Modifier and Type Method Description
      static java.lang.String[] byteToString​(byte[] bytes, int length)
      Converts an array of bytes into an array of Strings for a fixed string dataset.
      void clear()
      Clears memory held by the dataset, such as data buffer.
      void clearData()
      Clears the data buffer in memory and to force the next read() to load data from file.
      static java.lang.Object convertFromUnsignedC​(java.lang.Object data_in)
      Deprecated.
      Not for public use in the future.
      static java.lang.Object convertFromUnsignedC​(java.lang.Object data_in, java.lang.Object data_out)
      Converts one-dimension array of unsigned C-type integers to a new array of appropriate Java integer in memory.
      static java.lang.Object convertToUnsignedC​(java.lang.Object data_in)
      Deprecated.
      Not for public use in the future.
      static java.lang.Object convertToUnsignedC​(java.lang.Object data_in, java.lang.Object data_out)
      Converts the array of converted unsigned integer back to unsigned C-type integer data in memory.
      abstract Dataset copy​(Group pgroup, java.lang.String name, long[] dims, java.lang.Object data)
      Creates a new dataset and writes the data buffer to the new dataset.
      long[] getChunkSize()
      Returns the array that contains the dimension sizes of the chunk of the dataset.
      java.lang.String getCompression()
      Returns the string representation of compression information.
      boolean getConvertByteToString()
      Returns the flag that indicates if a byte array is converted to a string array..
      java.lang.Object getData()
      Returns the data buffer of the dataset in memory.
      abstract Datatype getDatatype()
      Returns the datatype object of the dataset.
      java.lang.String[] getDimNames()
      Returns the array of strings that represent the dimension names.
      long[] getDims()
      Returns the array that contains the dimension sizes of the dataset.
      java.lang.String getFilters()
      Returns the string representation of filter information.
      int getHeight()
      Returns the dimension size of the vertical axis.
      long[] getMaxDims()
      Returns the array that contains the max dimension sizes of the dataset.
      java.lang.Class getOriginalClass()
      Get Class of the original data buffer if converted.
      int getRank()
      Returns the rank (number of dimensions) of the dataset.
      long[] getSelectedDims()
      Returns the dimension sizes of the selected subset.
      int[] getSelectedIndex()
      Returns the indices of display order.
      int getSize​(int tid)
      Returns the size in bytes of a given datatype.
      long[] getStartDims()
      Returns the starting position of a selected subset.
      java.lang.String getStorage()
      Returns the string representation of storage information.
      long[] getStride()
      Returns the selectedStride of the selected dataset.
      int getWidth()
      Returns the size of dimension of the horizontal axis.
      abstract void init()
      Retrieves datatype and dataspace information from file and sets the dataset in memory.
      boolean isEnumConverted()
      Get flag that indicate if enum data is converted to strings.
      boolean isString​(int tid)
      Checks if a given datatype is a string.
      abstract java.lang.Object read()
      Reads the data from file.
      abstract byte[] readBytes()
      Reads the raw data of the dataset from file to a byte array.
      void setConvertByteToString​(boolean b)
      Sets the flag that indicates if a byte array is converted to a string array.
      void setData​(java.lang.Object d)
      Deprecated.
      Not for public use in the future.
      void setEnumConverted​(boolean b)
      Set flag that indicate if enum data is converted to strings.
      static byte[] stringToByte​(java.lang.String[] strings, int length)
      Converts a string array into an array of bytes for a fixed string dataset.
      void write()
      Writes the memory buffer of this dataset to file.
      abstract void write​(java.lang.Object buf)
      Writes a memory buffer to the dataset in file.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
    • Field Detail

      • data

        protected java.lang.Object data
        The memory buffer that holds the raw data of the dataset.
      • rank

        protected int rank
        The number of dimensions of the dataset.
      • dims

        protected long[] dims
        The current dimension sizes of the dataset
      • maxDims

        protected long[] maxDims
        The max dimension sizes of the dataset
      • selectedDims

        protected long[] selectedDims
        Array that contains the number of data points selected (for read/write) in each dimension.

        The select size must be less than or equal to the current dimension size. A subset of a rectangle selection is defined by the starting position and selected sizes.

        For example, a 4 X 5 dataset

             0,  1,  2,  3,  4
            10, 11, 12, 13, 14
            20, 21, 22, 23, 24
            30, 31, 32, 33, 34
         long[] dims = {4, 5};
         long[] startDims = {1, 2};
         long[] selectedDims = {3, 3};
         then the following subset is selected by the startDims and selectedDims above
             12, 13, 14
             22, 23, 24
         32, 33, 34
      • startDims

        protected long[] startDims
        The starting position of each dimension of a selected subset. With both the starting position and selected sizes, the subset of a rectangle selection is fully defined.
      • selectedIndex

        protected final int[] selectedIndex
        Array that contains the indices of the dimensions selected for display.

        selectedIndex[] is provided for two purpose:

        1. selectedIndex[] is used to indicate the order of dimensions for display, i.e. selectedIndex[0] = row, selectedIndex[1] = column and selectedIndex[2] = depth. For example, for a four dimension dataset, if selectedIndex[] is {1, 2, 3}, then dim[1] is selected as row index, dim[2] is selected as column index and dim[3] is selected as depth index.
        2. selectedIndex[] is also used to select dimensions for display for datasets with three or more dimensions. We assume that applications such as HDFView can only display data up to three dimensions (a 2D spreadsheet/image with a third dimension that the 2D spreadsheet/image is cut from). For dataset with more than three dimensions, we need selectedIndex[] to store which three dimensions are chosen for display. For example, for a four dimension dataset, if selectedIndex[] = {1, 2, 3}, then dim[1] is selected as row index, dim[2] is selected as column index and dim[3] is selected as depth index. dim[0] is not selected. Its location is fixed at 0 by default.
      • selectedStride

        protected long[] selectedStride
        The number of elements to move from the start location in each dimension. For example, if selectedStride[0] = 2, every other data point is selected along dim[0].
      • chunkSize

        protected long[] chunkSize
        The array of dimension sizes for a chunk.
      • compression

        protected java.lang.String compression
        The compression information.
      • compression_gzip_txt

        public static final java.lang.String compression_gzip_txt
        See Also:
        Constant Field Values
      • filters

        protected java.lang.String filters
        The filters information.
      • storage

        protected java.lang.String storage
        The storage information.
      • datatype

        protected Datatype datatype
        The datatype object of the dataset.
      • dimNames

        protected java.lang.String[] dimNames
        Array of strings that represent the dimension names. It is null if dimension names do not exist.
      • convertByteToString

        protected boolean convertByteToString
        Flag to indicate if the byte[] array is converted to strings
      • isDataLoaded

        protected boolean isDataLoaded
        Flag to indicate if data values are loaded into memory.
      • nPoints

        protected long nPoints
        The number of data points in the memory buffer.
      • originalBuf

        protected java.lang.Object originalBuf
        The data buffer that contains the raw data directly reading from file (before any data conversion).
      • convertedBuf

        protected java.lang.Object convertedBuf
        The array that holds the converted data of unsigned C-type integers.

        For example, Suppose that the original data is an array of unsigned 16-bit short integers. Since Java does not support unsigned integer, the data is converted to an array of 32-bit singed integer. In that case, the converted buffer is the array of 32-bit singed integer.

      • enumConverted

        protected boolean enumConverted
        Flag to indicate if the enum data is converted to strings.
    • Constructor Detail

      • Dataset

        public Dataset​(FileFormat theFile,
                       java.lang.String name,
                       java.lang.String path)
        Constructs a Dataset object with a given file, name and path.

        Parameters:
        theFile - the file that contains the dataset.
        name - the name of the Dataset, e.g. "dset1".
        path - the full group path of this Dataset, e.g. "/arrays/".
    • Method Detail

      • clear

        public void clear()
        Clears memory held by the dataset, such as data buffer.
      • init

        public abstract void init()
        Retrieves datatype and dataspace information from file and sets the dataset in memory.

        The init() is designed to support lazy operation in dataset object. When a data object is retrieved from file, the datatype, dataspace and raw data are not loaded into memory. When it is asked to read the raw data from file, init() is first called to get the datatype and dataspace information, then load the raw data from file.

        init() is also used to reset selection of a dataset (start, stride and count) to the default, which is the entire dataset for 1D or 2D datasets. In the following example, init() at step 1) retrieve datatype and dataspace information from file. getData() at step 3) read only one data point. init() at step 4) reset the selection to the whole dataset. getData() at step 4) reads the values of whole dataset into memory.

         dset = (Dataset) file.get(NAME_DATASET);
         
         // 1) get datatype and dataspace information from file
         dset.init();
         rank = dset.getRank(); // rank = 2, a 2D dataset
         count = dset.getSelectedDims();
         start = dset.getStartDims();
         dims = dset.getDims();
         
         // 2) select only one data point
         for (int i = 0; i < rank; i++) {
             start[0] = 0;
             count[i] = 1;
         }
         
         // 3) read one data point
         data = dset.getData();
         
         // 4) reset to select the whole dataset
         dset.init();
         
         // 5) clean the memory data buffer
         dset.clearData();
         
         // 6) Read the whole dataset
         data = dset.getData();
         
      • getRank

        public final int getRank()
        Returns the rank (number of dimensions) of the dataset.
        Returns:
        the number of dimensions of the dataset.
      • getDims

        public final long[] getDims()
        Returns the array that contains the dimension sizes of the dataset.
        Returns:
        the dimension sizes of the dataset.
      • getMaxDims

        public final long[] getMaxDims()
        Returns the array that contains the max dimension sizes of the dataset.
        Returns:
        the max dimension sizes of the dataset.
      • getSelectedDims

        public final long[] getSelectedDims()
        Returns the dimension sizes of the selected subset.

        The SelectedDims is the number of data points of the selected subset. Applications can use this array to change the size of selected subset. The select size must be less than or equal to the current dimension size. Combined with the starting position, selected sizes and stride, the subset of a rectangle selection is fully defined.

        For example, a 4 X 5 dataset

             0,  1,  2,  3,  4
            10, 11, 12, 13, 14
            20, 21, 22, 23, 24
            30, 31, 32, 33, 34
         long[] dims = {4, 5};
         long[] startDims = {1, 2};
         long[] selectedDims = {3, 3};
         long[] selectedStride = {1, 1};
         then the following subset is selected by the startDims and selectedDims
             12, 13, 14
             22, 23, 24
             32, 33, 34
         
        Returns:
        the dimension sizes of the selected subset.
      • getStartDims

        public final long[] getStartDims()
        Returns the starting position of a selected subset.

        Applications can use this array to change the starting position of a selection. Combined with the selected dimensions, selected sizes and stride, the subset of a rectangle selection is fully defined.

        For example, a 4 X 5 dataset

             0,  1,  2,  3,  4
            10, 11, 12, 13, 14
            20, 21, 22, 23, 24
            30, 31, 32, 33, 34
         long[] dims = {4, 5};
         long[] startDims = {1, 2};
         long[] selectedDims = {3, 3};
         long[] selectedStride = {1, 1};
         then the following subset is selected by the startDims and selectedDims
             12, 13, 14
             22, 23, 24
             32, 33, 34
         
        Returns:
        the starting position of a selected subset.
      • getStride

        public final long[] getStride()
        Returns the selectedStride of the selected dataset.

        Applications can use this array to change how many elements to move in each dimension. Combined with the starting position and selected sizes, the subset of a rectangle selection is defined.

        For example, a 4 X 5 dataset

             0,  1,  2,  3,  4
            10, 11, 12, 13, 14
            20, 21, 22, 23, 24
            30, 31, 32, 33, 34
         long[] dims = {4, 5};
         long[] startDims = {0, 0};
         long[] selectedDims = {2, 2};
         long[] selectedStride = {2, 3};
         then the following subset is selected by the startDims and selectedDims
             0,   3
             20, 23
         
      • setConvertByteToString

        public final void setConvertByteToString​(boolean b)
        Sets the flag that indicates if a byte array is converted to a string array.

        In a string dataset, the raw data from file is stored in a byte array. By default, this byte array is converted to an array of strings. For a large dataset (e.g. more than one million strings), the converson takes a long time and requires a lot of memory space to store the strings. At some applications, such a conversion can be delayed. For example, A GUI application may convert only part of the strings that are visible to the users, not the entire data array.

        setConvertByteToString(boolean b) allows users to set the flag so that applications can choose to perform the byte-to-string conversion or not. If the flag is set to false, the getData() returns a array of byte instead of an array of strings.

        Parameters:
        b - convert bytes to strings if b is true; otherwise, if false, do not convert bytes to strings.
      • getConvertByteToString

        public final boolean getConvertByteToString()
        Returns the flag that indicates if a byte array is converted to a string array..
        Returns:
        true if byte array is converted to string; otherwise, returns false if there is no conversion.
      • read

        public abstract java.lang.Object read()
                                       throws java.lang.Exception,
                                              java.lang.OutOfMemoryError
        Reads the data from file.

        read() reads the data from file to a memory buffer and returns the memory buffer. The dataset object does not hold the memory buffer. To store the memory buffer in the dataset object, one must call getData().

        By default, the whole dataset is read into memory. Users can also select subset to read. Subsetting is done in an implicit way.

        How to Select a Subset

        A selection is specified by three arrays: start, stride and count.

        1. start: offset of a selection
        2. stride: determining how many elements to move in each dimension
        3. count: number of elements to select in each dimension
        getStartDims(), getStartDims() and getSelectedDims() returns the start, stride and count arrays respectively. Applications can make a selection by changing the values of the arrays.

        The following example shows how to make a subset. In the example, the dataset is a 4-dimensional array of [200][100][50][10], i.e. dims[0]=200; dims[1]=100; dims[2]=50; dims[3]=10;
        We want to select every other data point in dims[1] and dims[2]

         int rank = dataset.getRank(); // number of dimension of the dataset
         long[] dims = dataset.getDims(); // the dimension sizes of the dataset
         long[] selected = dataset.getSelectedDims(); // the selected size of the dataset
         long[] start = dataset.getStartDims(); // the off set of the selection
         long[] stride = dataset.getStride(); // the stride of the dataset
         int[] selectedIndex = dataset.getSelectedIndex(); // the selected dimensions for
                                                           // display
         
         // select dim1 and dim2 as 2D data for display,and slice through dim0
         selectedIndex[0] = 1;
         selectedIndex[1] = 2;
         selectedIndex[1] = 0;
         
         // reset the selection arrays
         for (int i = 0; i < rank; i++) {
             start[i] = 0;
             selected[i] = 1;
             stride[i] = 1;
         }
         
         // set stride to 2 on dim1 and dim2 so that every other data points are
         // selected.
         stride[1] = 2;
         stride[2] = 2;
         
         // set the selection size of dim1 and dim2
         selected[1] = dims[1] / stride[1];
         selected[2] = dims[1] / stride[2];
         
         // when dataset.getData() is called, the selection above will be used since
         // the dimension arrays are passed by reference. Changes of these arrays
         // outside the dataset object directly change the values of these array
         // in the dataset object.
         

        For ScalarDS, the memory data buffer is an one-dimensional array of byte, short, int, float, double or String type based on the datatype of the dataset.

        For CompoundDS, the memory data object is an java.util.List object. Each element of the list is a data array that corresponds to a compound field.

        For example, if compound dataset "comp" has the following nested structure, and member datatypes

         comp --> m01 (int)
         comp --> m02 (float)
         comp --> nest1 --> m11 (char)
         comp --> nest1 --> m12 (String)
         comp --> nest1 --> nest2 --> m21 (long)
         comp --> nest1 --> nest2 --> m22 (double)
         
        getData() returns a list of six arrays: {int[], float[], char[], String[], long[] and double[]}.
        Returns:
        the data read from file.
        Throws:
        java.lang.Exception
        java.lang.OutOfMemoryError
        See Also:
        getData()
      • readBytes

        public abstract byte[] readBytes()
                                  throws java.lang.Exception
        Reads the raw data of the dataset from file to a byte array.

        readBytes() reads raw data to an array of bytes instead of array of its datatype. For example, for an one-dimension 32-bit integer dataset of size 5, the readBytes() returns of a byte array of size 20 instead of an int array of 5.

        readBytes() can be used to copy data from one dataset to another efficiently because the raw data is not converted to its native type, it saves memory space and CPU time.

        Returns:
        the byte array of the raw data.
        Throws:
        java.lang.Exception
      • write

        public abstract void write​(java.lang.Object buf)
                            throws java.lang.Exception
        Writes a memory buffer to the dataset in file.
        Parameters:
        buf - the data to write
        Throws:
        java.lang.Exception
      • write

        public final void write()
                         throws java.lang.Exception
        Writes the memory buffer of this dataset to file.
        Throws:
        java.lang.Exception
      • copy

        public abstract Dataset copy​(Group pgroup,
                                     java.lang.String name,
                                     long[] dims,
                                     java.lang.Object data)
                              throws java.lang.Exception
        Creates a new dataset and writes the data buffer to the new dataset.

        This function allows applications to create a new dataset for a given data buffer. For example, users can select a specific interesting part from a large image and create a new image with the selection.

        The new dataset retains the datatype and dataset creation properties of this dataset.

        Parameters:
        pgroup - the group which the dataset is copied to.
        name - the name of the new dataset.
        dims - the dimension sizes of the the new dataset.
        data - the data values of the subset to be copied.
        Returns:
        the new dataset.
        Throws:
        java.lang.Exception
      • getDatatype

        public abstract Datatype getDatatype()
        Returns the datatype object of the dataset.
        Returns:
        the datatype object of the dataset.
      • getData

        public final java.lang.Object getData()
                                       throws java.lang.Exception,
                                              java.lang.OutOfMemoryError
        Returns the data buffer of the dataset in memory.

        If data is already loaded into memory, returns the data; otherwise, calls read() to read data from file into a memory buffer and returns the memory buffer.

        By default, the whole dataset is read into memory. Users can also select subset to read. Subsetting is done in an implicit way.

        How to Select a Subset

        A selection is specified by three arrays: start, stride and count.

        1. start: offset of a selection
        2. stride: determining how many elements to move in each dimension
        3. count: number of elements to select in each dimension
        getStartDims(), getStartDims() and getSelectedDims() returns the start, stride and count arrays respectively. Applications can make a selection by changing the values of the arrays.

        The following example shows how to make a subset. In the example, the dataset is a 4-dimensional array of [200][100][50][10], i.e. dims[0]=200; dims[1]=100; dims[2]=50; dims[3]=10;
        We want to select every other data point in dims[1] and dims[2]

         int rank = dataset.getRank(); // number of dimension of the dataset
         long[] dims = dataset.getDims(); // the dimension sizes of the dataset
         long[] selected = dataset.getSelectedDims(); // the selected size of the dataet
         long[] start = dataset.getStartDims(); // the off set of the selection
         long[] stride = dataset.getStride(); // the stride of the dataset
         int[] selectedIndex = dataset.getSelectedIndex(); // the selected dimensions for
                                                           // display
         
         // select dim1 and dim2 as 2D data for display,and slice through dim0
         selectedIndex[0] = 1;
         selectedIndex[1] = 2;
         selectedIndex[1] = 0;
         
         // reset the selection arrays
         for (int i = 0; i < rank; i++) {
             start[i] = 0;
             selected[i] = 1;
             stride[i] = 1;
         }
         
         // set stride to 2 on dim1 and dim2 so that every other data points are
         // selected.
         stride[1] = 2;
         stride[2] = 2;
         
         // set the selection size of dim1 and dim2
         selected[1] = dims[1] / stride[1];
         selected[2] = dims[1] / stride[2];
         
         // when dataset.getData() is called, the slection above will be used since
         // the dimension arrays are passed by reference. Changes of these arrays
         // outside the dataset object directly change the values of these array
         // in the dataset object.
         

        For ScalarDS, the memory data buffer is an one-dimensional array of byte, short, int, float, double or String type based on the datatype of the dataset.

        For CompoundDS, the memory data object is an java.util.List object. Each element of the list is a data array that corresponds to a compound field.

        For example, if compound dataset "comp" has the following nested structure, and memeber datatypes

         comp --> m01 (int)
         comp --> m02 (float)
         comp --> nest1 --> m11 (char)
         comp --> nest1 --> m12 (String)
         comp --> nest1 --> nest2 --> m21 (long)
         comp --> nest1 --> nest2 --> m22 (double)
         
        getData() returns a list of six arrays: {int[], float[], char[], String[], long[] and double[]}.
        Returns:
        the memory buffer of the dataset.
        Throws:
        java.lang.Exception
        java.lang.OutOfMemoryError
      • setData

        @Deprecated
        public final void setData​(java.lang.Object d)
        Deprecated.
        Not for public use in the future.

        setData() is not safe to use because it changes memory buffer of the dataset object. Dataset operation such as write/read will fail if the buffer type or size is changed.

      • clearData

        public void clearData()
        Clears the data buffer in memory and to force the next read() to load data from file.

        The function read() loads data from file into memory only if the data is not read. If data is already in memory, read() just returns the memory buffer. Sometimes we want to force read() to re-read data from file. For example, when the selection is changed, we need to re-read the data. clearData() clears the current memory buffer and force the read() to load the data from file.

        See Also:
        getData(), read()
      • getHeight

        public final int getHeight()
        Returns the dimension size of the vertical axis.

        This function is used by GUI applications such as HDFView. GUI applications display a dataset in a 2D table or 2D image. The display order is specified by the index array of selectedIndex as follow:

        selectedIndex[0] -- height
        The vertical axis
        selectedIndex[1] -- width
        The horizontal axis
        selectedIndex[2] -- depth
        The depth axis is used for 3 or more dimensional datasets.
        Applications can use getSelectedIndex() to access and change the display order. For example, in a 2D dataset of 200x50 (dim0=200, dim1=50), the following code will set the height=200 and width=50.
         long[] selectedIndex = dataset.getSelectedIndex();
         selectedIndex[0] = 0;
         selectedIndex[1] = 1;
         
        Returns:
        the size of dimension of the vertical axis.
        See Also:
        getSelectedIndex(), getWidth()
      • getWidth

        public final int getWidth()
        Returns the size of dimension of the horizontal axis.

        This function is used by GUI applications such as HDFView. GUI applications display dataset a 2D Table or 2D Image. The display order is specified by the index array of selectedIndex as follow:

        selectedIndex[0] -- height
        The vertical axis
        selectedIndex[1] -- width
        The horizontal axis
        selectedIndex[2] -- depth
        The depth axis, which is used for 3 or more dimension datasets.
        Applications can use getSelectedIndex() to access and change the display order. For example, in a 2D dataset of 200x50 (dim0=200, dim1=50), the following code will set the height=200 and width=100.
         long[] selectedIndex = dataset.getSelectedIndex();
         selectedIndex[0] = 0;
         selectedIndex[1] = 1;
         
        Returns:
        the size of dimension of the horizontal axis.
        See Also:
        getSelectedIndex(), getHeight()
      • getSelectedIndex

        public final int[] getSelectedIndex()
        Returns the indices of display order.

        selectedIndex[] is provided for two purpose:

        1. selectedIndex[] is used to indicate the order of dimensions for display. selectedIndex[0] is for the row, selectedIndex[1] is for the column and selectedIndex[2] for the depth.

          For example, for a four dimesion dataset, if selectedIndex[] = {1, 2, 3}, then dim[1] is selected as row index, dim[2] is selected as column index and dim[3] is selected as depth index.

        2. selectedIndex[] is also used to select dimensions for display for datasets with three or more dimensions. We assume that applications such as HDFView can only display data values up to three dimension (2D spreadsheet/image with a third dimension which the 2D spreadsheet/image is selected from). For dataset with more than three dimensions, we need selectedIndex[] to tell applications which three dimensions are chosen for display.
          For example, for a four dimesion dataset, if selectedIndex[] = {1, 2, 3}, then dim[1] is selected as row index, dim[2] is selected as column index and dim[3] is selected as depth index. dim[0] is not selected. Its location is fixed at 0 by default.
        Returns:
        the array of the indices of display order.
      • getCompression

        public final java.lang.String getCompression()
        Returns the string representation of compression information.

        For example, "SZIP: Pixels per block = 8: H5Z_FILTER_CONFIG_DECODE_ENABLED".

        Returns:
        the string representation of compression information.
      • getFilters

        public final java.lang.String getFilters()
        Returns the string representation of filter information.
        Returns:
        the string representation of filter information.
      • getStorage

        public final java.lang.String getStorage()
        Returns the string representation of storage information.
        Returns:
        the string representation of storage information.
      • getChunkSize

        public final long[] getChunkSize()
        Returns the array that contains the dimension sizes of the chunk of the dataset. Returns null if the dataset is not chunked.
        Returns:
        the array of chunk sizes or returns null if the dataset is not chunked.
      • convertFromUnsignedC

        @Deprecated
        public static java.lang.Object convertFromUnsignedC​(java.lang.Object data_in)
        Deprecated.
        Not for public use in the future.
        Using convertFromUnsignedC(Object, Object)
      • convertFromUnsignedC

        public static java.lang.Object convertFromUnsignedC​(java.lang.Object data_in,
                                                            java.lang.Object data_out)
        Converts one-dimension array of unsigned C-type integers to a new array of appropriate Java integer in memory.

        Since Java does not support unsigned integer, values of unsigned C-type integers must be converted into its appropriate Java integer. Otherwise, the data value will not displayed correctly. For example, if an unsigned C byte, x = 200, is stored into an Java byte y, y will be -56 instead of the correct value of 200.

        Unsigned C integers are upgrade to Java integers according to the following table:

        Mapping Unsigned C Integers to Java Integers
        Unsigned C Integer JAVA Integer
        unsigned byte signed short
        unsigned short signed int
        unsigned int signed long
        unsigned long signed long
        NOTE: this conversion cannot deal with unsigned 64-bit integers. Therefore, the values of unsigned 64-bit dataset may be wrong in Java application.

        If memory data of unsigned integers is converted by convertFromUnsignedC(), convertToUnsignedC() must be called to convert the data back to unsigned C before data is written into file.

        Parameters:
        data_in - the input 1D array of the unsigned C-type integers.
        data_out - the output converted (or upgraded) 1D array of Java integers.
        Returns:
        the upgraded 1D array of Java integers.
        See Also:
        convertToUnsignedC(Object, Object)
      • convertToUnsignedC

        @Deprecated
        public static java.lang.Object convertToUnsignedC​(java.lang.Object data_in)
        Deprecated.
        Not for public use in the future.
        Using convertToUnsignedC(Object, Object)
      • convertToUnsignedC

        public static java.lang.Object convertToUnsignedC​(java.lang.Object data_in,
                                                          java.lang.Object data_out)
        Converts the array of converted unsigned integer back to unsigned C-type integer data in memory.

        If memory data of unsigned integers is converted by convertFromUnsignedC(), convertToUnsignedC() must be called to convert the data back to unsigned C before data is written into file.

        Parameters:
        data_in - the input array of the Java integer.
        data_out - the output array of the unsigned C-type integer.
        Returns:
        the converted data of unsigned C-type integer array.
        See Also:
        convertFromUnsignedC(Object, Object)
      • byteToString

        public static final java.lang.String[] byteToString​(byte[] bytes,
                                                            int length)
        Converts an array of bytes into an array of Strings for a fixed string dataset.

        A C-string is an array of chars while an Java String is an object. When a string dataset is read into Java application, the data is stored in an array of Java bytes. byteToString() is used to convert the array of bytes into array of Java strings so that applications can display and modify the data content.

        For example, the content of a two element C string dataset is {"ABC", "abc"}. Java applications will read the data into an byte array of {65, 66, 67, 97, 98, 99). byteToString(bytes, 3) returns an array of Java String of strs[0]="ABC", and strs[1]="abc".

        If memory data of strings is converted to Java Strings, stringToByte() must be called to convert the memory data back to byte array before data is written to file.

        Parameters:
        bytes - the array of bytes to convert.
        length - the length of string.
        Returns:
        the array of Java String.
        See Also:
        stringToByte(String[], int)
      • stringToByte

        public static final byte[] stringToByte​(java.lang.String[] strings,
                                                int length)
        Converts a string array into an array of bytes for a fixed string dataset.

        If memory data of strings is converted to Java Strings, stringToByte() must be called to convert the memory data back to byte array before data is written to file.

        Parameters:
        strings - the array of string.
        length - the length of string.
        Returns:
        the array of bytes.
        See Also:
        byteToString(byte[] bytes, int length)
      • getDimNames

        public final java.lang.String[] getDimNames()
        Returns the array of strings that represent the dimension names. Returns null if there is no dimension name.

        Some datasets have pre-defined names for each dimension such as "Latitude" and "Longitude". getDimNames() returns these pre-defined names.

        Returns:
        the names of dimensions, or null if there is no dimension name.
      • isString

        public boolean isString​(int tid)
        Checks if a given datatype is a string. Sub-classes must replace this default implementation.
        Parameters:
        tid - The data type identifier.
        Returns:
        true if the datatype is a string; otherwise returns false.
      • getSize

        public int getSize​(int tid)
        Returns the size in bytes of a given datatype. Sub-classes must replace this default implementation.
        Parameters:
        tid - The data type identifier.
        Returns:
        The size of the datatype
      • isEnumConverted

        public boolean isEnumConverted()
        Get flag that indicate if enum data is converted to strings.
        Returns:
        the enumConverted
      • setEnumConverted

        public void setEnumConverted​(boolean b)
        Set flag that indicate if enum data is converted to strings.
        Parameters:
        b - the enumConverted to set
      • getOriginalClass

        public final java.lang.Class getOriginalClass()
        Get Class of the original data buffer if converted.
        Returns:
        the Class of originalBuf