CS 222 Spring 2016

Coding Challenge

Due: March 30, Wednesday, 10:30 am.

Language: Java

Submission:

  • Check your file encoding. It should be UTF-8
  • Checkout the project inClass_coding from SVN to get started
  • Commit your code to SVN at the end of the class

Assignment Explanation

We will do some coding in this assignment. You will be given a code and will be asked to complete a task using that code. You will be working on sparse matrix vectors. The code provided to you simply reads matrix data from file and stores it in coordinate list format (COO). Please read the explanation about sparse matrix representations provided below.

Sparse matrix representation

Matrix operations are widely used in linear algebra libraries that are crucial for countless areas in engineering and science, such as nuclear physics, microchip design, finite element analysis, high performance computing, etc. Dense matrices are typically and not surprisingly stored in two-dimensional arrays, (double[][] matrix). However, when the matrix is sparse, a two-dimensional array is a terrible waste of space, because the majority of the elements will be simply zeros. Considering that real-world matrices can be very large (a few thousands to millions of elements), efficient representations for sparse matrices are crucial.

Below, we give two different ways to represent a sparse matrix. To provide an example, we will use the following 10x10 matrix:

    0    1.1  2.2  0   0    0   0  0   0    0
    0    0    3.3  4.4 0    0   0  0   0    0
    0    0    0    0   0    0   0  5.5 0    0 
    0    0    6.6  0   0    7.7 0  8.8 9.9  0
    10.0 0    0    0   0    0   0  0   0    11.1
    0    0    0    0   12.2 0   0  0   0    0
    0    0    0    0   13.3 0   0  0   0    0 
    0    14.4 15.5 0   0    0   0  0   16.6 0
    0    0    0    0   0    0   0  0   17.7 0
    0    0    0    0   0    0   0  0   0    18.8

Assuming that the matrix elements are double values, the matrix above would require a 10x10 array of double values (i.e. a minimum of 100 double elements) allocated in the memory if it’s represented using a two-dimensional array.

Coordinate format (COO)

In this format, each non-zero value of the matrix is associated with its row and column index. Assuming that indexing starts from 0, the matrix above could be represented as follows, where on each line the first value is the row index, the second is the column index, and the third is the value of the non-zero element.

    0 1 1.1
    0 2 2.2
    1 2 3.3
    1 3 4.4
    2 7 5.5
    3 2 6.6
    3 5 7.7
    3 7 8.8
    3 8 9.9
    4 0 10.0
    4 9 11.1
    5 4 12.2
    6 4 13.3
    7 1 14.4
    7 2 15.5
    7 8 16.6
    8 8 17.7
    9 9 18.8

Note that the index values are integers whereas the matrix values are doubles. This representation would require 18+18 = 36 int values, plus 18 double values. This is a clear improvement over the naive, two-dimensional array representation.

Also get familiar with sorting routines available in the Java library.

Task 1

Your task in this problem is to convert the matrix from COO to another format provided in the assignment. And print the matrix in this new format to the console. The outputs will be obtained by calling the toString() method of your new format.

Task 2

Your other task is to write clean code. In addition to appropriate naming, clean and short and well-defined functions, apply what you have learnt from formatting lecture. Ask yourself these questions: Do I need to define a class? Am I making proper usage of static keyword? Make sure you have no code smells in the code you write.

Remarks

  • Think about the data structure you will use to store the matrix contents when you convert the matrix. This data structure must be space-wise efficient enough. You canNOT use a two-dimensional array; that’s against the whole idea of reducing the space requirements for the input matrix.

The code provided to you will read the matrix from a .mtx file that will be provided in your svn folder. There are four things you need to watch out about these files:

  • The first few lines of the matrix contain comments. These are the lines that start with the % sign.
  • Following the comments, the first non-comment line gives the number of rows, columns and the number of entries.
  • The entries may be given in any order – do not assume any particular order.
  • The entries may contain zero values. You will need to filter them out.
  • Indexing starts from 1, NOT 0.

For instance, the fidap005 matrix has the following contents:

%%MatrixMarket matrix coordinate real general
27 27 279
1 1  1.0370389925925e+06
2 1  2.5925905925925e+05
10 1 -1.1851862518518e+06
... removed for space concerns
27 26  2.5925905925895e+05
8 27  3.7037148148387e+04
9 27  1.4814814814901e+05
17 27 -2.9629665185188e+05
18 27 -1.1851862518519e+06
26 27  2.5925905925895e+05
27 27  1.0370389925918e+06

As you can see, the first non-comment line says that this matrix is 27x27, and there are entries for 279 elements in this file. Note that the contents are NOT sorted in row-major order.

Test cases

You will be provided the outputs of some matrices in your SVN folder. These outputs are the matrices in the desired format.Compare your outputs to these to find out if your conversion was successful or not.

Grading

  • Functional Requirements
    • Converting COO to format desired: 7
    • Displaying the matrix decently on the command line : 3
  • Style
    • Adequate and meaningful comments: 1
    • Modularization of code into methods and classes: 3
    • Proper names: 3
    • Short functions that are at the right level of abstraction: 3