Software and Data from Projects


Generalized Optimal Response Time Retrieval of Replicated Data from Storage Arrays

Paper

[TOS '13] PDF, BibTex




Disk Parameters

Speed of a disk is defined as the average time it takes to retrieve a single data block from that disks. In our experiments, we use the disk specifications given in Table 1.


Table 1: Disk Specifications
Producer Model Type RPM Speed (ms)
Seagate Barracuda HDD 7.2K 13.2
WD Raptor HDD 10K 8.3
Seagate Cheetah HDD 15K 6.1
OCZ Vertex SSD - 0.5
Intel X25-E SSD - 0.2

Experiments

All the experiments conducted are summarized in Table 2. Delay and initial load values are given in milliseconds. R(2,10,2) means that a number among the set 2, 4, 6, 8, and 10 is chosen randomly. If the system is homogeneous, the properties of the cheetah disk is used for all the disks in the system. If the system is heterogeneous, then the disks are chosen randomly among the disk group indicated in the table. Disk groups can be HDDs, SSDs, or HDDs+SSDs.


Table 2: Experiment Parameters
Experiment Number of Disk Site 1 Site 2
Number Sites Properties Disks Delays Loads Disks Delays Loads
1 1 homogeneous cheetah 0 0 - - -
2 1 heterogeneous ssd 0 0 - - -
3 1 heterogeneous hdd 0 0 - - -
4 1 heterogeneous ssd+hdd 0 0 - - -
5 1 heterogeneous ssd+hdd R(2,10,2) R(2,10,2) - - -
6 2 homogeneous cheetah 0 0 cheetah 0 0
7 2 homogeneous cheetah 0 0 cheetah 0 20
8 2 homogeneous cheetah 0 5 cheetah 0 15
9 2 homogeneous cheetah 0 10 cheetah 0 10
10 2 homogeneous cheetah 0 15 cheetah 0 5
11 2 homogeneous cheetah 0 20 cheetah 0 0
12 2 homogeneous cheetah 0 0 cheetah 20 0
13 2 homogeneous cheetah 5 0 cheetah 15 0
14 2 homogeneous cheetah 10 0 cheetah 10 0
15 2 homogeneous cheetah 15 0 cheetah 5 0
16 2 homogeneous cheetah 20 0 cheetah 0 0
17 2 heterogeneous ssd 0 0 hdd 0 0
17 2 heterogeneous hdd 0 0 ssd 0 0
19 2 heterogeneous ssd+hdd 0 0 ssd+hdd 0 0
20 2 heterogeneous ssd+hdd R(2,10,2) R(2,10,2) ssd+hdd R(2,10,2) R(2,10,2)

Results

Results of the experiments defined in Table 2 is provided in Table 3.


Table 3: Experimental Results
Experiment
All Results
Experiment 1
PDF
Experiment 2
PDF
Experiment 3
PDF
Experiment 4
PDF
Experiment 5
PDF
Experiment 6
PDF
Experiment 7
PDF
Experiment 8
PDF
Experiment 9
PDF
Experiment 10
PDF
Experiment 11
PDF
Experiment 12
PDF
Experiment 13
PDF
Experiment 14
PDF
Experiment 15
PDF
Experiment 16
PDF
Experiment 17
PDF
Experiment 18
PDF
Experiment 19
PDF
Experiment 20
PDF

Contact

You can send an e-mail to this address for any questions.

Integrated Maximum Flow Algorithm for Optimal Response Time Retrieval of Replicated Data

Paper

[ICPP '12] PDF, BibTex




Disk Parameters

Speed of a disk is defined as the average time it takes to retrieve a single data block from that disks. In our experiments, we use the disk specifications given in Table 1.


Table 1: Disk Specifications
Producer Model Type RPM Speed (ms)
Seagate Barracuda HDD 7.2K 13.2
WD Raptor HDD 10K 8.3
Seagate Cheetah HDD 15K 6.1
OCZ Vertex SSD - 0.5
Intel X25-E SSD - 0.2

Experiments

All the experiments conducted are summarized in Table 2. Delay and initial load values are given in milliseconds. R(2,10,2) means that a number among the set 2, 4, 6, 8, and 10 is chosen randomly. If the system is homogeneous, the properties of the cheetah disk is used for all the disks in the system. If the system is heterogeneous, then the disks are chosen randomly among the disk group indicated in the table. Disk groups can be HDDs, SSDs, or HDDs+SSDs.


Table 2: Experiment Parameters
Experiment Number of Disk Site 1 Site 2
Number Sites Properties Disks Delays Loads Disks Delays Loads
1 2 homogeneous cheetah 0 0 cheetah 0 0
2 2 heterogeneous ssd 0 0 hdd 0 0
3 2 heterogeneous hdd 0 0 ssd 0 0
4 2 heterogeneous ssd+hdd 0 0 ssd+hdd 0 0
5 2 heterogeneous ssd+hdd R(2,10,2) R(2,10,2) ssd+hdd R(2,10,2) R(2,10,2)

Results

Results of the experiments defined in Table 2 is provided in Table 3.


Table 3: Experimental Results
Experiment
All Results
Experiment 1
PDF
Experiment 2
PDF
Experiment 3
PDF
Experiment 4
PDF
Experiment 5
PDF

Contact

You can send an e-mail to this address for any questions.

Equivalent Disk Allocations

Paper

[TPDS '12] PDF, Supplementary File, BibTex




Periodic Disk Allocations with Best Additive Error and Threshold

   Efficient retrieval of a range query is challening. Multi-disk architectures offer the opportunity to exploit I/O parallelism during retrieval. A common approach for efficient parallel I/O is partitioning the data space into disjoint regions, and allocating the data to multiple disks. When users issue a query, data falling into disjoint partitions is retrieved in parallel from multiple disks. This technique is referred to as declustering and can be summarized as a good way of distributing data to multiple I/O devices.

   Additive error of a range query is the difference between optimal and actual retrieval cost. Additive error of a declustering scheme is the maximum additive error over all the queries. Threshold of a declustering scheme is k if all spatial range queries with at most k buckets can be retrieved optimally. It is desirable to find declustering schemes with low additive error and high threshold. Periodic disk allocations yield good results, however; the number of periodic disk allocations is large and finding the ones with the best additive error and threshold is not easy.

   Here, we share our recent research findings by providing periodic disk allocations giving the best additive error and threshold for 2, 3 and 4 dimensional databases.

Format of the Files

  • The first column is N, the number of disks in the system.
  • The second column is the best additive error or threshold for N number of disks specified in the first column.
  • The third column is the allocation that yields the best additive error or threshold.
  • We use the notation (a1 , a2 , . . . , ad ) for the d-dimensional disk allocation (a1∗i1 + a2∗i2 + . . . + ad∗id mod N).

Results

Dimentionality
Additive Error
Threshold
2 Dimensions
txt
txt
3 Dimensions
txt
txt
4 Dimensions
txt
txt

Contact

You can send an e-mail to this address for any questions.