Milestone Report

Progress So Far

We familiarized ourselves with the Adaptive Radix Tree data structure and ways to parallelize it by reading the two papers. However, we encountered some problems implementing the data structure in Rust. The most ergonomic way of implementing the various ART node types is to use an enum (tagged union), which has the size of the largest variant. Under this layout, we would lose the advantage of having smaller node types. If we want to modify the memory layout, we need to specify our memory allocation layout using unsafe Rust, which we don't feel comfortable doing. Also, the reference sequential implementation uses a lot of memory copying operations instead of swapping the pointer. Thus, we decided to pivot back to using C++, a language both of us are more familiar with. It also gives us more flexibility in controlling the internal memory layout and working with raw pointers.

New Goals and Deliverables

Compared to the goals listed in the proposal, we decided not to include the comparison between our implementation and other concurrent data structures due to time constraint. We will still have thorough scalablity and contention analysis.

75% Goals

Related works reading. Understand the ART data structure and the synchronization approaches described in the papers.
Sequential implementation of ART. Implement a version of ART with no synchronization. This version will help us understand the data structure better while serving as a reference to compare against.
Concurrent implementation of ART using lock coupling (LC). As shown in the paper, the code structure for lock coupling and optimistic lock coupling are very similar. The LC implementation of ART will also serve as an alternative synchronization approach to compare against with.
Extensive testing on implemented sequential and concurrent ART implementations.

100% Goals

75% Goals, plus:

Scalability analysis. We will run microbenchmarks to investigate the scaliability of each finished implementations under low contention.
Contention analysis. We will run microbenchmarks to measure the performance of each finished implementations when there are simulantaneous read and write operations.
Concurrent implementation of ART using optimistic lock coupling (OLC). Modify the LC version to do OLC.

125% Goals

100% Goals, plus:

Concurrent implementation of ART using read-optimized write exclusion (ROWEX). The ROWEX synchronization approach should further improve read performance since a read will not wait or restart in this scheme.
Run YCSB benchmark. In addtion to the microbenchmarks, it might be beneficial to evaluate the performance on a synthetic workload. YCSB is common set of workloads for evaluating the performance of different key-value stores, which fits our scope.
SIMD optimizations. SIMD-based parallel comparison can be used to further improve lookup performance.

Demo

A test workload will be ran on the sequential and parallel versions of our ART implementation, and we will specifically be looking at the speedup of the parallel versions for an increasing thread count. Our poster will also show speedup graphs on our ART implementation.

Analysis

We will be comparing speedup of our parallel implementation to the our sequential implementation. We are also interested in seeing the performance difference between tradational lock coupling, and optimistic lock coupling. If time permits, we will be adding in read-optimized write exclusion for comparison between the parallel implementations.

New Schedule

Week	Work Item	Work split
04/16 - 04/21	Sequential implementation - insert, get	Yuchen
04/16 - 04/21	Sequential implementation - remove	Hang
04/22 - 04/24	Garbage collection and lock implementation	Yuchen
04/22 - 04/24	Benchmark setup	Hang
04/25 - 04/30	LC and OLC implementation - insert, get	Yuchen
04/25 - 04/30	LC and OLC implementation - remove	Hang
05/01 - 05/05	Final report/result collection and analysis	Together
05/06	Poster Session	Together

Concerns

One minor concern that we have is the time constraint on implementing both LC and OLC. The sequential version of ART will already take some time implement, so there's a chance that we may run out of time to implement both lock-coupling and optimistic lock-coupling, in which case, we will likely have to pick only one to implement. However, this isn't a huge concern, as the implementation difference between LC and OLC should not be too big.

Building a Concurrent Adaptive Radix Tree