280 lines
8.1 KiB
Plaintext
280 lines
8.1 KiB
Plaintext
To build the software:
|
|
|
|
$ make
|
|
|
|
To install it:
|
|
|
|
$ sudo make install
|
|
|
|
This will put the executables in /usr/local/bin
|
|
|
|
To initialize or re-create the database files:
|
|
|
|
$ sudo /usr/local/bin/dbr_ctl initdb [force]
|
|
|
|
To start the db:
|
|
|
|
$ sudo /usr/local/bin/dbr_ctl start
|
|
|
|
Look at the top of the dbr_ctl script for some initialization variables you
|
|
might want to change.
|
|
|
|
--------------
|
|
Precis
|
|
--------------
|
|
Roxanne is a very simple database server that allows a client to store and
|
|
retrieve values by key. Keys are stored in a hash map of 64K buckets. Hash
|
|
collisions are resolved by separate chaining onto linked lists at the end
|
|
of the index file. The default location for the index file is in
|
|
/var/roxanne/idx
|
|
|
|
In addition to value lookups by key, the database provides a way to group
|
|
keys in a hierarchical directory structure. See Composite Keys below.
|
|
|
|
The values for the given keys are stored in contiguous 4KB blocks in the
|
|
database file (/var/roxanne/db). A file called block_bitmap tracks the
|
|
free/busy blocks in the db file. The dbr processes memory-map this file for
|
|
very fast access. The db file starts out with zero blocks. Blocks are only
|
|
added to the db file as needed to accomodate new records. As typically
|
|
built, the database can accomodate about a billion blocks.
|
|
|
|
--------------
|
|
Composite Keys
|
|
--------------
|
|
The database supports the notion of a composite key. That is, a key
|
|
that is subdivided into a hierarchy that all keys participate in. The key-
|
|
space is similar to a typical Unix filesystem organized into directories.
|
|
|
|
To work with hierarchical keys, divide them with spaces ' '. The last
|
|
element becomes the value. The 'value' does not participate in the
|
|
key-space.
|
|
|
|
The commands
|
|
|
|
create finance accounting payroll employees 50
|
|
create finance accounting receivables employees 70
|
|
|
|
Creates a 4-level hierarchy:
|
|
|
|
1 2 3 4
|
|
finance
|
|
accounting
|
|
payroll
|
|
employees
|
|
receivables
|
|
employees
|
|
|
|
|
|
The key-space then becomes a kind of database on its own. A client
|
|
can query the database for all the subkeys of a path. This gives clients
|
|
building blocks for range queries and ordered (sorted) results.
|
|
|
|
Nodes in the key-space are reference-counted so that repeating groups
|
|
in the hierarchy don't waste space.
|
|
|
|
All lookups of values are still done via hashmap of the entire key. In
|
|
other words, values can only be fetched by providing THE ENTIRE KEY.
|
|
In this case, the hash-lookup key is 'finance accounting payroll employees'.
|
|
It points to a starting block in the db that contains '50'.
|
|
|
|
This means that point-lookups of records will always be very fast, but that
|
|
the operator can use the key-space to organize and sort results.
|
|
|
|
Results from the keys command return in ascending sort order (strcmp).
|
|
|
|
--------------
|
|
Usage Example
|
|
--------------
|
|
madison:Roxanne rothrock$ sudo dbr_ctl start
|
|
Started listening.
|
|
carp:Roxanne rothrock$ telnet localhost 4080
|
|
Trying 127.0.0.1...
|
|
Connected to localhost.
|
|
Escape character is '^]'.
|
|
create alpha record_1
|
|
STATUS: OK
|
|
SIZE: 9
|
|
Write OK.
|
|
|
|
create alpha beta record_2
|
|
STATUS: OK
|
|
SIZE: 9
|
|
Write OK.
|
|
|
|
read alpha
|
|
STATUS: OK
|
|
SIZE: 8
|
|
record_1
|
|
|
|
read alpha beta
|
|
STATUS: OK
|
|
SIZE: 8
|
|
record_2
|
|
|
|
...Now a demonstration of the 'keys' command:
|
|
|
|
create alpha gamma record_3
|
|
STATUS: OK
|
|
SIZE: 9
|
|
Write OK.
|
|
|
|
create alpha delta record_4
|
|
STATUS: OK
|
|
SIZE: 9
|
|
Write OK.
|
|
|
|
keys alpha
|
|
STATUS: OK
|
|
SIZE: 16
|
|
beta delta gamma
|
|
|
|
|
|
--------------
|
|
Protocol
|
|
--------------
|
|
|
|
Data manipulation commands:
|
|
|
|
create key [key]...value<newline>
|
|
|
|
read key [key]...<newline>
|
|
|
|
delete key [key]...<newline>
|
|
|
|
keys [key] [key]....<newline>
|
|
|
|
|
|
Other commands:
|
|
|
|
quit
|
|
|
|
Response:
|
|
"STATUS: <status string>\nSIZE: <integer>\n<integer> long string of bytes\n\n"
|
|
|
|
--------------
|
|
On Disk
|
|
--------------
|
|
|
|
The initial and maximum sizes are hard-coded in the application along
|
|
with the units of growth and storage. Eventually, these will become
|
|
configurable options. For now, it's just simpler this way.
|
|
|
|
Roxanne's persistent storage lives in /var/roxanne by default.
|
|
|
|
$ du -skh /var/roxanne/*
|
|
128M /var/roxanne/block_bitmap
|
|
64K /var/roxanne/db
|
|
64M /var/roxanne/idx
|
|
24K /var/roxanne/keydb
|
|
0B /var/roxanne/keydb_freelist
|
|
|
|
|
|
### The Block Bitmap (/var/roxanne/block_bitmap)
|
|
|
|
Each _bit_ in this file corresponds to a 4 KB block in the 'db' file.
|
|
Roxanne processes memory-map this file and use it keep track of used
|
|
and free blocks. Only one process may update the block bitmap at any
|
|
time. The block bitmap file provides a way to very quickly find and
|
|
reserve a contiguous set of blocks to store values.The block_bitmap
|
|
file is regularly flushed to disk with msync() after each reservation
|
|
request. The size of the block bitmap file is fixed at 128MB which is
|
|
enough space to keep track of 1,073,741,824 blocks.
|
|
|
|
### The database (/var/roxanne/db)
|
|
|
|
All values are stored in the db file. The unit of storage in the 'db'
|
|
file is the block, and all blocks are 4KB.
|
|
|
|
The db file starts out small and grows as more values are added to the
|
|
the database. All values inserted into the database are stored in
|
|
contiguous blocks. This simplifies the code, and provides for fast access,
|
|
but some fragmentation will occur over time if the database serves lots
|
|
of reads and writes of varying size.
|
|
|
|
#### The hash index (/var/roxanne/idx)
|
|
|
|
Initially sized at 64 megabytes, the hash index stores the full, composite
|
|
keys for values in the db file. An entry in the index is comprised of a
|
|
key stored as a string, an integer that represents a byte-offset in the
|
|
db file, an integer representing the length in bytes of the value, and
|
|
a third integer that represents a byte-offset in the idx file for
|
|
chaining additional keys.
|
|
|
|
Hash collisions are resolved by the separate chaining method. Each
|
|
index entry is 1024 bytes, so the hash table has 65,536 slots. When a
|
|
collision occurs, the colliding key is appended to the end if the index
|
|
file and the key in the slot where the collision occurred has its
|
|
'next' pointer updated to be the byte-offest in the index file where
|
|
the append occurred.
|
|
|
|
#### The keydb (/var/roxanne/keydb)
|
|
|
|
Initially sized at 0 bytes, the keydb stores the composite key hierarchy.
|
|
Each level of the hierarchy is stored as a binary tree.
|
|
|
|
Each node in the hierarchy has: a key-part, a left pointer (less-than),
|
|
a right pointer (greater-than), and a next pointer that points to the
|
|
next level in the hierarchy.
|
|
|
|
Nodes have a reference count so that duplicate key-parts don't require
|
|
additional space. Unfortunately, the database does not yet support
|
|
reclamation of keydb nodes with a reference count of 0.
|
|
|
|
##### Example
|
|
|
|
Consider the following two composite keys (values left off):
|
|
|
|
foo bar toast
|
|
foo bar jam
|
|
|
|
This set of two composite keys comprises 4 nodes in the keydb, stored
|
|
like so:
|
|
|
|
key refcount next-pointer left-pointer right-pointer
|
|
------------------------------------------------------------
|
|
foo 2 bar NULL NULL
|
|
bar 2 toast NULL NULL
|
|
toast 1 NULL jam NULL
|
|
jam 1 NULL NULL NULL
|
|
|
|
|
|
Next, add these composite keys (again, values left off).
|
|
foo bar whiskey
|
|
zen
|
|
zoo
|
|
egg
|
|
|
|
The table now looks like this:
|
|
|
|
key refcount next-pointer left-pointer right-pointer
|
|
------------------------------------------------------------
|
|
foo 3 bar egg zen
|
|
bar 3 toast NULL NULL
|
|
toast 1 NULL jam whiskey
|
|
jam 1 NULL NULL NULL
|
|
whiskey 1 NULL NULL NULL
|
|
zen 1 NULL NULL zoo
|
|
zoo 1 NULL NULL NULL
|
|
egg 1 NULL NULL NULL
|
|
|
|
|
|
Here is the composite-key space represented graphically:
|
|
|
|
+---+foo+----+
|
|
| + |
|
|
| | |
|
|
| | |
|
|
v v v
|
|
egg bar zen+--+
|
|
+ |
|
|
| v
|
|
| v
|
|
v zoo
|
|
+--+toast+--+
|
|
| |
|
|
| |
|
|
| |
|
|
v v
|
|
jam whiskey
|
|
|