Go to file
Tyler Neely 61f9a3d566
Update FUNDING.yml
2019-11-19 17:02:27 +01:00
.github Update FUNDING.yml 2019-11-19 17:02:27 +01:00
art update aesthetics 2019-09-03 09:30:28 +02:00
benchmarks/stress2 Merge branch 'master' of github.com:spacejam/sled into tyler_debug_cache 2019-11-18 20:23:08 +01:00
bindings Update native bindings 2019-10-06 15:07:06 +02:00
examples Remove ConfigBuilder. Add Config::open. Persist config settings using a simple text format that does not rely on serde. Clean up test_log.rs. 2019-10-06 01:35:29 +02:00
scripts try using relative TSAN suppression file 2019-10-19 14:23:39 +02:00
src clippy feedback 2019-11-18 22:39:25 +01:00
tests limit scope to write operations 2019-11-17 13:27:55 +00:00
.gitignore
.rustfmt.toml update rustfmt conf. delete unused confusing one 2019-10-02 04:49:35 +02:00
CHANGELOG.md Add new `create_new` option to `Config`, allowing the user to specify that a database should only be created for the first time, rather than re-opened. This directly corresponds to the `create_new` option on std::fs::OpenOptions. 2019-11-01 12:52:37 +01:00
CONTRIBUTING.md Update CONTRIBUTING.md 2019-10-18 09:47:54 +02:00
Cargo.toml Update color-backtrace requirement from 0.2.3 to 0.3.0 2019-11-13 10:13:10 +00:00
ISSUE_TEMPLATE
LICENSE-APACHE
LICENSE-MIT
README.md Update README.md 2019-11-09 11:00:54 +01:00
RELEASE_CHECKLIST.md
SECURITY.md Update SECURITY.md 2019-10-28 10:43:30 +01:00
azure-pipelines.yml exclude burn-in test from caching 2019-10-21 22:18:14 -07:00
code-of-conduct.md Update code-of-conduct.md 2019-10-28 10:44:15 +01:00
tsan_suppressions.txt

README.md

key value
documentation documentation
chat about databases with us chat
help us build what you want to use Open Collective backers

sled - it's all downhill from here!!!

A (beta) modern embedded database. Doesn't your data deserve a (beta) beautiful new home?

use sled::Db;

let tree = Db::open(path)?;

// insert and get, similar to std's BTreeMap
tree.insert(k, v1);
assert_eq!(tree.get(&k), Ok(Some(v1)));

// range queries
for kv in tree.range(k..) {}

// deletion
tree.remove(&k);

// atomic compare and swap
tree.compare_and_swap(k, Some(v1), Some(v2));

// block until all operations are stable on disk
// (flush_async also available to get a Future)
tree.flush();

performance

  • 2 million sustained writes per second with 8 threads, 1000 8 byte keys, 10 byte values, intel 9900k, nvme
  • 8.5 million sustained reads per second with 16 threads, 1000 8 byte keys, 10 byte values, intel 9900k, nvme

what's the trade-off? sled uses too much disk space sometimes. this will improve significantly before 1.0.

features

a note on lexicographic ordering and endianness

If you want to store numerical keys in a way that will play nicely with sled's iterators and ordered operations, please remember to store your numerical items in big-endian form. Little endian (the default of many things) will often appear to be doing the right thing until you start working with more than 256 items (more than 1 byte), causing lexicographic ordering of the serialized bytes to diverge from the lexicographic ordering of their deserialized numerical form.

  • Rust integral types have built-in to_be_bytes and from_be_bytes methods.
  • bincode can be configured to store integral types in big-endian form.

interaction with async

If your dataset resides entirely in cache (achievable at startup by setting the cache to a large enough value and performing a full iteration) then all reads and writes are non-blocking and async-friendly, without needing to use Futures or an async runtime.

To asynchronously suspend your async task on the durability of writes, we support the flush_async method, which returns a Future that your async tasks can await the completion of if they require high durability guarantees and you are willing to pay the latency costs of fsync. Note that sled automatically tries to sync all data to disk several times per second in the background without blocking user threads.

architecture

lock-free tree on a lock-free pagecache on a lock-free log. the pagecache scatters partial page fragments across the log, rather than rewriting entire pages at a time as B+ trees for spinning disks historically have. on page reads, we concurrently scatter-gather reads across the log to materialize the page from its fragments. check out the architectural outlook for a more detailed overview of where we're at and where we see things going!

goals

  1. don't make the user think. the interface should be obvious.
  2. don't surprise users with performance traps.
  3. don't wake up operators. bring reliability techniques from academia into real-world practice.
  4. don't use so much electricity. our data structures should play to modern hardware's strengths.

known issues, warnings

  • if reliability is your primary constraint, use SQLite. sled is beta.
  • if storage price performance is your primary constraint, use RocksDB. sled uses too much space sometimes.
  • quite young, should be considered unstable for the time being.
  • the on-disk format is going to change in ways that require manual migrations before the 1.0.0 release!
  • until 1.0.0, sled targets the current stable version of rust. after 1.0.0, we will aim to trail current by at least one version. If this is an issue for your business, please consider helping us reach 1.0.0 sooner by financially supporting our efforts to get there.

plans

  • Typed Trees that support working directly with serde-friendly types instead of raw bytes, and also allow the deserialized form to be stored in the shared cache for speedy access.
  • LSM tree-like write performance with traditional B+ tree-like read performance
  • MVCC and snapshots
  • forward-compatible binary format
  • concurrent snapshot delta generation and recovery
  • consensus protocol for PC/EC systems
  • pluggable conflict detection and resolution strategies for gossip + CRDT-based PA/EL systems
  • first-class programmatic access to replication stream

fund feature development

Want to support development? Help us out via Open Collective!

special thanks

Ferrous Systems provided a huge amount of engineer time for sled in 2018 and 2019. They are the world's leading Rust education and embedded consulting company. Get in touch!

Special thanks to Meili for providing engineering effort and other support to the sled project. They are building an event store backed by sled, and they offer a full-text search system which has been a valuable case study helping to focus the sled roadmap for the future.

Additional thanks to Arm, Works on Arm and Packet, who have generously donated a 96 core monster machine to assist with intensive concurrency testing of sled. Each second that sled does not crash while running your critical stateful workloads, you are encouraged to thank these wonderful organizations. Each time sled does crash and lose your data, blame Intel.

Finally, thanks to everyone who helps out by contributing on Open Collective!

contribution welcome!

want to help advance the state of the art in open source embedded databases? check out CONTRIBUTING.md!

references