The Raft Consensus Algorithm

Diego Ongaro and John Ousterhout

November 2015

Source code available at https://github.com/ongardie/raft-talk.
Unless otherwise noted, this work is:
© 2012-2015 Diego Ongaro, © 2012-2014 John Ousterhout.
Licensed under the Creative Commons Attribution 4.0 International License.
Creative Commons License

Paxos Protocol

  • Leslie Lamport, 1989
  • Nearly synonymous with consensus
“The dirty little secret of the NSDI community is that at most five people really, truly understand every part of Paxos ;-).”
—NSDI reviewer
There are significant gaps between the description of the Paxos algorithm and the needs of a real-world system...the final system will be based on an unproven protocol.
—Chubby authors

How Is Consensus Used?

Top-level system configuration

Replicate entire database state

Raft's Design for Understandability

We wanted an algorithm optimized for building real systems

  • Must be correct, complete, and perform well
  • Must also be understandable

“What would be easier to understand or explain?”

  • Fundamentally different decomposition than Paxos
  • Less complexity in state space
  • Less mechanism

Raft User Study

Raft Implementations

Name Primary Authors Language License
RethinkDB/clustering C++ AGPL
etcd/raft Blake Mizerany, Xiang Li and Yicheng Qin Go Apache 2.0
LogCabin Diego Ongaro (Stanford) C++ ISC
go-raft Ben Johnson (Sky) and Xiang Li (CMU, CoreOS) Go MIT
hashicorp/raft Armon Dadgar (hashicorp) Go MPL-2.0
hoverbear/raft Andrew Hobden, Dan Burkert Rust MIT
ckite Pablo Medina Scala Apache2
verdi/raft James Wilcox, Doug Woos, Pavel Panchekha, Zach Tatlock, Xi Wang, Mike Ernst, and Tom Anderson (University of Washington) Coq BSD
OpenDaylight Moiz Raja, Kamal Rameshan, Robert Varga (Cisco), Tom Pantelis (Brocade) Java Eclipse
zraft_lib Gunin Alexander Erlang Apache2
kanaka/raft.js Joel Martin Javascript MPL-2.0
akka-raft Konrad Malawski Scala Apache2
rafter Andrew Stone (Basho) Erlang Apache2
floss Alexander Flatter Ruby MIT
willemt/raft Willem-Hendrik Thiart C BSD
... ... ... ...

Copied from Raft website, probably stale.

LogCabin

  • Started as research platform for Raft at Stanford
  • Developed into production system at Scale Computing
  • Network service running Raft replicated state machine
  • Data model: hierarchical key-value store, kept in memory
  • Written in C++ (gcc 4.4's C++0x)