- Berkeley DB Reference Guide:
- Berkeley DB Transactional Data Store Applications
|
|
Transaction tuning
There are a few different issues to consider when tuning the performance
of Berkeley DB transactional applications. First, you should review
Access method tuning, as the
tuning issues for access method applications are applicable to
transactional applications as well. The following are additional tuning
issues for Berkeley DB transactional applications:
- access method
- Highly concurrent applications should use the Queue access method, where
possible, as it provides finer-granularity of locking than the other
access methods. Otherwise, applications usually see better concurrency
when using the Btree access method than when using either the Hash or
Recno access methods.
- record numbers
- Using record numbers outside of the Queue access method will often slow
down concurrent applications as they limit the degree of concurrency
available in the database.
Using the Recno access method, or the Btree access
method with retrieval by record number configured can slow applications
down.
- Btree database size
- When using the Btree access method, applications supporting concurrent
access may see excessive numbers of deadlocks in small databases. There
are two different approaches to resolving this problem. First, as the
Btree access method uses page-level locking, decreasing the database
page size can result in fewer lock conflicts. Second, in the case of
databases that are cyclically growing and shrinking, turning off reverse
splits can leave the database with enough pages that there will be fewer
lock conflicts.
- transactionally protected read operations
- Most applications do not need repeatable reads. Performing all read
operations outside of transactions can often significantly increase
application throughput. In addition, limiting the lifetime of
non-transactional cursors will reduce the length of times locks are
held, thereby improving concurrency.
- DB_DIRTY_READ
- Consider using the DB_DIRTY_READ flag for transactions, cursors
or individual read operations. This flag allows read operations to
potentially return data which has been modified but not yet committed,
and can significantly increase application throughput in applications
that do not require data be guaranteed to be permanent in the database.
- DB_RMW
- Consider using the DB_RMW flag to immediate acquire write locks
when reading data items that will subsequently be modified. Although
this flag may increase contention (because write locks are held longer
than they would otherwise be), it may decrease the number of deadlocks
that occur.
- DB_TXN_NOSYNC
- By default, transactional commit in Berkeley DB implies durability, that is,
all committed operations will be present in the database after
recovery from any application or system failure. For applications not
requiring that level of certainty, specifying the DB_TXN_NOSYNC
flag will often provide a significant performance improvement. In this
case, the database will still be fully recoverable, but some number of
committed transactions might be lost after system failure.
- large key/data items
- Transactional protections in Berkeley DB are guaranteed by before and after
physical image logging. This means applications modifying large
key/data items also write large log records, and, in the case of the
default transaction commit, threads of control must wait until those
log records have been flushed to disk. Applications supporting
concurrent access should try and keep key/data items small wherever
possible.
- log buffer size
- Berkeley DB internally maintains a buffer of log writes. The buffer is
written to disk at transaction commit, by default, or, whenever it
is filled. If it is consistently being filled before transaction
commit, it will be written multiple times per transaction, costing
application performance. In these cases, increasing the size of the
log buffer can increase application throughput.
- trickle write
- In some applications, the cache is sufficiently active and dirty that
readers frequently need to write a dirty page in order to have space in
which to read a new page from the backing database file. You can use
the db_stat utility (or the statistics returned by the
memp_stat function) to see how often this is happening in your
application's cache. In this case, using a separate thread of control
and the memp_trickle interface to trickle-write pages can often
increase the overall throughput of the application.
Copyright Sleepycat Software