The BeeDict module provides two high level on-disk
dictionary implementations. The first (BeeDict) can
work with arbitrary hashable key objects, while the second
(BeeStringDict) uses limited sized strings as basis
providing slightly better performance. Both variants need
pickleable Python objects as keys and values.
Data transfer to and from the dictionaries is done in the
same way as for in-memory dictionaries, e.g. d['key']
= 1; print d['key']; del d['key]
, so usage should be
transparent to Python programs using either in-memory or
on-disk dictionaries. Not all dictionary methods are
understood though.
BeeDict Objects
BeeDict objects are on-disk dictionaries which use a
hash-to-address index. Both Keys and values must be
pickleable and can have arbitrary size (keys shouldn't be
too long though); keys have to be hashable.
Hash collisions are treated by sequential reads of all
records with the same hash value and testing for equality of
keys. This can be expensive !
BeeDicts use a BeeStorage.BeeKeyValueStorage instance as
storage object and a BeeIndex.BeeIntegerIndex instance as
index object.
BeeDict objects are constructed using:
BeeDict(name,min_recordsize=0,readonly=0,recover=0,
autocommit=0,validate=0)
-
Create an instance using
name
as basename for
the data and index files. Two files will be created:
name.dat and name.idx.
min_recordsize
is passed to the BeeStorage
as indicator of the minimum size for data
records. readonly
can be set to true to
open the files in read-only mode, preventing any disk
modifications.
To open the dictionary in recovery mode, pass a keyword
recover=1
. Then run .recover()
and reopen using the normal settings. The
AutoRecover()
wrapper can take care of this
action for you automatically.
If autocommit
is true the cache control
will do an automatic .commit()
whenever the
transaction log overflows.
If validate
is true, the dictionary will
run a validation check after having successfully opened
storage and index. RecreateIndexError
or
RecoverError
exceptions could be raised in
case inconsistencies are found.
BeeDict Instance Methods
BeeDict Instance Attributes
BeeDictCursor Objects
BeeDickCursor objects are intended to iterate over the
database one item at a time without the need to read all
keys. You can then read/write to the current cursor position
and thus modify the dictionary in place.
Note that modifying the targetted dictionary while using a
cursor can cause the cursor to skip new entries or fail due
to deleted items. Especially deleting the key to which the
cursor currently points can cause errors to be raised. In
all other cases, the cursor will be repositioned.
BeeDictCursor objects are constructed using the BeeDict
.cursor()
method.
BeeDictCursor Instance Methods
BeeDictCursor Instance Attributes
BeeDictCursor don't have any useful attributes. Use the instance
methods to query the key and value objects.
BeeStringDict Objects
BeeStringDict objects are on-disk dictionaries which use a
limited size string to address index. Values must be
pickleable and can have arbitrary size.
Since hash collisions cannot occur this dictionary type may
have some performance advantages over the standard BeeDict
dictionary.
BeeStringDict objects are constructed using:
BeeStringDict(name,keysize=10,min_recordsize=0,readonly=0,
recover=0,autocommit=0,validate=0)
-
Create an instance using
name
as basename for
the data and index files. Two files will be created:
name.dat and name.idx.
keysize
gives the maximal size of the
strings used as index keys. min_recordsize
is passed to the BeeStorage as indicator of the minimum
size for data records. readonly
can be set
to true to open the files in read-only mode, preventing
any disk modifications.
To open the dictionary in recovery mode, pass a keyword
recover=1
. Then run .recover()
and reopen using the normal settings. The
AutoRecover()
wrapper can take care of this
action for you automatically.
If autocommit
is true the cache control
will do an automatic .commit()
whenever the
transaction log overflows.
If validate
is true, the dictionary will
run a validation check after having successfully opened
storage and index. RecreateIndexError
or
RecoverError
exceptions could be raised in
case inconsistencies are found.
Note that the keysize is currently not stored in the
dictionary itself -- you'll have to store this
information in some other form. This may change in
future versions.
BeeStringDict Instance Methods
BeeStringDict Instance Attributes
BeeStringDictCursor Objects
BeeDickCursor objects are intended to iterate over the
database one item at a time without the need to read all
keys. You can then read/write to the current cursor position
and thus modify the dictionary in place.
Note that modifying the targetted dictionary while using a
cursor can cause the cursor to skip new entries or fail due
to deleted items. Especially deleting the key to which the
cursor currently points can cause errors to be raised. In
all other cases, the cursor will be repositioned.
BeeStringDictCursor objects are constructed using the BeeStringDict
.cursor()
method.
BeeStringDictCursor Instance Methods
BeeStringDictCursor Instance Attributes
Functions
Constants