pycassa.batch
– Batch Operations¶
The batch interface allows insert, update, and remove operations to be performed in batches. This allows a convenient mechanism for streaming updates or doing a large number of operations while reducing number of RPC roundtrips.
Batch mutator objects are synchronized and can be safely passed around threads.
>>> b = cf.batch(queue_size=10)
>>> b.insert('key1', {'col1':'value11', 'col2':'value21'})
>>> b.insert('key2', {'col1':'value12', 'col2':'value22'}, ttl=15)
>>> b.remove('key1', ['col2'])
>>> b.remove('key2')
>>> b.send()
One can use the queue_size argument to control how many mutations will be
queued before an automatic send()
is performed. This allows simple streaming
of updates. If set to None
, automatic checkpoints are disabled. Default is 100.
Supercolumns are supported:
>>> b = scf.batch()
>>> b.insert('key1', {'supercol1': {'colA':'value1a', 'colB':'value1b'}
... {'supercol2': {'colA':'value2a', 'colB':'value2b'}})
>>> b.remove('key1', ['colA'], 'supercol1')
>>> b.send()
You may also create a Mutator
directly, allowing operations
on multiple column families:
>>> b = Mutator(pool)
>>> b.insert(cf, 'key1', {'col1':'value1', 'col2':'value2'})
>>> b.insert(supercf, 'key1', {'subkey1': {'col1':'value1', 'col2':'value2'}})
>>> b.send()
Note
This interface does not implement atomic operations across column families. All the limitations of the batch_mutate Thrift API call applies. Remember, a mutation in Cassandra is always atomic per key per column family only.
Note
If a single operation in a batch fails, the whole batch fails.
In addition mutators can be used as context managers, where an implicit
send()
will be called upon exit.
>>> with cf.batch() as b:
... b.insert('key1', {'col1':'value11', 'col2':'value21'})
... b.insert('key2', {'col1':'value12', 'col2':'value22'})
Calls to insert()
and remove()
can also be chained:
>>> cf.batch().remove('foo').remove('bar').send()
To use atomic batches (supported in Cassandra 1.2 and later), pass the atomic option in when creating the batch:
>>> cf.batch(atomic=True)
or when sending it:
>>> b = cf.batch()
>>> b.insert('key1', {'col1':'val2'})
>>> b.insert('key2', {'col1':'val2'})
>>> b.send(atomic=True)
-
class
pycassa.batch.
Mutator
(pool, queue_size=100, write_consistency_level=None, allow_retries=True, atomic=False)¶ Batch update convenience mechanism.
Queues insert/update/remove operations and executes them when the queue is full or send is called explicitly.
pool is the
ConnectionPool
that will be used for operations.After queue_size operations,
send()
will be executed automatically. Use 0 to disable automatic sends.-
insert
(column_family, key, columns[, timestamp][, ttl])¶ Adds a single row insert to the batch.
column_family is the
ColumnFamily
that the insert will be executed on.If this is used on a counter column family, integers may be used for column values, and they will be taken as counter adjustments.
-
remove
(column_family, key[, columns][, super_column][, timestamp])¶ Adds a single row remove to the batch.
column_family is the
ColumnFamily
that the remove will be executed on.
-
send
([write_consistency_level])¶ Sends all operations currently in the batch and clears the batch.
-
-
class
pycassa.batch.
CfMutator
(column_family, queue_size=100, write_consistency_level=None, allow_retries=True, atomic=False)¶ A
Mutator
that deals only with one column family.column_family is the
ColumnFamily
that all operations will be executed on.-
insert
(key, cols[, timestamp][, ttl])¶ Adds a single row insert to the batch.
-
remove
(key[, columns][, super_column][, timestamp])¶ Adds a single row remove to the batch.
-