Code Documentation

class dgim.Dgim(N, error_rate=0.5)

An implementation of the DGIM algorithm. It estimates the number of “True” present the last N elements of a boolean stream. The datastructure it uses is very compact and has a memory complexity of O(log(N)^2).

The algorithm is described in: Datar, Mayur, et al. “Maintaining stream statistics over sliding windows.” SIAM Journal on Computing 31.6 (2002): 1794-1813.

An explanation of the algorithm can also be found here: http://infolab.stanford.edu/~ullman/mmds/ch4.pdf

__init__(N, error_rate=0.5)

Constructor

Parameters:
  • N (int) – sliding window width. The algorithm will return an estimate of the number of “True” in the last N elements of the stream.
  • error_rate – the maximum error made by the algorithm.

The error rate is in ]0, 1] Let c be the true result and e the estimate returned by the dgim algorithm. abs(c-e) < error_rate * c The lower the error, the higher the object footprint. :type error_rate: float

get_count()

Returns an estimate of the number of “True” in the last N elements of the stream.

Return type:int
update(elt)

Update the stream with one element.

Parameters:elt (bool) – the latest element of the stream