Code Documentation¶
- class dgim.Dgim(N, error_rate=0.5)¶
An implementation of the DGIM algorithm. It estimates the number of “True” present the last N elements of a boolean stream. The datastructure it uses is very compact and has a memory complexity of O(log(N)^2).
The algorithm is described in: Datar, Mayur, et al. “Maintaining stream statistics over sliding windows.” SIAM Journal on Computing 31.6 (2002): 1794-1813.
An explanation of the algorithm can also be found here: http://infolab.stanford.edu/~ullman/mmds/ch4.pdf
- __init__(N, error_rate=0.5)¶
Constructor
Parameters: - N (int) – sliding window width. The algorithm will return an estimate of the number of “True” in the last N elements of the stream.
- error_rate – the maximum error made by the algorithm.
The error rate is in ]0, 1] Let c be the true result and e the estimate returned by the dgim algorithm. abs(c-e) < error_rate * c The lower the error, the higher the object footprint. :type error_rate: float
- get_count()¶
Returns an estimate of the number of “True” in the last N elements of the stream.
Return type: int
- update(elt)¶
Update the stream with one element.
Parameters: elt (bool) – the latest element of the stream