snowshu.samplings.samplings package¶
Submodules¶
snowshu.samplings.samplings.brute_force_sampling module¶
-
class
snowshu.samplings.samplings.brute_force_sampling.
BruteForceSampling
(probability: float = 0.1, min_sample_size: int = 1000, max_allowed_rows: int = 1000000)¶ Bases:
snowshu.core.samplings.bases.base_sampling.BaseSampling
Heuristic sampling using raw % size for sample size and
Bernoulli
sampling.- Parameters
probability – The % sample size desired in decimal format from 0.01 to 0.99. Default 10%.
min_sample_size – The minimum number of records to retrieve from the population. Default 1000.
-
prepare
(relation: Relation, source_adapter: BaseSourceAdapter) → None¶ Runs all necessary pre-activities and instanciates the sample method.
Prepare will be called before primary query compile time, so it can be used to do any necessary pre-compile activities (such as collecting a histogram from the relation).
- Parameters
relation – The
Relation
object to prepare.source_adapter – The
source adapter
instance to use for executing prepare queries.
-
size
: int = None¶
snowshu.samplings.samplings.default_sampling module¶
-
class
snowshu.samplings.samplings.default_sampling.
DefaultSampling
(margin_of_error: float = 0.02, confidence: float = 0.99, min_sample_size: int = 1000, max_allowed_rows: int = 1000000)¶ Bases:
snowshu.core.samplings.bases.base_sampling.BaseSampling
Basic sampling using
Cochrans
theorem for sample size andBernoulli
sampling.This default sampling assumes high volatility in the population
- Parameters
margin_of_error – The acceptable error % expressed in a decimal from 0.01 to 0.10 (1% to 10%). Default 0.02 (2%). https://en.wikipedia.org/wiki/Margin_of_error
confidence – The confidence interval to be observed for the sample expressed in a decimal from 0.01 to 0.99 (1% to 99%). Default 0.99 (99%). http://www.stat.yale.edu/Courses/1997-98/101/confint.htm
min_sample_size – The minimum number of records to retrieve from the population. Default 1000.
-
prepare
(relation: Relation, source_adapter: BaseSourceAdapter) → None¶ Runs all nessesary pre-activities and instanciates the sample method.
Prepare will be called before primary query compile time, so it can be used to do any nessesary pre-compile activites (such as collecting a histogram from the relation).
- Parameters
relation – The
Relation
object to prepare.source_adapter – The
source adapter
instance to use for executing prepare queries.
-
size
: int = None¶