Learn Sampling with MongoDB
Sample
Gets a random sample from the collection, by a specified number of docs (n).
Useful to...
- do initial analysis on a dataset
- do sampling on result set
- fetch docs for random-user-searching
- seed some random obj for computation
Conditions
When... n < 5% of the collection AND
100 docs in the collection AND
sample
is the first stage, a pseud-random cursor selects docs for returning.
When... those conditions are NOT 100% met, an in-memory random sort happens. This has the 100mb restriction.
example
db.nycFacilities.aggregate([
{
$sample: { size: 200 }
}
])
- collection has > 100 docs
- sample size is > 5% of docs
- sample stage is FIRST of the pipeline
- ... pseudo-random operation will apply
Page Tags:
database
mongodb
sampling