dask.bag.Bag.distinct
dask.bag.Bag.distinct¶
- Bag.distinct(key=None)[source]¶
Distinct elements of collection
Unordered without repeats.
- Parameters
- key: {callable,str}
Defines uniqueness of items in bag by calling
key
on each item. If a string is passedkey
is considered to belambda x: x[key]
.
Examples
>>> import dask.bag as db >>> b = db.from_sequence(['Alice', 'Bob', 'Alice']) >>> sorted(b.distinct()) ['Alice', 'Bob'] >>> b = db.from_sequence([{'name': 'Alice'}, {'name': 'Bob'}, {'name': 'Alice'}]) >>> b.distinct(key=lambda x: x['name']).compute() [{'name': 'Alice'}, {'name': 'Bob'}] >>> b.distinct(key='name').compute() [{'name': 'Alice'}, {'name': 'Bob'}]