Archive
Talk Python To Me: a Python podcast
I just found this Python podcast: http://www.talkpythontome.com/. In the latest episode you can find an interview with “Jesse Davis from MongoDB. Jesse is the maintainer for a number of popular open-source projects including the Python MongoDB driver known as PyMongo and Mongo C (for C/C++ developers, yes you read right! C developers). Jesse discusses how interesting it is to write both Python and C code and how it reawakens part of the brain.” (source)
[mongodb] get a random document from a collection
Problem
From a MongoDB collection, you want to get a random document.
Solution
import random
def get_random_doc():
# coll refers to your collection
count = coll.count()
return coll.find()[random.randrange(count)]
Pymongo documentation on cursors: here.
Generate random hash
Problem
MongoDB generates 96 bit hash values that are used as primary keys. In a project of mine I also needed randomly generated primary keys so I decided to go the MongoDB way. So the question is: how to generate 96 bit hash values with Python?
Solution
#!/usr/bin/env python
import random
def my_hash(bits=96):
assert bits % 8 == 0
required_length = bits / 8 * 2
s = hex(random.getrandbits(bits)).lstrip('0x').rstrip('L')
if len(s) < required_length:
return my_hash(bits)
else:
return s
def main():
for _ in range(3):
print my_hash()
#########################################################
if __name__ == "__main__":
main()
Sample output:
f4bf4a4c949d7beee38d84a3 457ef2f29f462a4f1e54b61e dc921ad1e6c32bc8ce8503c8
Another (but about 3.5 times slower) solution:
def my_hash(bits=96):
assert bits % 8 == 0
return os.urandom(bits/8).encode('hex')
urandom needs the number of bytes as its parameter.
Tips from here.
Update (20130813)
I found a related work called SimpleFlake. SimpleFlake generates 64 bit IDs, where the ID is prefixed with a millisecond timestamp and the remaining bits are completely random. It has the advantage that IDs show the chronological order of ID creation.
