0% found this document useful (0 votes)
67 views163 pages

MongoDB With Python For Beginners Complete Guide

Uploaded by

Lina Zapata
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views163 pages

MongoDB With Python For Beginners Complete Guide

Uploaded by

Lina Zapata
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 163

About the Authors

Ona Prado was received Engineering of Computer Science from the American University. He has
programmed computers for 10 years. Much of his experience has related to text processing, database
systems, and Natural Language processing (NLP). Currently he consults on database applications for
companies in the financial and publishing industries.

Table of Contents
Contents
About the Authors
Table of Contents
MongoDB with Python for Beginners
PART 1: Introduction
CHAPTER 1: MongoDB: An introduction
CHAPTER 2: MongoDB and Python
CHAPTER 3: Installing MongoDB on Windows with Python
PART 2: Getting Started
CHAPTER 1: Document Databases Work
CHAPTER 2: What is a PyMongo Cursor
What is a Cursor?
PyMongo Cursor:
Output:
CHAPTER 3: Create a database in MongoDB using Python
Creating a database using Python in MongoDB
PART 3: MongoDB Queries
CHAPTER 1: Python MongoDB – Query
What is a MongoDB Query?
CHAPTER 2: MongoDB Python | Insert and Update Data
CHAPTER 3: Python MongoDB – insert_one Query
insert_one() Method
CHAPTER 4: Python MongoDB – insert_many Query
insert_many()
CHAPTER 5: Difference Between insert(), insertOne(), and insertMany() in Pymongo
CHAPTER 6: Python MongoDB – Update_one()
updateOne()
CHAPTER 7: Python MongoDB – Update_many Query
Update_many()
CHAPTER 8: MongoDB Python – Insert and Replace Operations
CHAPTER 9: MongoDB python | Delete Data and Drop Collection
CHAPTER 10: Python Mongodb – Delete_one()
Connecting to a Database
Deleting document from Collection or Database
CHAPTER 11: Python Mongodb – Delete_many()
Delete_many()
CHAPTER 12: Python MongoDB – Find
Finding data from the collection or the database
Find_one() Method
Find()
CHAPTER 13: Python MongoDB – find_one Query
CHAPTER 14: Python MongoDB – find_one_and_update Query
CHAPTER 15: Python MongoDB – find_one_and_delete query
find_one_and_delete()
CHAPTER 16: Python MongoDB – find_one_and_replace Query
CHAPTER 17: Python MongoDB – Sort
Sorting the MongoDB documents
CHAPTER 18: Python MongoDB – distinct()
distinct()
CHAPTER 19: Python MongoDB- rename()
rename()
CHAPTER 20: Python MongoDB – bulk_write()
bulk_write()
CHAPTER 21: Python MongoDB – $group (aggregation)
$group operation
CHAPTER 22: Nested Queries in PyMongo
Nested Queries in PyMongo
Query operators in PyMongo
PART 4: Working with Collections and documents in MongoDB
CHAPTER 1: Access a collection in MongoDB using Python?
Accessing a Collection
CHAPTER 2: Get the Names of all Collections using PyMongo
CHAPTER 3: Drop Collection if already exists in MongoDB using Python
CHAPTER 4: Update data in a Collection using Python
Updating Data in MongoDB
update_one()
Python3
Method: update_many()
Python3
CHAPTER 5: Get all the Documents of the Collection using PyMongo
CHAPTER 6: Count the number of Documents in MongoDB using Python
Count the number of Documents using Python
CHAPTER 7: Update all Documents in a Collection using PyMongo
Updating all Documents in a Collection
CHAPTER 8: Aggregation in MongoDB using Python
Aggregation in MongoDB
PART 5: Conversion between MongoDB data and Structured data
CHAPTER 1: Import JSON File in MongoDB using Python
Importing JSON file in MongoDB
CHAPTER 2: Convert PyMongo Cursor to JSON
CHAPTER 3: Convert PyMongo Cursor to Dataframe

MongoDB with Python for


Beginners
PART 1: Introduction
CHAPTER 1: MongoDB: An
introduction
MongoDB, the most popular NoSQL database, is an open-source
document-oriented database. The term ‘NoSQL’ means ‘non-relational’. It
means that MongoDB isn’t based on the table-like relational database
structure but provides an altogether different mechanism for storage and
retrieval of data. This format of storage is called BSON ( similar to JSON
format).
A simple MongoDB document Structure:
{

title: 'Geeksforgeeks',

by: 'Harshit Gupta',

url: 'https://www.geeksforgeeks.org',

type: 'NoSQL'
}

SQL databases store data in tabular format. This data is stored in a


predefined data model which is not very much flexible for today’s real-
world highly growing applications. Modern applications are more
networked, social and interactive than ever. Applications are storing
more and more data and are accessing it at higher rates.
Relational Database Management System(RDBMS) is not the correct
choice when it comes to handling big data by the virtue of their design
since they are not horizontally scalable. If the database runs on a single
server, then it will reach a scaling limit. NoSQL databases are more scalable
and provide superior performance. MongoDB is such a NoSQL database
that scales by adding more and more servers and increases productivity with
its flexible document model.
RDBMS vs MongoDB:
RDBMS has a typical schema design that shows number of tables
and the relationship between these tables whereas MongoDB is
document-oriented. There is no concept of schema or relationship.
Complex transactions are not supported in MongoDB because
complex join operations are not available.
MongoDB allows a highly flexible and scalable document structure.
For example, one data document of a collection in MongoDB can
have two fields whereas the other document in the same collection
can have four.
MongoDB is faster as compared to RDBMS due to efficient
indexing and storage techniques.
There are a few terms that are related in both databases. What’s
called Table in RDBMS is called a Collection in MongoDB.
Similarly, a Row is called a Document and a Column is called a
Field. MongoDB provides a default ‘_id’ (if not provided explicitly)
which is a 12-byte hexadecimal number that assures the uniqueness
of every document. It is similar to the Primary key in RDBMS.

Features of MongoDB:

Document Oriented: MongoDB stores the main subject in the


minimal number of documents and not by breaking it up into
multiple relational structures like RDBMS. For example, it stores
all the information of a computer in a single document called
Computer and not in distinct relational structures like CPU, RAM,
Hard disk, etc.
Indexing: Without indexing, a database would have to scan every
document of a collection to select those that match the query which
would be inefficient. So, for efficient searching Indexing is a must
and MongoDB uses it to process huge volumes of data in very less
time.
Scalability: MongoDB scales horizontally using sharding
(partitioning data across various servers). Data is partitioned into
data chunks using the shard key, and these data chunks are evenly
distributed across shards that reside across many physical servers.
Also, new machines can be added to a running database.
Replication and High Availability: MongoDB increases the data
availability with multiple copies of data on different servers. By
providing redundancy, it protects the database from hardware
failures. If one server goes down, the data can be retrieved easily
from other active servers which also had the data stored on them.
Aggregation: Aggregation operations process data records and
return the computed results. It is similar to the GROUPBY clause in
SQL. A few aggregation expressions are sum, avg, min, max, etc

Where do we use MongoDB?

MongoDB is preferred over RDBMS in the following scenarios:

Big Data: If you have huge amount of data to be stored in tables,


think of MongoDB before RDBMS databases. MongoDB has built-
in solution for partitioning and sharding your database.
Unstable Schema: Adding a new column in RDBMS is hard
whereas MongoDB is schema-less. Adding a new field does not
effect old documents and will be very easy.
Distributed data Since multiple copies of data are stored across
different servers, recovery of data is instant and safe even if there is
a hardware failure.

Language Support by MongoDB:


MongoDB currently provides official driver support for all popular
programming languages like C, C++, Rust, C#, Java, Node.js, Perl, PHP,
Python, Ruby, Scala, Go, and Erlang.

Installing MongoDB:
Just go to http://www.mongodb.org/downloads and select your operating
system out of Windows, Linux, Mac OS X and Solaris. A detailed
explanation about the installation of MongoDB is given on their site.
For Windows, a few options for the 64-bit operating systems drops down.
When you’re running on Windows 7, 8 or newer versions, select Windows
64-bit 2008 R2+. When you’re using Windows XP or Vista then
select Windows 64-bit 2008 R2+ legacy.
Who’s using MongoDB?

MongoDB has been adopted as backend software by a number of major


websites and services including EA, Cisco, Shutterfly, Adobe, Ericsson,
Craigslist, eBay, and Foursquare.
CHAPTER 2: MongoDB and Python
Prerequisite : MongoDB : An introduction

MongoDB is a cross-platform, document-oriented database that works on the concept of


collections and documents. MongoDB offers high speed, high availability, and high
scalability.
The next question which arises in the mind of the people is “Why MongoDB”?

Reasons to opt for MongoDB :


1. It supports hierarchical data structure (Please refer docs for details)
2. It supports associate arrays like Dictionaries in Python.
3. Built-in Python drivers to connect python-application with Database. Example-
PyMongo
4. It is designed for Big Data.
5. Deployment of MongoDB is very easy.

MongoDB vs RDBMS
MongoDB and PyMongo Installation Guide
1. First start MongoDB from command prompt using :
Method 1:

mongod
or
Method 2:
net start MongoDB
See port number by default is set 27017 (last line in above image).
Python has a native library for MongoDB. The name of the available library is
“PyMongo”. To import this, execute the following command:

from pymongo import MongoClient

2. Create a connection : The very first after importing the module is to create a
MongoClient.

from pymongo import MongoClient

client = MongoClient()

3. After this, connect to the default host and port. Connection to the host and port is
done explicitly. The following command is used to connect the MongoClient on the
localhost which runs on port number 27017.

client = MongoClient(‘host’, port_number)

example:- client = MongoClient(‘localhost’, 27017)

4. It can also be done using the following command:


client = MongoClient(“mongodb://localhost:27017/”)

5. Access DataBase Objects : To create a database or switch to an existing database


we use:
Method 1 : Dictionary-style

mydatabase = client[‘name_of_the_database’]

6. Method2 :

mydatabase = client.name_of_the_database

7. If there is no previously created database with this name, MongoDB will implicitly
create one for the user.
Note : The name of the database fill won’t tolerate any dash (-) used in it. The names
like my-Table will raise an error. So, underscore are permitted to use in the name.
8. Accessing the Collection : Collections are equivalent to Tables in RDBMS. We
access a collection in PyMongo in the same way as we access the Tables in the
RDBMS. To access the table, say table name “myTable” of the database, say
“mydatabase”.
Method 1:

mycollection = mydatabase[‘myTable’]

9. Method 2 :

mycollection = mydatabase.myTable

10.
>MongoDB store the database in the form of dictionaries as shown:>
11.
record = {
12.
title: 'MongoDB and Python',
13.
description: 'MongoDB is no SQL database',
14.
tags: ['mongodb', 'database', 'NoSQL'],
15.
viewers: 104
16.
}
17.
‘_id’ is the special key which get automatically added if the programmer forgets
to add explicitly. _id is the 12 bytes hexadecimal number which assures the
uniqueness of every inserted document.

18.
Insert the data inside a collection :
Methods used:

insert_one() or insert_many()

We normally use insert_one() method document into our collections. Say, we wish to
enter the data named as record into the ’myTable’ of ‘mydatabase’.

rec = myTable.insert_one(record)

The whole code looks likes this when needs to be implemented.

# importing module

from pymongo import MongoClient

# creation of MongoClient

client=MongoClient()

# Connect with the portnumber and host

client = MongoClient(“mongodb://localhost:27017/”)
# Access database

mydatabase = client[‘name_of_the_database’]

# Access collection of the database

mycollection=mydatabase[‘myTable’]

# dictionary to be added in the database

rec={

title: 'MongoDB and Python',

description: 'MongoDB is no SQL database',

tags: ['mongodb', 'database', 'NoSQL'],

viewers: 104

# inserting the data in the database

rec = mydatabase.myTable.insert(record)

19.
Querying in MongoDB : There are certain query functions which are used to
filter the data in the database. The two most commonly used functions are:
1.
find()
find() is used to get more than one single document as a result of query.

for i in mydatabase.myTable.find({title: 'MongoDB and Python'})

print(i)

2. This will output all the documents in the myTable of mydatabase whose title is
‘MongoDB and Python’.
3. count()
count() is used to get the numbers of documents with the name as passed in the
parameters.
print(mydatabase.myTable.count({title: 'MongoDB and Python'}))

4. This will output the numbers of documents in the myTable of mydatabase whose
title is ‘MongoDB and Python’.

20.
These two query functions can be summed to give a give the most filtered result
as shown below.
print(mydatabase.myTable.find({title: 'MongoDB and Python'}).count())

1. To print all the documents/entries inside ‘myTable’ of database


‘mydatabase’ : Use the following code:

from pymongo import MongoClient

try:

conn = MongoClient()

print("Connected successfully!!!")

except:

print("Could not connect to MongoDB")

# database name: mydatabase

db = conn.mydatabase

# Created or Switched to collection names: myTable

collection = db.myTable

# To find() all the entries inside collection name 'myTable'

cursor = collection.find()

for record in cursor:

print(record)
CHAPTER 3: Installing MongoDB
on Windows with Python
We would explain the installation of MongoDB in steps. Before you install,
I would suggest everyone use ide spyder, Anaconda.

Step 1 -> Install the community Edition Installation Link


Step 2 -> Run the installed MongoDB windows installer package that
you just downloaded. MongoDB get installed here->
C:\Program Files\MongoDB\Server\3.4\
Step 3 -> Let’s set MongoDB environment

(a) Create data directory where all data is stored. On C: drive


create a folder data inside it create a folder db or Run

md C:\data\db

(b) To start MongoDB Run ->

"C:\Program Files\MongoDB\Server\3.4\bin\mongod.exe"

Wait till the connection message appears


(c) Verify Environment Path or set path if not correctly set Open
environment variables, you can search this by windows search.
Open Environment Variable under the System variables section open
Path. This would look like this.
Add the path of bin folder as shown in the image above.

(d) To Connect to MongoDB Open other command prompt and


run->

"C:\Program Files\MongoDB\Server\3.4\bin\mongo.exe

Step 4-> Ready MongoDB Open Command Prompt(Admin mode) type->


mongod
NOTE : Till step 4 MongoDB will work only when the Command Prompt
is open and it’s listening. Now we’ll see Extension to make it better. Below
steps from step 5 to step 8 are optional : Step 5-> Open command
prompt and run-
mkdir c:\data\db
mkdir c:\data\log

Step 6-> Create a configuration file at C:\Program


Files\MongoDB\Server\3.4\mongod.cfg (name of file mongod.cfg)
systemLog:
destination: file

path: c:\data\log\mongod.log

storage:

dbPath: c:\data\db

This can be created and saved in Admin mode of Notepad or Notepad++ or


any other editor to run notepad admin mode press Ctrl + Shift + Enter.
Admin mode of notepad will let you create mongod.cfg and save above text
file.
Step 7 -> Install the MongoDB service by starting mongod.exe with the
–install option and the -config option to specify the previously created
configuration file. Now run this command on command prompt
"C:\Program Files\MongoDB\Server\3.4\bin\mongod.exe"

--config "C:\Program Files\MongoDB\Server\3.4\mongod.cfg" –install

Step 8-> To start & stop MongoDB run To start :

net start MongoDB


To stop :
net stop MongoDB

NOTE : ALL commands are run on Command Prompt Admin mode, to


open command prompt Admin Mode either open normal command prompt
and press Ctrl+Shift+Enter or Right click on left windows icon start button
where you can see the options.
Step 9 -> Open Anaconda Command Prompt as shown in the image.
Step 10 -> Install package to use MongoDB To install this package with
conda run:
conda install -c anaconda pymongo
PART 2: Getting Started
CHAPTER 1: Document Databases
Work

A document database has information retrieved or stored in the form of a


document or other words semi-structured database. Since they are non-
relational, so they are often referred to as NoSQL data.
The document database fetches and accumulates data in forms of key-value
pairs but here, the values are called as Documents. A document can be
stated as a complex data structure. Document here can be a form of text,
arrays, strings, JSON, XML, or any such format. The use of nested
documents is also very common. It is very effective as most of the data
created is usually in the form of JSON and is unstructured.

Consider the below example that shows a sample database stored in


both RelationalandDocumentDatabase
CHAPTER 2: What is a PyMongo Cursor
MongoDB is an open-source database management system that uses the NoSql database to
store large amounts of data. MongoDB uses collection and documents instead of tables like
traditional relational databases. MongoDB documents are similar to JSON objects but use a
variant called Binary JSON (BSON) that accommodates more data types.

What is a Cursor?
When you use the function db.collection.find() to search documents in collections then as a
result it returns a pointer. That pointer is known as a cursor. Consider if we have 2
documents in our collection, then the cursor object will point to the first document and then
iterate through all documents which are present in our collection.

PyMongo Cursor:
As we already discussed what is a cursor. It is basically a tool for iterating over MongoDB
query result sets. This cursor instance is returned by the find() method. Consider the below
example for better understanding.
Example: Sample database is as follows:
javascript

from pymongo import MongoClient

# Connecting to mongodb

client = MongoClient('mongodb://localhost:27017/')
with client:

db = client.GFG

lectures = db.lecture.find()

print(lectures.next())

print(lectures.next())

print(lectures.next())

print("\nRemaining Lectures\n")

print(list(lectures))

Output:

In this, find() method returns the cursor object.


lectures = db.lecture.find()

With the next() method we get the next document in the collection.
lectures.next()

With the list() method, we can transform the cursor to a Python list.
CHAPTER 3: Create a database in MongoDB
using Python
MongoDB is a general-purpose, document-based, distributed database built for modern
application developers and the cloud. It is a document database, which means it stores data
in JSON-like documents. This is an efficient way to think about data and is more expressive
and powerful than the traditional table model. MongoDB has no separate command to create
a database. Instead, it uses the use command to create a database. The use command is used
to switch to the specific database. If the database name specified after the use keyword does
not exist, then a new database is created with the specified name.

Creating a database using Python in MongoDB

To use Python in MongoDB, we are going to import PyMongo. From that, MongoClient can
be imported which is used to create a client to the database. Using the client, a new database
can be created. Example: List of databases using MongoDB shell (before):

Python3

# import MongoClient

from pymongo import MongoClient

# Creating a client

client = MongoClient('localhost', 27017)


# Creating a database name GFG

db = client['GFG']

print("Database is created !!")

Output:
Database is created!!

In the above example, it is clearly shown how a database is created. When creating a client,
the local host along with its port number, which is 27017 here, is passed to the MongoClient.
Then, by using the client, a new database named ‘GFG’ is created. We can check if the
database is present in the list of databases using the following code:

Python3

list_of_db = client.list_database_names()

if "mydbase" in list_of_db:

print("Exists !!")

Output:
Exists!!

List of Databases in MongoDB shell (after):


PART 3: MongoDB Queries
CHAPTER 1: Python MongoDB – Query
MongoDB is a cross-platform document-oriented and a non relational (i.e NoSQL) database
program. It is an open-source document database, that stores the data in the form of key-
value pairs.

What is a MongoDB Query?

MongoDB query is used to specify the selection filter using query operators while retrieving
the data from the collection by db.find() method. We can easily filter the documents using
the query object. To apply the filter on the collection, we can pass the query specifying the
condition for the required documents as a parameter to this method, which is an optional
parameter for db.find() method.
Query Selectors: Following is the list of some operators used in the queries in MongoDB.
Operatio
n Syntax Description

Equality {“key” : “value”} Matches values that are equal to a specified value.

Less {“key” :
Matches values that are less than a specified value.
Than {$lt:”value”}}

Greater {“key” : Matches values that are greater than a specified


Than {$gt:”value”}} value.

Less
{“key” : Matches values that are less than or equal to a
Than
{$lte:”value”}} specified value.
Equal to

Greater
{“key” : Matches values that are greater than or equal to a
Than
{$gte:”value”}} specified value.
Equal to

Not {“key”:{$ne: Matches all values that are not equal to a specified
Equal to “value”}} value.
Operatio
n Syntax Description

{ “$and”:[{exp1},
Logical Joins query clauses with a logical AND returns all
{exp2}, …,
AND documents that match the conditions of both clauses.
{expN}] }

{ “$or”:[{exp1},
Logical Joins query clauses with a logical OR returns all
{<exp2}, …,
OR documents that match the conditions of either clause.
{expN}] }

{ “$not”:[{exp1},
Logical Inverts the effect of a query expression and returns
{exp2}, …,
NOT documents that do not match the query expression.
{expN}] }

The Database or Collection on which we operate:

Example 1:

Python3

# importing Mongoclient from pymongo

from pymongo import MongoClient

# Making Connection

myclient = MongoClient("mongodb://localhost:27017/")
# database

db = myclient["mydatabase"]

# Created or Switched to collection

# names: GeeksForGeeks

Collection = db["GeeksForGeeks"]

# Filtering the Quantities greater

# than 40 using query.

cursor = Collection.find({"Quantity":{"$gt":40}})

# Printing the filtered data.

print("The data having Quantity greater than 40 is:")

for record in cursor:

print(record)

# Filtering the Quantities less

# than 40 using query.

cursor = Collection.find({"Quantity":{"$lt":40}})

# Printing the filtered data.

print("\nThe data having Quantity less than 40 is:")

for record in cursor:

print(record)

Output:
Example 2:

Python3

# importing Mongoclient from pymongo

from pymongo import MongoClient

# Making Connection

myclient = MongoClient("mongodb://localhost:27017/")

# database

db = myclient["mydatabase"]

# Created or Switched to collection

# names: GeeksForGeeks

Collection = db["GeeksForGeeks"]

# Filtering the (Quantities greater than

# 40 AND greater than 40) using AND query.

cursor = Collection.find({"$and":[{"Quantity":{"$gt":40}},

{"Quantity":{"$gt":50}}]})
# Printing the filtered data.

print("Quantities greater than 40 AND\

Quantities greater than 40 :")

for record in cursor:

print(record)

# Filtering the (Quantities greater than

# 40 OR greater than 40) using OR query.

cursor = Collection.find({"$or":[{"Quantity":{"$gt":40}},

{"Quantity":{"$gt":50}}]})

# Printing the filtered data.

print()

print("Quantities greater than 40 OR\

Quantities greater than 40 :")

for record in cursor:

print(record)

Output:
CHAPTER 2: MongoDB Python | Insert and
Update Data
Prerequisites : MongoDB Python Basics We would first understand how to insert a
document/entry in a collection of a database. Then we would work on how to update an
existing document in MongoDB using pymongo library in python. The update commands
helps us to update the query data inserted already in MongoDB database collection.

Insert data
We would first insert data in MongoDB.

Step 1 – Establishing Connection: Port number Default: 27017

conn = MongoClient(‘localhost’, port-number)

If using default port-number i.e. 27017. Alternate connection method:

conn = MongoClient()

Step 2 – Create Database or Switch to Existing Database:

db = conn.dabasename

Create a collection or Switch to existing collection:

collection = db.collection_name

Step 3 – Insert : To Insert Data create a dictionary object and insert data in
database. Method used to insert data:

insert_one() or insert_many()

After insert to find the documents inside a collection we use find() command. The
find() method issues a query to retrieve data from a collection in MongoDB. All
queries in MongoDB have the scope of a single collection. Note : ‘_id’ is different
for every entry in database collection. Let us understand insert of data with help on
code:-

Python3
# Python code to illustrate

# inserting data in MongoDB

from pymongo import MongoClient

try:

conn = MongoClient()

print("Connected successfully!!!")

except:

print("Could not connect to MongoDB")

# database

db = conn.database

# Created or Switched to collection names: my_gfg_collection

collection = db.my_gfg_collection

emp_rec1 = {

"name":"Mr.Geek",

"eid":24,

"location":"delhi"

emp_rec2 = {

"name":"Mr.Shaurya",

"eid":14,

"location":"delhi"

# Insert Data

rec_id1 = collection.insert_one(emp_rec1)

rec_id2 = collection.insert_one(emp_rec2)
print("Data inserted with record ids",rec_id1," ",rec_id2)

# Printing the data inserted

cursor = collection.find()

for record in cursor:

print(record)

Output:
Connected successfully!!!
Data inserted with record ids
{'_id': ObjectId('5a02227b37b8552becf5ed2a'),
'name': 'Mr.Geek', 'eid': 24, 'location': 'delhi'}

{'_id': ObjectId('5a02227c37b8552becf5ed2b'), 'name':

'Mr.Shaurya', 'eid': 14, 'location': 'delhi'}

Updating data in MongoDB


Methods used: update_one() and update_many() Parameters passed: + a filter document
to match the documents to update + an update document to specify the modification to
perform + an optional upsert parameter After inserting Data in MongoDB let’s Update the
Data of employee with id:24

Python3

# Python code to illustrate

# updating data in MongoDB

# with Data of employee with id:24

from pymongo import MongoClient

try:

conn = MongoClient()
print("Connected successfully!!!")

except:

print("Could not connect to MongoDB")

# database

db = conn.database

# Created or Switched to collection names: my_gfg_collection

collection = db.my_gfg_collection

# update all the employee data whose eid is 24

result = collection.update_many(

{"eid":24},

"$set":{

"name":"Mr.Geeksforgeeks"

},

"$currentDate":{"lastModified":True}

print("Data updated with id",result)

# Print the new record

cursor = collection.find()

for record in cursor:

print(record)
Output:

Connected successfully!!!
Data updated with id
{'_id': ObjectId('5a02227b37b8552becf5ed2a'),
'name': 'Mr.Geeksforgeeks', 'eid': 24, 'location':
'delhi', 'lastModified': datetime.datetime(2017, 11, 7, 21, 19, 9, 698000)}
{'_id': ObjectId('5a02227c37b8552becf5ed2b'), 'name':
'Mr.Shaurya', 'eid': 14, 'location': 'delhi'}

To find number of documents or entries in collection the are updated use.


print(result.matched_count)
Here output would be 1.
CHAPTER 3: Python MongoDB – insert_one
Query
MongoDB is a cross-platform document-oriented and a non relational (i.e NoSQL) database
program. It is an open-source document database, that stores the data in the form of key-
value pairs. MongoDB is developed by MongoDB Inc. and was initially released on 11
February 2009. It is written in C++, Go, JavaScript, and Python languages. MongoDB offers
high speed, high availability, and high scalability.

insert_one() Method
This is a method by which we can insert a single entry within the collection or the database
in MongoDB. If the collection does not exist this method creates a new collection and insert
the data into it. It takes a dictionary as a parameter containing the name and value of each
field in the document you want to insert in the collection.
This method returns an instance of class “~pymongo.results.InsertOneResult” which has a
“_id” field that holds the id of the inserted document. If the document does not specify an
“_id” field, then MongoDB will add the “_id” field and assign a unique object id for the
document before inserting.

Syntax:

collection.insert_one(document, bypass_document_validation=False, session=None, comment=None)

Parameters:

‘document’: The document to insert. Must be a mutable mapping type. If the


document does not have an _id field one will be added automatically.
‘bypass_document_validation’ (optional): If “True”, allows the write to opt-out of
document level validation. Default is “False”.
‘session’ (optional): A class ‘~pymongo.client_session.ClientSession’.
‘comment'(optional): A user-provided comment to attach to this command.

Example 1:
Sample database is as follows:
Example

Python3

# importing Mongoclient from pymongo

from pymongo import MongoClient

# Making Connection

myclient = MongoClient("mongodb://localhost:27017/")

# database

db = myclient["GFG"]

# Created or Switched to collection

# names: GeeksForGeeks

collection = db["Student"]

# Creating Dictionary of records to be

# inserted

record = { "_id": 5,

"name": "Raju",

"Roll No": "1005",

"Branch": "CSE"}
# Inserting the record1 in the collection

# by using collection.insert_one()

rec_id1 = collection.insert_one(record)

Output:

Example 2: Inserting multiple values

To insert multiple values, 2 Methods can be followed:


#1: Naive Method: Using for loop and insert_one

Python3

# importing Mongoclient from pymongo

from pymongo import MongoClient

# Making Connection

myclient = MongoClient("mongodb://localhost:27017/")

# database

db = myclient["GFG"]

# Created or Switched to collection

# names: GeeksForGeeks

collection = db["Student"]
# Creating Dictionary of records to be

# inserted

records = {

"record1": { "_id": 6,

"name": "Anshul",

"Roll No": "1006",

"Branch": "CSE"},

"record2": { "_id": 7,

"name": "Abhinav",

"Roll No": "1007",

"Branch": "ME"}

# Inserting the records in the collection

# by using collection.insert_one()

for record in records.values():

collection.insert_one(record)

Output:

#2: Using insert_many method: This method can be used to insert multiple documents in a
collection in MongoDB.
CHAPTER 4: Python MongoDB –
insert_many Query
MongoDB is a cross-platform document-oriented and a non relational (i.e NoSQL) database
program. It is an open-source document database, that stores the data in the form of key-
value pairs. MongoDB is developed by MongoDB Inc. and was initially released on 11
February 2009. It is written in C++, Go, JavaScript, and Python languages. MongoDB offers
high speed, high availability, and high scalability.
insert_many()
This method is used to insert multiple entries in a collection or the database in MongoDB.
The parameter of this method is a list that contains dictionaries of the data that we want to
insert in the collection.
This method returns an instance of class “~pymongo.results.InsertManyResult” which has a
“_id” field that holds the id of the inserted documents. If the document does not specify an
“_id” field, then MongoDB will add the “_id” field to all the data in the list and assign a
unique object id for the documents before inserting.

Syntax:
collection.insert_many(documents, ordered=True, bypass_document_validation=False, session=None)

Parameters:

‘documents’ : A iterable of documents to insert.


‘ordered’ (optional): If “True” (the default) documents will be inserted on the
server serially, in the order provided. If an error occurs all remaining inserts are
aborted. If “False”, documents will be inserted on the server in arbitrary order,
possibly in parallel, and all document inserts will be attempted.
‘bypass_document_validation’ (optional) : If “True”, allows the write to opt-out of
document level validation. Default is “False”.
‘session’ (optional): a class ‘~pymongo.client_session.ClientSession’.
comment (optional): A user-provided comment to attach to this command.
(Changed in version 4.1)

Example 1: In this example _id is provided.

Python3
# importing Mongoclient from pymongo

from pymongo import MongoClient

myclient = MongoClient("mongodb://localhost:27017/")

# database

db = myclient["GFG"]

# Created or Switched to collection

# names: GeeksForGeeks

collection = db["Student"]

# Creating a list of records which we

# insert in the collection using the

# update_many() method.

mylist = [

{ "_id": 1, "name": "Vishwash", "Roll No": "1001", "Branch":"CSE"},

{ "_id": 2, "name": "Vishesh", "Roll No": "1002", "Branch":"IT"},

{ "_id": 3, "name": "Shivam", "Roll No": "1003", "Branch":"ME"},

{ "_id": 4, "name": "Yash", "Roll No": "1004", "Branch":"ECE"},

# In the above list _id field is provided so it inserted in

# the collection as specified.

# Inserting the entire list in the collection


collection.insert_many(mylist)

Output:

Example 2: In this example _id is not provided, it is allocated automatically by MongoDB.

Python3

# importing Mongoclient from pymongo

from pymongo import MongoClient

myclient = MongoClient("mongodb://localhost:27017/")

# database

db = myclient["GFG"]

# Created or Switched to collection

# names: GeeksForGeeks

collection = db["Geeks"]

# Creating a list of records which we

# insert in the collection using the

# update_many() method.

mylist = [
{"Manufacturer":"Honda", "Model":"City", "Color":"Black"},

{"Manufacturer":"Tata", "Model":"Altroz", "Color":"Golden"},

{"Manufacturer":"Honda", "Model":"Civic", "Color":"Red"},

{"Manufacturer":"Hyundai", "Model":"i20", "Color":"white"},

{"Manufacturer":"Maruti", "Model":"Swift", "Color":"Blue"},

# In the above list we do not specify the _id, the MongoDB assigns

# a unique id to all the records in the collection by default.

# Inserting the entire list in the collection

collection.insert_many(mylist)

Output:
CHAPTER 5: Difference Between insert(),
insertOne(), and insertMany() in Pymongo
MongoDB is a NoSql Database that can be used to store data required by different
applications. Python can be used to access MongoDB databases. Python requires a driver to
access the databases. PyMongo enables interacting with MongoDB database from Python
applications. The pymongo package acts as a native Python driver for MongoDB. Pymongo
provides commands that can be used in Python applications to perform required action on
the MongoDB. MongoDB offers three methods to insert records or documents into the
database which are as follows:

insert() : Used to insert a document or documents into a collection. If the collection


does not exist, then insert() will create the collection and then insert the specified
documents.

Syntax

db.collection.insert(<document or array of documents>,


{
writeConcern: <document>,
ordered: <boolean>
}
)

Parameter

<document>: The document or record that is to be stored in the database


writeConcern: Optional.
ordered: Optional. Can be set to true or false.
Example:

Python3

# importing Mongoclient from pymongo

from pymongo import MongoClient


myclient = MongoClient("mongodb://localhost:27017/")

# database

db = myclient["GFG"]

# Created or Switched to collection

# names: College

collection = db["College"]

mylist = [

{ "_id": 1, "name": "Vishwash", "Roll No": "1001", "Branch":"CSE"},

{ "_id": 2, "name": "Vishesh", "Roll No": "1002", "Branch":"IT"},

{ "_id": 3, "name": "Shivam", "Roll No": "1003", "Branch":"ME"},

{ "_id": 4, "name": "Yash", "Roll No": "1004", "Branch":"ECE"},

# Inserting the entire list in the collection

collection.insert(mylist)

Output:

insertOne() : Used to insert a single document or record into the database. If the
collection does not exist, then insertOne() method creates the collection first and
then inserts the specified document.

Syntax

db.collection.insertOne(<document>,
{
writeConcern: <document>
}
)

Parameter

<document> The document or record that is to be stored in the database


writeConcern: Optional.

Return Value: It returns the _id of the document inserted into the database.
Note: The Pymongo command for insertOne() is insert_one()
Example:

Python3

# importing Mongoclient from pymongo

from pymongo import MongoClient

# Making Connection

myclient = MongoClient("mongodb://localhost:27017/")

# database

db = myclient["GFG"]

# Created or Switched to collection

# names: GeeksForGeeks

collection = db["Student"]

# Creating Dictionary of records to be

# inserted

record = { "_id": 5,

"name": "Raju",

"Roll No": "1005",


"Branch": "CSE"}

# Inserting the record1 in the collection

# by using collection.insert_one()

rec_id1 = collection.insert_one(record)

Output:

insertMany()

Syntax
db.collection.insertMany([ <document 1>, <document 2>, … ],
{
writeConcern: <document>,
ordered: <boolean>
}
)

Parameter

<documents> The document or record that is to be stored in the database


writeConcern: Optional.
ordered: Optional. Can be set to true or false.

Return Value: It returns the _ids of the documents inserted into the database.
Note: The Pymongo command for insertMany() is insert_many()
Example:

Python3

# importing Mongoclient from pymongo

from pymongo import MongoClient


myclient = MongoClient("mongodb://localhost:27017/")

# database

db = myclient["GFG"]

# Created or Switched to collection

# names: GeeksForGeeks

collection = db["College"]

mylist = [

{ "_id": 6, "name": "Deepanshu", "Roll No": "1006", "Branch":"CSE"},

{ "_id": 7, "name": "Anshul", "Roll No": "1007", "Branch":"IT"}

# Inserting the entire list in the collection

collection.insert_many(mylist)

Output:

.math-table { border-collapse: collapse; width: 100%; } .math-table td { border: 1px solid


#5fb962; text-align: left !important; padding: 8px; } .math-table th { border: 1px solid
#5fb962; padding: 8px; } .math-table tr>th{ background-color: #c6ebd9; vertical-align:
middle; } .math-table tr:nth-child(odd) { background-color: #ffffff; }

insert() insertOne() insertMany()


Pymongo equivalent Pymongo equivalent
Pymongo equivalent command is command is command is
insert() insert_one() insert_many()
Deprecated in newer versions of Used in newer versions Used in newer versions
mongo engine of mongo engine of mongo engine
throws
WriteResult.writeConcernError throws either a
and WriteResult.writeError for writeError or throws a
write and non-write concern writeConcernError BulkWriteError
errors respectively exception. exception.
compatible with not compatible with not compatible with
db.collection.explain() db.collection.explain() db.collection.explain()
If ordered is set to true
and any document
reports an error then
the remaining
If ordered is set to true and any documents are not
document reports an error then inserted. If ordered is
the remaining documents are not If error is reported for set to false then
inserted. If ordered is set to false the document it is not remaining documents
then remaining documents are inserted into the are inserted even if an
inserted even if an error occurs. database error occurs.
returns the insert_ids
returns an object that contains returns the insert_id of of the documents
the status of the operation. the document inserted inserted
CHAPTER 6: Python MongoDB –
Update_one()
MongoDB is a cross-platform document-oriented and a non relational (i.e NoSQL) database
program. It is an open-source document database, that stores the data in the form of key-
value pairs.
First create a database on which we perform the update_one() operation:

Python3

# importing Mongoclient from pymongo

from pymongo import MongoClient

try:

conn = MongoClient() # Making connection

except:

print("Could not connect to MongoDB")

# database

db = conn.database

# Created or Switched to collection


# names: GeeksForGeeks

collection = db.GeeksForGeeks

# Creating Records:

record1 = { "appliance":"fan",

"quantity":10,

"rating":"3 stars",

"company":"havells"}

record2 = { "appliance":"cooler",

"quantity":15,

"rating":"4 stars",

"company":"symphony"}

record3 = { "appliance":"ac",

"quantity":20,

"rating":"5 stars",

"company":"voltas"}

record4 = { "appliance":"tv",

"quantity":12,

"rating":"3 stars",

"company":"samsung"}

# Inserting the Data

rec_id1 = collection.insert_one(record1)

rec_id2 = collection.insert_one(record2)

rec_id3 = collection.insert_one(record3)

rec_id4 = collection.insert_one(record4)
# Printing the data inserted

print("The data in the database is:")

cursor = collection.find()

for record in cursor:

print(record)

Output :

MongoDB Shell:

updateOne()
It is a function by which we can update a record in a MongoDB database or Collection. This
method mainly focuses on two arguments that we passed one is the query (i.e filter) object
defining which document to update and the second is an object defining the new values of
the document(i.e new_values) and the rest arguments are optional that we will discuss in the
syntax section. This function finds the first document that matches with the query and update
it with an object defining the new values of the document, i.e Updates a single document
within the collection based on the filter.
Syntax:

collection.update_one(filter, new_values, upsert=False,


bypass_document_validation=False, collation=None, array_filters=None, session=None)

Parameters:

‘filter’ : A query that matches the document to update.


‘new_values’ : The modifications to apply.
‘upsert’ (optional): If “True”, perform an insert if no documents match the filter.
‘bypass_document_validation’ (optional) : If “True”, allows the write to opt-out of
document level validation. Default is “False”.
‘collation’ (optional) : An instance of class: ‘~pymongo.collation.Collation’. This
option is only supported on MongoDB 3.4 and above.
‘array_filters’ (optional) : A list of filters specifying which array elements an
update should apply. Requires MongoDB 3.6+.
‘session’ (optional) : a class:’~pymongo.client_session.ClientSession’.

Example 1: In this example, we are going to update the fan quantity from 10 to 25.

Python3

# importing Mongoclient from pymongo

from pymongo import MongoClient

conn = MongoClient('localhost', 27017)

# database

db = conn.database

# Created or Switched to collection

# names: GeeksForGeeks

collection = db.GeeksForGeeks

# Updating fan quantity from 10 to 25.

filter = { 'appliance': 'fan' }

# Values to be updated.

newvalues = { "$set": { 'quantity': 25 } }

# Using update_one() method for single


# updation.

collection.update_one(filter, newvalues)

# Printing the updated content of the

# database

cursor = collection.find()

for record in cursor:

print(record)

Output :

MongoDB Shell:

Example 2: In this example we are changing the tv company name from ‘samsung’ to
‘sony’ by using update_one():

Python3

# importing Mongoclient from pymongo

from pymongo import MongoClient

conn = MongoClient('localhost', 27017)


# database

db = conn.database

# Created or Switched to collection

# names: GeeksForGeeks

collection = db.GeeksForGeeks

# Updating the tv company name from

# 'samsung' to 'sony'.

filter = { 'appliance': 'tv' }

# Values to be updated.

newvalues = { "$set": { 'company': "sony" } }

# Using update_one() method for single updation.

collection.update_one(filter, newvalues)

# Printing the updated content of the database

cursor = collection.find()

for record in cursor:

print(record)

Output :

MongoDB Shell:
NOTE :The “$set” operator replaces the value of a field with the specified value. If the field
does not exist, “$set” will add a new field with the specified value, provided that the new
field does not violate a type constraint.
CHAPTER 7: Python MongoDB –
Update_many Query
MongoDB is a NoSQL database management system. Unlike MySQL the data in
MongoDB is not stored as relations or tables. Data in mongoDB is stored as documents.
Documents are Javascript/JSON like objects. More formally documents in MongoDB use
BSON. PyMongo is a MongoDB API for python. It allows to read and write data from a
MongoDB database using a python script. It needs both python and mongoDB to be installed
on the system.

Update_many()

Update function has been deprecated in newer versions of MongoDB (3.xx and above).
Earlier update function could be used for both single updates and multiple using “multi =
true”. But in newer versions of mongoDB it is recommended to use update_many() and
update_one().
The major difference is that the user needs to plan ahead if the query is going to be updating
single or multiple documents.

Syntax:

db.collection.updateMany(

<filter>,

<update>,

upsert: <boolean>,
writeConcern: <document>,
collation: <document>,
arrayFilters: [ <filterdocument1>, ... ],
hint: <document|string>
}
)

Update Operators in MongoDB


Setting Values:

$set: Used to set a fields value.


$setOnInsert: Update value only if a new document insertion.
$unset: Remove the field and its value.

Numeric Operators:

$inc: Increases the value by a given amount.


$min/$max: returns minimum or maximum of value.
$mul: multiplies the values by a given amount.

Miscellaneous Operators:

$currentDate: Updates value of a field to current date.


$rename: Renames a field

Sample Database:

Some use cases we are going to see in this article where updating many records can be
useful:
1. Changing or incrementing several elements based on a condition.
2. Inserting a new field to multiple or all documents.
Example 1: All the students with marks greater than 35 has been passed.

Python3

from pymongo import MongoClient

# Creating an instance of MongoClient

# on default localhost

client = MongoClient('mongodb://localhost:27017')

# Accessing desired database and collection

db = client.gfg

collection = db["classroom"]

# Update passed field to be true for all

# students with marks greater than 35

collection.update_many(

{"marks": { "$gt": "35" } },

"$set": { "passed" : "True" }

Database After Query:


Example 2: New field called address added to all documents

Python

from pymongo import MongoClient

# Creating an instance of MongoClient

# on default localhost

client = MongoClient('mongodb://localhost:27017')

# Accessing desired database and collection

db = client.gfg

collection = db["classroom"]

# Address filed to be added to all documents


collection.update_many(

{},

{"$set":

"Address": "value"

},

# don't insert if no document found

upsert=False,

array_filters=None

Database After query:


CHAPTER 8: MongoDB Python – Insert and
Replace Operations

This article focus on how to replace document or entry inside a collection. We can only
replace the data already inserted in the database.
Prerequisites : MongoDB Python Basics
Method used: replace_one() Aim: Replace entire data of old document with a new document

Insertion In MongoDB

We would first insert data in MongoDB.

Python3

# Python code to illustrate

# Insert in MongoDB

from pymongo import MongoClient

try:

conn = MongoClient()

print(& quot

Connected successfully!!!& quot

except:

print(& quot

Could not connect to MongoDB & quot

# database

db = conn.database
# Created or Switched to collection names: my_gfg_collection

collection = db.my_gfg_collection

emp_rec1 = {

& quot

name & quot: & quot

Mr.Geek & quot,

& quot

eid & quot: 24,

& quot

location & quot: & quot

delhi & quot

emp_rec2 = {

& quot

name & quot: & quot

Mr.Shaurya & quot,

& quot

eid & quot: 14,

& quot

location & quot: & quot

delhi & quot

emp_rec3 = {

& quot

name & quot: & quot

Mr.Coder & quot,

& quot

eid & quot: 14,

& quot
location & quot: & quot

gurugram & quot

# Insert Data

rec_id1 = collection.insert_one(emp_rec1)

rec_id2 = collection.insert_one(emp_rec2)

rec_id3 = collection.insert_one(emp_rec3)

print(& quot

Data inserted with record ids", rec_id1, & quot

& quot

, rec_id2, rec_id3)

# Printing the data inserted

cursor = collection.find()

for record in cursor:

print(record)

Output:

Connected successfully!!!
Data inserted with record ids
{'_id': ObjectId('5a02227b37b8552becf5ed2a'), 'name':
'Mr.Geek', 'eid': 24, 'location': 'delhi'}
{'_id': ObjectId('5a02227c37b8552becf5ed2b'), 'name':
'Mr.Shaurya', 'eid': 14, 'location': 'delhi'}
{'_id': ObjectId('5a02227c37b8552becf5ed2c'), 'name':
'Mr.Coder', 'eid': 14, 'location': 'gurugram'}

Replace_one()

After inserting the data let’s replace the Data of an employee whose name: Mr.Shaurya
Matlab

# Python code to illustrate

# Replace_one() in MongoDB

from pymongo import MongoClient

try:

conn = MongoClient()

print("Connected successfully!!!")

except:

print("Could not connect to MongoDB")

# database

db = conn.database

# Created or Switched to collection names: my_gfg_collection

collection = db.my_gfg_collection

# replace one of the employee data whose name is Mr.Shaurya

result = collection.replace_one(

{"name":"Mr.Shaurya"},

"name":"Mr.GfG",

"eid":45,

"location":"noida"

print("Data replaced with id",result)


# Print the new record

cursor = collection.find()

for record in cursor:

print(record)

Output:
Connected successfully!!!
Data replaced with id
{'_id': ObjectId('5a02227b37b8552becf5ed2a'), 'name':
'Mr.Geek', 'eid': 24, 'location': 'delhi'}
{'_id': ObjectId('5a02227c37b8552becf5ed2b'), 'name':
'Mr.GfG', 'eid': 45, 'location': 'noida'}
{'_id': ObjectId('5a02227c37b8552becf5ed2c'), 'name':
'Mr.Coder', 'eid': 14, 'location': 'gurugram'}

We have successfully replaced the document of employee name:’Mr.Shaurya’ and replaced


the entire document with a new one, name:’Mr.GfG’ (present).
To replace multiple documents use update_many() with upsert set to True.
CHAPTER 9: MongoDB python | Delete Data
and Drop Collection
Prerequisite : MongoDB Basics, Insert and Update
Aim : To delete entries/documents of a collection in a database. Assume name of collection
‘my_collection’.
Method used : delete_one() or delete_many()

Remove All Documents That Match a Condition : The following operation


removes all documents that match the specified condition.

result = my_collection.delete_many({"name": "Mr.Geek"})

To see the number of documents deleted :

print(result.deleted_count)

Remove All Documents :


Method 1 : Remove all documents using delete_many()
result= my_collection.delete_many({})

Method 2 : Delete all documents using collection.remove()


result = my_collection.remove()

Best method to remove is to drop the collection so that data indexes are also removed and
then create a new collection in that insert data.

To Drop a Collection :

db.my_collection.drop()

We first insert a document in the collection then deleted the documents as per query.

# Python program to illustrate

# delete, drop and remove

from pymongo import MongoClient

try:
conn = MongoClient()

print("Connected successfully!!!")

except:

print("Could not connect to MongoDB")

# database

db = conn.database

# Created or Switched to collection names: my_gfg_collection

collection = db.my_gfg_collection

emp_rec1 = {

"name":"Mr.Geek",

"eid":24,

"location":"delhi"

emp_rec2 = {

"name":"Mr.Shaurya",

"eid":14,

"location":"delhi"

emp_rec3 = {

"name":"Mr.Coder",

"eid":14,

"location":"gurugram"

# Insert Data

rec_id1 = collection.insert_one(emp_rec1)

rec_id2 = collection.insert_one(emp_rec2)
rec_id3 = collection.insert_one(emp_rec3)

print("Data inserted with record ids",rec_id1," ",rec_id2,rec_id3)

# Printing the document before deletion

cursor = collection.find()

for record in cursor:

print(record)

# Delete Document with name : Mr Coder

result = collection.delete_one({"name":"Mr.Coder"})

# If query would have been delete all entries with eid:14

# use this

# result = collection.delete_many("eid":14})

cursor = collection.find()

for record in cursor:

print(record)

OUTPUT (comment line denoted by #)

Connected successfully!!!
Data inserted with record ids

#Data INSERT
{'_id': ObjectId('5a02227c37b8552becf5ed2b'), 'name':

'Mr.GfG', 'eid': 45, 'location': 'noida'}


{'_id': ObjectId('5a0c734937b8551c1cd03349'), 'name':
'Mr.Shaurya', 'eid': 14, 'location': 'delhi'}
{'_id': ObjectId('5a0c734937b8551c1cd0334a'), 'name':
'Mr.Coder', 'eid': 14, 'location': 'gurugram'}

#Mr.Coder is deleted
{'_id': ObjectId('5a02227c37b8552becf5ed2b'), 'name':
'Mr.GfG', 'eid': 45, 'location': 'noida'}
{'_id': ObjectId('5a0c734937b8551c1cd03349'), 'name':
'Mr.Shaurya', 'eid': 14, 'location': 'delhi'}
CHAPTER 10: Python Mongodb –
Delete_one()
Mongodb is a very popular cross-platform document-oriented, NoSQL(stands for “not only
SQL”) database program, written in C++. It stores data in JSON format(as key-value pairs),
which makes it easy to use. MongoDB can run over multiple servers, balancing the load to
keep the system up and run in case of hardware failure.

Connecting to a Database
Step 1 – Establishing Connection: Port number Default: 27017
conn = MongoClient(‘localhost’, port-number)

If using default port-number i.e. 27017. Alternate connection method:


conn = MongoClient()

Step 2 – Create Database or Switch to Existing Database:


db = conn.dabasename

Create a collection or Switch to an existing collection:


collection = db.collection_name

Deleting document from Collection or Database

In MongoDB, a single document can be deleted by the method delete_one(). The first
parameter of the method would be a query object which defines the document to be deleted.
If there are multiple documents matching the filter query, only the first appeared document
would be deleted.
Note: Deleting a document is the same as deleting a record in the case of SQL.
Consider the sample database:
Examples:

Python

# Python program to demonstrate

# delete_one

import pymongo

# creating Mongoclient object to

# create database with the specified

# connection URL

students = pymongo.MongoClient('localhost', 27017)

# connecting to a database with

# name GFG

Db = students["GFG"]

# connecting to a collection with

# name Geeks

coll = Db["Geeks"]

# creating query object


myQuery ={'Class':'2'}

coll.delete_one(myQuery)

# print collection after deletion:

for x in coll.find():

print(x)

Output :
'_id': 2.0, 'Name': 'Golu', 'Class': '3'}

{'_id': 3.0, 'Name': 'Raja', 'Class': '4'}

{'_id': 4.0, 'Name': 'Moni', 'Class': '5'}

MongoDB Shell:
CHAPTER 11: Python Mongodb –
Delete_many()
MongoDB is a general-purpose, document-based, distributed database built for modern
application developers and the cloud. It is a document database, which means it stores data
in JSON-like documents. This is an efficient way to think about data and is more expressive
and powerful than the traditional table model.

Delete_many()
Delete_many() is used when one needs to delete more than one document. A query object
containing which document to be deleted is created and is passed as the first parameter to the
delete_many().

Syntax:
collection.delete_many(filter, collation=None, hint=None, session=None)

Parameters:
‘filter’ : A query that matches the document to delete.
‘collation’ (optional) : An instance of class: ‘~pymongo.collation.Collation’. This
option is only supported on MongoDB 3.4 and above.
‘hint’ (optional) : An index to use to support the query predicate. This option is only
supported on MongoDB 3.11 and above.
‘session’ (optional) : a class:’~pymongo.client_session.ClientSession’.

Sample Database:

Example 1: Deleting all the documents where the name starts with ‘A’.

Python3
import pymongo

client = pymongo.MongoClient("mongodb://localhost:27017/")

# Connecting to the database

mydb = client["GFG"]

# Connecting the to collection

col = mydb["Geeks"]

query = {"Name": {"$regex": "^A"}}

d = col.delete_many(query)

print(d.deleted_count, " documents deleted !!")

Output:
2 documents deleted !!

MongoDB Shell:

Example 2:

Python3

import pymongo
client = pymongo.MongoClient("mongodb://localhost:27017/")

# Connecting to the database

mydb = client["GFG"]

# Connecting the to collection

col = mydb["Geeks"]

query = {"Class": '3'}

d = col.delete_many(query)

print(d.deleted_count, " documents deleted !!")

Output:
1 documents deleted !!

MongoDB Shell:
CHAPTER 12: Python MongoDB – Find
MongoDB is a cross-platform document-oriented database program and the most popular
NoSQL database program. The term NoSQL means non-relational. MongoDB stores the
data in the form of key-value pairs. It is an Open Source, Document Database which
provides high performance and scalability along with data modeling and data management of
huge sets of data in an enterprise application. MongoDB also provides the feature of Auto-
Scaling. It uses JSON-like documents, which makes the database very flexible and scalable.

Finding data from the collection or the database

In MongoDB, there are 2 functions that are used to find the data from the collection or the
database.

find_one()
find()

Find_one() Method
In MongoDB, to select data from the collection we use find_one() method. It returns the first
occurred information in the selection and brings backs as an output. find_one() method
accepts an optional parameter filter that specifies the query to be performed and returns the
first occurrence of information from the database.
Example 1: Find the first document from the student’s a collection/database. Let’s suppose
the database looks like as follows:

Python3

# Python program to demonstrate

# find_one()
import pymongo

mystudent = pymongo.MongoClient('localhost', 27017)

# Name of the database

mydb = mystudent["gfg"]

# Name of the collection

mycol = mydb["names"]

x = mycol.find_one()

print(x)

Output :

Find()
find() method is used to select data from the database. It returns all the occurrences of the
information stored in the collection. It has 2 types of parameters, The first parameter of the
find() method is a query object. In the below example we will use an empty Query object,
which will select all information from the collection. Note: It works the same
as SELECT* without any parameter.

Example:

Python3

import pymongo
# establishing connection

# to the database

my_client = pymongo.MongoClient('localhost', 27017)

# Name of the database

mydb = my_client["gfg"]

# Name of the collection

mynew = mydb["names"]

for x in mycol.find():

print(x)

Output :

The second parameter to the find() method is that you can specify the field to include in the
result. The second parameter passed in the find() method is of object type describing the
field. Thus, this parameter is optional. If omitted then all the fields from the
collection/database will be displayed in the result. To include the field in the result the value
of the parameter passed should be 1, if the value is 0 then it will be excluded from the result.

Example: Return only the names and address, not the id:
CHAPTER 13: Python MongoDB – find_one
Query
This article focus on the find_one() method of the PyMongo library. find_one() is used to
find the data from MongoDB.
Prerequisites: MongoDB Python Basics
Let’s begin with the find_one() method:
Importing PyMongo Module: Import the PyMongo module using the command:
from pymongo import MongoClient

1. If MongoDB is already not installed on your machine, one can refer to the guide og
how to Install MongoDB with Python.
2. Creating a Connection: Now we had already imported the module, it’s time to
establish a connection to the MongoDB server, presumably which is running on
localhost (host name) at port 27017 (port number).

client = MongoClient(‘localhost’, 27017)

Accessing the Database: Since the connection to the MongoDB server is established. We
can now create or use the existing database.
mydatabase = client.name_of_the_database

Accessing the Collection: We now select the collection from the database using the
following syntax:
collection_name = mydatabase.name_of_collection

Finding in the collection: Now we will find in the collection using find_one() function.
This function return only one document if the data is found in the collection else it returns
None. It is ideal for those situations where we need to search for only one document.
Syntax:
find_one(filter=None, *args, **kwargs)
Example 1: Sample Database:

Python3

# Python program to demonstrate

# find_one() method

# Importing Library

from pymongo import MongoClient

# Connecting to MongoDB server

# client = MongoClient('host_name','port_number')

client = MongoClient('localhost', 27017)

# Connecting to the database named

# GFG

mydatabase = client.GFG

# Accessing the collection named

# gfg_collection

mycollection = mydatabase.Student

# Searching through the database


# using find_one method.

result = mycollection.find_one({'Branch': 'CSE'})

print(result)

Output:
{'_id': 1, 'name': 'Vishwash', 'Roll No': '1001', 'Branch': 'CSE'}

Example 2:

Python3

# Python program to demonstrate

# find_one() method

# Importing Library

from pymongo import MongoClient

# Connecting to MongoDB server

# client = MongoClient('host_name','port_number')

client = MongoClient('localhost', 27017)

# Connecting to the database named

# GFG

mydatabase = client.GFG

# Accessing the collection named

# gfg_collection

mycollection = mydatabase.Student
# Searching through the database

# using find_one method.

result = mycollection.find_one({'Branch': 'CSE'},

{'_id': 0, 'name': 1, 'Roll No': 1})

print(result)

Output:
{'name': 'Vishwash', 'Roll No': '1001'}
CHAPTER 14: Python MongoDB –
find_one_and_update Query
The function find_one_and_update() actually finds and updates a MongoDB document.
Though default-wise this function returns the document in its original form and to return the
updated document return_document has to be implemented in the code.

Syntax:
coll.find_one_and_update(filter, update, options)

Parameters:
col- collection in MongoDB
filter- criteria to find the document which needs to be updated
update- The operations which need to be implemented for updating the document
options- projection or upsert can be used here
projection- a mapping which informs about which fields are included and excluded,
it is 1/TRUE for including a field and 0/FALSE for excluding
upsert- for inserting a new document if no file is found with the mentioned criteria
upsert is TRUE
return_document: If ReturnDocument.BEFORE (the default), returns the original
document before it was replaced, or None if no document matches. If
ReturnDocument.AFTER, returns the replaced or inserted document.

Example 1: Sample Database:

Python3
from pymongo import MongoClient

from pymongo import ReturnDocument

# Create a pymongo client

client = MongoClient('localhost', 27017)

# Get the database instance

db = client['GFG']

# Create a collection

doc = db['Student']

print(doc.find_one_and_update({'name':"Raju"},

{ '$set': { "Branch" : 'ECE'} },

return_document = ReturnDocument.AFTER))

Output:
{'_id': 5, 'name': 'Raju', 'Roll No': '1005', 'Branch': 'ECE'}

Example 2:

Python3

from pymongo import MongoClient

from pymongo import ReturnDocument

# Create a pymongo client

client = MongoClient('localhost', 27017)


# Get the database instance

db = client['GFG']

# Create a collection

doc = db['Student']

print(# Increasing marks of Ravi by 10

doc.find_one_and_update({'name': "Raju"},

{ '$set': { "Branch" : 'CSE'} },

projection = { "name" : 1, "Branch" : 1 },

return_document = ReturnDocument.AFTER))

Output:
{'_id': 5, 'name': 'Raju', 'Branch': 'CSE'}
CHAPTER 15: Python MongoDB –
find_one_and_delete query
MongoDB is a cross-platform document-oriented and a non relational (i.e NoSQL) database
program. It is an open-source document database, that stores the data in the form of key-
value pairs.

find_one_and_delete()
This function is used to delete a single document from the collection based on the filter that
we pass and returns the deleted document from the collection. It finds the first matching
document that matches the filter and deletes it from the collection i.e finds a single document
and deletes it, returning the document.

Syntax:
Collection.find_one_and_delete(filter, projection=None, sort=None, session=None, **kwargs)

Parameters:
‘filter’ : A query that matches the document to delete.
‘projection’ (optional): A list of field names that should be returned in the
result document or a mapping specifying the fields to include or exclude. If
‘projection’ is a list “_id” will always be returned. Use a mapping to exclude
fields from the result (e.g. projection={‘_id’: False}).

sort’ (optional): A list of (key, direction) pairs specifying the sort order for
the query. If multiple documents match the query, they are sorted and the first is
deleted.
‘session’ (optional): A class: “~pymongo.client_session.ClientSession”.
‘**kwargs’ (optional): Additional command arguments can be passed as keyword
arguments
(for example maxTimeMS can be used with recent server versions).

The sample database is as follows:


The database on which we operate.

Example 1:

Python3

# importing Mongoclient from pymongo

from pymongo import MongoClient

# Making Connection

myclient = MongoClient("mongodb://localhost:27017/")

# database

db = myclient["mydatabase"]

# Created or Switched to collection

# names: GeeksForGeeks

Collection = db["GeeksForGeeks"]

# Defining the filter that we want to use.

Filter ={'Manufacturer': 'Apple'}

# Using find_one_and_delete() function.

print("The returned document is:")

print(Collection.find_one_and_delete(Filter))

# Printing the data in the collection

# after find_one_and_delete() operation.


print("\nThe data after find_one_and_delete() operation is:")

for data in Collection.find():

print(data)

Output :

Example 2:

In this example we delete the Redmi data from the database using the find_one_and_delete()
method:

Python3

# importing Mongoclient from pymongo

from pymongo import MongoClient

# Making Connection

myclient = MongoClient("mongodb://localhost:27017/")

# database

db = myclient["mydatabase"]

# Created or Switched to collection

# names: GeeksForGeeks

Collection = db["GeeksForGeeks"]
# Defining the filter that we want to use.

Filter ={'Manufacturer': 'Redmi'}

# Using find_one_and_delete() function.

print("The returned document is:")

print(Collection.find_one_and_delete(Filter)

# Printing the data in the collection

# after find_one_and_delete() operation.

print("\nThe data after find_one_and_delete() operation is:")

for data in Collection.find():

print(data)

Output :
CHAPTER 16: Python MongoDB –
find_one_and_replace Query
find_one_and_replace() method search one document if finds then replaces with the given
second parameter in MongoDb. find_one_and_replace() method is differ
from find_one_and_update() with the help of filter it replace the document rather than
update the existing document.
Syntax:
find_one_and_replace(filter, replacement, projection=None, sort=None,
return_document=ReturnDocument.BEFORE, session=None, **kwargs)
Parameters

filter: A query for replacement of a matched document.

replacement: replacement document.


projection: it is optional.A list of a field that should be returned in the result.
sort: key, direction pair for the sort order of query.
return_document: ReturnDocument.BEFORE (default) will return the original
document without replacement. ReturnDocument.AFTER will return the replaced or
inserted document.
**kwargs: Additional commands.

Sample database used in all the below examples:

Example 1:

Python3

import pymongo
# establishing connection

# to the database

client = pymongo.MongoClient("mongodb://localhost:27017/")

# Database name

db = client["mydatabase"]

# Collection name

col = db["gfg"]

# replace with the help of

# find_one_and_replace()

col.find_one_and_replace({'coursename': 'SYSTEM DESIGN'},

{'coursename': 'PHP'})

# print the document after replacement

for x in col.find({}, {"_id": 0, "coursename": 1, "price": 1}):

print(x)

Output:

Example 2:

Python3

import pymongo
# establishing connection

# to the database

client = pymongo.MongoClient("mongodb://localhost:27017/")

# Database name

db = client["mydatabase"]

# Collection name

col = db["gfg"]

# replace with the help of

# find_one_and_replace()

col.find_one_and_replace({'price': 9999}, {'price': 19999})

# print the document after replacement

for x in col.find({}, {"_id": 0, "coursename": 1, "price": 1}):

print(x)

Output:
CHAPTER 17: Python MongoDB – Sort
MongoDB is a cross-platform document-oriented database program and the most popular
NoSQL database program. The term NoSQL means non-relational. MongoDB stores the
data in the form of key-value pairs. It is an Open Source, Document Database which
provides high performance and scalability along with data modeling and data management of
huge sets of data in an enterprise application. MongoDB also provides the feature of Auto-
Scaling. It uses JSON-like documents, which makes the database very flexible and
scalable. Note: For more information, refer to MongoDB and Python

Sorting the MongoDB documents


sort() method is used for sorting the database in some order. This method accepts two
parameters first is the fieldname and the second one is for the direction to sort. (By default it
sorts in ascending order)
Syntax:
sort(key_or_list, direction)

key_or_list: a single key or a list of (key, direction) pairs specifying the keys to sort on
direction (optional): only used if key_or_list is a single key, if not given ASCENDING is assumed
Note: 1 as the direction is used for ascending order and -1 as the direction is used for
descending order
Example 1: Using sort() function to sort the result alphabetically by name. Let’s suppose the
database looks like this:

Python3

# python code to sort elements

# alphabetically in ascending order

import pymongo
# establishing connection

# to the database

my_client = pymongo.MongoClient('localhost', 27017)

# Name of the database

mydb = my_client[& quot

gfg & quot

# Name of the collection

mynew = mydb[& quot

names & quot

# sorting function

mydoc = mynew.find().sort(& quot

name & quot

for x in mydoc:

print(x)

Output :

Example 2: Sorting in descending order


Python3

import pymongo

# establishing connection

# to the database

my_client = pymongo.MongoClient('localhost', 27017)

# Name of the database

mydb = my_client[& quot

gfg & quot

# Name of the collection

mynew = mydb[& quot

names & quot

# sorting function with -1

# as direction

mydoc = mynew.find().sort(& quot

name" , -1)

for x in mydoc:

print(x)
Output :
CHAPTER 18: Python MongoDB – distinct()
MongoDB is a cross-platform, document-oriented database that works on the concept of
collections and documents. It stores data in the form of key-value pairs and is a NoSQL
database program. The term NoSQL means non-relational. Refer to MongoDB and Python for
an in-depth introduction to the topic. Now let’s understand the use of distinct() function in
PyMongo.

distinct()
PyMongo includes the distinct() function that finds and returns the distinct values for a specified
field across a single collection and returns the results in an array.
Syntax : distinct(key, filter = None, session = None, **kwargs)

Parameters :

key : field name for which the distinct values need to be found.
filter : (Optional) A query document that specifies the documents from which to
retrieve the distinct values.
session : (Optional) a ClientSession.

Let’s create a sample collection :

# importing the module

from pymongo import MongoClient

# creating a MongoClient object

client = MongoClient()

# connecting with the portnumber and host

client = MongoClient("mongodb://localhost:27017/")
# accessing the database

database = client['database']

# access collection of the database

mycollection = mydatabase['myTable']

documents = [{"_id": 1, "dept": "A",

"item": {"code": "012", "color": "red"},

"sizes": ["S", "L"]},

{"_id": 2, "dept": "A",

"item": {"code": "012", "color": "blue"},

"sizes": ["M", "S"]},

{"_id": 3, "dept": "B",

"item": {"code": "101", "color": "blue"},

"sizes": "L"},

{"_id": 4, "dept": "A",

"item": {"code": "679", "color": "black"},

"sizes": ["M"]}]

mycollection.insert_many(documents)

for doc in mycollection.find({}):

print(doc)

Output :
{'_id': 1, 'dept': 'A', 'item': {'code': '012', 'color': 'red'}, 'sizes': ['S', 'L']}

{'_id': 2, 'dept': 'A', 'item': {'code': '012', 'color': 'blue'}, 'sizes': ['M', 'S']}
{'_id': 3, 'dept': 'B', 'item': {'code': '101', 'color': 'blue'}, 'sizes': 'L'}

{'_id': 4, 'dept': 'A', 'item': {'code': '679', 'color': 'black'}, 'sizes': ['M']}
Now we will; use the distinct() method to :

Return distinct values for a Field


Return Distinct Values for an Embedded Field
Return Distinct Values for an Array Field
Return Specific Query

# distinct() function returns the distinct values for the

# field dept from all documents in the mycollection collection

print(mycollection.distinct('dept'))

# distinct values for the field color,

# embedded in the field item, from all documents

# in the mycollection collection

print(mycollection.distinct('item.color'))

# returns the distinct values for the field sizes

# from all documents in the mycollection collection

print(mycollection.distinct("sizes"))

# distinct values for the field code,

# embedded in the field item, from the documents

# in mycollection collection whose dept is equal to B.

print(mycollection.distinct("item.code", {"dept" : "B"}))

Output :

['A', 'B']
['red', 'blue', 'black']

['L', 'S', 'M']

['101']
CHAPTER 19: Python MongoDB- rename()
MongoDB is a cross-platform, document-oriented database that works on the concept of
collections and documents. It stores data in the form of key-value pairs and is a NoSQL
database program. The term NoSQL means non-relational. Refer to MongoDB and Python for
an in-depth introduction to the topic. Now let’s understand the use of rename() function in
PyMongo.

rename()
The PyMongo function rename() is used to rename a collection. The rename operation fails if
the new name is not an instance of basestring or it is an invalid collection name.
Syntax : rename(new_name, session = None, **kwargs)
Parameters :

new_name : The new name of the collection.


session : (Optional) a ClientSession.
**kwargs : (Optional) additional arguments to the rename command may be passed
as keyword arguments to this helper method (i.e. dropTarget = True).

Example 1 : In this example we will create a collection and rename it. The rename() function
will rename the collection name from collection to collec. The value of dropTarget is set as True,
this means that if an existing collection collec existed, then the new collection would overwrite
the existing collection’s data.

# importing the module

from pymongo import MongoClient

# creating a MongoClient object

client = MongoClient()

# connecting with the portnumber and host

client = MongoClient("mongodb://localhost:27017/")
# accessing the database

database = client['database']

# access collection of the database

collection = database['myTable']

docs = [{"id":1, "name":"Drew"},

{"id":3, "name":"Cody"}]

collection.insert_many(docs)

# renaming the collection

collection.rename('collec', dropTarget = True)

result = database.collection_names()

for collect in result:

print(collect)

Output-
collec

Example 2 : In this example, the dropTarget parameter is set to False, the new collection name
entered should be unique. But since the collection name collec already exists in the database, it
will return an error.

# importing the module

from pymongo import MongoClient

# creating a MongoClient object

client = MongoClient()

# connecting with the portnumber and host


client = MongoClient("mongodb://localhost:27017/")

# accessing the database

database = client['database']

# access collection of the database

mycollection = database['myTable']

docs = [{"id":1, "name":"Drew"},

{"id":3, "name":"Cody"}]

mycollection.insert_many(docs)

# renaming the collection

mycollection.rename('collec', dropTarget = False)

result = database.collection_names()

for collect in result:

print(collect)

Output :
pymongo.errors.OperationFailure: target namespace exists
CHAPTER 20: Python MongoDB –
bulk_write()
MongoDB is an open-source document-oriented database. MongoDB stores data in the form of
key-value pairs and is a NoSQL database program. The term NoSQL means non-relational.
Refer to MongoDB: An Introduction for a much more detailed introduction on MongoDB.
Now let’s understand the Bulk Write operation in MongoDB using Python.

bulk_write()
The PyMongo function bulk_write() sends a batch of write operations to the server. Executing
write operations in batches increases the write throughput.
Syntax : bulk_write(requests, ordered = True, bypass_document_validation = False, session =
None)
Parameters :

requests : Requests are passed as a list of write operation instances.


ordered : (Optional) A boolean specifying whether the operation executions are
performed in an ordered or unordered manner. By default, it is set to True.
bypass_document_validation : (Optional) A value indicating whether to bypass
document validation.
session : (Optional) a ClientSession.

Example 1 : Executing multiple requests using bulk_write().

# importing the module

from pymongo import MongoClient, InsertOne, DeleteOne, ReplaceOne

# creating a MongoClient object

client = MongoClient()
# connecting with the portnumber and host

client = MongoClient("mongodb://localhost:27017/")

# accessing the database

database = client['database']

# access collection of the database

mycollection = mydatabase['myTable']

# defining the requests

requests = [InsertOne({"Student name": "Cody"}),

InsertOne({ "Student name": "Drew"}),

DeleteOne({"Student name": "Cody"}),

ReplaceOne({"Student name": "Drew"},

{ "Student name": "Andrew"}, upsert = True)]

# executing the requests

result = mycollection.bulk_write(requests)

for doc in mycollection.find({}):

print(doc)

Here, the first two documents are inserted by using the command InsertOne. Then using
the DeleteOne command, the document with Student name: Cody is deleted. Using
the ReplaceOne command, the student with the name Drew is replaced by the
name Andrew. Hence all these commands are executed in the order and we get the following
output :

Output :
{‘_id’: ObjectId(‘5f060aa5a9666ecd86f5b6bd’), ‘Student name’: ‘Andrew’}
Example 2 :

# importing the modules

from pymongo import MongoClient, InsertOne, DeleteOne, ReplaceOne, UpdateOne

# creating a MongoClient object

client = MongoClient()

# connecting with the portnumber and host

client = MongoClient("mongodb://localhost:27017/")

# accessing the database

database = client['database']

# access collection of the database

mycollection = mydatabase['myTable']

# defining the requests

requests = [InsertOne({ "x": 5}),

InsertOne({ "y": 2}),

UpdateOne({'x': 5}, {'$inc': {'x': 3}}),

DeleteOne({ "y": 2})]

# executing the requests

result = mycollection.bulk_write(requests)

for doc in mycollection.find({}):

print(doc)
Here two documents are inserted first using the InsertOne command. Then the document with
value of x equal to 5 is updated and its value is incremented by 3. At last, the document
containing y is deleted with the command DeleteOne.
Output :
{'_id': ObjectId('5f060cd7358fae75aad1ae94'), 'x': 8}
CHAPTER 21: Python MongoDB – $group
(aggregation)
MongoDB is an open-source document-oriented database. MongoDB stores data in the form
of key-value pairs and is a NoSQL database program. The term NoSQL means non-
relational. In this article, we will see the use of $group in MongoDB using Python.
$group operation
In PyMongo, the Aggregate Method is mainly used to process data records from multiple
documents and returns the result to the user. This is based on the data processing pipeline
and includes multiple stages at the end of which we get the aggregated result. One of the
stages of the aggregate method includes $group. This operation groups input documents of
the collection by the specified identifier expression entered by the user and then apply the
accumulator expression to it. Then it produces the output documents.
$group includes the following-
1. _id- The documents are grouped according to the given id expression.
2. field (Optional) – It includes accumulator expression which is applied to the
included documents.

Let’s understand this through some examples.


Example 1:

Python3

from pymongo import MongoClient

# creation of MongoClient

client=MongoClient()

# Connect with the portnumber and host

client = MongoClient("mongodb://localhost:27017/")
# Access database

mydatabase = client['database']

# Access collection of the database

mycollection=mydatabase['myTable']

writer_profiles = [

{"_id":1, "user":"Amit", "title":"Python", "comments":5},

{"_id":2, "user":"Drew", "title":"JavaScript", "comments":15},

{"_id":3, "user":"Amit", "title":"C++", "comments":6},

{"_id":4, "user":"Drew", "title":"MongoDB", "comments":2},

{"_id":5, "user":"Cody", "title":"Perl", "comments":9}]

mycollection.insert_many(writer_profiles)

agg_result= mycollection.aggregate(

[{

"$group" :

{"_id" : "$user",

"num_tutorial" : {"$sum" : 1}

}}

])

for i in agg_result:

print(i)

Output:
{'_id': 'Cody', 'num_tutorial': 1}

{'_id': 'Drew', 'num_tutorial': 2}

{'_id': 'Amit', 'num_tutorial': 2}


In the above example, the documents are grouped on the basis of expression $user, and then
the field num_tutorial includes the accumulator operator $sum that calculates the number
of tutorials of each user.
Example 2:

Python3

from pymongo import MongoClient

# creation of MongoClient

client=MongoClient()

# Connect with the portnumber and host

client = MongoClient("mongodb://localhost:27017/")

# Access database

mydatabase = client['database4']

# Access collection of the database

mycollection=mydatabase['myTable']

writer_profiles = [

{"_id":1, "user":"Amit", "title":"Python", "comments":8},

{"_id":2, "user":"Drew", "title":"JavaScript", "comments":15},

{"_id":3, "user":"Amit", "title":"C++", "comments":6},

{"_id":4, "user":"Drew", "title":"MongoDB", "comments":2},

{"_id":5, "user":"Cody", "title":"MongoDB", "comments":16}]

mycollection.insert_many(writer_profiles)

agg_result= mycollection.aggregate(
[{

"$group" :

{"_id" : "$title",

"total" : {"$sum" : 1}

}}

])

for i in agg_result:

print(i)

Output:
{'_id': 'MongoDB', 'total': 2}

{'_id': 'C++', 'total': 1}

{'_id': 'JavaScript', 'total': 1}

{'_id': 'Python', 'total': 1}

In this example, the documents are grouped by the expression $title, and the
field total includes the accumulator operator $sum that calculates the number of articles of
each title.
CHAPTER 22: Nested Queries in PyMongo
MongoDB is a NoSQL document-oriented database. It does not give much importance for
relations or can also be said as it is schema-free.
PyMongo is a Python module that can be used to interact between the mongo database and
Python applications. The data that is exchanged between the Python application and the
mongo database is in Binary JSON format.

Nested Queries in PyMongo

To fetch a particular record from the MongoDB document, querying plays an important role.
Getting the right data as soon as possible to make the right decision is necessary. Here are
some of the multiple query request techniques.

Query operators in PyMongo


To use $and, $or and $not MongoDB operators, the outer dictionary key must be one of the query
operators; and dictionary parameters must be in a Python list and that Python list must be the
value of the key.
Syntax :
query = {
'$and' : [

{ operand_query_1},
{ operand_query_2}
]

Example 1 : Create a collection called lecturers and retrieve using find().

import pprint

from pymongo import MongoClient


client = MongoClient()

Database = client["GFG"]

lecturers = Database["lecture"]

lecturers.insert_many([

{"l_id":56, "d_id":1, "salary":50000},

{"l_id":57, "d_id":3, "salary":40000},

{"l_id":58, "d_id":4, "salary":90000},

{"l_id":59, "d_id":2, "salary":50000},

{"l_id":52, "d_id":1, "salary":70000},

{"l_id":53, "d_id":5, "salary":30000}

])

# retrieving the documents

for x in lecturers.find():

pprint.pprint(x)

Output :
Query 1 : Display the lecturer records with salary less than 50000 and arrange in ascending
order.

# lecturer records with salary less

# than 50000 and arrange in ascending order.

pprint.pprint(list(lecturers.find({"salary":

{'$lt':50000}}).sort('salary', 1)))

Output :

Query 2 : Display lecturer records with salary greater than 40000 in department_id 1 and sort
according to their salary in descending order.

# lecturer records with salary greater than 40000

# in department_id 1 and sort according to their

# salary in descending order.

pprint.pprint(list(lecturers.find({'$and':

[{"d_id":1},

{"salary":

{'$gte':50000}}]}).sort("salary", -1)))

Output :

Example 2 : Create a collection called books and retrieve using find().


import pprint

from pymongo import MongoClient

import datetime

client = MongoClient()

Database = client["GFG"]

books = Database["book"]

books.insert_many([

{"author":"Samsuel", "book_id":54, "ratings":3,

"publish":datetime.datetime(1999, 12, 6)},

{"author":"Thomson", "book_id":84, "ratings":4,

"publish":datetime.datetime(1996, 7, 12)},

{"author":"Piyush Agarwal", "book_id":34, "ratings":1,

"publish":datetime.datetime(2000, 9, 6)},

{"author":"Shreya Mathur", "book_id":12, "ratings":2,

"publish":datetime.datetime(2017, 8, 8)},

{"author":"Antony Sharma", "book_id":98, "ratings":4,

"publish":datetime.datetime(2003, 11, 5)},

])

# retrieving the documents

for x in books.find():
pprint.pprint(x)

Output :

Query 1 : Display the record of books with ratings greater than 3 published after 2000.

# books with ratings greater than 3 published after 2000

pprint.pprint(list(books.find({'$and':

[{"ratings":

{'$gt':3}},

{"publish":

{'$gt':datetime.datetime(2000, 12, 31)

Output :
Query 2 : Display the record of the books with ratings greater than 1 and published between
the year 1999 and 2016, sort in decreasing order.

# between 1999-2016

query ={'$and':

[{"publish":

{'$gte':datetime.datetime(1999, 1, 1)}},

{"publish":

{'$lte':datetime.datetime(2016, 12, 31)}}]}

# books with ratings greater than 1

# and published between the year

# 1999-2016, sort in decreasing order.

pprint.pprint(list(books.find({'$and':

[{"ratings":

{'$gt':1}},

query]}).sort("publish", -1)))

Output :
PART 4: Working with Collections
and documents in MongoDB
CHAPTER 1: Access a collection in MongoDB
using Python?
MongoDB is a cross-platform, document-oriented database that works on the concept of
collections and documents. MongoDB offers high speed, high availability, and high
scalability.
Accessing a Collection
1) Getting a list of collection: For getting a list of a MongoDB database’s collections
list_collection_names() method is used. This method returns a list of collections.
Syntax:
list_collection_names()

Example:

Sample Database:

Python3

from pymongo import MongoClient

# create an client instance of the

# MongoDB class

mo_c = MongoClient()

# create an instance of 'some_database'

db = mo_c.GFG
# get a list of a MongoDB database's

# collections

collections = db.list_collection_names()

print ("collections:", collections, "\n")

Output:
collections: ['Geeks']

2) Check if the collection exist or not: To check if the collection attribute exists for the
database use hasattr() method. It returns true if the collection is in database otherwise returns
false.
Syntax: hasattr(db, ‘collectionname’)
Parameters:
db: It is database object.
collectionname: It is the name of the collection.

Example:

Python3

from pymongo import MongoClient

# create an client instance of

# the MongoDB class

mo_c = MongoClient()

# create an instance of 'some_database'

db = mo_c.GFG

# check collection is exists or not


print(hasattr(db, 'Geeks'))

Output:
True

3) Accessing a Collection: To access a MongoDB collection name use the below syntax.

Syntax:
database_object.Collectionname

or

database_object["Collectionname"]

Note: Database_object[“Collectioname”] can be useful in the case where the name of the
collection contains a space in between them i.e. in cases like database_object[“Collection
name”].

Example:

Python3

from pymongo import MongoClient

# create an client instance of

# the MongoDB class

mo_c = MongoClient()

# create an instance of 'some_database'

db = mo_c.GFG

col1 = db["gfg"]
print ("Collection:", col1)

Output:
Collection: Collection(Database(MongoClient(host=[‘localhost:27017’],
document_class=dict, tz_aware=False, connect=True), ‘GFG’), ‘gfg’)
CHAPTER 2: Get the Names of all Collections
using PyMongo
PyMongo is the module used for establishing a connection to the MongoDB using Python
and perform all the operations like insertion, deletion, updating, etc. PyMongo is the
recommended way to work with MongoDB and Python.
Note: For detailed information about Python and MongoDB visit MongoDB and Python.
Let’s begin with the Get Names of all Collections using PyMongo
Importing PyMongo Module: Import the PyMongo module using the command:
from pymongo import MongoClient

If MongoDB is already not installed on your machine you can refer to the guide: Guide to
Install MongoDB with Python
Creating a Connection: Now we had already imported the module, its time to establish a
connection to the MongoDB server, presumably which is running on localhost (host name) at
port 27017 (port number).
client = MongoClient(‘localhost’, 27017)

Accessing the Database: Since the connection to the MongoDB server is established. We
can now create or use the existing database.
mydatabase = client.name_of_the_database

In our case the name of the database is GeeksForGeeks


mydatabase = client.GeeksForGeeks

List the name of all the Collections in the Database: To list the name of all the collection
in the database.
mydatabase.collection_names()

The collection_names() is deprecated in the version 3.7.0. Instead use


mydatabase.list_collection_names()

This method return the list of the collection names in the Database.
Example: Sample Database:
Python3

# Python Program to demonstrate

# List name of all collections using PyMongo

# Importing required libraries

from pymongo import MongoClient

# Connecting to MongoDB server

# client = MongoClient('host_name', 'port_number')

client = MongoClient(‘localhost’, 27017)

# Connecting to the database named

# GeeksForGeeks

mydatabase = client.GeeksForGeeks

# Getting the names of all the collections

# in GeeksForGeeks Database.

collections = mydatabase.list_collection_names()

# Printing the name of the collections to the console.

print(collections)

Output:
['Geeks']
CHAPTER 3: Drop Collection if already
exists in MongoDB using Python
Using drop() method we can drop collection if collection exists. If collection is not found
then it returns False otherwise it returns True if collection is dropped.
Syntax:
drop()

Example 1:
The sample database is as follows:

Python3

import pymongo

client = pymongo.MongoClient("mongodb://localhost:27017/")

# Database name

db = client["mydatabase"]

# Collection name
col = db["gfg"]

# drop collection col1

print(col.drop())

Output:

Example 2: If collection does not exist.

Python3

import pymongo

client = pymongo.MongoClient("mongodb://localhost:27017/")

# Database name

db = client["mydatabase"]

# Collection name

col = db["gfg"]

# drop collection col1

if col.drop():

print('Deleted')

else:

print('Not Present')

Output:
Not Present
CHAPTER 4: Update data in a Collection
using Python
MongoDB is a cross-platform, document-oriented database that works on the concept of
collections and documents. MongoDB offers high speed, high availability, and high scalability.

Updating Data in MongoDB


We can update data in a collection using update_one() method and update_many() method.

update_one()
update_one() method update first occurrence if document matching the query filter is found.
Syntax :update_one(query, newvalues, upsert=False, bypass_document_validation=False,
collation=None, array_filters=None, session=None)

Parameters:
filter : A query that matches the document to update.
new_values : The modifications to apply.
upsert (optional): If “True”, perform an insert if no documents match the filter.
bypass_document_validation (optional) : If “True”, allows the write to opt-out of document
level validation. Default is “False”.
collation (optional) : An instance of class: ‘~pymongo.collation.Collation’. This option is only
supported on MongoDB 3.4 and above.
array_filters (optional) : A list of filters specifying which array elements an update should
apply. Requires MongoDB 3.6+.
session (optional) : a class:’~pymongo.client_session.ClientSession’.
hint (optional): An index to use to support the query predicate specified. This option is only
supported on MongoDB 4.2 and above.

Example:
Sample database is as follows:
Python3

import pymongo

client = pymongo.MongoClient("mongodb://localhost:27017/")

# Database name

db = client["GFG"]

# Collection name

col = db["gfg"]

# Query to be updated

query = {"coursename": "SYSTEM DESIGN"}

# New value

newvalue = {"$set": {"coursename": "Computer network"}}

# Update the value

col.update_one(query, newvalue)

Output:

Method: update_many()
update_many() method update all the documents matching the query filter.

Syntax:
update_many(query, newvalues, upsert=False, bypass_document_validation=False,
collation=None, array_filters=None, session=None)

Parameters:

‘filter’ : A query that matches the document to update.


‘new_values’ : The modifications to apply.
‘upsert’ (optional): If “True”, perform an insert if no documents match the filter.
‘bypass_document_validation’ (optional) : If “True”, allows the write to opt-out of
document level validation. Default is “False”.
‘collation’ (optional) : An instance of class: ‘~pymongo.collation.Collation’. This
option is only supported on MongoDB 3.4 and above.
‘array_filters’ (optional) : A list of filters specifying which array elements an update
should apply. Requires MongoDB 3.6+.
‘session’ (optional) : a class:’~pymongo.client_session.ClientSession’.

Example:
Python3

import pymongo

client = pymongo.MongoClient("mongodb://localhost:27017/")

# Database name

db = client["GFG"]

# Collection name

col = db["gfg"]

# Query to be updated

query = {"coursename": "SYSTEM DESIGN"}


# New value

newvalue = {"$set": {"coursename": "Computer network"}}

# Update the value

col.update_many(query, newvalue)

Output:
CHAPTER 5: Get all the Documents of the
Collection using PyMongo
To get all the Documents of the Collection use find() method. The find() method takes a
query object as a parameter if we want to find all documents then pass none in the find()
method. To include the field in the result the value of the parameter passed should be 1, if the
value is 0 then it will be excluded from the result. Note: If we pass no parameter in find()
method .it works like select * in MYSQL . Sample Database:
Let’s suppose the database looks like this

Example 1:

Python3

import pymongo

# establishing connection

# to the database

client = pymongo.MongoClient("mongodb://localhost:27017/")

# Database name

db = client["mydatabase"]
# Collection name

col = db["gfg"]

# if we don't want to print id then pass _id:0

for x in col.find({}, {"_id":0, "coursename": 1, "price": 1 }):

print(x)

Output:

Example 2:

Python3

import pymongo

# establishing connection

# to the database

client = pymongo.MongoClient("mongodb://localhost:27017/")

# Database name

db = client["mydatabase"]

# Collection name

col = db["gfg"]
# if we don't want to print id then pass _id:0 and price :0

for x in col.find({}, {"coursename": 1}):

print(x)

Output:
CHAPTER 6: Count the number of
Documents in MongoDB using Python
MongoDB is a document-oriented NoSQL database that is a non-relational DB. MongoDB
is a schema-free database that is based on Binary JSON format. It is organized with a group
of documents (rows in RDBMS) called collection (table in RDBMS). The collections in
MongoDB are schema-less. PyMongo is one of the MongoDB drivers or client libraries.
Using the PyMongo module we can send requests and receive responses from

Count the number of Documents using Python


Method 1: Using count() The total number of documents present in the collection can be
retrieved by using count() method. Deprecated in version 3.7.
Syntax :
db.collection.count()

Example : Count the number of documents (my_data) in the collection using count().
Sample Database:

Python3

from pymongo import MongoClient

Client = MongoClient()
myclient = MongoClient('localhost', 27017)

my_database = myclient[& quot

GFG & quot

my_collection = my_database[& quot

Student & quot

# number of documents in the collection

mydoc = my_collection.find().count()

print(& quot

The number of documents in collection : & quot

, mydoc)

Output :
The number of documents in collection : 8

Method 2: count_documents() Alternatively, you can also use count_documents() function


in pymongo to count the number of documents present in the collection.
Syntax :
db.collection.count_documents({query, option})

Example: Retrieves the documents present in the collection and the count of the documents
using count_documents().

Python3

from pymongo import MongoClient

Client = MongoClient()
myclient = MongoClient('localhost', 27017)

my_database = myclient[& quot

GFG & quot

my_collection = my_database[& quot

Student & quot

# number of documents in the collection

total_count = my_collection.count_documents({})

print(& quot

Total number of documents : & quot

, total_count)

Output:
Total number of documents : 8
CHAPTER 7: Update all Documents in a
Collection using PyMongo
MongoDB is an open-source document-oriented database. MongoDB stores data in the form
of key-value pairs and is a NoSQL database program. The term NoSQL means non-
relational.
PyMongo contains tools which are used to interact with the MongoDB database. Now let’s
see how to update all the documents in a collection.

Updating all Documents in a Collection


PyMongo includes an update_many() function which updates all the documents which
satisfy the given query.
update_many() accepts the following parameters –

1. filter – It is the first parameter which is a criteria according to which the documents
that satisfy the query are updated.
2. update operator- It is the second parameter which contains the information to be
updated in the documents.
3. upsert- (Optional) It is a boolean. If set to true and there is no document matching
the filter, a new document is created.
4. array_filters – (Optional) It is an array of filter documents to determine which array
elements to modify for an update operation on an array field.
5. bypass_document_validation – (Optional) A boolean which skips the document
validation when set to True.
6. collation – (Optional) specifies the language specific rules for the operation.
7. session – (Optional) a ClientSession.

Update Operators in MongoDB


Setting Values:
$set: Used to set a fields value.
$setOnInsert: Update value only if a new document insertion.
$unset: Remove the field and its value.

Numeric Operators:
$inc: Increases the value by a given amount.
$min/$max: returns minimum or maximum of value.
$mul: multiplies the values by a given amount.
Miscellaneous Operators:
$currentDate: Updates value of a field to current date.
$rename: Renames a field

Now let’s understand through some examples.


Sample Database:

Some use cases we are going to see in this article where updating many records can be
useful:

1. Changing or incrementing several elements based on a condition.


2. Inserting a new field to multiple or all documents.

Example 1: All the students with marks greater than 35 has been passed.

Python3
from pymongo import MongoClient

# Creating an instance of MongoClient

# on default localhost

client = MongoClient('mongodb://localhost:27017')

# Accessing desired database and collection

db = client.gfg

collection = db["classroom"]

# Update passed field to be true for all

# students with marks greater than 35

collection.update_many(

{"marks": { "$gt": "35" } },

"$set": { "passed" : "True" }

Database After Query:


Example 2: New field called address added to all documents

Python

from pymongo import MongoClient

# Creating an instance of MongoClient

# on default localhost

client = MongoClient('mongodb://localhost:27017')

# Accessing desired database and collection

db = client.gfg

collection = db["classroom"]

# Address filed to be added to all documents

collection.update_many(
{},

{"$set":

"Address": "value"

},

# don't insert if no document found

upsert=False,

array_filters=None

Database After query:


CHAPTER 8: Aggregation in MongoDB using
Python
MongoDB is free, open-source,cross-platform and document-oriented database management
system(dbms). It is a NoSQL type of database. It store the data in BSON format on hard
disk. BSON is binary form for representing simple data structure, associative array and
various data types in MongoDB. NoSQL is most recently used database which provide
mechanism for storage and retrieval of data. Instead of using tables and rows as in relational
databases, mongodb architecture is made up of collections and documents.

Aggregation in MongoDB

Aggregation operation groups the values from multiple documents(rows in case of SQL)
together to perform a variety of operations on the grouped data and is going to return a single
result for each grouped data after aggregation.

Syntax:
db.collection_name.aggregate(aggregate operations)

Sample Database used in all the below examples:

Example 1:

Python3

from pymongo import MongoClient

my_client = MongoClient('localhost', 27017)


db = my_client["GFG"]

coll = db["Student"]

# Aggregation

cursor = coll.aggregate([{"$group":

{"_id":"$Branch",

"similar_branches":{"$sum":1}

}])

for document in cursor:

print(document)

Output:

Here, we use “$group” command for grouping then by “_id”:”branches” we are grouping
ids according to the branches. “similar_branches” is the keyword used for the total number
of similar branches,we can use any keyword here. “$sum:1” is used as a counter of total
number of each branches. The sum is incrementing by 1.

Example 2: We can also use the aggregation query for counting the number of document in
the database.

Python3

from pymongo import MongoClient

my_client = MongoClient('localhost', 27017)


db = my_client["GFG"]

coll = db["Student"]

# Aggregation

cursor = coll.aggregate([{"$group":

{"_id":"$None",

"total collections":{"$sum": 1}

}])

for document in cursor:

print(document)

Output:
{'_id': None, 'total collections': 8}
PART 5: Conversion between
MongoDB data and Structured data
CHAPTER 1: Import JSON File in MongoDB
using Python
Prerequisites: MongoDB and Python, Working With JSON Data in Python MongoDB is a
cross-platform document-oriented and a non relational (i.e NoSQL) database program. It is
an open-source document database, that stores the data in the form of key-value
pairs. JSON stands for JavaScript Object Notation. It is an open standard file format, and
data interchange format with an extension “.json”, that uses human-readable text to store and
transmit data objects consisting of attribute-value pairs and array data types.

Importing JSON file in MongoDB


To import a JSON file in MongoDB we have to first load or open the JSON file after that we
can easily insert that file into the database or the collection. To load a JSON file we have to
first import json in our code after that we can open the JSON file. When our file gets loaded
or opened we can easily insert it into the collection and operate on that file. Let’s see the
example for better understanding.
Example :
Sample JSON used:
Python3

import json

from pymongo import MongoClient

# Making Connection
myclient = MongoClient("mongodb://localhost:27017/")

# database

db = myclient["GFG"]

# Created or Switched to collection

# names: GeeksForGeeks

Collection = db["data"]

# Loading or Opening the json file

with open('data.json') as file:

file_data = json.load(file)

# Inserting the loaded data in the Collection

# if JSON contains data more than one entry

# insert_many is used else insert_one is used

if isinstance(file_data, list):

Collection.insert_many(file_data)

else:

Collection.insert_one(file_data)

Output:
CHAPTER 2: Convert PyMongo Cursor to
JSON
Prerequisites: MongoDB Python Basics
This article is about converting the PyMongo Cursor to JSON. Functions like find() and
find_one() returns the Cursor instance.
Let’s begin:

1. Importing Required Modules: Import the required module using the command:
2. from pymongo import MongoClient
from bson.json_util import dumps

If MongoDB is already not installed on your machine you can refer to the guide: Guide to
Install MongoDB with Python
3. Creating a Connection: Now we had already imported the module, its time to
establish a connection to the MongoDB server, presumably which is running on
localhost (host name) at port 27017 (port number).

client = MongoClient(‘localhost’, 27017)

4. Accessing the Database: Since the connection to the MongoDB server is established.
We can now create or use the existing database.

mydatabase = client.name_of_the_database

5. Accessing the Collection: We now select the collection from the database using the
following syntax:

collection_name = mydatabase.name_of_collection

6. Getting the documents: Getting all the documents from the collection using find()
method. It returns the instance of the Cursor.

cursor = collection_name.find()

7. Converting the Cursor to JSON: Converting the Cursor to the JSON.


First, we will convert the Cursor to the list of dictionary.

list_cur = list(cursor)
Now, converting the list_cur to the JSON using the method dumps() from bson.json_util
json_data = dumps(list_cur)

You can now save it to the file or can use it in the program using loads() function.
Below is the implementation.

# Python Program for

# demonstrating the

# PyMongo Cursor to JSON

# Importing required modules

from pymongo import MongoClient

from bson.json_util import dumps, loads

# Connecting to MongoDB server

# client = MongoClient('host_name',

# 'port_number')

client = MongoClient('localhost', 27017)

# Connecting to the database named

# GFG

mydatabase = client.GFG

# Accessing the collection named

# gfg_collection

mycollection = mydatabase.College

# Now creating a Cursor instance

# using find() function


cursor = mycollection.find()

# Converting cursor to the list

# of dictionaries

list_cur = list(cursor)

# Converting to the JSON

json_data = dumps(list_cur, indent = 2)

# Writing data to file data.json

with open('data.json', 'w') as file:

file.write(json_data)

Output:
CHAPTER 3: Convert PyMongo Cursor to
Dataframe
Prerequisites: MongoDB Python Basics

This article is about converting the PyMongo Cursor to Pandas Dataframe. Functions like
find() and find_one() returns the Cursor instance.

Let’s begin:

1. Importing Required Modules: Import the required module using the command:
2. from pymongo import MongoClient
from pandas import DataFrame

If MongoDB is already not installed on your machine you can refer to the guide: Guide to
Install MongoDB with Python
If pandas not install you can install it using pip and if you are using Python3 then
use pip3 instead of pip to install the required modules.
pip install pandas

3. Creating a Connection: Now we had already imported the module, its time to
establish a connection to the MongoDB server, presumably which is running on
localhost (host name) at port 27017 (port number).

client = MongoClient(‘localhost’, 27017)

4. Accessing the Database: Since the connection to the MongoDB server is established.
We can now create or use the existing database.

mydatabase = client.name_of_the_database

5. Accessing the Collection: We now select the collection from the database using the
following syntax:

collection_name = mydatabase.name_of_collection

6. Getting the documents: Getting all the documents from the collection using find()
method. It returns the instance of the Cursor.
7. cursor = collection_name.find()
8. Converting the Cursor to Dataframe: Converting the Cursor to the Pandas
Dataframe.
First, we convert the cursor to the list of dictionary.
9. list_cur = list(cursor)
Now, converting the list to the Dataframe
df = DataFrame(list_cur)

Below is the implementation.


Sample Database:

# Python Program for demonstrating the

# PyMongo Cursor to Pandas DataFrame

# Importing required modules

from pymongo import MongoClient

from pandas import DataFrame

# Connecting to MongoDB server

# client = MongoClient('host_name',

# 'port_number')

client = MongoClient('localhost', 27017)

# Connecting to the database named

# GFG

mydatabase = client.GFG

# Accessing the collection named

# gfg_collection

mycollection = mydatabase.College
# Now creating a Cursor instance

# using find() function

cursor = mycollection.find()

print('Type of cursor:',type(cursor))

# Converting cursor to the list of

# dictionaries

list_cur = list(cursor)

# Converting to the DataFrame

df = DataFrame(list_cur)

print('Type of df:',type(df))

# Printing the df to console

print()

print(df.head())

Output:
Type of cursor: <class 'pymongo.cursor.Cursor'>

Type of df: <class 'pandas.core.frame.DataFrame'>

_id name Roll No Branch

0 1 Vishwash 1001 CSE

1 2 Vishesh 1002 IT

2 3 Shivam 1003 ME

3 4 Yash 1004 ECE

4 5 Raju 1005 CSE

Output Explanation:

As seen above when there is no argument is provided it only prints 5 records (numbered 0 to
4…brainfart).And if you put a positive int in the dataframe function, it will generate that many
records.

You might also like