Mongodb (Cont.) : Excerpts From "The Little Mongodb Book" Karl Seguin

The document provides guidance on modeling data in MongoDB and when it may be preferable to other database technologies. It discusses several MongoDB concepts including embedding documents and arrays to model relationships, denormalization of data to optimize reads, and flexibility in schema. The author notes MongoDB is an alternative to relational databases rather than a replacement, and emphasizes experimenting with different data modeling approaches for one's specific use cases.

Uploaded by

jorge alvarez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

84 views37 pages

Mongodb (Cont.) : Excerpts From "The Little Mongodb Book" Karl Seguin

Uploaded by

jorge alvarez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

MongoDB (cont.

)
Excerpts from
“The Little MongoDB Book”
Karl Seguin
Accessing array elements
• $slice: takes the form of array: [skip , limit], where the first value
indicates the number of items in the array to skip and the second
value indicates the number of items to return:
• db.unicorns.find({}, {loves: {$slice: [0, 1]}})
• db.unicorns.find({}, {loves: {$slice: [1, 1]}})
• db.unicorns.find({}, {loves: {$slice: [0, 2]}})
• db.unicorns.find({},{loves : {$slice: 2}}) Acá skip es 0

• db.unicorns.find({},{loves: {$slice: -1}})

MongoDB is rich in operators for dealing with
arrays…you are encouraged to try them…
• What do these queries do?

db.unicorns.find({"loves.0": 'grape'})
db.unicorns.find({"loves.1": 'grape'})
db.unicorns.find({loves: {$size: 3}})
Data modeling
• “Having a conversation about modeling with a new paradigm is not as
easy.”  Karl Seguin
• “The truth is that most of us are still finding out what works and what
doesn’t when it comes to modeling with these new technologies.” 
Karl Seguin
• Out of all NoSQL databases, document-oriented databases are
probably the most similar to relational databases – at least when it
comes to modeling. However, the differences that exist are important.
No Joins:
• Some NoSQL systems do not have joins.*
• To live in a join-less world, we have to do joins ourselves within our
application’s code.
• Essentially we need to issue a second query to find the relevant data
in a second collection. Setting our data up is not any different than
declaring a foreign key in a relational database.

* MongoDB sí lo tiene con el operador $lookup, más adelante se muestra un ejemplo. Ver
https://docs.mongodb.com/manual/reference/operator/aggregation/lookup
• The first thing we will do is create an employee (here it is an explicit
_id so that we can build coherent examples)
db.employees.insert({_id: ObjectId("4d85c7039ab0fd70a117d730"),
name: 'Leto'})

• Now let us add a couple employees and set their manager as Leto:
db.employees.insert({_id: ObjectId(
"4d85c7039ab0fd70a117d731"),
name: 'Duncan',
manager: ObjectId(
"4d85c7039ab0fd70a117d730")});

db.employees.insert({_id: ObjectId(
"4d85c7039ab0fd70a117d732"),
name: 'Moneo',
manager: ObjectId(
"4d85c7039ab0fd70a117d730")});
• So to find all of Leto’s employees, one simply executes:

db.employees.find({manager: ObjectId("4d85c7039ab0fd70a117d730")})

• In this example, the lack of join will merely require an extra query
A simple example with $lookup
db.user.remove({})
db.user.insert({code:1, name:'George', gender: 'male'})
db.user.insert({code:2, name:'Saffron', gender: 'female'})
db.user.insert({code:3, name:'Tini', gender: 'female'})

db.account.remove({})
db.account.insert({userCode:1, account:'Ventas'})
db.account.insert({userCode:2, account:'Compras'})
db.account.insert({userCode:1, account:'Cocina'})
db.user.aggregate([
{
$lookup:
{
from: "account",
localField: "code",
foreignField: "userCode",
as: "accounts"
}
}
]).pretty();
db.user.aggregate([
{$lookup:
{
from: "account",
localField: "code",
foreignField: "userCode",
as: "accounts"
}
},
{$match: {accounts: {$ne: []}}},
{$project: {_id: 0}}
]).pretty();
• Arrays and Embedded Documents:
• Remember that MongoDB supports arrays as first class objects of a
document
• It turns out that this is incredibly handy when dealing with many-to-
one or many-to-many relationships.
• As a simple example, if an employee could have two managers, we
could simply store these in an array:
db.employees.insert(
{_id: ObjectId("4d85c7039ab0fd70a117d733"),
name: 'Siona',
manager: [
ObjectId("4d85c7039ab0fd70a117d730"),
ObjectId("4d85c7039ab0fd70a117d732")]
})
• Of particular interest is that, for some documents, manager can be a
scalar value, while for others it can be an array!
• Our previous find query will work for both:

db.employees.find({manager: ObjectId(
"4d85c7039ab0fd70a117d730")})
• Besides arrays, MongoDB also supports embedded documents. Go
ahead and try inserting a document with a nested document, such as:
db.employees.insert(
{_id: ObjectId("4d85c7039ab0fd70a117d734"),
name: 'Ghanima',
family: {mother: 'Chani',
father: 'Paul',
brother: ObjectId("4d85c7039ab0fd70a117d730")}
})
• Embedded documents can be queried using a dot-notation:
db.employees.find({'family.mother': 'Chani'})
• Combining the two concepts, we can even embed arrays of documents:
db.employees.insert(
{_id: ObjectId("4d85c7039ab0fd70a117d735"),
name: 'Chani',
family: [{relation: 'mother', name: 'Ann'},
{relation: 'father', name: 'Paul'},
{relation: 'brother', name: 'Duncan'}]
})
• Denormalization
• “Denormalization refers to the process of optimizing the read performance of a database by
adding redundant data or by grouping data.”*
• This process may be accomplished by duplicating data in multiple tables, grouping data for
queries.
• With the evergrowing popularity of NoSQL, many of which do not have joins,
denormalization as part of normal modeling is becoming common.
This does not mean you should duplicate every piece of
data in every document.

* https://quizlet.com/145056951/cassandra-flash-cards
• Consider modeling your data based on what information belongs to
what document.
• For example, say you are writing a forum application. The traditional
way to associate a specific user with a post is via a userid column
within posts.
• With such a model, you can not display posts without retrieving
(joining to) users.
• A possible alternative is simply to store the name as well as the userid
with each post.
• Of course, if you let users change their name, you may have to update
each document (which is one multi-update)  But it is not very
common that users change their name…
• Adjusting to this kind of approach will not come easy to some.
• Do not be afraid to experiment with this approach though, it can be
suitable in some circumstances
• Some alternatives
• Arrays of ids can be a useful strategy when dealing with one-to-many
or many-to-many scenarios. But more commonly, developers are left
deciding between using embedded documents versus doing “manual”
referencing.
• Embedded documents are frequently took advantage of*, but mostly
for smaller pieces of data which we want to always pull with the
parent document.
• A real world example may be to store an addresses documents with
each user, something like:
* In the original the author uses “leveraged”; however, see
http://this.isfluent.com/2010/1/are-you-stupid-enough-to-use-leverage-as-a-verb
db.employees.insert(
{name: 'leto',
email: '[email protected]',
addresses: [
{street: "229 W. 43rd St", city: "New York", state:"NY",zip:"10036"},
{street: "555 University", city: "Palo Alto", state:"CA",zip:"94107"}]
})
• This does not mean you should underestimate the power of
embedded documents or write them off as something of minor utility.
• Having your data model map directly to your objects makes things a
lot simpler and often removes the need to join.
• This is especially true when you consider that MongoDB lets you
query and index fields of an embedded documents and arrays.
• Few or Many Collections
• Given that collections do not enforce any schema, it is entirely
possible to build a system using a single collection with a mishmash of
documents!!! But it would be a very bad idea.
• The conversation gets even more interesting when you consider
embedded documents.
• The example that frequently comes up is a blog. Should you have a
posts collection and a comments collection, or should each post have
an array of comments embedded within it?
• Setting aside the document size limit for the time being*, most
developers should prefer to separate things out. It is simply cleaner,
gives you better performance and more explicit.
• MongoDB’s flexible schema allows you to combine the two
approaches by keeping comments in their own collection but
embedding a few comments (maybe the first few) in the blog post to
be able to display them with the post.
This follows the principle of keeping together data
that you want to get back in one query.

*16MB in MongoDB
• There is no hard rule.
• Play with different approaches and you will get a sense of what does
and does not feel right.
When To Use MongoDB?
• There are enough new and competing storage technologies that it is
easy to get overwhelmed by all of the choices.
• Only you know whether the benefits of introducing a new solution
outweigh the costs.
• MongoDB (and in general, NoSQL-databases) should be seen as a
direct alternative to relational databases.
Notice that we did not call MongoDB a replacement for relational
databases, but rather an alternative.
• It is a tool that can do what a lot of other tools can do. Some of it
MongoDB does better, some of it MongoDB does worse. Let us
dissect things a little further.
Flexible Schema
• An oft-touted* benefit of document-oriented database is that they do
not enforce a fixed schema.
• This makes them much more flexible than traditional database tables.

*Muy promocionado
• People talk about schema-less as though you will suddenly start
storing a crazy mishmash of data.
• “There are domains and data sets which can really be a pain to model
using relational databases, but I see those as edge cases.”  Karl
Seguin
• Schema-less is cool, but most of your data is going to be highly
structured
• There is nothing a nullable column probably would not solve just as
well.
A lot of features…
Writes
• MongoDB has something called a capped* collection.
• We can create a capped collection by using the db.createCollection
command and flagging it as capped:
• //limit our capped collection to 1 megabyte
db.createCollection('logs', {capped: true , size: 1048576})

* Que tienen un tope

A lot of features…
• When our capped collection reaches its 1MB limit, old documents are
automatically purged.
• A limit on the number of documents, rather than the size, can be set
using max.
• If you want to “expire” your data based on time rather than overall
collection size, you can use TTL indexes where TTL stands for “time-
to-live”. See: https://docs.mongodb.com/manual/core/index-ttl
A lot of features…
Full Text Search
• MongoDB includes text search capabilities. See:
https://docs.mongodb.com/manual/reference/operator/query/text
• It supports fifteen languages with stemming and stop words.
• With MongoDB’s support for arrays and text search you will only need
to look to other solutions if you need a more powerful and full-
featured text search engine.

Utilities: mongoimport and mongoexport (JSON and CSV files)

A lot of features…
Data Processing
• Before version 2.2 MongoDB relied on MapReduce for most data
processing jobs.
• As of 2.2 it has added a powerful feature called aggregation
framework* or pipeline, so you will only need to use MapReduce in
rare cases where you need complex functions for aggregations that
are not yet supported in the pipeline.
• For parallel processing of very large data, you may need to rely on
something else, such as Hadoop.

*Similar to GROUP BY in SQL, you are encouraged to to try it…See a basic example next
• A basic aggregation example: What does this code do?

db.unicorns.aggregate([
{ $match: { } },
{ $group: { _id: "$gender", total: { $sum: 1 } } }
])

$match is similar to where in SQL, here it can be removed…

See also:
https://docs.mongodb.com/manual/reference/method/db.collection.aggregate
A lot of features…
Geospatial
• A particularly powerful feature of MongoDB is its support for geospatial
indexes. This allows you to store either geoJSON (x and y coordinates
within documents and many more geospatial data…)
See: https://docs.mongodb.com/manual/reference/geojson
Parallel and distributed execution across sharded nodes
• Replicas

Many, many more features…

Very briefly: Cursors
• The db.collection.find() method returns a cursor.
• By default, the cursor will be iterated automatically when the result of
the query is returned.
• You can also manually iterate a cursor: In the mongo shell, when you
assign the cursor returned from the find() method to a variable using
the var keyword, the cursor does not automatically iterate.
• Cursors are rich in methods, see
https://docs.mongodb.com/manual/reference/method/js-cursor
Example 1
var myCursor = db.unicorns.find({});
while (myCursor.hasNext()) {
print(tojson(myCursor.next()));
}
• As an alternative consider the printjson() method to replace print(tojson()):
var myCursor = db.unicorns.find({});
while (myCursor.hasNext()) {
printjson(myCursor.next());
}
Example 2: What does this example do?
var micursor = db.unicorns.find().sort({weight: 1})
var i = 0;
while (micursor.hasNext()) {
if (i%2 == 1) printjson(micursor.next());
else micursor.next();
i++;
}

MongoDB Guide for Students
No ratings yet
MongoDB Guide for Students
104 pages
FSD Unit III
No ratings yet
FSD Unit III
22 pages
Dbms Unit5 Notes
No ratings yet
Dbms Unit5 Notes
81 pages
281511lecture Notes 2 - MongoDB Data Modeling-1718181255820
No ratings yet
281511lecture Notes 2 - MongoDB Data Modeling-1718181255820
13 pages
NoSQL Database Guide
No ratings yet
NoSQL Database Guide
100 pages
Mongodb
No ratings yet
Mongodb
9 pages
A. Im, G. Cai, H. Tunc, J. Stevens, Y. Barve, S. Hei Vanderbilt University
No ratings yet
A. Im, G. Cai, H. Tunc, J. Stevens, Y. Barve, S. Hei Vanderbilt University
81 pages
MongoDB Document Database Overview
No ratings yet
MongoDB Document Database Overview
69 pages
Big Data Notes
No ratings yet
Big Data Notes
13 pages
Mongo DB
No ratings yet
Mongo DB
50 pages
MongoDB Data Modeling Guide
No ratings yet
MongoDB Data Modeling Guide
2 pages
Module 5
No ratings yet
Module 5
32 pages
Mongodb Schema Design Part 3
No ratings yet
Mongodb Schema Design Part 3
1 page
Mongodb Tutorial: Database Collection
No ratings yet
Mongodb Tutorial: Database Collection
36 pages
Big Training Data Module 2 - Mongo DB 2
No ratings yet
Big Training Data Module 2 - Mongo DB 2
67 pages
MongoDB Data Modeling Best Practices
No ratings yet
MongoDB Data Modeling Best Practices
22 pages
BIS601 Module 5 Textbook
No ratings yet
BIS601 Module 5 Textbook
57 pages
MongoDB Schema Design Basics
100% (2)
MongoDB Schema Design Basics
51 pages
Java MongoDB Development Guide
No ratings yet
Java MongoDB Development Guide
129 pages
Mongodb
No ratings yet
Mongodb
49 pages
NoSQL 24 Mongo P1
No ratings yet
NoSQL 24 Mongo P1
43 pages
Chapitre 4 MongoDB
No ratings yet
Chapitre 4 MongoDB
27 pages
Mongo DB
No ratings yet
Mongo DB
77 pages
Complete Unit 3 Notes
No ratings yet
Complete Unit 3 Notes
30 pages
Mongodb
No ratings yet
Mongodb
60 pages
Manual Group B Assignment No 1
No ratings yet
Manual Group B Assignment No 1
7 pages
mongoDB 1
No ratings yet
mongoDB 1
23 pages
MongoDB for Developers
No ratings yet
MongoDB for Developers
15 pages
Csis 3300 w5 9 Nosql
No ratings yet
Csis 3300 w5 9 Nosql
27 pages
Bda Unit 4
No ratings yet
Bda Unit 4
13 pages
Journey To The Mongodb: Myat Su Htwe Senior Lecturer Academic Department
No ratings yet
Journey To The Mongodb: Myat Su Htwe Senior Lecturer Academic Department
44 pages
MongoDB Tutorial
No ratings yet
MongoDB Tutorial
4 pages
MongoDB Essentials for Developers
No ratings yet
MongoDB Essentials for Developers
144 pages
Unit - Iii Bda
No ratings yet
Unit - Iii Bda
51 pages
Ultimate Mongodb Cheatsheet
No ratings yet
Ultimate Mongodb Cheatsheet
5 pages
1664473609-Unit 5 - Database Management - MongoDB
No ratings yet
1664473609-Unit 5 - Database Management - MongoDB
23 pages
Lecture 07.06 ModelingDataInMongo - 12
No ratings yet
Lecture 07.06 ModelingDataInMongo - 12
12 pages
Unit 4
No ratings yet
Unit 4
27 pages
Big Data (Unit 3)
No ratings yet
Big Data (Unit 3)
46 pages
MongoDB Guide for Developers
No ratings yet
MongoDB Guide for Developers
26 pages
RDBMS vs MongoDB Terminology Guide
No ratings yet
RDBMS vs MongoDB Terminology Guide
15 pages
Mongo DB
No ratings yet
Mongo DB
16 pages
MongoDB Guide for Adobe Developers
No ratings yet
MongoDB Guide for Adobe Developers
7 pages
Mongo DB
No ratings yet
Mongo DB
36 pages
MongoDB Document Database Overview
No ratings yet
MongoDB Document Database Overview
31 pages
The Little Mongo DB Schema Design Book by Christian Amor Kvalheim
No ratings yet
The Little Mongo DB Schema Design Book by Christian Amor Kvalheim
153 pages
MongoDB NoSQL Database Guide
No ratings yet
MongoDB NoSQL Database Guide
19 pages
Lecture 9 - MongoDB
No ratings yet
Lecture 9 - MongoDB
8 pages
Unit-3 (Mongo DB)
No ratings yet
Unit-3 (Mongo DB)
47 pages
MongoDB Basics for Beginners
No ratings yet
MongoDB Basics for Beginners
27 pages
MongoDb Imp
No ratings yet
MongoDb Imp
21 pages
Mongo DB
No ratings yet
Mongo DB
26 pages
MongoDB Guide for Developers
No ratings yet
MongoDB Guide for Developers
24 pages
Lecture 18 Theory
No ratings yet
Lecture 18 Theory
18 pages
MONGODB Relationships
No ratings yet
MONGODB Relationships
5 pages
FSD 3 Unit
No ratings yet
FSD 3 Unit
5 pages
MongoDB Schema Design Guide
No ratings yet
MongoDB Schema Design Guide
61 pages
Manufacturing and Production Information System
100% (19)
Manufacturing and Production Information System
17 pages
Microsoft Access Database Design Guide
No ratings yet
Microsoft Access Database Design Guide
2 pages
RDBMS Implementation Overview
No ratings yet
RDBMS Implementation Overview
21 pages
Togaf Qs
67% (3)
Togaf Qs
87 pages
Data Warehousing and Data Mining - Thara - M.Tech Cse
No ratings yet
Data Warehousing and Data Mining - Thara - M.Tech Cse
11 pages
Cross-Industry Standard Process For Data Mining
No ratings yet
Cross-Industry Standard Process For Data Mining
3 pages
HP Oracle DW - BI Sizing Questionnaire
100% (1)
HP Oracle DW - BI Sizing Questionnaire
15 pages
Agile Complete Merged
No ratings yet
Agile Complete Merged
82 pages
Comprehensive Guide to Software Testing
No ratings yet
Comprehensive Guide to Software Testing
5 pages
1 DB Set of MCQ
No ratings yet
1 DB Set of MCQ
7 pages
RDBMS
No ratings yet
RDBMS
1 page
CSC Competencies Modified
100% (6)
CSC Competencies Modified
12 pages
Unit-Iv Chapter in Book - 4 JDBC Part - I: - K. Indhu
No ratings yet
Unit-Iv Chapter in Book - 4 JDBC Part - I: - K. Indhu
25 pages
ACC564 Week 8 Homework 1
No ratings yet
ACC564 Week 8 Homework 1
2 pages
3 - Software Design Process
100% (1)
3 - Software Design Process
5 pages
SAP Data Cleansing and Migration To SAP S - 4HANA - SAP Blogs
No ratings yet
SAP Data Cleansing and Migration To SAP S - 4HANA - SAP Blogs
22 pages
Database Management Systems Overview
No ratings yet
Database Management Systems Overview
11 pages
SDLC Models and Risk Management
No ratings yet
SDLC Models and Risk Management
7 pages
Module 7 Business Continuity Management
100% (4)
Module 7 Business Continuity Management
128 pages
Introduction To RDBMS Day - 1
No ratings yet
Introduction To RDBMS Day - 1
54 pages
Chapter 11 Quiz on ETL and ERP Concepts
No ratings yet
Chapter 11 Quiz on ETL and ERP Concepts
4 pages
Database, Data Warehouse, and Data Lake Explained
No ratings yet
Database, Data Warehouse, and Data Lake Explained
3 pages
A Comprehensive Guide To The Data Catalog
No ratings yet
A Comprehensive Guide To The Data Catalog
39 pages
Ds Salaries
No ratings yet
Ds Salaries
11 pages
Assignment No 1 Software Case
No ratings yet
Assignment No 1 Software Case
3 pages
ORBIT Pre-Demo Questionnaire Guide
No ratings yet
ORBIT Pre-Demo Questionnaire Guide
1 page
Agile Workflow: Sprint Process Overview
No ratings yet
Agile Workflow: Sprint Process Overview
1 page
Features by License Type For Planning Models
No ratings yet
Features by License Type For Planning Models
6 pages
Erased Log by Sos
No ratings yet
Erased Log by Sos
4 pages
Bi Unit I
No ratings yet
Bi Unit I
40 pages

Mongodb (Cont.) : Excerpts From "The Little Mongodb Book" Karl Seguin

Uploaded by

Mongodb (Cont.) : Excerpts From "The Little Mongodb Book" Karl Seguin

Uploaded by

MongoDB (cont.

• db.unicorns.find({},{loves: {$slice: -1}})

* Que tienen un tope

Utilities: mongoimport and mongoexport (JSON and CSV files)

$match is similar to where in SQL, here it can be removed…

Many, many more features…

You might also like