mongoDB indexing topic - 1

mongoDB的index和我们常见的RDBMS的index差不多,目的都是为了提高查询性能。
在mongoDB中,除了常见的B-Tree索引,还有Geospatial索引(一般用于定位搜索)。
B-Tree索引支持在key,内嵌key,内嵌文档,array等上面建索引,支持复合索引(compound index),唯一索引;在1.7.4以后还引入了稀疏索引的概念.
范例 : 
# 新建测试collection
> for (i=0;i<10000;i++) {                                                             
... db.userinfo.insert({"firstname" : "zhou","lastname" : "digoal" + "." + i,"age" : i})
... }
> db.userinfo.find().limit(10)
{ "_id" : ObjectId("4d23fdc69ae655253f811078"), "firstname" : "zhou", "lastname" : "digoal.0", "age" : 0 }
{ "_id" : ObjectId("4d23fdc69ae655253f811079"), "firstname" : "zhou", "lastname" : "digoal.1", "age" : 1 }
{ "_id" : ObjectId("4d23fdc69ae655253f81107a"), "firstname" : "zhou", "lastname" : "digoal.2", "age" : 2 }
{ "_id" : ObjectId("4d23fdc69ae655253f81107b"), "firstname" : "zhou", "lastname" : "digoal.3", "age" : 3 }
{ "_id" : ObjectId("4d23fdc69ae655253f81107c"), "firstname" : "zhou", "lastname" : "digoal.4", "age" : 4 }
{ "_id" : ObjectId("4d23fdc69ae655253f81107d"), "firstname" : "zhou", "lastname" : "digoal.5", "age" : 5 }
{ "_id" : ObjectId("4d23fdc69ae655253f81107e"), "firstname" : "zhou", "lastname" : "digoal.6", "age" : 6 }
{ "_id" : ObjectId("4d23fdc69ae655253f81107f"), "firstname" : "zhou", "lastname" : "digoal.7", "age" : 7 }
{ "_id" : ObjectId("4d23fdc69ae655253f811080"), "firstname" : "zhou", "lastname" : "digoal.8", "age" : 8 }
{ "_id" : ObjectId("4d23fdc69ae655253f811081"), "firstname" : "zhou", "lastname" : "digoal.9", "age" : 9 }
# 当前只有_id的默认索引
> db.system.indexes.find({"ns" : "test.userinfo"})
{ "name" : "_id_", "ns" : "test.userinfo", "key" : { "_id" : 1 } }

1. basic usage
# 全表扫描
> db.userinfo.find({"lastname" : "digoal.0"})
{ "_id" : ObjectId("4d23fdc69ae655253f811078"), "firstname" : "zhou", "lastname" : "digoal.0", "age" : 0 }
> db.userinfo.find({"lastname" : "digoal.0"}).explain()
{
        "cursor" : "BasicCursor",
        "nscanned" : 10000,
        "nscannedObjects" : 10000,
        "n" : 1,
        "millis" : 5,
        "indexBounds" : {

        }
}
# nscanned 扫描了多少条记录,n返回多少条记录,cursor扫描方法.
# create index on lastname key
# 正向索引
> db.userinfo.ensureIndex({"lastname" : 1})
> db.system.indexes.find({"ns" : "test.userinfo"})     
{ "name" : "_id_", "ns" : "test.userinfo", "key" : { "_id" : 1 } }
{ "_id" : ObjectId("4d2402a39ae655253f813789"), "ns" : "test.userinfo", "key" : { "lastname" : 1 }, "name" : "lastname_1" }
# 反向索引
> db.userinfo.ensureIndex({"lastname" : -1})      
> db.system.indexes.find({"ns" : "test.userinfo"})
{ "name" : "_id_", "ns" : "test.userinfo", "key" : { "_id" : 1 } }
{ "_id" : ObjectId("4d2402a39ae655253f813789"), "ns" : "test.userinfo", "key" : { "lastname" : 1 }, "name" : "lastname_1" }
{ "_id" : ObjectId("4d2402ad9ae655253f81378a"), "ns" : "test.userinfo", "key" : { "lastname" : -1 }, "name" : "lastname_-1" }
# 重复建同样的索引,mongoDB不会干任何事情
> db.userinfo.ensureIndex({"lastname" : -1})      
> db.system.indexes.find({"ns" : "test.userinfo"})
{ "name" : "_id_", "ns" : "test.userinfo", "key" : { "_id" : 1 } }
{ "_id" : ObjectId("4d2402a39ae655253f813789"), "ns" : "test.userinfo", "key" : { "lastname" : 1 }, "name" : "lastname_1" }
{ "_id" : ObjectId("4d2402ad9ae655253f81378a"), "ns" : "test.userinfo", "key" : { "lastname" : -1 }, "name" : "lastname_-1" }
> db.getLastError()
null
# but that's ok(驱动列一样的两个索引)
> db.system.indexes.find({"ns" : "test.userinfo"})   
{ "name" : "_id_", "ns" : "test.userinfo", "key" : { "_id" : 1 } }
{ "_id" : ObjectId("4d24050e9ae655253f813790"), "ns" : "test.userinfo", "key" : { "lastname" : 1 }, "name" : "lastname_1" }
{ "_id" : ObjectId("4d24051a9ae655253f813791"), "ns" : "test.userinfo", "key" : { "lastname" : 1, "age" : 1 }, "name" : "lastname_1_age_1" }
# 执行计划测试
# 单个lastname条件时走lastname_1索引
> db.userinfo.find({"lastname" : "digoal.0"}).explain()
{
        "cursor" : "BtreeCursor lastname_1",
        "nscanned" : 1,
        "nscannedObjects" : 1,
        "n" : 1,
        "millis" : 0,
        "indexBounds" : {
                "lastname" : [
                        [
                                "digoal.0",
                                "digoal.0"
                        ]
                ]
        }
}
# 复合条件"lastname" : "digoal.0","age" : 0走lastname_1_age_1索引
> db.userinfo.find({"lastname" : "digoal.0","age" : 0}).explain()
{
        "cursor" : "BtreeCursor lastname_1_age_1",
        "nscanned" : 1,
        "nscannedObjects" : 1,
        "n" : 1,
        "millis" : 0,
        "indexBounds" : {
                "lastname" : [
                        [
                                "digoal.0",
                                "digoal.0"
                        ]
                ],
                "age" : [
                        [
                                0,
                                0
                        ]
                ]
        }
}
# 非驱动列查询条件不走索引
> db.userinfo.find({"age" : 0}).explain()                        
{
        "cursor" : "BasicCursor",
        "nscanned" : 10000,
        "nscannedObjects" : 10000,
        "n" : 1,
        "millis" : 5,
        "indexBounds" : {

        }
}
> db.userinfo.find({"firstname" : "zhou"}).explain()
{
        "cursor" : "BasicCursor",
        "nscanned" : 10000,
        "nscannedObjects" : 10000,
        "n" : 10000,
        "millis" : 5,
        "indexBounds" : {

        }
}
# 排序与索引,单KEY索引正向反向排序输出都可以走索引,复合索引需要注意,索引KEY必须同时被正或被反,否则不可以走索引.
> db.userinfo.find().sort({"lastname" : 1}).explain() 
{
        "cursor" : "BtreeCursor lastname_1",
        "nscanned" : 10000,
        "nscannedObjects" : 10000,
        "n" : 10000,
        "millis" : 6,
        "indexBounds" : {
                "lastname" : [
                        [
                                {
                                        "$minElement" : 1
                                },
                                {
                                        "$maxElement" : 1
                                }
                        ]
                ]
        }
}
> db.userinfo.find().sort({"lastname" : -1}).explain()          
{
        "cursor" : "BtreeCursor lastname_1 reverse",
        "nscanned" : 10000,
        "nscannedObjects" : 10000,
        "n" : 10000,
        "millis" : 6,
        "indexBounds" : {
                "lastname" : [
                        [
                                {
                                        "$maxElement" : 1
                                },
                                {
                                        "$minElement" : 1
                                }
                        ]
                ]
        }
}
> db.userinfo.find().sort({"lastname" : 1,"age" : 1}).explain()
{
        "cursor" : "BtreeCursor lastname_1_age_1",
        "nscanned" : 10000,
        "nscannedObjects" : 10000,
        "n" : 10000,
        "millis" : 8,
        "indexBounds" : {
                "lastname" : [
                        [
                                {
                                        "$minElement" : 1
                                },
                                {
                                        "$maxElement" : 1
                                }
                        ]
                ],
                "age" : [
                        [
                                {
                                        "$minElement" : 1
                                },
                                {
                                        "$maxElement" : 1
                                }
                        ]
                ]
        }
}
> db.userinfo.find().sort({"lastname" : -1,"age" : -1}).explain()
{
        "cursor" : "BtreeCursor lastname_1_age_1 reverse",
        "nscanned" : 10000,
        "nscannedObjects" : 10000,
        "n" : 10000,
        "millis" : 8,
        "indexBounds" : {
                "lastname" : [
                        [
                                {
                                        "$maxElement" : 1
                                },
                                {
                                        "$minElement" : 1
                                }
                        ]
                ],
                "age" : [
                        [
                                {
                                        "$maxElement" : 1
                                },
                                {
                                        "$minElement" : 1
                                }
                        ]
                ]
        }
}
> db.userinfo.find().sort({"lastname" : -1,"age" : 1}).explain()
{
        "cursor" : "BasicCursor",
        "nscanned" : 10000,
        "nscannedObjects" : 10000,
        "n" : 10000,
        "scanAndOrder" : true,
        "millis" : 51,
        "indexBounds" : {

        }
}
> db.userinfo.find().sort({"lastname" : 1,"age" : -1}).explain()
{
        "cursor" : "BasicCursor",
        "nscanned" : 10000,
        "nscannedObjects" : 10000,
        "n" : 10000,
        "scanAndOrder" : true,
        "millis" : 51,
        "indexBounds" : {

        }
}

2. 内嵌KEY索引
# 新建测试数据
for (i=0;i<1000;i++) {
db.userinfo1.insert({"firstname" : "zhou","lastname" : "digoal","age" : 27,"other" : {"corp" : "sky-mobi","city" : "hangzhou","phone" : i}})
}
# 内嵌KEY索引
> db.userinfo1.ensureIndex({"other.phone" : 1})
> db.userinfo1.getIndexes()
[
        {
                "name" : "_id_",
                "ns" : "test.userinfo1",
                "key" : {
                        "_id" : 1
                }
        },
        {
                "_id" : ObjectId("4d2408e59ae655253f813b7c"),
                "ns" : "test.userinfo1",
                "key" : {
                        "other.phone" : 1
                },
                "name" : "other.phone_1"
        }
]
# 执行计划,走索引other.phone_1,注意到索引的长度是内嵌文档+.$key_正或反,所以索引的名字
> db.userinfo1.find({"other.phone" : 0}).explain()
{
        "cursor" : "BtreeCursor other.phone_1",
        "nscanned" : 1,
        "nscannedObjects" : 1,
        "n" : 1,
        "millis" : 0,
        "indexBounds" : {
                "other.phone" : [
                        [
                                0,
                                0
                        ]
                ]
        }
}
# 使用自定义索引名字:phone,由于mongoDB中namespace的长度是受限制的127,所以选择适当的名字是非常有必要的
> db.userinfo1.dropIndex({"other.phone" : 1})     
{ "nIndexesWas" : 2, "ok" : 1 }
> db.userinfo1.ensureIndex({"other.phone" : 1},{"name" : "phone"})
> db.userinfo1.find({"other.phone" : 0}).explain()                
{
        "cursor" : "BtreeCursor phone",
        "nscanned" : 1,
        "nscannedObjects" : 1,
        "n" : 1,
        "millis" : 0,
        "indexBounds" : {
                "other.phone" : [
                        [
                                0,
                                0
                        ]
                ]
        }
}
# 内嵌KEY索引也可以做复合索引

3. 内嵌文档索引
# 拿上面的表继续测试
> db.userinfo1.ensureIndex({"other" : 1},{"name" : "ebd"})        
> db.userinfo1.getIndexes()
[
        {
                "name" : "_id_",
                "ns" : "test.userinfo1",
                "key" : {
                        "_id" : 1
                }
        },
        {
                "_id" : ObjectId("4d2409989ae655253f813b7d"),
                "ns" : "test.userinfo1",
                "key" : {
                        "other.phone" : 1
                },
                "name" : "phone"
        },
        {
                "_id" : ObjectId("4d240a0f9ae655253f813b7e"),
                "ns" : "test.userinfo1",
                "key" : {
                        "other" : 1
                },
                "name" : "ebd"
        }
]
# 测试文档索引的使用,只有当使用文档类型时才走索引,直接引用内嵌KEY是无法走内嵌文档索引的.
# 这点和array的索引不太一样,array的索引是建立在array的每个值上面的.
> db.userinfo1.find({"other.corp" : "sky-mobi"}).explain() 
{
        "cursor" : "BasicCursor",
        "nscanned" : 1001,
        "nscannedObjects" : 1001,
        "n" : 1001,
        "millis" : 1,
        "indexBounds" : {

        }
}
> db.userinfo1.find({"other" : {"corp" : "sky-mobi"}}).explain()
{
        "cursor" : "BtreeCursor ebd",
        "nscanned" : 0,
        "nscannedObjects" : 0,
        "n" : 0,
        "millis" : 0,
        "indexBounds" : {
                "other" : [
                        [
                                {
                                        "corp" : "sky-mobi"
                                },
                                {
                                        "corp" : "sky-mobi"
                                }
                        ]
                ]
        }
}
# 内嵌文档索引也可以被拆成复合内嵌KEY索引,这样的话引导KEY可以被执行计划使用。

3. array索引
# 参考我之前写的《mongoDB's Multikeys feature》

4. 复合索引
# 如在a,b,c上建索引
可以被条件
a
a,b
a,c
a,b,c使用
# 测试
> db.userinfo.drop()
true
> for (i=0;i<1000;i++) {
... db.userinfo.insert({"firstname" : "zhou","lastname" : "digoal" + "." + i,"age" : i})
... }
> db.userinfo.ensureIndex({"firstname" :1,"lastname" :1,"age" :1})
> db.userinfo.getIndexes()
[
        {
                "name" : "_id_",
                "ns" : "test.userinfo",
                "key" : {
                        "_id" : 1
                }
        },
        {
                "_id" : ObjectId("4d240cbf9ae655253f813f67"),
                "ns" : "test.userinfo",
                "key" : {
                        "firstname" : 1,
                        "lastname" : 1,
                        "age" : 1
                },
                "name" : "firstname_1_lastname_1_age_1"
        }
]
> db.userinfo.find({"firstname" : "zhou"}).explain()
{
        "cursor" : "BtreeCursor firstname_1_lastname_1_age_1",
        "nscanned" : 1000,
        "nscannedObjects" : 1000,
        "n" : 1000,
        "millis" : 1,
        "indexBounds" : {
                "firstname" : [
                        [
                                "zhou",
                                "zhou"
                        ]
                ],
                "lastname" : [
                        [
                                {
                                        "$minElement" : 1
                                },
                                {
                                        "$maxElement" : 1
                                }
                        ]
                ],
                "age" : [
                        [
                                {
                                        "$minElement" : 1
                                },
                                {
                                        "$maxElement" : 1
                                }
                        ]
                ]
        }
}
> db.userinfo.find({"firstname" : "zhou","lastname" : "digoal.0"}).explain()
{
        "cursor" : "BtreeCursor firstname_1_lastname_1_age_1",
        "nscanned" : 1,
        "nscannedObjects" : 1,
        "n" : 1,
        "millis" : 0,
        "indexBounds" : {
                "firstname" : [
                        [
                                "zhou",
                                "zhou"
                        ]
                ],
                "lastname" : [
                        [
                                "digoal.0",
                                "digoal.0"
                        ]
                ],
                "age" : [
                        [
                                {
                                        "$minElement" : 1
                                },
                                {
                                        "$maxElement" : 1
                                }
                        ]
                ]
        }
}
> db.userinfo.find({"firstname" : "zhou","lastname" : "digoal.0","age" : 0}).explain()
{
        "cursor" : "BtreeCursor firstname_1_lastname_1_age_1",
        "nscanned" : 1,
        "nscannedObjects" : 1,
        "n" : 1,
        "millis" : 0,
        "indexBounds" : {
                "firstname" : [
                        [
                                "zhou",
                                "zhou"
                        ]
                ],
                "lastname" : [
                        [
                                "digoal.0",
                                "digoal.0"
                        ]
                ],
                "age" : [
                        [
                                0,
                                0
                        ]
                ]
        }
}
> db.userinfo.find({"firstname" : "zhou","age" : 0}).explain()                        
{
        "cursor" : "BtreeCursor firstname_1_lastname_1_age_1",
        "nscanned" : 1,
        "nscannedObjects" : 1,
        "n" : 1,
        "millis" : 2,
        "indexBounds" : {
                "firstname" : [
                        [
                                "zhou",
                                "zhou"
                        ]
                ],
                "lastname" : [
                        [
                                {
                                        "$minElement" : 1
                                },
                                {
                                        "$maxElement" : 1
                                }
                        ]
                ],
                "age" : [
                        [
                                0,
                                0
                        ]
                ]
        }
}
> db.userinfo.find({"lastname" : "digoal.0","age" : 0}).explain()
{
        "cursor" : "BasicCursor",
        "nscanned" : 1000,
        "nscannedObjects" : 1000,
        "n" : 1,
        "millis" : 0,
        "indexBounds" : {

        }
}
# 当然,如果你愿意的话使用hint强制索引也行
> db.userinfo.find({"lastname" : "digoal.0","age" : 0}).hint({"firstname" :1,"lastname" :1,"age" :1}).explain()
{
        "cursor" : "BtreeCursor firstname_1_lastname_1_age_1",
        "nscanned" : 1,
        "nscannedObjects" : 1,
        "n" : 1,
        "millis" : 0,
        "indexBounds" : {
                "firstname" : [
                        [
                                {
                                        "$minElement" : 1
                                },
                                {
                                        "$maxElement" : 1
                                }
                        ]
                ],
                "lastname" : [
                        [
                                "digoal.0",
                                "digoal.0"
                        ]
                ],
                "age" : [
                        [
                                0,
                                0
                        ]
                ]
        }
}

5. 松散索引,New in 1.7.4,以下来自官方网站介绍
A "sparse index" is an index that only includes documents with the indexed field.
Any document that is missing the sparsely indexed field will not be stored in the index; the index will therefor be sparse because of the missing documents when values are missing.

Sparse indexes, by definition, are not complete (for the collection) and behave differently than complete indexes. When using a "sparse index" for sorting (or possibly just filtering) some documents in the collection may not be returned. This is because only documents in the index will be returned
# 范例
db.people.ensureIndex({title : 1}, {sparse : true})
db.people.save({name:"Jim"})
db.people.save({name:"Sarah", title:"Princess"})
db.people.find({title:{$ne:null}}).sort({title:1})        // returns only Sarah
You can combine sparse with unique to produce a unique constraint that ignores documents with missing fields

时间: 2024-09-21 18:19:26

mongoDB indexing topic - 1的相关文章

mongoDB indexing topic - 2

6. 唯一索引# 范例> db.userinfo.drop()true> db.userinfo.insert({"firstname" : "zhou","lastname" : "digoal","age" : 27})> db.userinfo.ensureIndex({"lastname" : 1},{"unique" : true})>

mongoDB Indexing Advice

mongoDB建索引的几个建议:1. 建立索引的前提是查询条件,建立完之后explain查看有索引和无索引的开销对比.能建立唯一的索引尽量唯一.如:> for (i=0;i<10000;i++) {... db.userinfo.insert({"firstname" : "zhou" + i,"lastname" : "digoal" + i,"age" : i,"city"

mongoDB&#039;s Geospatial Indexing

mongoDB支持二维空间索引,使用空间索引,mongoDB支持一种特殊查询,如某地图网站上可以查找离你最近的咖啡厅,银行等信息.这个使用mongoDB的空间索引结合特殊的查询方法很容易实现.前提条件:建立空间索引的key可以使用array或内嵌文档存储,但是前两个elements必须存储固定的一对空间位置数值.如 { loc : [ 50 , 30 ] } { loc : { x : 50 , y : 30 } } { loc : { foo : 50 , y : 30 } } { loc :

Discuz!NT千万级数据量上的两驾马车--TokyoCabinet,MongoDB

在Discuz!NT的企业版设计过程中,处理大数据表一直是一个让人头疼的问题,特别是像主题表 (topic),用户表(user)等,因为对于一个流量和发帖量都很大的论坛而言,在运行几年之后,这两 个表的数据量可能会破千万(注:因为帖子表采用分表机制,所以这里暂未涉及,但出于性能考虑,也提 供了本文中类似的解决方案).当时考虑的架构设计中有两种思路来解决这种问题: 一种是采用类似MYSPACE的方式,即按一定记录KEY值(比如用户表的UID)来对大数据表中的记录进行 分割,比如前200万用户(即:

TokuMX, MongoDB and InnoDB versus the insert benchmark with disks

I used the insert benchmark on servers that use disks in my quest to learn more about MongoDB internals. The insert benchmark is interesting for a few reasons. First while inserting a lot of data isn't something I do all of the time it is something f

mongoose-100分悬赏,Mongodb使用_auth参数启动后,Mongoose应该怎么连接

问题描述 100分悬赏,Mongodb使用_auth参数启动后,Mongoose应该怎么连接 大家好,请教下,当Mongodb使用_auth参数启动后,Mongoose应该怎么连接才行?看了Mongoose官网文档,尝试了以下连接方法:1.mongoose.connect(""mongodb://admin:111111@localhost/root""); 2.var options = { db: { native_parser: true } server:

How Should I Backup MongoDB Databases?

Abstract: MongoDB replica set is composed of a group of MongoDB instances (processes), including one primary node and multiple secondary nodes. All the data on the MongoDB Driver (client) is written to the primary node, and the secondary node synchro

MongoDB World 2017 参会全记录

作者简介 张友东,阿里云数据库技术专家,Mongoing中文社区联合发起人,主要关注分布式存储.NoSQL数据库等技术领域,目前主要从事MongoDB云数据库的研发工作,致力于让开发者用上最好的MongoDB云服务,本次给大家分享上周(6.19-6.21)在芝加哥参加 MongoDB Wolrd 2017 的一些收获. MongoDB云数据库包年5折,包月6折活动火热进行中,猛击了解详情 云数据库 MongoDB 版 基于飞天分布式系统和高性能存储,提供三节点副本集的高可用架构,容灾切换,故障迁

Using mongoDB&#039;s Profiler analyze the performance of database operations

慢查询在数据库分析中是一个非常重要的参考.我记得在PostgreSQL Conference 2009 中曾经就有人专门讲解了Hit the mole.(90%的数据库性能问题往往出现在10%的SQL上面.)在MySQL,PostgreSQL数据库中,用日志来记录慢查询.(比如为数据库设置一下阀值100MS,运行时间超过100MS的SQL将记录到日志中.)在mongoDB中也比较类似,使用profile 来配置慢查询的阀值.已经是否开启记录慢查询的功能.mongoDB的慢查询被记录到system