mongoDB indexing topic

6. 唯一索引
# 范例
> db.userinfo.drop()
true
> db.userinfo.insert({"firstname" : "zhou","lastname" : "digoal","age" : 27})
> db.userinfo.ensureIndex({"lastname" : 1},{"unique" : true})
> db.userinfo.getIndexes()
[
        {
                "name" : "_id_",
                "ns" : "test.userinfo",
                "key" : {
                        "_id" : 1
                }
        },
        {
                "_id" : ObjectId("4d240fa49ae655253f813f69"),
                "ns" : "test.userinfo",
                "key" : {
                        "lastname" : 1
                },
                "name" : "lastname_1",
                "unique" : true
        }
]
> db.userinfo.insert({"firstname" : "zhou","lastname" : "digoal","age" : 27})
E11000 duplicate key error index: test.userinfo.$lastname_1 dup key: { : "digoal" }

# null值测试,唯一索引中null的值不能出现两次,这个和一般的RDBMS不太一样.
> db.userinfo.insert({"firstname" : "zhou","age" : 27})
> db.userinfo.insert({"firstname" : "zhou","age" : 27})
E11000 duplicate key error index: test.userinfo.$lastname_1 dup key: { : null }

# 测试自动删除重复记录
> db.userinfo.drop()
true
> db.userinfo.insert({"firstname" : "zhou","lastname" : "digoal","age" : 27})
> db.userinfo.insert({"firstname" : "zhou","lastname" : "digoal1","age" : 28})
> db.userinfo.insert({"firstname" : "zhou","lastname" : "digoal2","age" : 29})
> db.userinfo.insert({"firstname" : "zhou1","lastname" : "digoal","age" : 28})
> db.userinfo.insert({"firstname" : "zhou2","lastname" : "digoal","age" : 28})
> db.userinfo.insert({"firstname" : "zhou2","lastname" : "digoal","age" : 28})
> db.userinfo.find()
{ "_id" : ObjectId("4d2410609ae655253f813f6d"), "firstname" : "zhou", "lastname" : "digoal", "age" : 27 }
{ "_id" : ObjectId("4d24107c9ae655253f813f6e"), "firstname" : "zhou", "lastname" : "digoal1", "age" : 28 }
{ "_id" : ObjectId("4d2410809ae655253f813f6f"), "firstname" : "zhou", "lastname" : "digoal2", "age" : 29 }
{ "_id" : ObjectId("4d2410949ae655253f813f70"), "firstname" : "zhou1", "lastname" : "digoal", "age" : 28 }
{ "_id" : ObjectId("4d2410979ae655253f813f71"), "firstname" : "zhou2", "lastname" : "digoal", "age" : 28 }
{ "_id" : ObjectId("4d24109b9ae655253f813f72"), "firstname" : "zhou2", "lastname" : "digoal", "age" : 28 }
# 使用"dropDups" : true选项,重复记录被删除,删除原则是保留第一次搜索到的KEY,后面找到的都被删除.
> db.userinfo.ensureIndex({"lastname" : 1},{"unique" : true,"dropDups" : true})
E11000 duplicate key error index: test.userinfo.$lastname_1 dup key: { : "digoal" }
> db.userinfo.find()
{ "_id" : ObjectId("4d2410609ae655253f813f6d"), "firstname" : "zhou", "lastname" : "digoal", "age" : 27 }
{ "_id" : ObjectId("4d24107c9ae655253f813f6e"), "firstname" : "zhou", "lastname" : "digoal1", "age" : 28 }
{ "_id" : ObjectId("4d2410809ae655253f813f6f"), "firstname" : "zhou", "lastname" : "digoal2", "age" : 29 }

7. 后台建立索引
# 默认情况下,建索引是前台建的,建索引期间阻止读写操作.这是非常暴力的操作,所以自从1.3.2以后mongoDB允许在后台建立索引,
后台建立索引的做法是增量的做法，因此不中断数据库的读写操作。
#　尽管建立索引是后台操作的,但是必须等建完才返回shell.如果在建立索引期间有其他的数据库操作需要另外开启shell
# 在建索引期间,可以从system.indexes中查看到索引的存在,但是知道建完才可以使用这个索引.
# 随时可以使用db.currentOp() 和 db.killOp()来终结这个过程.
# 范例
> db.userinfo.drop()
true
for (i=0;i<1000000;i++) {
db.userinfo.insert({"firstname" : "zhou","lastname" : "digoal" + "." + i,"age" : i,"other" :{"corp" : "sky-mobi" , "city" : "hangzhou"}})
}
> db.userinfo.ensureIndex({"lastname" : 1,"age" : 1},{"background" :true})
# 查看当前进程1645749就是以上执行的操作, "msg" : "bg index build 272821/1000000 27%"显示操作进度
> db.currentOp()
{
        "inprog" : [
                {
                        "opid" : 1645750,
                        "active" : true,
                        "lockType" : "read",
                        "waitingForLock" : false,
                        "secs_running" : 2,
                        "op" : "getmore",
                        "ns" : "local.oplog.$main",
                        "client" : "172.16.3.33:4158",
                        "desc" : "conn"
                },
                {
                        "opid" : 1645749,
                        "active" : true,
                        "lockType" : "write",
                        "waitingForLock" : false,
                        "secs_running" : 3,
                        "op" : "insert",
                        "ns" : "test.system.indexes",
                        "client" : "127.0.0.1:49250",
                        "desc" : "conn",
                        "msg" : "bg index build 272821/1000000 27%"
                }
        ]
}

# 杀op测试
> db.currentOp()
{
        "inprog" : [
                {
                        "opid" : 1645906,
                        "active" : true,
                        "lockType" : "read",
                        "waitingForLock" : false,
                        "secs_running" : 0,
                        "op" : "getmore",
                        "ns" : "local.oplog.$main",
                        "client" : "172.16.3.33:4158",
                        "desc" : "conn"
                },
                {
                        "opid" : 1645905,
                        "active" : true,
                        "lockType" : "write",
                        "waitingForLock" : false,
                        "secs_running" : 1,
                        "op" : "insert",
                        "ns" : "test.system.indexes",
                        "client" : "127.0.0.1:49250",
                        "desc" : "conn",
                        "msg" : "bg index build 145412/1000000 14%"
                }
        ]
}
> db.killOp(1645905)
{ "info" : "attempting to kill op" }
# 被中断
> db.userinfo.ensureIndex({"lastname" : 1,"age" : 1,"other.corp":1},{"background" :true})
interrupted
> db.userinfo.getIndexes()
[
        {
                "name" : "_id_",
                "ns" : "test.userinfo",
                "key" : {
                        "_id" : 1
                }
        }
]

8. 删除索引
# 范例
# 删除collection下除_id以外的所有索引.
> db.userinfo.dropIndexes();
{
        "nIndexesWas" : 3,
        "msg" : "non-_id indexes dropped for collection",
        "ok" : 1
}
> db.userinfo.getIndexes()
[
        {
                "name" : "_id_",
                "ns" : "test.userinfo",
                "key" : {
                        "_id" : 1
                }
        }
]
# 删除指定索引
> db.userinfo.getIndexes()
[
        {
                "name" : "_id_",
                "ns" : "test.userinfo",
                "key" : {
                        "_id" : 1
                }
        },
        {
                "_id" : ObjectId("4d2416ac9ae655253f9081ba"),
                "ns" : "test.userinfo",
                "key" : {
                        "lastname" : 1,
                        "age" : 1,
                        "other.corp" : 1
                },
                "name" : "lastname_1_age_1_other.corp_1",
                "background" : true
        }
]
> db.userinfo.dropIndex({"lastname" : 1,"age" : 1,"other.corp":1})
{ "nIndexesWas" : 2, "ok" : 1 }
> db.userinfo.getIndexes()
[
        {
                "name" : "_id_",
                "ns" : "test.userinfo",
                "key" : {
                        "_id" : 1
                }
        }
]
# 使用runCommand删除索引
> db.userinfo.getIndexes()
[
        {
                "name" : "_id_",
                "ns" : "test.userinfo",
                "key" : {
                        "_id" : 1
                }
        },
        {
                "_id" : ObjectId("4d2416f49ae655253f9081bb"),
                "ns" : "test.userinfo",
                "key" : {
                        "lastname" : 1,
                        "age" : 1,
                        "other.corp" : 1
                },
                "name" : "lastname_1_age_1_other.corp_1",
                "background" : false
        }
]
> db.runCommand({"dropIndexes" : "userinfo", "index" : {"lastname" : 1,"age" : 1,"other.corp":1}})
{ "nIndexesWas" : 2, "ok" : 1 }
> db.userinfo.getIndexes()
[
        {
                "name" : "_id_",
                "ns" : "test.userinfo",
                "key" : {
                        "_id" : 1
                }
        }
]

# 使用$cmd.findOne删除索引
> db.userinfo.getIndexes()
[
        {
                "name" : "_id_",
                "ns" : "test.userinfo",
                "key" : {
                        "_id" : 1
                }
        },
        {
                "_id" : ObjectId("4d24176e9ae655253f9081bc"),
                "ns" : "test.userinfo",
                "key" : {
                        "lastname" : 1,
                        "age" : 1,
                        "other.corp" : 1
                },
                "name" : "lastname_1_age_1_other.corp_1",
                "background" : false
        }
]
> db.$cmd.findOne({"dropIndexes" : "userinfo", "index" : {"lastname" : 1,"age" : 1,"other.corp":1}})
{ "nIndexesWas" : 2, "ok" : 1 }
> db.userinfo.getIndexes()
[
        {
                "name" : "_id_",
                "ns" : "test.userinfo",
                "key" : {
                        "_id" : 1
                }
        }
]

# 使用runCommand删除除_id以外的索引
> db.runCommand({"dropIndexes" : "userinfo", "index" : "*"})
{
        "nIndexesWas" : 2,
        "msg" : "non-_id indexes dropped for collection",
        "ok" : 1
}
> db.userinfo.getIndexes()
[
        {
                "name" : "_id_",
                "ns" : "test.userinfo",
                "key" : {
                        "_id" : 1
                }
        }
]

9. reIndex
# 使用reIndex重建collection上的所有索引包含_id的索引,重建索引是不建议的操作,建议删除后后台重建
> db.userinfo.reIndex()
{
        "nIndexesWas" : 2,
        "msg" : "indexes dropped for collection",
        "ok" : 1,
        "nIndexes" : 2,
        "indexes" : [
                {
                        "name" : "_id_",
                        "ns" : "test.userinfo",
                        "key" : {
                                "_id" : 1
                        }
                },
                {
                        "_id" : ObjectId("4d24190b9ae655253f9081be"),
                        "ns" : "test.userinfo",
                        "key" : {
                                "lastname" : 1,
                                "age" : 1,
                                "other.corp" : 1
                        },
                        "name" : "lastname_1_age_1_other.corp_1",
                        "background" : false
                }
        ],
        "ok" : 1
}
# 重currentOp可以看出是前台操作,所以是不建议的
> db.currentOp()
{
        "inprog" : [
                {
                        "opid" : 1646209,
                        "active" : true,
                        "lockType" : "read",
                        "waitingForLock" : true,
                        "secs_running" : 15,
                        "op" : "getmore",
                        "ns" : "local.oplog.$main",
                        "client" : "172.16.3.33:4158",
                        "desc" : "conn"
                },
                {
                        "opid" : 1646210,
                        "active" : true,
                        "lockType" : "write",
                        "waitingForLock" : false,
                        "secs_running" : 13,
                        "op" : "query",
                        "ns" : "test",
                        "query" : {
                                "reIndex" : "userinfo"
                        },
                        "client" : "127.0.0.1:49250",
                        "desc" : "conn",
                        "msg" : "index: (2/3) btree bottom up 122806/1000000 12%"
                }
        ]
}

10. Geospatial 空间索引
# 专题介绍,建立在2d基础之上,如查找最近的餐厅,等等.

其他:
1. mongoDB中的执行计划刷新特征.
2. mongoDB中不允许单个KEY建立同样的索引.但是允许有不同的索引.
3. 排序要走复合索引必须与创始时的顺序完全相同或完全相反.
4. When you update an object, if the object fits in its previous allocation area, only those indexes whose keys have changed are updated. This improves performance. Note that if the object has grown and must move, all index keys must then update, which is slower.
5. 关于执行计划缓存,mongodb是这么做的,不过周期没有指出.Hinting is usually unnecessary. MongoDB has a query optimizer and is very clever about choosing which index to use. When you first do a query, the query optimizer tries out a number of query plans concurrently. The first one to finish will be used, and the rest of the query executions are terminated. That query plan will be remembered for future queries on the same keys. The query optimizer periodically retries other plans, in case you’ve added new data and the previously chosen plan is no longer best. The only part you should need to worry about is giving the query optimizer useful indexes to choose from.

注意事项:
1. 不能滥用索引,索引过多会影响写入性能。
2. 使用合适的索引名称长度,在删除索引前最好查询system.indexes确认索引名。
3. 小心使用唯一索引的去重功能。
4. 小心使用前台建索引的场景。
5. 小心使用reIndex().
6. 建好索引后一定要使用explain查看一下是否生效.
7. 唯一索引中null的值不能出现两次,这个和一般的RDBMS不太一样.
8. You may use sort() to return data in order without an index if the data set to be returned is small (less than four megabytes). For these cases it is best to use limit() and sort() together.

时间： 2024-10-26 19:14:05

mongoDB indexing topic - 2

mongoDB indexing topic - 2的相关文章

mongoDB indexing topic - 1

mongoDB Indexing Advice

mongoDB's Geospatial Indexing

Discuz!NT千万级数据量上的两驾马车--TokyoCabinet,MongoDB

TokuMX, MongoDB and InnoDB versus the insert benchmark with disks

mongoose-100分悬赏，Mongodb使用_auth参数启动后，Mongoose应该怎么连接

How Should I Backup MongoDB Databases?

MongoDB World 2017 参会全记录

Using mongoDB's Profiler analyze the performance of database operations