• 自建MongoDB实践:MongoDB 分片集群

    接下来的一些列文章会为大家介绍日常工作中常用的 NoSQL 产品 MongoDB。

    主要涉及到:

    MongoDB 的安装及基本使用

    MongoDB 文档查询

    MongoDB 复制集

    MongoDB 分片集群的介绍及搭建

    MongoDB 的备份及恢复

    MongoDB 安全加密

    MongoDB Change Stream 功能介绍及代码演示

    MongoDB 其他

    我们会用八篇文章近五万字来阐述 MongoDB 运维相关的日常实战事宜。

    您当前看到的是系列文章的第四篇:《MongoDB 分片集群的介绍及搭建》

    一般来说,在数据库系统和计算系统中,我们有两种方法来提高其性能。第一个是简单地用更强大的服务器取代我们常规的服务器,一般我们称之为垂直扩容(或纵向扩容)。

    ►垂直扩容的主要缺点是它有限制:它不能无限扩大,这取决于多方面的因素。诸如:硬件已达到其物理极限、云提供商不能为我们提供更强大的服务器。

    提高性能的第二种方法是使用具有相同容量的服务器并增加其数量,一般我们称之为水平扩容(或横向扩容)。

    当数据量比较大的时候,我们需要把数据分片运行在不同的机器中,以降低 CPU、内存和 IO 的压力,Sharding 就是数据库分片技术。

    MongoDB 分片技术类似 MySQL 的水平切分和垂直切分,数据库主要由两种方式做 Sharding:垂直扩展和横向切分。

    • 垂直扩展:添加更多的 CPU,内存,磁盘空间等。
    • 横向切分:则是通过数据分片的方式,通过集群统一提供服务。

    一个 MongoDB 分片集群由以下组件组成:

    • shard: 每个分片都包含分片数据的一个子集。每个分片以副本集部署。
    • mongos: Mongos 充当查询路由器,在客户端应用程序和分片集群之间提供接口。从 MongoDB 4.4 开始,mongos 可以支持 hedged 读取,以尽量减少延迟。
    • config servers: 配置服务器存储集群的元数据和配置信息。

    MongoDB 在 Collection 级别进行分片处理,在集群中的分片之间分发这些 Collection 数据。

    一个生产环境的集群,请确保数据的冗余性及系统的高可用性。对于一个生产级别的分片集群,需要考虑一下几点:

    • 部署一个 3 成员的复制集作为一个配置中心服务
    • 每个分片部署为一个 3 成员的复制集
    • 部署一个或多个 mongos 路由

    环境准备

    • 开始演示:

    如果大家在演示该文档时,手头上的机器资源不充足的话,可以安排一台多个角色即可(使用不同的端口号),不一定非得一台机器一个角色。

    • 环境拓扑如下:

    这里我们使用了静态 DNS 解析,如果有条件,可以用 DNS 服务进行域名的配置解析。/etc/hosts 文件如下:

    10.20.20.19 mongo01.tyun.cn cfg1.tyun.cn mongos1.tyun.cn
    10.20.20.11 mongo02.tyun.cn cfg2.tyun.cn mongos2.tyun.cn
    10.20.20.41 mongo03.tyun.cn cfg3.tyun.cn mongos3.tyun.cn
    10.20.20.14 mongo04.tyun.cn
    10.20.20.53 mongo05.tyun.cn
    10.20.20.61 mongo06.tyun.cn
    10.20.20.62 mongo07.tyun.cn
    10.20.20.89 mongo08.tyun.cn
    10.20.20.99 mongo09.tyun.cn

    配置 Config Server

    01准备配置文件

    在 3 台配置节点上分别创建配置文件 /etc/mongo-cfg.conf,内容如下:

    # cfg1.tyun.cn 的配置文件
    (venv36) [root@mongo01 ~]# cat /etc/mongo-cfg.conf 
    systemLog:
      destination: file
      logAppend: true
      path: /var/log/mongodb/mongo-cfg.log
    storage:
      dbPath: /var/lib/mongocfg
      journal:
        enabled: true
      wiredTiger:
        engineConfig:
          cacheSizeGB: 1
    processManagement:
      fork: true  # fork and run in background
      pidFilePath: /var/run/mongodb/mongo-cfg.pid  # location of pidfile
      timeZoneInfo: /usr/share/zoneinfo
    net:
      port: 27000
      bindIp: cfg1.tyun.cn
    sharding:
      clusterRole: configsvr
    replication:
      replSetName: config
    
    
    # cfg2.tyun.cn 的配置文件
    (venv36) [root@mongo02 ~]# cat /etc/mongo-cfg.conf 
    systemLog:
      destination: file
      logAppend: true
      path: /var/log/mongodb/mongo-cfg.log
    storage:
      dbPath: /var/lib/mongocfg
      journal:
        enabled: true
      wiredTiger:
        engineConfig:
          cacheSizeGB: 1
    processManagement:
      fork: true  # fork and run in background
      pidFilePath: /var/run/mongodb/mongo-cfg.pid  # location of pidfile
      timeZoneInfo: /usr/share/zoneinfo
    net:
      port: 27000
      bindIp: cfg2.tyun.cn
    sharding:
      clusterRole: configsvr
    replication:
      replSetName: config
    
    
    # cfg3.tyun.cn 的配置文件
    (venv36) [root@mongo03 ~]# cat /etc/mongo-cfg.conf 
    systemLog:
      destination: file
      logAppend: true
      path: /var/log/mongodb/mongo-cfg.log
    storage:
      dbPath: /var/lib/mongocfg
      journal:
        enabled: true
      wiredTiger:
        engineConfig:
          cacheSizeGB: 1
    processManagement:
      fork: true  # fork and run in background
      pidFilePath: /var/run/mongodb/mongo-cfg.pid  # location of pidfile
      timeZoneInfo: /usr/share/zoneinfo
    net:
      port: 27000
      bindIp: cfg3.tyun.cn
    sharding:
      clusterRole: configsvr
    replication:
      replSetName: config

    02启动 Config Server

    在 3 台配置节点上分别执行如下命令:

    [root@mongo01 ~]# systemctl start mongocfg
    
    
    [root@mongo02 ~]# systemctl start mongocfg
    
    
    [root@mongo03 ~]# systemctl start mongocfg

    检查一下进程是否已经启动成功:

    (venv36) [root@mongo01 ~]# ansible -i hosts 'cfg' -m shell -a "systemctl status mongocfg" |grep "Active: active (running)"
       Active: active (running) since Fri 2022-08-05 05:24:56 UTC; 1min 4s ago
       Active: active (running) since Fri 2022-08-05 05:25:25 UTC; 35s ago
       Active: active (running) since Fri 2022-08-05 05:25:36 UTC; 24s ago

    03初始化 Config Server

    登录到第一个节点上,这时还没有创建用户及密码,所以登录时没有指定密码也是可以登录的。

    (venv36) [root@mongo01 ~]# mongo cfg1.tyun.cn:27000
    MongoDB shell version v4.4.15
    connecting to: mongodb://cfg1.tyun.cn:27000/test?compressors=disabled&gssapiServiceName=mongodb
    Implicit session: session { "id" : UUID("651fb6a5-9e7e-43f9-91ee-1ae6a2b3365f") }
    MongoDB server version: 4.4.15
    > 
    > show dbs
    
    
    > use test
    switched to db test
    
    
    > db.test.insert({a: 1})
    WriteCommandError({
        "ok" : 0,
        "errmsg" : "command insert requires authentication",
        "code" : 13,
        "codeName" : "Unauthorized"
    })

    除了创建用户,其实什么也不能操作。接下来的第一件事情是创建用户及密码:

    > db.createUser({user: "root", pwd: "root123", roles: [{role: "root", db: "admin" }]})

    接着初始化 Config Server:

    (venv36) [root@mongo01 ~]# mongo -u root -p  --host cfg1.tyun.cn:27000 --authenticationDatabase admin
    MongoDB shell version v4.4.15
    Enter password: 
    connecting to: mongodb://cfg1.tyun.cn:27000/?authSource=admin&compressors=disabled&gssapiServiceName=mongodb
    Implicit session: session { "id" : UUID("a1827479-b741-4f8b-be49-5ca0be4852aa") }
    MongoDB server version: 4.4.15
    ---
    The server generated these startup warnings when booting: 
            2022-08-05T06:30:24.135+00:00: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine. See http://dochub.mongodb.org/core/prodnotes-filesystem
    ---
    ---
            Enable MongoDB's free cloud-based monitoring service, which will then receive and display
            metrics about your deployment (disk utilization, CPU, operation statistics, etc).
    
    
            The monitoring data will be available on a MongoDB website with a unique URL accessible to you
            and anyone you share the URL with. MongoDB may use this information to make product
            improvements and to suggest MongoDB products and deployment options to you.
    
    
            To enable free monitoring, run the following command: db.enableFreeMonitoring()
            To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
    ---
    >
    > rs.initiate({
        _id: "config",
        "members" : [
            {
                "_id": 0,
                "host" : "cfg1.tyun.cn:27000"
            },
            {
                "_id": 1,
                "host" : "cfg2.tyun.cn:27000"
            },
            {
                "_id": 2,
                "host" : "cfg3.tyun.cn:27000"
            }
        ]
    });
    { "ok" : 1 }

    需要等待 10 秒钟左右,3 个 Config Server 会通过选举产生主节点。

    注意提示符变化:

    config:SECONDARY> 
    config:PRIMARY> 
    ......
    config:PRIMARY> config:PRIMARY> show dbs
    admin   0.000GB
    config  0.000GB
    local   0.000GB
    
    
    config:PRIMARY> use admin
    switched to db admin
    
    
    config:PRIMARY> show users
    {
        "_id" : "admin.admin",
        "userId" : UUID("0c0d5bc1-062c-4204-963f-bba842ffda7d"),
        "user" : "admin",
        "db" : "admin",
        "roles" : [
            {
                "role" : "dbAdminAnyDatabase",
                "db" : "admin"
            },
            {
                "role" : "userAdminAnyDatabase",
                "db" : "admin"
            }
        ],
        "mechanisms" : [
            "SCRAM-SHA-1",
            "SCRAM-SHA-256"
        ]
    }
    {
        "_id" : "admin.root",
        "userId" : UUID("aa54a433-e9a2-452b-bd1d-d6ef54f4a46e"),
        "user" : "root",
        "db" : "admin",
        "roles" : [
            {
                "role" : "root",
                "db" : "admin"
            }
        ],
        "mechanisms" : [
            "SCRAM-SHA-1",
            "SCRAM-SHA-256"
        ]
    }

    至此,Config Server 配置完成。

    配置 Replica Set

    Replica Set 的配置请参考 Replica Set 章节。分片 1 的集群节点为:

    • mongo04.tyun.cn:27010
    • mongo05.tyun.cn:27010
    • mongo06.tyun.cn:27010

    配置 Mongos

    01准备 mongos 配置文件

    sharding:
      configDB: <configReplSetName>/cfg1.example.net:27019,cfg2.example.net:27019
    net:
      bindIp: localhost,<hostname(s)|ip address(es)>

    一个相对完整的配置文件(以 mongos1 为例):

    [root@mongo01 ~]# cat /etc/mongos.conf 
    systemLog:
      destination: file
      logAppend: true
      path: /var/log/mongodb/mongos.log
    processManagement:
      fork: true
      pidFilePath: /var/run/mongodb/mongos.pid
      timeZoneInfo: /usr/share/zoneinfo
    net:
      port: 27017
      bindIp: mongos1.tyun.cn
    # security:
    #   authorization: enabled
    #   keyFile: /etc/mongod.keyfile
    sharding:
      configDB: config/cfg1.tyun.cn:27000,cfg2.tyun.cn:27000,cfg3.tyun.cn:27000

    02启动 mongos

    [root@mongo01 ~]# mongos \
      --bind_ip mongos1.tyun.cn \
      --port 27017 \
      --logpath /var/log/mongodb/mongos.log \
      --configdb config/cfg1.tyun.cn:27000,cfg2.tyun.cn:27000,cfg3.tyun.cn:27000 \
      --fork
    
    
    [root@mongo02 ~]# mongos \
      --bind_ip mongos2.tyun.cn \
      --port 27017 \
      --logpath /var/log/mongodb/mongos.log \
      --configdb config/cfg1.tyun.cn:27000,cfg2.tyun.cn:27000,cfg3.tyun.cn:27000 \
      --fork
    
    
    [root@mongo03 ~]# mongos \
      --bind_ip mongos3.tyun.cn \
      --port 27017 \
      --logpath /var/log/mongodb/mongos.log \
      --configdb config/cfg1.tyun.cn:27000,cfg2.tyun.cn:27000,cfg3.tyun.cn:27000 \
      --fork

    mongos 也可以通过配置文件的形式启动:

    [root@mongo01 ~]# cat /etc/mongos.conf 
    systemLog:
      destination: file
      logAppend: true
      path: /var/log/mongodb/mongos.log
    processManagement:
      fork: true  # fork and run in background
      pidFilePath: /var/run/mongodb/mongos.pid  # location of pidfile
      timeZoneInfo: /usr/share/zoneinfo
    net:
      port: 27017
      bindIp: mongos1.tyun.cn
    security:
      # authorization: enabled
      keyFile: /etc/mongo.keyfile
    sharding:
      configDB: config/cfg1.tyun.cn:27000,cfg2.tyun.cn:27000,cfg3.tyun.cn:27000

    启动命令如下:

    [root@mongo01 ~]# mongos -f /etc/mongos.conf

    添加 shard1 分片到分片集

    增加第一个分片 shard1 到集群中:

    [root@mongo01 ~]# mongo --host mongos1.tyun.cn:27017
    MongoDB shell version v4.4.15
    connecting to: mongodb://mongos1.tyun.cn:27017/?compressors=disabled&gssapiServiceName=mongodb
    Implicit session: session { "id" : UUID("f1ade2c4-c071-4e8a-9fbb-f1093e9d9753") }
    MongoDB server version: 4.4.15
    ---
    The server generated these startup warnings when booting: 
            2022-08-05T09:17:59.537+00:00: Access control is not enabled for the database. Read and write access to data and configuration is unrestricted
            2022-08-05T09:17:59.537+00:00: You are running this process as the root user, which is not recommended
    ---
    mongos> show dbs
    admin   0.000GB
    config  0.000GB
    mongos> 
    mongos> sh.addShard("shard1/mongo04.tyun.cn:27010,mongo05.tyun.cn:27010,mongo06.tyun.cn:27010");
    {
        "shardAdded" : "shard1",
        "ok" : 1,
        "operationTime" : Timestamp(1659691403, 8),
        "$clusterTime" : {
            "clusterTime" : Timestamp(1659691403, 8),
            "signature" : {
                "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                "keyId" : NumberLong(0)
            }
        }
    }
    
    
    mongos> sh.status()
    --- Sharding Status --- 
      sharding version: {
          "_id" : 1,
          "minCompatibleVersion" : 5,
          "currentVersion" : 6,
          "clusterId" : ObjectId("62ecc377dc19b0487fcd62e6")
      }
      shards:
            {  "_id" : "shard1",  "host" : "shard1/mongo04.tyun.cn:27010,mongo05.tyun.cn:27010,mongo06.tyun.cn:27010",  "state" : 1 }
      active mongoses:
            "4.4.15" : 3
      autosplit:
            Currently enabled: yes
      balancer:
            Currently enabled:  yes
            Currently running:  no
            Failed balancer rounds in last 5 attempts:  0
            Migration Results for the last 24 hours: 
                    No recent migrations
      databases:
            {  "_id" : "config",  "primary" : "config",  "partitioned" : true }

    创建分片表

    接下来我们创建一个测试库 test,然后在 test 库上创建集合 shard,并开启分片。

    mongos> sh.enableSharding("test");
    {
        "ok" : 1,
        "operationTime" : Timestamp(1659755432, 7),
        "$clusterTime" : {
            "clusterTime" : Timestamp(1659755432, 7),
            "signature" : {
                "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                "keyId" : NumberLong(0)
            }
        }
    }
    
    
    mongos> sh.shardCollection("test.shard", {_id: 'hashed'});
    {
        "collectionsharded" : "test.shard",
        "collectionUUID" : UUID("329f4308-bff9-453a-bec2-7f3a757d95dd"),
        "ok" : 1,
        "operationTime" : Timestamp(1659755452, 13),
        "$clusterTime" : {
            "clusterTime" : Timestamp(1659755452, 13),
            "signature" : {
                "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                "keyId" : NumberLong(0)
            }
        }
    }
    
    
    mongos> sh.status()
    --- Sharding Status --- 
      sharding version: {
          "_id" : 1,
          "minCompatibleVersion" : 5,
          "currentVersion" : 6,
          "clusterId" : ObjectId("62ecc377dc19b0487fcd62e6")
      }
      shards:
            {  "_id" : "shard1",  "host" : "shard1/mongo04.tyun.cn:27010,mongo05.tyun.cn:27010,mongo06.tyun.cn:27010",  "state" : 1 }
      active mongoses:
            "4.4.15" : 3
      autosplit:
            Currently enabled: yes
      balancer:
            Currently enabled:  yes
            Currently running:  no
            Failed balancer rounds in last 5 attempts:  0
            Migration Results for the last 24 hours: 
                    No recent migrations
      databases:
            {  "_id" : "config",  "primary" : "config",  "partitioned" : true }
                    config.system.sessions
                            shard key: { "_id" : 1 }
                            unique: false
                            balancing: true
                            chunks:
                                    shard1    1024
                            too many chunks to print, use verbose if you want to force print
            {  "_id" : "test",  "primary" : "shard1",  "partitioned" : true,  "version" : {  "uuid" : UUID("8c333889-11b2-4de0-9f54-f0c56b622124"),  "lastMod" : 1 } }
                    test.shard
                            shard key: { "_id" : "hashed" }
                            unique: false
                            balancing: true
                            chunks:
                                    shard1    2 // 注意这里的输出
                            { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong(0) } on : shard1 Timestamp(1, 0) 
                            { "_id" : NumberLong(0) } -->> { "_id" : { "$maxKey" : 1 } } on : shard1 Timestamp(1, 1)

    我们可以看到 shard1 中有 2 chunk。

    插入测试数据:

    mongos> use test
    switched to db test
    
    
    mongos> for (var i = 0; i < 100000; i++) {
        db.shard.insert({i: i});
    }
    
    
    mongos> db.shard.find().limit(10)
    { "_id" : ObjectId("62eddc26f659b8344f42c837"), "i" : 0 }
    { "_id" : ObjectId("62eddc26f659b8344f42c838"), "i" : 1 }
    { "_id" : ObjectId("62eddc26f659b8344f42c839"), "i" : 2 }
    { "_id" : ObjectId("62eddc26f659b8344f42c83a"), "i" : 3 }
    { "_id" : ObjectId("62eddc26f659b8344f42c83b"), "i" : 4 }
    { "_id" : ObjectId("62eddc26f659b8344f42c83c"), "i" : 5 }
    { "_id" : ObjectId("62eddc26f659b8344f42c83d"), "i" : 6 }
    { "_id" : ObjectId("62eddc26f659b8344f42c83e"), "i" : 7 }
    { "_id" : ObjectId("62eddc26f659b8344f42c83f"), "i" : 8 }
    { "_id" : ObjectId("62eddc26f659b8344f42c840"), "i" : 9 }

    这时我们也可以登录到 shard1 复本集里面查看一下数据(找到主节点进行登录):

    [root@mongo01 ~]# mongo --host mongo05.tyun.cn:27010
    MongoDB shell version v4.4.15
    connecting to: mongodb://mongo05.tyun.cn:27010/?compressors=disabled&gssapiServiceName=mongodb
    Implicit session: session { "id" : UUID("b14a4b9b-f6b9-48d5-980a-a7fd3bbf2d73") }
    MongoDB server version: 4.4.15
    ---
    shard1:PRIMARY> show dbs
    admin   0.000GB
    config  0.000GB
    local   0.004GB
    test    0.006GB
    shard1:PRIMARY> use test
    switched to db test
    shard1:PRIMARY> db.shard
    db.shard
    shard1:PRIMARY> db.shard.find().limit(6)
    { "_id" : ObjectId("62eddc26f659b8344f42c837"), "i" : 0 }
    { "_id" : ObjectId("62eddc26f659b8344f42c838"), "i" : 1 }
    { "_id" : ObjectId("62eddc26f659b8344f42c839"), "i" : 2 }
    { "_id" : ObjectId("62eddc26f659b8344f42c83a"), "i" : 3 }
    { "_id" : ObjectId("62eddc26f659b8344f42c83b"), "i" : 4 }
    { "_id" : ObjectId("62eddc26f659b8344f42c83c"), "i" : 5 }
    shard1:PRIMARY>

    添加 shard2 分片到分片集

    Replica Set 的配置请参考 Replica Set 章节。分片 1 的集群节点为:

    • mongo07.tyun.cn:27010
    • mongo08.tyun.cn:27010
    • mongo09.tyun.cn:27010

    shard2 复本集验证:

    [root@mongo01 ~]# mongo --host mongo07.tyun.cn:27010
    
    
    shard2:PRIMARY> rs.status()
    {
        "set" : "shard2",
        "date" : ISODate("2022-08-06T03:31:26.564Z"),
        "myState" : 1,
        "term" : NumberLong(1),
        "syncSourceHost" : "",
        "syncSourceId" : -1,
        "heartbeatIntervalMillis" : NumberLong(2000),
        "majorityVoteCount" : 2,
        "writeMajorityCount" : 2,
        "votingMembersCount" : 3,
        "writableVotingMembersCount" : 3,
        "optimes" : {
            "lastCommittedOpTime" : {
                "ts" : Timestamp(1659756685, 1),
                "t" : NumberLong(1)
            },
            "lastCommittedWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
            "readConcernMajorityOpTime" : {
                "ts" : Timestamp(1659756685, 1),
                "t" : NumberLong(1)
            },
            "readConcernMajorityWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
            "appliedOpTime" : {
                "ts" : Timestamp(1659756685, 1),
                "t" : NumberLong(1)
            },
            "durableOpTime" : {
                "ts" : Timestamp(1659756685, 1),
                "t" : NumberLong(1)
            },
            "lastAppliedWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
            "lastDurableWallTime" : ISODate("2022-08-06T03:31:25.927Z")
        },
        "lastStableRecoveryTimestamp" : Timestamp(1659756625, 4),
        "electionCandidateMetrics" : {
            "lastElectionReason" : "electionTimeout",
            "lastElectionDate" : ISODate("2022-08-06T03:30:25.877Z"),
            "electionTerm" : NumberLong(1),
            "lastCommittedOpTimeAtElection" : {
                "ts" : Timestamp(0, 0),
                "t" : NumberLong(-1)
            },
            "lastSeenOpTimeAtElection" : {
                "ts" : Timestamp(1659756615, 1),
                "t" : NumberLong(-1)
            },
            "numVotesNeeded" : 2,
            "priorityAtElection" : 1,
            "electionTimeoutMillis" : NumberLong(10000),
            "numCatchUpOps" : NumberLong(0),
            "newTermStartDate" : ISODate("2022-08-06T03:30:25.915Z"),
            "wMajorityWriteAvailabilityDate" : ISODate("2022-08-06T03:30:26.890Z")
        },
        "members" : [
            {
                "_id" : 0,
                "name" : "mongo07.tyun.cn:27010",
                "health" : 1,
                "state" : 1,
                "stateStr" : "PRIMARY",
                "uptime" : 213,
                "optime" : {
                    "ts" : Timestamp(1659756685, 1),
                    "t" : NumberLong(1)
                },
                "optimeDate" : ISODate("2022-08-06T03:31:25Z"),
                "lastAppliedWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
                "lastDurableWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
                "syncSourceHost" : "",
                "syncSourceId" : -1,
                "infoMessage" : "",
                "electionTime" : Timestamp(1659756625, 1),
                "electionDate" : ISODate("2022-08-06T03:30:25Z"),
                "configVersion" : 1,
                "configTerm" : -1,
                "self" : true,
                "lastHeartbeatMessage" : ""
            },
            {
                "_id" : 1,
                "name" : "mongo08.tyun.cn:27010",
                "health" : 1,
                "state" : 2,
                "stateStr" : "SECONDARY",
                "uptime" : 71,
                "optime" : {
                    "ts" : Timestamp(1659756675, 1),
                    "t" : NumberLong(1)
                },
                "optimeDurable" : {
                    "ts" : Timestamp(1659756675, 1),
                    "t" : NumberLong(1)
                },
                "optimeDate" : ISODate("2022-08-06T03:31:15Z"),
                "optimeDurableDate" : ISODate("2022-08-06T03:31:15Z"),
                "lastAppliedWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
                "lastDurableWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
                "lastHeartbeat" : ISODate("2022-08-06T03:31:25.890Z"),
                "lastHeartbeatRecv" : ISODate("2022-08-06T03:31:24.933Z"),
                "pingMs" : NumberLong(0),
                "lastHeartbeatMessage" : "",
                "syncSourceHost" : "mongo07.tyun.cn:27010",
                "syncSourceId" : 0,
                "infoMessage" : "",
                "configVersion" : 1,
                "configTerm" : -1
            },
            {
                "_id" : 2,
                "name" : "mongo09.tyun.cn:27010",
                "health" : 1,
                "state" : 2,
                "stateStr" : "SECONDARY",
                "uptime" : 71,
                "optime" : {
                    "ts" : Timestamp(1659756675, 1),
                    "t" : NumberLong(1)
                },
                "optimeDurable" : {
                    "ts" : Timestamp(1659756675, 1),
                    "t" : NumberLong(1)
                },
                "optimeDate" : ISODate("2022-08-06T03:31:15Z"),
                "optimeDurableDate" : ISODate("2022-08-06T03:31:15Z"),
                "lastAppliedWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
                "lastDurableWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
                "lastHeartbeat" : ISODate("2022-08-06T03:31:25.890Z"),
                "lastHeartbeatRecv" : ISODate("2022-08-06T03:31:24.872Z"),
                "pingMs" : NumberLong(0),
                "lastHeartbeatMessage" : "",
                "syncSourceHost" : "mongo07.tyun.cn:27010",
                "syncSourceId" : 0,
                "infoMessage" : "",
                "configVersion" : 1,
                "configTerm" : -1
            }
        ],
        "ok" : 1,
        "$clusterTime" : {
            "clusterTime" : Timestamp(1659756685, 1),
            "signature" : {
                "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                "keyId" : NumberLong(0)
            }
        },
        "operationTime" : Timestamp(1659756685, 1)
    }
    shard2:PRIMARY>

    接着把 shard2 加入到分片集中(连接任意一台 mongos):

    [root@mongo01 ~]# mongo --host mongos1.tyun.cn:27017
    MongoDB shell version v4.4.15
    connecting to: mongodb://mongos1.tyun.cn:27017/?compressors=disabled&gssapiServiceName=mongodb
    Implicit session: session { "id" : UUID("1bb0a6ed-dad1-4440-95cb-2f60e0be506f") }
    MongoDB server version: 4.4.15
    ---
    The server generated these startup warnings when booting: 
            2022-08-05T09:17:59.537+00:00: Access control is not enabled for the database. Read and write access to data and configuration is unrestricted
            2022-08-05T09:17:59.537+00:00: You are running this process as the root user, which is not recommended
    ---
    mongos> 
    mongos> sh.addShard("shard2/mongo07.tyun.cn:27010,mongo08.tyun.cn:27010,mongo09.tyun.cn:27010");
    {
        "shardAdded" : "shard2",
        "ok" : 1,
        "operationTime" : Timestamp(1659756859, 4),
        "$clusterTime" : {
            "clusterTime" : Timestamp(1659756859, 4),
            "signature" : {
                "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                "keyId" : NumberLong(0)
            }
        }
    }
    
    
    mongos> sh.status()
    --- Sharding Status --- 
      sharding version: {
          "_id" : 1,
          "minCompatibleVersion" : 5,
          "currentVersion" : 6,
          "clusterId" : ObjectId("62ecc377dc19b0487fcd62e6")
      }
      shards:
            {  "_id" : "shard1",  "host" : "shard1/mongo04.tyun.cn:27010,mongo05.tyun.cn:27010,mongo06.tyun.cn:27010",  "state" : 1 }
            {  "_id" : "shard2",  "host" : "shard2/mongo07.tyun.cn:27010,mongo08.tyun.cn:27010,mongo09.tyun.cn:27010",  "state" : 1 }
      active mongoses:
            "4.4.15" : 3
      autosplit:
            Currently enabled: yes
      balancer:
            Currently enabled:  yes
            Currently running:  no
            Failed balancer rounds in last 5 attempts:  0
            Migration Results for the last 24 hours: 
                    31 : Success
      databases:
            {  "_id" : "config",  "primary" : "config",  "partitioned" : true }
                    config.system.sessions
                            shard key: { "_id" : 1 }
                            unique: false
                            balancing: true
                            chunks:
                                    shard1    994
                                    shard2    30
                            too many chunks to print, use verbose if you want to force print
            {  "_id" : "test",  "primary" : "shard1",  "partitioned" : true,  "version" : {  "uuid" : UUID("8c333889-11b2-4de0-9f54-f0c56b622124"),  "lastMod" : 1 } }
                    test.shard
                            shard key: { "_id" : "hashed" }
                            unique: false
                            balancing: true
                            chunks:
                                    shard1    1
                                    shard2    1
                            { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong(0) } on : shard2 Timestamp(2, 0) 
                            { "_id" : NumberLong(0) } -->> { "_id" : { "$maxKey" : 1 } } on : shard1 Timestamp(2, 1) 
    mongos>

    从输出可以看到,shard1 的 2 个 chunk,已经分配到了 shard2 上面了,这是 MongoDB 的自动均衡机制起作用了。

    看看每个 Shard 的文档数量有多少?

    mongos> status = db.shard.stats()
    
    
    // 查看文档数量
    mongos> status.shards.shard1.count
    50184
    
    
    // 过一段时间再次查看
    mongos> status.shards.shard2.count
    49816
    
    
    // 比较一下两个分片的文档数量
    mongos> status.shards.shard1.count - status.shards.shard2.count
    368

    从两个分片中的文档数量来看,数据存放基本是均衡的。

    mongos> use admin
    switched to db admin
    mongos> db.runCommand({listShards: 1})
    {
        "shards" : [
            {
                "_id" : "shard1",
                "host" : "shard1/mongo04.tyun.cn:27010,mongo05.tyun.cn:27010,mongo06.tyun.cn:27010",
                "state" : 1
            },
            {
                "_id" : "shard2",
                "host" : "shard2/mongo07.tyun.cn:27010,mongo08.tyun.cn:27010,mongo09.tyun.cn:27010",
                "state" : 1
            }
        ],
        "ok" : 1,
        "operationTime" : Timestamp(1660384940, 3),
        "$clusterTime" : {
            "clusterTime" : Timestamp(1660384940, 3),
            "signature" : {
                "hash" : BinData(0,"kAzOU7gYu5MWoNSYPEZanw1KYd4="),
                "keyId" : NumberLong("7128287226089177110")
            }
        }
    }

    删除分片:

    mongos> db.adminCommand( { removeShard: "shard2" } )
    {
        "msg" : "draining started successfully",
        "state" : "started",
        "shard" : "shard2",
        "note" : "you need to drop or movePrimary these databases",
        "dbsToMove" : [
            "testdb"
        ],
        "ok" : 1,
        "operationTime" : Timestamp(1660384982, 2),
        "$clusterTime" : {
            "clusterTime" : Timestamp(1660384982, 2),
            "signature" : {
                "hash" : BinData(0,"ToGrQJZSWqSfiFwe/Hop2eykOAM="),
                "keyId" : NumberLong("7128287226089177110")
            }
        }
    }

    查看移动的状态:

    mongos> db.adminCommand( { removeShard: "shard2" } )
    {
        "msg" : "draining ongoing",
        "state" : "ongoing", // 进行中
        "remaining" : {
            "chunks" : NumberLong(406),  // 剩余
            "dbs" : NumberLong(1),
            "jumboChunks" : NumberLong(0)
        },
        "note" : "you need to drop or movePrimary these databases",
        "dbsToMove" : [
            "testdb"
        ],
        "ok" : 1,
        "operationTime" : Timestamp(1660385198, 21),
        "$clusterTime" : {
            "clusterTime" : Timestamp(1660385198, 21),
            "signature" : {
                "hash" : BinData(0,"HVDmppA+MhUor9a72JKDjWErLKo="),
                "keyId" : NumberLong("7128287226089177110")
            }
        }
    }
    
    
    // 再次查看
    mongos> db.adminCommand( { removeShard: "shard2" } )
    {
        "msg" : "draining ongoing",
        "state" : "ongoing",
        "remaining" : {
            "chunks" : NumberLong(345),  // 这里
            "dbs" : NumberLong(1),
            "jumboChunks" : NumberLong(0)
        },
        "note" : "you need to drop or movePrimary these databases",
        "dbsToMove" : [
            "testdb"
        ],
        "ok" : 1,
        "operationTime" : Timestamp(1660385328, 3),
        "$clusterTime" : {
            "clusterTime" : Timestamp(1660385328, 3),
            "signature" : {
                "hash" : BinData(0,"Wi6BxDNErUjsHYTdVpvbiEyGUrw="),
                "keyId" : NumberLong("7128287226089177110")
            }
        }
    }
    
    
    // 一段时间后再次查看
    mongos> db.adminCommand( { removeShard: "shard2" } )
    {
        "msg" : "draining ongoing",
        "state" : "ongoing",
        "remaining" : {
            "chunks" : NumberLong(87),  // 这里
            "dbs" : NumberLong(1),
            "jumboChunks" : NumberLong(0)
        },
        "note" : "you need to drop or movePrimary these databases",
        "dbsToMove" : [
            "testdb"
        ],
        "ok" : 1,
        "operationTime" : Timestamp(1660385870, 3),
        "$clusterTime" : {
            "clusterTime" : Timestamp(1660385870, 6),
            "signature" : {
                "hash" : BinData(0,"R5LJzYTNv+s+aJaiJZVZ9arr+84="),
                "keyId" : NumberLong("7128287226089177110")
            }
        }
    }

    移动 DB 到其它分片:

    mongos> db.adminCommand( { movePrimary: "testdb", to: "shard0" })
    {
        "ok" : 1,
        "operationTime" : Timestamp(1660386323, 42852),
        "$clusterTime" : {
            "clusterTime" : Timestamp(1660386323, 42852),
            "signature" : {
                "hash" : BinData(0,"wpJWCc5pzEghDEgRjXl9NiA9Gxs="),
                "keyId" : NumberLong("7128287226089177110")
            }
        }
    }
    
    
    // 再次查看状态
    mongos> db.adminCommand( { removeShard: "shard2" } )
    {
        "msg" : "removeshard completed successfully",
        "state" : "completed",
        "shard" : "shard2",
        "ok" : 1,
        "operationTime" : Timestamp(1660386353, 3),
        "$clusterTime" : {
            "clusterTime" : Timestamp(1660386353, 3),
            "signature" : {
                "hash" : BinData(0,"EoqSZ6a4MbSrQcBHH6rVAI1DtyA="),
                "keyId" : NumberLong("7128287226089177110")
            }
        }
    }
    
    
    mongos> db.runCommand({listShards: 1})
    {
        "shards" : [
            {
                "_id" : "shard1",
                "host" : "shard1/mongo04.tyun.cn:27010,mongo05.tyun.cn:27010,mongo06.tyun.cn:27010",
                "state" : 1
            },
            {
                "_id" : "shard0",
                "host" : "shard0/mongo01.tyun.cn:27010,mongo02.tyun.cn:27010,mongo03.tyun.cn:27010",
                "state" : 1
            }
        ],
        "ok" : 1,
        "operationTime" : Timestamp(1660386367, 23),
        "$clusterTime" : {
            "clusterTime" : Timestamp(1660386367, 23),
            "signature" : {
                "hash" : BinData(0,"yBy7UjBzOh1RIbm4fj/q+Docptg="),
                "keyId" : NumberLong("7128287226089177110")
            }
        }
    }

    总结

    分片具有很大的灵活性。

    不过,我们在执行某些操作时还存在一些限制。

    我们将在以下列表中突出显示最重要的内容:

    01

    group() 命令不起作用。我们应该使用 aggregate() 和聚合框架,或者 mapreduce()。

    02

    db.eval() 命令不起作用,出于安全原因,在大多数情况下应将其禁用。

    03

    更新操作时的 $isolated 选项不起作用。这是分片环境中缺少的功能。update() 的 $isolated 选项提供了保证,如果我们一次更新多个文档,其他读者和作者将看不到一些更新了新值的文档,而其他文档仍然具有旧值。这是在 unsharded 中实现的方式环境是通过持有全局写锁和/或将操作序列化到单个线程来确保对受 update() 影响的文档的每个请求都不会被其他线程/操作访问。此实现意味着它不是高性能的并且不支持任何并发,这禁止在分片环境中使用 $isolated 运算符。

    04

    不支持查询的 $snapshot 运算符。find() 游标中的 $snapshot 运算符可防止文档由于在更新后被移动到磁盘上的不同位置,在结果中出现不止一次。$snapshot 运算符的操作成本很高,通常不是硬性要求。替代它的方法是对我们查询的字段使用索引,该字段的键在查询期间不会更改。

    05

    如果我们的查询不包含分片键,索引将无法覆盖我们的查询。分片环境中的结果将来自磁盘,而不仅仅是来自索引。唯一的例外是如果我们只在内置的 _id字段上查询并且只返回 _id 字段,在这种情况下,MongoDB 仍然可以使用内置索引覆盖查询。

    06

    update()和remove()操作的工作方式不同。分片环境中的所有update()和remove()操作必须包括要受影响的文档的_id 或分片键;否则,mongos 路由器将不得不对所有集合、数据库和分片进行全表扫描,这在操作上会非常耗时。

    07

    跨分片的唯一索引需要包含分片键作为索引的前缀。换句话说,要实现跨分片文档的唯一性,我们需要遵循 MongoDB 对分片遵循的数据分布。

    08

    分片键的大小不得超过 512 字节。分片键索引必须在被分片的键字段和可选的其他字段上按升序排列,或者在其上的哈希索引。

    最后,如果我们使用公有云环境下的 MongoDB 分片集群,那么其费用成本是非常高昂的,一个大概的费用我们已经在系列文章的第一篇中已经提到过。下一节我们将介绍 MongoDB 的备份与恢复,这是运维同学比较关心的话题。

    敬请期待,谢谢大家。

    «
    »
以专业成就每一位客户,让企业IT只为效果和安全买单

以专业成就每一位客户,让企业IT只为效果和安全买单