MariaDB 备份工具 Mariabackup 因错误而失败
MariaDB backup tool Mariabackup failing with error
我们最近从 MariaDB 5.5 升级到 10.2,并从 innobackupex 换成了 Mariadbackup(xtrabackup 的一个分支)。尝试进行完整备份总是失败。
我是 运行 的后援:
sudo mariabackup --backup --target-dir /mnt/database_backups/test --user backups
--password REDACTED
命令输出如下:
180501 11:53:30 Connecting to MySQL server host: localhost, user: backups, password: set, port: 3306, socket: /var/run/mysqld/mysqld.sock
Using server version 10.2.14-MariaDB-10.2.14+maria~trusty-log
mariabackup based on MariaDB server 10.2.14-MariaDB debian-linux-gnu (x86_64)
mariabackup: uses posix_fadvise().
mariabackup: cd to /var/lib/mysql/
mariabackup: open files limit requested 0, set to 1024
mariabackup: using the following InnoDB configuration:
mariabackup: innodb_data_home_dir = .
mariabackup: innodb_data_file_path = ibdata1:12M:autoextend
mariabackup: innodb_log_group_home_dir = ./
mariabackup: using O_DIRECT
2018-05-01 11:53:30 140057835345792 [Note] InnoDB: Number of pools: 1
mariabackup: Generating a list of tablespaces
2018-05-01 11:53:30 140057835345792 [Warning] InnoDB: Allocated tablespace ID 2997 for warehouse/warehouses, old maximum was 0
180501 11:53:34 >> log scanned up to (2154583932391)
180501 11:53:34 [01] Copying ./ibdata1 to /mnt/database_backups/test/ibdata1
180501 11:53:35 >> log scanned up to (2154583953963)
180501 11:53:35 [01] ...done
180501 11:53:35 [01] Copying ./warehouse/warehouses.ibd to /mnt/database_backups/test/warehouse/warehouses.ibd
180501 11:53:35 [01] ...done
-- MORE Copying... ...done lines
180501 12:09:59 [01] Copying ./vioadmin/amazon__product_blacklist.ibd to /mnt/database_backups/test/vioadmin/amazon__product_blacklist.ibd
180501 12:09:59 [01] ...done
180501 12:09:59 Executing FLUSH NO_WRITE_TO_BINLOG TABLES...
180501 12:09:59 Executing FLUSH TABLES WITH READ LOCK...
180501 12:09:59 Starting to backup non-InnoDB tables and files
180501 12:09:59 [01] Copying ./warehouse/warehouse_actions_archive.frm to /mnt/database_backups/test/warehouse/warehouse_actions_archive.frm
180501 12:09:59 [01] ...done
-- MORE Copying... ...done lines
180501 12:10:01 Finished backing up non-InnoDB tables and files
180501 12:10:01 [01] Copying aria_log_control to /mnt/database_backups/test/aria_log_control
180501 12:10:01 [01] ...done
180501 12:10:01 [01] Copying aria_log.00000001 to /mnt/database_backups/test/aria_log.00000001
180501 12:10:01 [01] ...done
180501 12:10:01 [00] Writing xtrabackup_binlog_info
180501 12:10:01 [00] ...done
180501 12:10:01 Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...
mariabackup: The latest check point (for incremental): '2154617240383'
mariabackup: Stopping log copying thread.
2018-05-01 12:10:01 140057835345792 [Note] InnoDB: Read redo log up to LSN=2154583953920
180501 12:10:01 >> log scanned up to (2154583953920)
180501 12:10:01 Executing UNLOCK TABLES
180501 12:10:01 All tables unlocked
180501 12:10:01 [00] Copying ib_buffer_pool to /mnt/database_backups/test/ib_buffer_pool
180501 12:10:01 [00] ...done
180501 12:10:01 Backup created in directory '/mnt/database_backups/test/'
MySQL binlog position: filename 'mariadb-bin.005485', position '28883329', GTID of the last change '0-1-241386'
180501 12:10:01 [00] Writing backup-my.cnf
180501 12:10:01 [00] ...done
180501 12:10:01 [00] Writing xtrabackup_info
180501 12:10:01 [00] ...done
mariabackup: Redo log (from LSN 2154583538384 to 2154583953920) was copied.
mariabackup: error: failed to copy enough redo log (LSN=2154583953920; checkpoint LSN=2154617240383).
有人可以阐明问题所在吗?我们的数据目录约为 94G,因此数据库非常大,如您所见,持续时间约为 17 分钟。这类似于我们之前使用 innobackupex 时的情况。
正如您从上面的日志中看到的,在备份过程中有一些行以 Executing FLUSH NO_WRITE_TO_BINLOG TABLES
开头。我不确定这些是否正常,还是有问题,但它们分散在数百 Copying
行之间确实看起来有点奇怪。下面列出的表实际上都是 InnoDB,尽管上面写着 Starting to backup non-InnoDB tables and files
.
感谢您的帮助。
我创建了 Mariabackup 10.2。重做日志解析代码与 Percona xtrabackup 和 Mariabackup 10.1 有所不同。
您能否分享完整的日志,以便我们找出失败的确切原因?如果您在 https://jira.mariadb.org/ 提交新的 MDEV 错误并在那里分享详细信息,并在此处发布 link 到问题,这对我们来说将是最方便的。
我有两个假设。要么redo log的Copy_online因为bug停止得太早,要么InnoDB后台有很多activity导致redo log在FLUSH TABLES WITH READ LOCK
附近发出后才写备份结束。
无论哪种方式,Copy_last阶段似乎都无法复制剩余的日志,因为循环重做日志文件在其间被覆盖了。
编辑:其他人已针对此问题提交 https://jira.mariadb.org/browse/MDEV-16367。如果您有更多信息,请在那里提交。
我们最近从 MariaDB 5.5 升级到 10.2,并从 innobackupex 换成了 Mariadbackup(xtrabackup 的一个分支)。尝试进行完整备份总是失败。
我是 运行 的后援:
sudo mariabackup --backup --target-dir /mnt/database_backups/test --user backups
--password REDACTED
命令输出如下:
180501 11:53:30 Connecting to MySQL server host: localhost, user: backups, password: set, port: 3306, socket: /var/run/mysqld/mysqld.sock
Using server version 10.2.14-MariaDB-10.2.14+maria~trusty-log
mariabackup based on MariaDB server 10.2.14-MariaDB debian-linux-gnu (x86_64)
mariabackup: uses posix_fadvise().
mariabackup: cd to /var/lib/mysql/
mariabackup: open files limit requested 0, set to 1024
mariabackup: using the following InnoDB configuration:
mariabackup: innodb_data_home_dir = .
mariabackup: innodb_data_file_path = ibdata1:12M:autoextend
mariabackup: innodb_log_group_home_dir = ./
mariabackup: using O_DIRECT
2018-05-01 11:53:30 140057835345792 [Note] InnoDB: Number of pools: 1
mariabackup: Generating a list of tablespaces
2018-05-01 11:53:30 140057835345792 [Warning] InnoDB: Allocated tablespace ID 2997 for warehouse/warehouses, old maximum was 0
180501 11:53:34 >> log scanned up to (2154583932391)
180501 11:53:34 [01] Copying ./ibdata1 to /mnt/database_backups/test/ibdata1
180501 11:53:35 >> log scanned up to (2154583953963)
180501 11:53:35 [01] ...done
180501 11:53:35 [01] Copying ./warehouse/warehouses.ibd to /mnt/database_backups/test/warehouse/warehouses.ibd
180501 11:53:35 [01] ...done
-- MORE Copying... ...done lines
180501 12:09:59 [01] Copying ./vioadmin/amazon__product_blacklist.ibd to /mnt/database_backups/test/vioadmin/amazon__product_blacklist.ibd
180501 12:09:59 [01] ...done
180501 12:09:59 Executing FLUSH NO_WRITE_TO_BINLOG TABLES...
180501 12:09:59 Executing FLUSH TABLES WITH READ LOCK...
180501 12:09:59 Starting to backup non-InnoDB tables and files
180501 12:09:59 [01] Copying ./warehouse/warehouse_actions_archive.frm to /mnt/database_backups/test/warehouse/warehouse_actions_archive.frm
180501 12:09:59 [01] ...done
-- MORE Copying... ...done lines
180501 12:10:01 Finished backing up non-InnoDB tables and files
180501 12:10:01 [01] Copying aria_log_control to /mnt/database_backups/test/aria_log_control
180501 12:10:01 [01] ...done
180501 12:10:01 [01] Copying aria_log.00000001 to /mnt/database_backups/test/aria_log.00000001
180501 12:10:01 [01] ...done
180501 12:10:01 [00] Writing xtrabackup_binlog_info
180501 12:10:01 [00] ...done
180501 12:10:01 Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...
mariabackup: The latest check point (for incremental): '2154617240383'
mariabackup: Stopping log copying thread.
2018-05-01 12:10:01 140057835345792 [Note] InnoDB: Read redo log up to LSN=2154583953920
180501 12:10:01 >> log scanned up to (2154583953920)
180501 12:10:01 Executing UNLOCK TABLES
180501 12:10:01 All tables unlocked
180501 12:10:01 [00] Copying ib_buffer_pool to /mnt/database_backups/test/ib_buffer_pool
180501 12:10:01 [00] ...done
180501 12:10:01 Backup created in directory '/mnt/database_backups/test/'
MySQL binlog position: filename 'mariadb-bin.005485', position '28883329', GTID of the last change '0-1-241386'
180501 12:10:01 [00] Writing backup-my.cnf
180501 12:10:01 [00] ...done
180501 12:10:01 [00] Writing xtrabackup_info
180501 12:10:01 [00] ...done
mariabackup: Redo log (from LSN 2154583538384 to 2154583953920) was copied.
mariabackup: error: failed to copy enough redo log (LSN=2154583953920; checkpoint LSN=2154617240383).
有人可以阐明问题所在吗?我们的数据目录约为 94G,因此数据库非常大,如您所见,持续时间约为 17 分钟。这类似于我们之前使用 innobackupex 时的情况。
正如您从上面的日志中看到的,在备份过程中有一些行以 Executing FLUSH NO_WRITE_TO_BINLOG TABLES
开头。我不确定这些是否正常,还是有问题,但它们分散在数百 Copying
行之间确实看起来有点奇怪。下面列出的表实际上都是 InnoDB,尽管上面写着 Starting to backup non-InnoDB tables and files
.
感谢您的帮助。
我创建了 Mariabackup 10.2。重做日志解析代码与 Percona xtrabackup 和 Mariabackup 10.1 有所不同。
您能否分享完整的日志,以便我们找出失败的确切原因?如果您在 https://jira.mariadb.org/ 提交新的 MDEV 错误并在那里分享详细信息,并在此处发布 link 到问题,这对我们来说将是最方便的。
我有两个假设。要么redo log的Copy_online因为bug停止得太早,要么InnoDB后台有很多activity导致redo log在FLUSH TABLES WITH READ LOCK
附近发出后才写备份结束。
无论哪种方式,Copy_last阶段似乎都无法复制剩余的日志,因为循环重做日志文件在其间被覆盖了。
编辑:其他人已针对此问题提交 https://jira.mariadb.org/browse/MDEV-16367。如果您有更多信息,请在那里提交。