在 Azure 上使用 Blob 存储配置 hadoop
Configuring hadoop with Blob storage on Azure
我们正在 linux 上的 Azure v2 云 运行 上测试 hadoop HA 集群。我们正在尝试切换到 Azure BLOB 存储。我们不确定应该如何使用 Blob 存储配置名称节点。我们收到以下错误:
2015-12-22 13:05:50,193 INFO ha.StandbyCheckpointer (StandbyCheckpointer.java:start(129)) - Starting standby checkpoint thread...
Checkpointing active NN at http://bd-azure-qa-nn2:50070
Serving checkpoints at http://bd-azure-qa-nn1:50070
2015-12-22 13:07:50,240 INFO ha.EditLogTailer (EditLogTailer.java:triggerActiveLogRoll(269)) - Triggering log roll on remote NameNode bd-azure-qa-nn2/10.0.0.7:8020
2015-12-22 13:07:51,387 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:52,391 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:53,400 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:54,416 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:55,425 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:56,450 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:57,456 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:58,462 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:59,473 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:00,478 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:01,482 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 10 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:02,490 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 11 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:03,501 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 12 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:04,515 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 13 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:05,520 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 14 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:05,966 WARN ha.EditLogTailer (EditLogTailer.java:triggerActiveLogRoll(274)) - Unable to trigger a roll of the active NN
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category JOURNAL is not supported in state standby
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1719)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1352)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:6339)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:933)
at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:139)
at org.apache.hadoop.hdfs.protocol.proto.NamenodeProtocolProtos$NamenodeProtocolService.callBlockingMethod(NamenodeProtocolProtos.java:11214)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2039)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2035)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
at org.apache.hadoop.ipc.Client.call(Client.java:1468)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy14.rollEditLog(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.rollEditLog(NamenodeProtocolTranslatorPB.java:145)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:271)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access0(EditLogTailer.java:61)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:313)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access0(EditLogTailer.java:282)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:299)
at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:412)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:295)
我们不太确定如何使用 Azure Blob 配置名称节点,因为 Blob 有效地移交了 HDFS 功能。
我们在 hdfs-site.xml 中有标准的 HA 名称节点配置,并使用以下属性修改了 core-site.xml 以启用 BLOB 存储。
<property>
<name>fs.AbstractFileSystem.wasb.impl</name>
<value>org.apache.hadoop.fs.azure.Wasb</value>
</property>
<property>
<name>fs.azure.account.key.OUR_STORAGE_ACCOUNT.blob.core.windows.net</name>
<value>"OUR_KEY"</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>wasb://blob-hdfs@OUR_STORAGE_ACCOUNT.blob.core.windows.net</value>
<final>true</final>
</property>
<property>
<name>fs.azure.page.blob.dir</name>
<value>/datadir</value>
</property>
<property>
<name>fs.azure.selfthrottling.read.factor</name>
<value>1.000000</value>
</property>
<property>
<name>fs.azure.selfthrottling.write.factor</name>
<value>1.000000</value>
<property>
我们在原始集群中的 HA 名称是:
<!-- <property>
<name>fs.defaultFS</name>
<value>hdfs://namenodeha</value>
</property> -->
hdfs-site.xml我们根本没碰过。
我们不确定名称节点设置。原始设置中的两个名称节点可能有点矫枉过正,因为底层 BLOB 应该处理所有复制等。
有人可以澄清一下吗?
我们在 hdfs-site.xml 中缺少以下属性以使其正常工作。
<property>
<name>dfs.datanode.https.address</name>
<value>0.0.0.0:50475</value>
</property>
<property>
<name>dfs.namenode.https-address.namenodeha.nn1</name>
<value>bd-azure-qa-nn1:50470</value>
</property>
<property>
<name>dfs.namenode.https-address.namenodeha.nn2</name>
<value>bd-azure-qa-nn2:50470</value>
</property>
<property>
<name>dfs.journalnode.https-address</name>
<value>0.0.0.0:8481</value> :
</property>
肯定有适当的主机名。
我们正在 linux 上的 Azure v2 云 运行 上测试 hadoop HA 集群。我们正在尝试切换到 Azure BLOB 存储。我们不确定应该如何使用 Blob 存储配置名称节点。我们收到以下错误:
2015-12-22 13:05:50,193 INFO ha.StandbyCheckpointer (StandbyCheckpointer.java:start(129)) - Starting standby checkpoint thread...
Checkpointing active NN at http://bd-azure-qa-nn2:50070
Serving checkpoints at http://bd-azure-qa-nn1:50070
2015-12-22 13:07:50,240 INFO ha.EditLogTailer (EditLogTailer.java:triggerActiveLogRoll(269)) - Triggering log roll on remote NameNode bd-azure-qa-nn2/10.0.0.7:8020
2015-12-22 13:07:51,387 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:52,391 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:53,400 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:54,416 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:55,425 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:56,450 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:57,456 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:58,462 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:59,473 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:00,478 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:01,482 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 10 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:02,490 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 11 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:03,501 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 12 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:04,515 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 13 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:05,520 INFO ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 14 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:05,966 WARN ha.EditLogTailer (EditLogTailer.java:triggerActiveLogRoll(274)) - Unable to trigger a roll of the active NN
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category JOURNAL is not supported in state standby
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1719)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1352)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:6339)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:933)
at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:139)
at org.apache.hadoop.hdfs.protocol.proto.NamenodeProtocolProtos$NamenodeProtocolService.callBlockingMethod(NamenodeProtocolProtos.java:11214)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2039)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2035)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
at org.apache.hadoop.ipc.Client.call(Client.java:1468)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy14.rollEditLog(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.rollEditLog(NamenodeProtocolTranslatorPB.java:145)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:271)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access0(EditLogTailer.java:61)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:313)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access0(EditLogTailer.java:282)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:299)
at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:412)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:295)
我们不太确定如何使用 Azure Blob 配置名称节点,因为 Blob 有效地移交了 HDFS 功能。 我们在 hdfs-site.xml 中有标准的 HA 名称节点配置,并使用以下属性修改了 core-site.xml 以启用 BLOB 存储。
<property>
<name>fs.AbstractFileSystem.wasb.impl</name>
<value>org.apache.hadoop.fs.azure.Wasb</value>
</property>
<property>
<name>fs.azure.account.key.OUR_STORAGE_ACCOUNT.blob.core.windows.net</name>
<value>"OUR_KEY"</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>wasb://blob-hdfs@OUR_STORAGE_ACCOUNT.blob.core.windows.net</value>
<final>true</final>
</property>
<property>
<name>fs.azure.page.blob.dir</name>
<value>/datadir</value>
</property>
<property>
<name>fs.azure.selfthrottling.read.factor</name>
<value>1.000000</value>
</property>
<property>
<name>fs.azure.selfthrottling.write.factor</name>
<value>1.000000</value>
<property>
我们在原始集群中的 HA 名称是:
<!-- <property>
<name>fs.defaultFS</name>
<value>hdfs://namenodeha</value>
</property> -->
hdfs-site.xml我们根本没碰过。
我们不确定名称节点设置。原始设置中的两个名称节点可能有点矫枉过正,因为底层 BLOB 应该处理所有复制等。
有人可以澄清一下吗?
我们在 hdfs-site.xml 中缺少以下属性以使其正常工作。
<property>
<name>dfs.datanode.https.address</name>
<value>0.0.0.0:50475</value>
</property>
<property>
<name>dfs.namenode.https-address.namenodeha.nn1</name>
<value>bd-azure-qa-nn1:50470</value>
</property>
<property>
<name>dfs.namenode.https-address.namenodeha.nn2</name>
<value>bd-azure-qa-nn2:50470</value>
</property>
<property>
<name>dfs.journalnode.https-address</name>
<value>0.0.0.0:8481</value> :
</property>
肯定有适当的主机名。