使用 HDFS NFS 网关时出现输入/输出错误

Question

尝试使用已安装的 HDFS NFS 网关中的文件时获取 "Input / output error"。尽管在 Ambari 中有 set dfs.namenode.accesstime.precision=3600000。例如，做类似...

$ hdfs dfs -cat /hdfs/path/to/some/tsv/file | sed -e "s/$NULL_WITH_TAB/$TAB/g" | hadoop fs -put -f - /hdfs/path/to/some/tsv/file
$ echo -e "Lines containing null (expect zero): $(grep -c "\tnull\t" /nfs/hdfs/path/to/some/tsv/file)"

当尝试从 tsv 中删除空值然后根据 NFS 位置检查该 tsv 中的空值时会抛出错误，但我在许多其他地方看到它（同样，已经有 dfs.namenode.accesstime.precision=3600000 ).任何人都知道为什么会发生这种情况或调试建议？谁能解释一下 "access time" 在这种情况下到底是什么？

Answer 1

来自关于 apache hadoop 的讨论 mailing list:

I think access time refers to the POSIX atime attribute for files, the “time of last access” as described here for instance (https://www.unixtutorial.org/atime-ctime-mtime-in-unix-filesystems). While HDFS keeps a correct modification time (mtime), which is important, easy and cheap, it only keeps a very low-resolution sense of last access time, which is less important, and expensive to monitor and record, as described here (https://issues.apache.org/jira/browse/HADOOP-1869) and here (https://superuser.com/questions/464290/why-is-cat-not-changing-the-access-time).

However, to have a conforming NFS api, you must present atime, and so the HDFS NFS implementation does. But first you have to configure it on. [...] many sites have been advised to turn it off entirely by setting it to zero, to improve HDFS overall performance. See for example here ( https://community.hortonworks.com/articles/43861/scaling-the-hdfs-namenode-part-4-avoiding-performa.html, section "Don’t let Reads become Writes”). So if your site has turned off atime in HDFS, you will need to turn it back on to fully enable NFS. Alternatively, you can maintain optimum efficiency by mounting NFS with the “noatime” option, as described in the document you reference.

[...] check under /var/log, eg with find /var/log -name ‘*nfs3*’ -print

使用 HDFS NFS 网关时出现输入/输出错误

Input / Output error when using HDFS NFS Gateway

hadoop

hdfs