比较文件名并重新创建源目录后移动文件

Moving files after comparing filenames and recreating source directories

我正在学习 shell 脚本,并努力保持尽可能 POSIX 兼容,同时保持代码库的可读性。目标是从目录 A 中读取文件列表,从目录 B 中找到它们的匹配项,并在目录 C 中重新创建目录父 B 的一部分,其中目录 A 中的文件应该被移动,然后删除 matched/moved目录 B 中的文件,如果从目录 B 文件中找到的目录为空,则删除它们。目录 A 中的所有文件将始终彼此唯一,目录 B 中始终存在一个或多个匹配项,而目录 C 中永远不会匹配,但目录 C 中的子目录可能已经存在以匹配目录 B . 在将匹配项从目录 A 移动到目录 C 后,应删除目录 B 中匹配的所有文件。扩展名会随着文件的单独处理而变化,但文件名将完全匹配。文件名可以包含空格和句点。文件名不会总是相同的长度。输出目录和存档目录中有两级子目录。

这是我目前所知道的。我被困在编写 for 循环来做脏活。尽量不要在 find、printf、awk、grep、for 和 if 之外走得太远。

#!/bin/sh
execHome="intendedMachine"
baseDir="/home/library/projects"
folderNew="output"
folderOld="working"
folderArchive="archive"
workingTypes=("jpg", "svg", "bmp", "tiff", "psd")

$folderNew="$baseDir/$folderNew"
$folderOld="$baseDir/$folderOld"
folderArchive="$baseDir/$folderArchive"

if [ "$(uname -n)" = "$execHome" ]
then

  count=$(find $folderNew -type f |grep -v "DS_Store" |awk -F "/" '{print $NF}'|wc -l)

  printf "\nFound/processing %s files in the %s folder\n\n" "$count" "$folderNew"

  find $folderNew -type f |grep -v "DS_Store" |awk -F "/" '{print $NF}'

else
  printf "Executed from %s; Run from %s for proper execution.\n" "$(uname -n)" "$execHome"
fi

示例:

目录A

/home/library/projects/output/projectOne 1.a.png
/home/library/projects/output/projectOne 1.b.png
/home/library/projects/output/projectOne 1.c.png
/home/library/projects/output/projectThree 3.m.png
/home/library/projects/output/projectThree 3.o.png
/home/library/projects/output/projectFour 4.t.png
/home/library/projects/output/projectFour 4.u.png

目录 B

/home/library/projects/working/House/2018 01/projectOne 1.a.jpg
/home/library/projects/working/House/2018 01/projectOne 1.a.svg
/home/library/projects/working/House/2018 01/projectOne 1.b.jpg
/home/library/projects/working/House/2018 01/projectOne 1.b.svg
/home/library/projects/working/House/2018 01/projectOne 1.c.jpg
/home/library/projects/working/House/2018 02/projectTwo 2.g.jpg
/home/library/projects/working/House/2018 02/projectTwo 2.g.svg
/home/library/projects/working/House/2018 02/projectTwo 2.h.jpg
/home/library/projects/working/House/2018 02/projectTwo 2.h.svg
/home/library/projects/working/House/2018 02/projectTwo 2.i.jpg
/home/library/projects/working/Car/2018 03/projectThree 3.m.jpg
/home/library/projects/working/Car/2018 03/projectThree 3.n.jpg
/home/library/projects/working/Car/2018 03/projectThree 3.o.jpg
/home/library/projects/working/Car/2018 03/projectThree 3.o.svg
/home/library/projects/working/Car/2018 04/projectFour 4.s.jpg
/home/library/projects/working/Car/2018 04/projectFour 4.t.jpg
/home/library/projects/working/Car/2018 04/projectFour 4.u.jpg

C目录

/home/library/projects/archive/House/2018 01/projectOne 1.d.png
/home/library/projects/archive/House/2018 01/projectOne 1.e.png
/home/library/projects/archive/House/2018 01/projectOne 1.f.png
/home/library/projects/archive/Car/2018 03/projectThree 3.p.png
/home/library/projects/archive/Car/2018 03/projectThree 3.q.png
/home/library/projects/archive/Car/2018 03/projectThree 3.r.png

期望的结果:

目录 A 的文件已移至目录 C

/home/library/projects/output/

目录 B 应删除目录 A 的文件并删除空文件夹

/home/library/projects/working/House/2018 02/projectTwo 2.g.jpg
/home/library/projects/working/House/2018 02/projectTwo 2.g.svg
/home/library/projects/working/House/2018 02/projectTwo 2.h.jpg
/home/library/projects/working/House/2018 02/projectTwo 2.h.svg
/home/library/projects/working/House/2018 02/projectTwo 2.i.jpg
/home/library/projects/working/Car/2018 03/projectThree 3.n.jpg
/home/library/projects/working/Car/2018 04/projectFour 4.s.jpg

目录 C 应包含旧存档和新输出文件作为存档

/home/library/projects/archive/House/2018 01/projectOne 1.a.png
/home/library/projects/archive/House/2018 01/projectOne 1.b.png
/home/library/projects/archive/House/2018 01/projectOne 1.c.png
/home/library/projects/archive/House/2018 01/projectOne 1.d.png
/home/library/projects/archive/House/2018 01/projectOne 1.e.png
/home/library/projects/archive/House/2018 01/projectOne 1.f.png
/home/library/projects/archive/Car/2018 03/projectThree 3.m.png
/home/library/projects/archive/Car/2018 03/projectThree 3.o.png
/home/library/projects/archive/Car/2018 03/projectThree 3.p.png
/home/library/projects/archive/Car/2018 03/projectThree 3.q.png
/home/library/projects/archive/Car/2018 03/projectThree 3.r.png
/home/library/projects/archive/Car/2018 04/projectFour 4.t.png
/home/library/projects/archive/Car/2018 04/projectFour 4.u.png

运行 无论如何从 bash 4.4.19 机器上查看它是如何工作的代码,但它并没有像我预期的那样工作。这是结果输出:

Found/processing 4 files in the /home/library/projects/output folder

./auto-archive.sh: line 34: hash["$proj"]: bad array subscript
parent of /home/library/projects/output/.temp/projectThree 3.m.png not found
parent of /home/library/projects/output/projectOne 1.a.png not found
parent of /home/library/projects/output/.temp/projectThree 3.0.png not found
parent of /home/library/projects/output/projectFour 4.t.png not found

抱歉。我之前也没有提到不应该递归扫描目录 B,这在用例中会产生其他正在写入但可能尚未准备好移动的临时文件。另外,出于测试的目的,只有上面列出的四个文件实际上在目录 A 中;并非所有最初列出的文件。此外,在重新创建建议的测试结构后,您的代码似乎可以完美执行;与我实际文件结构的结果不匹配。我担心我在描述我的实际文件 structure/naming 约定时可能错过了一些关键要素。现在审查描述符差异。很抱歉占用您的时间,但您的准确性肯定给我留下了深刻的印象。感觉我们越来越接近了,但肯定需要 运行 在 bash.

的早期版本上

任务将分为三个步骤:

  1. 创建一个映射,将每个文件名(项目名称)与其在 C 中的父目录名称相关联。这是通过分析 B 中的路径名作为准备阶段执行的。我们将使用关联array 和 bash 版本必须是 4.2 或更高版本

  2. 循环A中的文件,利用第一步创建的map,在C中构造一个存放路径名,删除B中的文件

  3. 作为清理阶段,我们删除 B 中的空目录(如果有)。

那么:

#!/bin/bash

execHome="intendedMachine"
baseDir="/home/library/projects"
folderNew="output"
folderOld="working"
folderArchive="archive"
workingTypes=("jpg" "svg" "bmp" "tiff" "psd")
declare -A hash

folderNew="$baseDir/$folderNew"
folderOld="$baseDir/$folderOld"
folderArchive="$baseDir/$folderArchive"

if [ "$(uname -n)" != "$execHome" ]; then
    printf "Executed from %s; Run from %s for proper execution.\n" "$(uname -n)" "$execHome"
    exit
fi

count=$(find "$folderNew" -type f |grep -v "DS_Store" |awk -F "/" '{print $NF}'|wc -l)
printf "\nFound/processing %s files in the %s folder\n\n" "$count" "$folderNew"

# determine parent directory name for each project name and create a map for them
while IFS=  read -r -d $'[=10=]' f; do 
    proj="${f##*/}"         # remove dirname
    proj="${proj%.*}"               # remove extention
    parent="${f##*$baseDir/}"       # remove pathname until $baseDir
    parent="${parent#*/}"   # strip pathname one-level deeper
    parent="${parent%/*}"   # remove filename
    # now we're mapping "projectOne 1.a" => "House/2018 01" e.g.
#   echo "$proj" "=>" "$parent"     # just for debugging
    hash["$proj"]="$parent"
done < <(find "$folderOld" -type f -print0) # directory B

# iterate over files in A; move to archive directory C and remove files in B
while IFS=  read -r -d $'[=10=]' f; do
    proj="${f##*/}"
    proj="${proj%.*}"
    parent="${hash[$proj]}"
    if [[ "$parent" = "" ]]; then
    echo "parent of $f not found"   # may not occur but just in case ..
    else
    # move from A to C
    destdir="$folderArchive/$parent"
    mkdir -p -- "$destdir"
    mv -- "$f" "$destdir"

    # remove relevant file(s) in B
    for ext in "${workingTypes[@]}"; do
        oldfile="$folderOld/$parent/$proj.${ext}"
        [ -f "$oldfile" ] && rm -f -- "$oldfile"
    done
    fi
done < <(find "$folderNew" -type f -print0) # directory A

# clean-up: remove empty dirs in B
find "$folderOld" -type d -empty -print0 | xargs -r -0 rmdir --

说明:

  • 您不必使用逗号来分隔数组中的元素。
  • 你不应该把 $ 放在左边的变量名之前。
  • while IFS= ... done < <(find ...) 语法是循环遍历 find 输出的惯用法。
  • ${parameter#word} 类型的语法是 parameter expansion 从路径中提取子字符串。
  • 关联数组 hash 将每个项目名称(例如 "projectOne 1.a")映射到其父目录名称(例如 "House/2018 01".
  • ) 某些命令中的
  • --是为可能以-开头的文件名做准备。 (这种保护可能看起来病态...)

如果您的 bash 早于 4.2,请告诉我。那我们就得想办法了。

编辑
这是 POSIX 兼容版本作为替代:
(显然,如果文件名包含换行符或转义字符 \x1b,脚本将不起作用。)

#!/bin/sh

execHome="intendedMachine"
baseDir="/home/library/projects"
folderNew="output"
folderOld="working"
folderArchive="archive"
workingTypes="jpg
svg
bmp
tiff
psd"

folderNew="$baseDir/$folderNew"
folderOld="$baseDir/$folderOld"
folderArchive="$baseDir/$folderArchive"
nl="
"                   # set to newline character
esc=$(/bin/echo -ne "3")      # set to escape character
#esc=":"            # if 3 does not work well, try another character

# substitute of reading a hash
# it relies on the context that IFS is set to $nl
read_lut() {
    local i
    local key
    local val
    local ret=""
    for i in $lut; do
        key="${i%${esc}*}"
        val="${i#*${esc}}"
    if [ "$key" = "" ]; then
        # loop until the end and use the last value
        ret="$val"
    fi
    done
    echo "$ret"
}

# substitute of writing to a hash
write_lut() {
    lut=$(printf "%s\n%s%c%s" "$lut" "" "$esc" "")
}

if [ "$(uname -n)" != "$execHome" ]; then
    printf "Executed from %s; Run from %s for proper execution.\n" "$(uname -n)" "$execHome"
    exit
fi

count=$(find "$folderNew" -type f |grep -v "DS_Store" |awk -F "/" '{print $NF}'|wc -l)
printf "\nFound/processing %s files in the %s folder\n\n" "$count" "$folderNew"

# determine parent directory name for each project name and create a map for them
ifs_bak="$IFS"
IFS="$nl"
for f in $(find "$folderOld" -type f); do
    proj="${f##*/}"         # remove dirname
    proj="${proj%.*}"               # remove extention
    parent="${f##*$baseDir/}"       # remove pathname until $baseDir
    parent="${parent#*/}"   # strip pathname one-level deeper
    parent="${parent%/*}"   # remove filename
    # now we're mapping "projectOne 1.a" => "House/2018 01" e.g.
#   echo "$proj" "=>" "$parent"     # just for debugging
    write_lut "$proj" "$parent"
done

# iterate over files in A; move to archive directory C and remove files in B
for f in $(find "$folderNew" -type f); do
    proj="${f##*/}"
    proj="${proj%.*}"
    parent=$(read_lut "$proj")
    if [ "$parent" = "" ]; then
        echo "parent of $f not found"   # may not occur but just in case ..
    else
        # move from A to C
        destdir="$folderArchive/$parent"
        mkdir -p -- "$destdir"
        mv -- "$f" "$destdir"

        # remove relevant file(s) in B
        for ext in $workingTypes; do
            oldfile="$folderOld/$parent/$proj.${ext}"
            [ -f "$oldfile" ] && rm -f -- "$oldfile"
        done
    fi
done

# clean-up: remove empty dirs in B
find "$folderOld" -type d -empty -print0 | xargs -r -0 rmdir --

# restore IFS
IFS="$ifs_bak"