Shell |删除重复行

Shell | removing repetitive lines

我需要编写一个 Bash 脚本来从输出文件中删除相似的行。 我的输出文件总是一样的。

第 1 行和第 2 行应保留,其他与这两行类似的行需要删除。

1:  </UsageData><?xml version="1.0" encoding="UTF-8"?>
2:  <UsageData broadcastday="2016-03-16">

日期不同。

最后一行应该保留。例如

</UsageData>

我是 shell 编程的新手,我不知道该怎么做。

这是我的样本 XML:

<?xml version="1.0" encoding="UTF-8"?> 
<UsageData broadcastday="2016-03-16"> 
    <Hh hhID="48800301"> 
        <Inst instID="000002B9"/> 
        <Live> 
            <Station>516</Station> 
            <From>Wed Mar 16 2016 09:52:47 GMT+0000 (UTC)</From> 
            <DurSec>58077</DurSec> 
            <Viewer> 
                <HhMem>569de65c9c3ab0cf7bfa2df2</HhMem> 
            </Viewer> 
        </Live> 
    </Hh> 
    <Hh hhID="46920403"> 
        <Inst instID="000002A8"/> 
        <Live> 
            <Station>5000</Station> 
            <From>Wed Mar 16 2016 12:42:17 GMT+0000 (UTC)</From> 
            <DurSec>47908</DurSec> 
            <Viewer> 
                <HhMem>56caee95f915e09335fd976f</HhMem> 
            </Viewer> 
        </Live> 
    </Hh> 
</UsageData><?xml version="1.0" encoding="UTF-8"?> 
<UsageData broadcastday="2016-03-16"> 
    <Hh hhID="15260304"> 
        <Inst instID="000000A5"/> 
        <Live> 
            <Station>5000</Station> 
            <From>Wed Mar 16 2016 12:57:48 GMT+0000 (UTC)</From> 
            <DurSec>28814</DurSec> 
            <Viewer> 
                <HhMem>565f181dd830d3cc7057c0b9</HhMem> 
            </Viewer> 
        </Live> 
    </Hh> 
</UsageData><?xml version="1.0" encoding="UTF-8"?> 
<UsageData broadcastday="2016-03-16"> 
    <Hh hhID="50100501"> 
        <Inst instID="0000022D"/> 
        <Live> 
            <Station>560</Station> 
            <From>Wed Mar 16 2016 14:21:19 GMT+0000 (UTC)</From> 
            <DurSec>41967</DurSec> 
            <Viewer> 
                <HhMem>56c4412de6a8ff4da18fd4ae</HhMem> 
                <HhMem>56c4412de6a8ff4da18fd4cb</HhMem> 
            </Viewer> 
        </Live> 
    </Hh> 
</UsageData><?xml version="1.0" encoding="UTF-8"?> 
<UsageData broadcastday="2016-03-16"> 
    <Hh hhID="36110404"> 
        <Inst instID="00000104"/> 
        <Live> 
            <Station>545</Station> 
            <From>Wed Mar 16 2016 15:01:04 GMT+0000 (UTC)</From> 
            <DurSec>671</DurSec> 
            <Viewer> 
                <HhMem>568ce8acbd0e486a951d41ce</HhMem> 
                <HhMem>568ce8acbd0e486a951d41dc</HhMem> 
                <HhMem>568ce8acbd0e486a951d41c5</HhMem> 
            </Viewer> 
        </Live> 
    </Hh> 
</UsageData>

我用非常简单的方法解决了我的问题。

awk '/</UsageData><\?xml version="1.0" encoding="UTF-8"\?>/ {getline; next}1' file