浙大數(shù)據(jù)結(jié)構(gòu)第五周之05-樹(shù)9 Huffman Codes

這篇具有很好參考價(jià)值的文章主要介紹了浙大數(shù)據(jù)結(jié)構(gòu)第五周之05-樹(shù)9 Huffman Codes。希望對(duì)大家有所幫助。如果存在錯(cuò)誤或未考慮完全的地方，請(qǐng)大家不吝賜教，您也可以點(diǎn)擊"舉報(bào)違法"按鈕提交疑問(wèn)。

題目詳情：

In 1953, David A. Huffman published his paper "A Method for the Construction of Minimum-Redundancy Codes", and hence printed his name in the history of computer science. As a professor who gives the final exam problem on Huffman codes, I am encountering a big problem: the Huffman codes are NOT unique. For example, given a string "aaaxuaxz", we can observe that the frequencies of the characters 'a', 'x', 'u' and 'z' are 4, 2, 1 and 1, respectively. We may either encode the symbols as {'a'=0, 'x'=10, 'u'=110, 'z'=111}, or in another way as {'a'=1, 'x'=01, 'u'=001, 'z'=000}, both compress the string into 14 bits. Another set of code can be given as {'a'=0, 'x'=11, 'u'=100, 'z'=101}, but {'a'=0, 'x'=01, 'u'=011, 'z'=001} is NOT correct since "aaaxuaxz" and "aazuaxax" can both be decoded from the code 00001011001001. The students are submitting all kinds of codes, and I need a computer program to help me determine which ones are correct and which ones are not.

Input Specification:

Each input file contains one test case. For each case, the first line gives an integer?N?(2≤N≤63), then followed by a line that contains all the?N?distinct characters and their frequencies in the following format:

c[1] f[1] c[2] f[2] ... c[N] f[N]

where?c[i]?is a character chosen from {'0' - '9', 'a' - 'z', 'A' - 'Z', '_'}, and?f[i]?is the frequency of?c[i]?and is an integer no more than 1000. The next line gives a positive integer?M?(≤1000), then followed by?M?student submissions. Each student submission consists of?N?lines, each in the format:

c[i] code[i]

where?c[i]?is the?i-th character and?code[i]?is an non-empty string of no more than 63 '0's and '1's.

Output Specification:

For each test case, print in each line either "Yes" if the student's submission is correct, or "No" if not.

Note: The optimal solution is not necessarily generated by Huffman algorithm. Any prefix code with code length being optimal is considered correct.

Sample Input:

7
A 1 B 1 C 1 D 3 E 3 F 6 G 6
4
A 00000
B 00001
C 0001
D 001
E 01
F 10
G 11
A 01010
B 01011
C 0100
D 011
E 10
F 11
G 00
A 000
B 001
C 010
D 011
E 100
F 101
G 110
A 00000
B 00001
C 0001
D 001
E 00
F 10
G 11

Sample Output:

Yes
Yes
No
No

簡(jiǎn)單翻譯：

依據(jù)給出字母的權(quán)重進(jìn)行哈夫曼編碼，再判斷每個(gè)學(xué)生對(duì)每個(gè)字母的編碼是否正確

主要思路：

分為兩部分

（一）實(shí)現(xiàn)哈夫曼編碼

（1）建立一個(gè)小頂堆，注意小頂堆里放的是指向結(jié)構(gòu)體數(shù)組的指針

（2）然后依據(jù)每個(gè)字母的權(quán)重構(gòu)造小頂堆，構(gòu)造過(guò)程其實(shí)是插入，插入過(guò)程是上濾

（3）接著每次從小頂堆里彈出最小和次小，注意彈出的過(guò)程不止是刪除，還要重構(gòu)小頂堆，重構(gòu)過(guò)程是下濾，構(gòu)成一棵二叉樹(shù)，再把這棵二叉樹(shù)放進(jìn)小頂堆，如此循環(huán)直到小頂堆里只有一個(gè)指針，那個(gè)指針即是哈夫曼樹(shù)根節(jié)點(diǎn)

（二）檢查學(xué)生答案

這里要注意學(xué)生可以不用哈夫曼編碼得到與哈夫曼編碼一樣的答案

檢查分兩部分

（1）WPL是否相同

哈夫曼的WPL計(jì)算：遞歸

學(xué)生給的答案的WPL的計(jì)算：利用權(quán)重 * 每個(gè)編碼長(zhǎng)度再求和

（2）是否會(huì)出現(xiàn)前綴碼
就用學(xué)生給的答案建樹(shù)，如果是0就向左邊建樹(shù)；如果是1就向右邊建樹(shù)，每個(gè)字母編碼建樹(shù)完畢將當(dāng)前current指針指向位置的權(quán)重設(shè)置為-1，這樣在以后建樹(shù)過(guò)程中如果發(fā)現(xiàn)下一層的權(quán)重是-1，說(shuō)明之前一個(gè)字母的編碼是當(dāng)前字母編碼的前綴，就不滿足；另外如果建完樹(shù)發(fā)現(xiàn)current不是葉子結(jié)點(diǎn)，也不滿足

第一次寫(xiě)錯(cuò)誤：

（1）建立小頂堆時(shí)的插入操作要注意比較的是weight，不是直接比較data

（2）檢查學(xué)生輸入時(shí)，檢測(cè)的是左/右孩子的weight，不是current的weight

（3）每申請(qǐng)一個(gè)新節(jié)點(diǎn)，都要將指針初始化為NULL，否則會(huì)出現(xiàn)空指針錯(cuò)誤

（4）檢測(cè)學(xué)生作業(yè)時(shí)，如果檢查到有編碼不對(duì)，還要繼續(xù)往下讀取，不然會(huì)影響之后的判斷，只是可以不用在這輪循環(huán)里判斷了文章來(lái)源地址http://www.zghlxwxcb.cn/news/detail-605850.html

代碼實(shí)現(xiàn)：

/*
建立哈夫曼樹(shù):
    (1)建立小頂堆，小頂堆按權(quán)值存放
    (2)由小頂堆構(gòu)建哈夫曼樹(shù)
        ①?gòu)男№敹牙飶棾鲎钚『痛涡?，組成一棵新的二叉樹(shù)
        ②將這棵二叉樹(shù)重新插入小頂堆
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_SIZE 1005
#define EMPTYCHARACTER '-'

typedef struct HuffmanNode HuffmanNode;
typedef HuffmanNode* Position;
typedef Position HuffmanTree;
struct HuffmanNode {
    char Character;
    int Weight;
    Position LeftChild;
    Position RightChild;
};

/*小頂堆數(shù)據(jù)結(jié)構(gòu)*/
typedef struct MinHeap MinHeap;
struct MinHeap {
    Position Data[MAX_SIZE];
    int Size;
};
MinHeap Heap;
/*初始化小頂堆*/
void Init(int nodeNum) {
    Heap.Data[0] = (Position)malloc(sizeof(HuffmanNode));
    Heap.Data[0]->Character = EMPTYCHARACTER;
    Heap.Data[0]->LeftChild = NULL;
    Heap.Data[0]->RightChild = NULL;
    Heap.Data[0]->Weight = 0;
    for(int i = 1; i <= nodeNum; i++) {     //小頂堆下標(biāo)有效數(shù)據(jù)范圍是1~nodeNum,0是哨兵
        Heap.Data[i] = NULL;
    }
    Heap.Size = 0;
}
/*小頂堆插入,插入標(biāo)準(zhǔn)是看權(quán)值，插入的節(jié)點(diǎn)是指針*/
void Insert(Position newNode) {
    int i = ++Heap.Size;
    for(; Heap.Data[i/2]->Weight > newNode->Weight; i /= 2) {
        Heap.Data[i] = Heap.Data[i/2];
    }
    Heap.Data[i] = newNode;
    return;
}
/*從最小堆中彈出最小的節(jié)點(diǎn)，彈出后還要重新調(diào)整最小堆
最小堆刪除操作：
返回的是存放在下標(biāo)為1的最小值
然后指向末尾最后一個(gè)節(jié)點(diǎn)值，并且把堆的大小減1
利用最后一個(gè)節(jié)點(diǎn)下濾，首先選擇左右孩子中較小的
然后將末尾最后一個(gè)節(jié)點(diǎn)值與較小值比較
如果小，說(shuō)明找到了合適位置，break出循環(huán)，將這個(gè)值賦給parent所指位置
如果大，說(shuō)明還要下濾*/
Position PopMin() {
    Position min = Heap.Data[1];

    Position x = Heap.Data[Heap.Size--];
    int parent = 1;
    int child = parent * 2;
    for(; parent * 2 <= Heap.Size; parent = child) {
        child = parent * 2;
        //注意，比較的是權(quán)值，不是直接比較，賦值是直接賦值
        if(child != Heap.Size && Heap.Data[child]->Weight > Heap.Data[child + 1]->Weight) {
            child++;
        }

        if(Heap.Data[child]->Weight >= x->Weight) {
            break;
        }
        else {
            Heap.Data[parent] = Heap.Data[child];
        }
    }

    Heap.Data[parent] = x;

    /*自己額外加的，不加也可以，不過(guò)debug時(shí)好看一點(diǎn)*/
    Heap.Data[Heap.Size + 1] = NULL;
    
    return min;
}

/*建立哈夫曼樹(shù)*/
/*先用另一個(gè)全局?jǐn)?shù)組保存每次讀入的權(quán)值*/
int Weight[MAX_SIZE];
HuffmanTree BuildHuffmanTree(int N) {
    /*先搞個(gè)小頂堆出來(lái)*/   
    /*一開(kāi)始問(wèn)題出在這里，申請(qǐng)一個(gè)新節(jié)點(diǎn)內(nèi)存時(shí)沒(méi)有把leftChild和rightChild置為NULL*/
    Init(N);
    for(int i = 1; i <= N; i++) {
        Position newNode = (Position)malloc(sizeof(HuffmanNode));
        scanf("%c ", &(newNode->Character));
        scanf("%d", &(newNode->Weight));
        getchar();
        Weight[i] = newNode->Weight;
        newNode->LeftChild = NULL;
        newNode->RightChild = NULL;
        Insert(newNode);
    }

    /*利用小頂堆建立哈夫曼樹(shù)
    每次從小頂堆中彈出最小值和次小值，然后計(jì)算他們的權(quán)值組成一棵新的二叉樹(shù)再插入小頂堆中
    最后返回小頂堆下標(biāo)1的指針?biāo)讣仁枪蚵鼧?shù)*/
    
    for(int i = 1; i < N; i++) {    //合并只需要N - 1次
        HuffmanTree newNode = (HuffmanTree)malloc(sizeof(HuffmanNode));
        newNode->Character = EMPTYCHARACTER;
        newNode->LeftChild = PopMin();
        newNode->RightChild = PopMin();
        newNode->Weight = newNode->LeftChild->Weight + newNode->RightChild->Weight;
        Insert(newNode);
    }   
    return PopMin();
}

/*刪除樹(shù)*/
void DeleteTree(HuffmanTree root) {
    if(root == NULL) {
        return;
    }   

    DeleteTree(root->LeftChild);
    DeleteTree(root->RightChild);
    free(root);
    return;
}
int countWPL(HuffmanTree root, int depth) {
    if(root->LeftChild == NULL && root->RightChild == NULL) {
        return root->Weight * depth;
    }

    int leftSubtree = countWPL(root->LeftChild, depth + 1);
    int rightSubtree = countWPL(root->RightChild, depth + 1);
    return leftSubtree + rightSubtree;
}
/*按照學(xué)生寫(xiě)的建立二叉樹(shù)*/
Position CreateNode() {
    Position newNode = (Position)malloc(sizeof(HuffmanNode));
    newNode->Weight = 0;
    newNode->LeftChild = NULL;
    newNode->RightChild = NULL;
    return newNode;
}
int checkSubmit(Position current, char* code, int stringLen) {
    for(int i = 0; i < stringLen; i++) {
        if(code[i] == '0') {
            if(current->LeftChild == NULL) {
                current->LeftChild = CreateNode();
            }
            if(current->LeftChild->Weight == -1) {
                return 0;
            }
            current = current->LeftChild;
        }
        else if(code[i] == '1') {
            if(current->RightChild == NULL) {
                current->RightChild = CreateNode();
            }
            if(current->RightChild->Weight == -1) {
                return 0;
            }
            current = current->RightChild;
        }
    }

    if(current->LeftChild == NULL && current->RightChild == NULL) {
        current->Weight = -1;
        return 1;
    }
    else {
        return 0;
    }
}
int main() {
    int characterNum;
    scanf("%d", &characterNum);
    getchar();
    HuffmanTree tree = BuildHuffmanTree(characterNum);
    int WPL = countWPL(tree, 0);
    int studentNum;
    scanf("%d", &studentNum);
    getchar();
    for(int i = 1; i <= studentNum; i++) {
        char c;
        char code[MAX_SIZE];
        int flag = 1;
        int result = 1;
        int codeLen = 0;    
        HuffmanTree studentTree = CreateNode();
        for(int j = 1; j <= characterNum; j++) {
            Position current = studentTree;
            scanf("%c ", &c);
            scanf("%s", code);
            getchar();
            codeLen += strlen(code) * Weight[j];
            /*一開(kāi)始設(shè)計(jì)判斷正誤的有問(wèn)題，雖然只要找到一個(gè)字符編碼有誤就行了，但依然
            要繼續(xù)往下讀，不然會(huì)影響之后的判斷，只是不用進(jìn)行檢測(cè)了*/
            if(flag == 1) {
                result = checkSubmit(current, code, strlen(code));
                if(result == 0) flag = 0;
            }
        }
        if(result == 1 && (codeLen == WPL)) {
            printf("Yes\n");
        }
        else {
            printf("No\n");
        }
        DeleteTree(studentTree);
    }
    DeleteTree(tree);
    return 0;
}

到了這里，關(guān)于浙大數(shù)據(jù)結(jié)構(gòu)第五周之05-樹(shù)9 Huffman Codes的文章就介紹完了。如果您還想了解更多內(nèi)容，請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章，希望大家以后多多支持TOY模板網(wǎng)！