題目詳情:
In 1953, David A. Huffman published his paper "A Method for the Construction of Minimum-Redundancy Codes", and hence printed his name in the history of computer science. As a professor who gives the final exam problem on Huffman codes, I am encountering a big problem: the Huffman codes are NOT unique. For example, given a string "aaaxuaxz", we can observe that the frequencies of the characters 'a', 'x', 'u' and 'z' are 4, 2, 1 and 1, respectively. We may either encode the symbols as {'a'=0, 'x'=10, 'u'=110, 'z'=111}, or in another way as {'a'=1, 'x'=01, 'u'=001, 'z'=000}, both compress the string into 14 bits. Another set of code can be given as {'a'=0, 'x'=11, 'u'=100, 'z'=101}, but {'a'=0, 'x'=01, 'u'=011, 'z'=001} is NOT correct since "aaaxuaxz" and "aazuaxax" can both be decoded from the code 00001011001001. The students are submitting all kinds of codes, and I need a computer program to help me determine which ones are correct and which ones are not.
Input Specification:
Each input file contains one test case. For each case, the first line gives an integer?N?(2≤N≤63), then followed by a line that contains all the?N?distinct characters and their frequencies in the following format:
c[1] f[1] c[2] f[2] ... c[N] f[N]
where?c[i]
?is a character chosen from {'0' - '9', 'a' - 'z', 'A' - 'Z', '_'}, and?f[i]
?is the frequency of?c[i]
?and is an integer no more than 1000. The next line gives a positive integer?M?(≤1000), then followed by?M?student submissions. Each student submission consists of?N?lines, each in the format:
c[i] code[i]
where?c[i]
?is the?i
-th character and?code[i]
?is an non-empty string of no more than 63 '0's and '1's.
Output Specification:
For each test case, print in each line either "Yes" if the student's submission is correct, or "No" if not.
Note: The optimal solution is not necessarily generated by Huffman algorithm. Any prefix code with code length being optimal is considered correct.
Sample Input:
7
A 1 B 1 C 1 D 3 E 3 F 6 G 6
4
A 00000
B 00001
C 0001
D 001
E 01
F 10
G 11
A 01010
B 01011
C 0100
D 011
E 10
F 11
G 00
A 000
B 001
C 010
D 011
E 100
F 101
G 110
A 00000
B 00001
C 0001
D 001
E 00
F 10
G 11
Sample Output:
Yes
Yes
No
No
簡(jiǎn)單翻譯:
依據(jù)給出字母的權(quán)重進(jìn)行哈夫曼編碼,再判斷每個(gè)學(xué)生對(duì)每個(gè)字母的編碼是否正確
主要思路:
分為兩部分
(一)實(shí)現(xiàn)哈夫曼編碼
(1)建立一個(gè)小頂堆,注意小頂堆里放的是指向結(jié)構(gòu)體數(shù)組的指針
(2)然后依據(jù)每個(gè)字母的權(quán)重構(gòu)造小頂堆,構(gòu)造過(guò)程其實(shí)是插入,插入過(guò)程是上濾
(3)接著每次從小頂堆里彈出最小和次小,注意彈出的過(guò)程不止是刪除,還要重構(gòu)小頂堆,重構(gòu)過(guò)程是下濾,構(gòu)成一棵二叉樹(shù),再把這棵二叉樹(shù)放進(jìn)小頂堆,如此循環(huán)直到小頂堆里只有一個(gè)指針,那個(gè)指針即是哈夫曼樹(shù)根節(jié)點(diǎn)
(二)檢查學(xué)生答案
這里要注意學(xué)生可以不用哈夫曼編碼得到與哈夫曼編碼一樣的答案
檢查分兩部分
(1)WPL是否相同
哈夫曼的WPL計(jì)算:遞歸
學(xué)生給的答案的WPL的計(jì)算:利用權(quán)重 * 每個(gè)編碼長(zhǎng)度再求和
(2)是否會(huì)出現(xiàn)前綴碼
就用學(xué)生給的答案建樹(shù),如果是0就向左邊建樹(shù);如果是1就向右邊建樹(shù),每個(gè)字母編碼建樹(shù)完畢將當(dāng)前current指針指向位置的權(quán)重設(shè)置為-1,這樣在以后建樹(shù)過(guò)程中如果發(fā)現(xiàn)下一層的權(quán)重是-1,說(shuō)明之前一個(gè)字母的編碼是當(dāng)前字母編碼的前綴,就不滿足;另外如果建完樹(shù)發(fā)現(xiàn)current不是葉子結(jié)點(diǎn),也不滿足
第一次寫(xiě)錯(cuò)誤:
(1)建立小頂堆時(shí)的插入操作要注意比較的是weight,不是直接比較data
(2)檢查學(xué)生輸入時(shí),檢測(cè)的是左/右孩子的weight,不是current的weight
(3)每申請(qǐng)一個(gè)新節(jié)點(diǎn),都要將指針初始化為NULL,否則會(huì)出現(xiàn)空指針錯(cuò)誤文章來(lái)源:http://www.zghlxwxcb.cn/news/detail-605850.html
(4)檢測(cè)學(xué)生作業(yè)時(shí),如果檢查到有編碼不對(duì),還要繼續(xù)往下讀取,不然會(huì)影響之后的判斷,只是可以不用在這輪循環(huán)里判斷了文章來(lái)源地址http://www.zghlxwxcb.cn/news/detail-605850.html
代碼實(shí)現(xiàn):
/*
建立哈夫曼樹(shù):
(1)建立小頂堆,小頂堆按權(quán)值存放
(2)由小頂堆構(gòu)建哈夫曼樹(shù)
①?gòu)男№敹牙飶棾鲎钚『痛涡?,組成一棵新的二叉樹(shù)
②將這棵二叉樹(shù)重新插入小頂堆
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_SIZE 1005
#define EMPTYCHARACTER '-'
typedef struct HuffmanNode HuffmanNode;
typedef HuffmanNode* Position;
typedef Position HuffmanTree;
struct HuffmanNode {
char Character;
int Weight;
Position LeftChild;
Position RightChild;
};
/*小頂堆數(shù)據(jù)結(jié)構(gòu)*/
typedef struct MinHeap MinHeap;
struct MinHeap {
Position Data[MAX_SIZE];
int Size;
};
MinHeap Heap;
/*初始化小頂堆*/
void Init(int nodeNum) {
Heap.Data[0] = (Position)malloc(sizeof(HuffmanNode));
Heap.Data[0]->Character = EMPTYCHARACTER;
Heap.Data[0]->LeftChild = NULL;
Heap.Data[0]->RightChild = NULL;
Heap.Data[0]->Weight = 0;
for(int i = 1; i <= nodeNum; i++) { //小頂堆下標(biāo)有效數(shù)據(jù)范圍是1~nodeNum,0是哨兵
Heap.Data[i] = NULL;
}
Heap.Size = 0;
}
/*小頂堆插入,插入標(biāo)準(zhǔn)是看權(quán)值,插入的節(jié)點(diǎn)是指針*/
void Insert(Position newNode) {
int i = ++Heap.Size;
for(; Heap.Data[i/2]->Weight > newNode->Weight; i /= 2) {
Heap.Data[i] = Heap.Data[i/2];
}
Heap.Data[i] = newNode;
return;
}
/*從最小堆中彈出最小的節(jié)點(diǎn),彈出后還要重新調(diào)整最小堆
最小堆刪除操作:
返回的是存放在下標(biāo)為1的最小值
然后指向末尾最后一個(gè)節(jié)點(diǎn)值,并且把堆的大小減1
利用最后一個(gè)節(jié)點(diǎn)下濾,首先選擇左右孩子中較小的
然后將末尾最后一個(gè)節(jié)點(diǎn)值與較小值比較
如果小,說(shuō)明找到了合適位置,break出循環(huán),將這個(gè)值賦給parent所指位置
如果大,說(shuō)明還要下濾*/
Position PopMin() {
Position min = Heap.Data[1];
Position x = Heap.Data[Heap.Size--];
int parent = 1;
int child = parent * 2;
for(; parent * 2 <= Heap.Size; parent = child) {
child = parent * 2;
//注意,比較的是權(quán)值,不是直接比較,賦值是直接賦值
if(child != Heap.Size && Heap.Data[child]->Weight > Heap.Data[child + 1]->Weight) {
child++;
}
if(Heap.Data[child]->Weight >= x->Weight) {
break;
}
else {
Heap.Data[parent] = Heap.Data[child];
}
}
Heap.Data[parent] = x;
/*自己額外加的,不加也可以,不過(guò)debug時(shí)好看一點(diǎn)*/
Heap.Data[Heap.Size + 1] = NULL;
return min;
}
/*建立哈夫曼樹(shù)*/
/*先用另一個(gè)全局?jǐn)?shù)組保存每次讀入的權(quán)值*/
int Weight[MAX_SIZE];
HuffmanTree BuildHuffmanTree(int N) {
/*先搞個(gè)小頂堆出來(lái)*/
/*一開(kāi)始問(wèn)題出在這里,申請(qǐng)一個(gè)新節(jié)點(diǎn)內(nèi)存時(shí)沒(méi)有把leftChild和rightChild置為NULL*/
Init(N);
for(int i = 1; i <= N; i++) {
Position newNode = (Position)malloc(sizeof(HuffmanNode));
scanf("%c ", &(newNode->Character));
scanf("%d", &(newNode->Weight));
getchar();
Weight[i] = newNode->Weight;
newNode->LeftChild = NULL;
newNode->RightChild = NULL;
Insert(newNode);
}
/*利用小頂堆建立哈夫曼樹(shù)
每次從小頂堆中彈出最小值和次小值,然后計(jì)算他們的權(quán)值組成一棵新的二叉樹(shù)再插入小頂堆中
最后返回小頂堆下標(biāo)1的指針?biāo)讣仁枪蚵鼧?shù)*/
for(int i = 1; i < N; i++) { //合并只需要N - 1次
HuffmanTree newNode = (HuffmanTree)malloc(sizeof(HuffmanNode));
newNode->Character = EMPTYCHARACTER;
newNode->LeftChild = PopMin();
newNode->RightChild = PopMin();
newNode->Weight = newNode->LeftChild->Weight + newNode->RightChild->Weight;
Insert(newNode);
}
return PopMin();
}
/*刪除樹(shù)*/
void DeleteTree(HuffmanTree root) {
if(root == NULL) {
return;
}
DeleteTree(root->LeftChild);
DeleteTree(root->RightChild);
free(root);
return;
}
int countWPL(HuffmanTree root, int depth) {
if(root->LeftChild == NULL && root->RightChild == NULL) {
return root->Weight * depth;
}
int leftSubtree = countWPL(root->LeftChild, depth + 1);
int rightSubtree = countWPL(root->RightChild, depth + 1);
return leftSubtree + rightSubtree;
}
/*按照學(xué)生寫(xiě)的建立二叉樹(shù)*/
Position CreateNode() {
Position newNode = (Position)malloc(sizeof(HuffmanNode));
newNode->Weight = 0;
newNode->LeftChild = NULL;
newNode->RightChild = NULL;
return newNode;
}
int checkSubmit(Position current, char* code, int stringLen) {
for(int i = 0; i < stringLen; i++) {
if(code[i] == '0') {
if(current->LeftChild == NULL) {
current->LeftChild = CreateNode();
}
if(current->LeftChild->Weight == -1) {
return 0;
}
current = current->LeftChild;
}
else if(code[i] == '1') {
if(current->RightChild == NULL) {
current->RightChild = CreateNode();
}
if(current->RightChild->Weight == -1) {
return 0;
}
current = current->RightChild;
}
}
if(current->LeftChild == NULL && current->RightChild == NULL) {
current->Weight = -1;
return 1;
}
else {
return 0;
}
}
int main() {
int characterNum;
scanf("%d", &characterNum);
getchar();
HuffmanTree tree = BuildHuffmanTree(characterNum);
int WPL = countWPL(tree, 0);
int studentNum;
scanf("%d", &studentNum);
getchar();
for(int i = 1; i <= studentNum; i++) {
char c;
char code[MAX_SIZE];
int flag = 1;
int result = 1;
int codeLen = 0;
HuffmanTree studentTree = CreateNode();
for(int j = 1; j <= characterNum; j++) {
Position current = studentTree;
scanf("%c ", &c);
scanf("%s", code);
getchar();
codeLen += strlen(code) * Weight[j];
/*一開(kāi)始設(shè)計(jì)判斷正誤的有問(wèn)題,雖然只要找到一個(gè)字符編碼有誤就行了,但依然
要繼續(xù)往下讀,不然會(huì)影響之后的判斷,只是不用進(jìn)行檢測(cè)了*/
if(flag == 1) {
result = checkSubmit(current, code, strlen(code));
if(result == 0) flag = 0;
}
}
if(result == 1 && (codeLen == WPL)) {
printf("Yes\n");
}
else {
printf("No\n");
}
DeleteTree(studentTree);
}
DeleteTree(tree);
return 0;
}
到了這里,關(guān)于浙大數(shù)據(jù)結(jié)構(gòu)第五周之05-樹(shù)9 Huffman Codes的文章就介紹完了。如果您還想了解更多內(nèi)容,請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!