《高級(jí)計(jì)算機(jī)視覺》期末樣題匯總
說明:電子科技大學(xué)2022年研究生課程《高級(jí)計(jì)算機(jī)視覺》期末樣題。
1. 游程編碼
給出下列數(shù)據(jù),寫出按照行的方向的游程長(zhǎng)度編碼。
答:
(1,2),(2,4),(3,2)
(1,2),(2,4),(3,2)
(3,2),(5,6)
(3,2),(5,6)
(2,4),(7,4)
(2,4),(7,4)
(2,4),(6,4)
(2,4),(6,2),(2,2)
2. Shannon-Fano coding
(a)Complete the following table using the Shannon-Fano Algorithm.
(b) What is the entropy of this source, and in what units? Compare to the above result.
3. 計(jì)算
Suppose an alphabet consists of 6 symbols {a, b, c, d, e, f}, and the probability for each of the symbol is 1/6 (note, log2(3) = 1.585).
- What is the entropy for this set?
- Draw the Shannon-Fano tree for this set. What is the average bitrate?
- Draw the Huffman tree for this set. What is the average bitrate?
- How many bits would we need without compression, assuming fixed-length codewords? What is the compression ratio, compared to the Huffman tree?
答:
- Shannon-Fano tree:
average bitrate=(2*2bit+4*3bit)/6=2.667bit.
- Huffman tree:
average bitrate=(2*2bit+4*3bit)/6=2.667bit.
- 假設(shè)碼字長(zhǎng)度固定,如果不進(jìn)行壓縮,我們需要6*3bit=18bit。與哈夫曼樹相比,壓縮比是18bit/16bit=1.125。
4. “checkerboard” image
Calculate the entropy of a “checkerboard” image in which half of the pixels are BLACK and half of them are WHITE.
5. Huffman Coding
(a) Construct a binary Huffman code for a source S with three symbols A, B and C, having probabilities 0.6, 0.3, and 0.1, respectively. What is its average codeword length, in bits per symbol? What is the entropy of this source?
(b) Let’s now extend this code by grouping symbols into 2-character groups. Compare the performance now, in bits per original source symbol, with the best possible.
答:
(a)
average codeword length=(2*2bit+1*1bit)/3=1.667bit.
(b)
2-character groups | probability |
---|---|
AA | 0.36 |
AB | 0.18 |
AC | 0.06 |
BA | 0.18 |
BB | 0.09 |
BC | 0.03 |
CA | 0.06 |
CB | 0.03 |
CC | 0.01 |
Huffman tree:
average codeword length=(1*1bit+2*3bit+3*4bit+1*5bit+2*6bit)/9=4bit
the best possible average codeword length is 2.5909bit, and the average codeword length of 2-character groups with Huffman coding is 4bit.
6. Adaptive Huffman Coding
a) What are the advantages of Adaptive Huffman Coding compared to the original Huffman Coding algorithm?
- 原始的Huffman算法給出了一種靜態(tài)的編碼樹構(gòu)造方案,要求在實(shí)際編碼之前統(tǒng)計(jì)被編碼對(duì)象中符號(hào)出現(xiàn)的幾率,并據(jù)此進(jìn)行編碼樹的構(gòu)造。所以應(yīng)用此方案時(shí)必須對(duì)輸入符號(hào)流進(jìn)行兩遍掃描,而在大多數(shù)多媒體應(yīng)用中數(shù)據(jù)分布的先前統(tǒng)計(jì)數(shù)據(jù)是不可行的。
- 另外,靜態(tài)編碼樹構(gòu)造方案不能對(duì)符號(hào)流的局部統(tǒng)計(jì)規(guī)律變化做出反應(yīng),因?yàn)樗鼜氖贾两K都使用完全不變的編碼樹。而自適應(yīng)Huffman編碼不需要事先構(gòu)造Huffman樹,而是隨著編碼的進(jìn)行,逐步構(gòu)造Huffman樹。同時(shí),這種編碼方案對(duì)符號(hào)的統(tǒng)計(jì)也動(dòng)態(tài)進(jìn)行,隨著程序的運(yùn)行,同一個(gè)符號(hào)的編碼可能發(fā)生改變(變得更長(zhǎng)或更短)。
- 再者就靜態(tài)編碼在儲(chǔ)存或傳輸Huffman編碼結(jié)果之前,還必須先儲(chǔ)存或傳輸Huffman編碼樹,自適應(yīng)霍夫曼編碼則不需要,這大大節(jié)省了內(nèi)存開銷。
b) Assume that the Adaptive Huffman Coding is used to code an information source S with a vocabulary of four letters (a, b, c, d). Before any transmission, the initial coding is a = 00, b =01, c = 10, d = 11. As in the example illustrated in the following figure, a special symbol NEW will be sent before any letter if it is to be sent the first time.
Adaptive Huffman Tree e after sending letters aabb
After that, the additional bitstream received by the decoder for the next few letters is 01010010101.
- What are the additional letters received?
- Draw the adaptive Huffman trees after each of the additional letters is received.
答:
接收到的后續(xù)的幾個(gè)字母分別是:b(01),a(01),c(00 10),c(101)。
從圖定位到01為b,然后b的權(quán)值+1,此時(shí)b的節(jié)點(diǎn)權(quán)值變?yōu)?>a(2),b與a交換位置。
從上圖定位到01為a,然后a權(quán)值+1,此時(shí)a的節(jié)點(diǎn)權(quán)值變?yōu)?=b(3),樹的節(jié)點(diǎn)不做變動(dòng)。
根據(jù)上圖定位到00是new,意味著有新字符的加入,然后根據(jù)下面的10知道新加入的字符是c,然后用包含c和new的子樹替換舊的new節(jié)點(diǎn),然后將a的父節(jié)點(diǎn)的權(quán)值+1變?yōu)?>b(3),與b交換位置。
根據(jù)上圖定位到c,然后將相應(yīng)的節(jié)點(diǎn)權(quán)值分別+1,發(fā)現(xiàn)沒有需要置換的節(jié)點(diǎn)和子樹。
7. LZW
Consider the dictionary-based LZW compression algorithm. Suppose the alphabet is the set of symbols {0,1}. Show the dictionary (symbol sets plus associated codes) and output for LZW compression of the input 0110011.
答:
對(duì)于字符串0110011。初始字典為{0, 1}。
步驟 | 前綴 | 后綴 | 詞 | 存在對(duì)應(yīng)碼 | 輸出 | 碼 |
---|---|---|---|---|---|---|
1 | 0 | (, 0) | ||||
2 | 0 | 1 | (0, 1) | no | 0 | 2 |
3 | 1 | 1 | (1, 1) | no | 1 | 3 |
4 | 1 | 0 | (1, 0) | no | 1 | 4 |
5 | 0 0 (0, 0) no 0 5 | |||||
6 | 0 | 1 | (0, 1) | yes | ||
7 | 2 | 1 | (2, 1) | no | 2 | 6 |
8 | 1 | 1 | (1, 1) | yes |
輸出:0,1,1,0,2,1。對(duì)應(yīng)生成的碼表:
2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|
(0, 1) | (1, 1) | (1, 0) | (0, 0) | (2, 1) |
8. LZW
Suppose we have a small 8-bit grayscale image, with all pixels equal to the same pixel value, say 113. Consider the performance of an LZW compression scheme. First initialize codes in the dictionary with pixel values, 0…255. Use 9-bit codes.
For a 4×4 uniform image made of pixel values which are all 113, how many bits will LZW (or WINZIP) use for a compressed version of the image? Explain in detail, using an LZW table. What is the compression ratio?
Hint: recall that the LZW coding algorithm is
答:
Thus we end up with 5 codes output (at 9 bits each), as opposed to 16 uncompressed 8-bit pixels. Therefore the compression ratio is 128/45 = 2.8444.
9. Huffman coding and Arithmetic coding
Suppose we wish to transmit the 10-character string “MULTIMEDIA”. The characters in the string are chosen from a finite alphabet of 8 characters.
(a) What are the probabilities of each character in the source string?
(b) Compute the entropy of the source string.
(c) If the source string is encoded using Huffman coding, draw the encoding tree and compute the average number of bits needed.
(d) If the source string MULTIMEDIA is now encoded using Arithmetic coding, what is the codeword in fractional decimal representation? How many bits are needed for coding in binary format? How does this compare to the entropy?
答:
(a)
P(A)=P(D)=P(E)=P(L)=P(T)=P(U)=1/10, P(I)=P(M)=1/5.
(b)
(c) Huffman tree:
average bitrate=(1*2bit+5*3bit+2*4bit)/8=3.125bit
(d)
初始化:
A:[0,0.1), D:[0.1,0.2), E:[0.2,0.3), I:[0.3,0.5), L:[0.5,0.6), M:[0.6,0.8), T:[0.8,0.9), U:[0.9,1).
M:
A:[0.6,0.62), D:[0.62,0.64), E:[0.64,0.66), I:[0.66,0.7), L:[0.7,0.72), M:[0.72,0.76), T:[0.76,0.78), U:[0.78,0.8).
U:
A:[0.78,0.782), D:[0.782,0.784), E:[0.784,0.786), I:[0.786,0.79), L:[0.79,0.792), M:[0.792,0.796), T:[0.796,0.798), U:[0.798,0.8).
L:
A:[0.79,0.7902), D:[0.7902,0.7904), E:[0.7904,0.7906), I:[0.7906,0.791), L:[0.791,0.7912), M:[0.7912,0.7916), T:[0.7916,0.7918), U:[0.7918,0.792).
T:
A:[0.7916,0.79162), D:[0.79162,0.79164), E:[0.79164,0.79166), I:[0.79166,0.7917), L:[0.7917,0.79172), M:[0.79172,0.79176), T:[0.79176,0.79178), U:[0.79178,0.7918).
I:
A:[0.79166,0.791664), D:[0.791664,0.791668), E:[0.791668,0.791672), I:[0.791672,0.79168), L:[0.79168,0.791684), M:[0.791684,0.791692), T:[0.791692,0.791696), U:[0.791696,0.7917).
M:
A:[0.791684,0.7916848), D:[0.7916848,0.7916856), E:[0.7916856,0.7916864), I:[0.7916864,0.791688), L:[0.791688,0.7916888), M:[0.7916888,0.7916904), T:[0.7916904,0.7916912), U:[0.7916912,0.791692).
E:
A:[0.7916856,0.79168568), D:[0.79168568,0.79168576), E:[0.79168576,0.79168584), I:[0.79168584,0.791686), L:[0.791686,0.79168608), M:[0.79168608,0.79168624), T:[0.79168624,0.79168632), U:[0.79168632,0.7916864).
D:
A:[0.79168568,0.791685688), D:[0.791685688,0.791685696), E:[0.791685696,0.791685704), I:[0.791685704,0.79168572), L:[0.79168572,0.791685728), M:[0.791685728,0.791685744), T:[0.791685744,0.791685752), U:[0.791685752,0.79168576).
I:
A:[0.791685704,0.7916857056), D:[0.7916857056,0.7916857072), E:[0.7916857072,0.7916857088), I:[0.7916857088,0.791685712), L:[0.791685712,0.7916857136), M:[0.7916857136,0.7916857168), T:[0.7916857168,0.7916857184), U:[0.7916857184,0.79168572).
A:
A:[0.791685704,0.79168570416), D:[0.79168570416,0.79168570432), E:[0.79168570432,0.79168570448), I:[0.79168570448,0.7916857048), L:[0.7916857048,0.79168570496), M:[0.79168570496,0.79168570528), T:[0.79168570528,0.79168570544), U:[0.79168570544,0.7916857056).
最終的目標(biāo)區(qū)間為:[0.791685704,0.7916857056),我們?cè)谶@個(gè)區(qū)間內(nèi),任意選一個(gè)小數(shù),便可以作為最終的編碼小數(shù)。但是計(jì)算機(jī)只能識(shí)別0和1,所以我們?cè)賹⑿?shù)轉(zhuǎn)成二進(jìn)制。我們的訴求是進(jìn)行最短壓縮,所以我們從[0.791685704,0.7916857056)選一個(gè)二進(jìn)制表示最短的小數(shù)。這里我們選定0.791685705073178,二進(jìn)制為:0.110010101010101111101010000101,去掉整數(shù)位0以及小數(shù)點(diǎn)后,最終的二進(jìn)制編碼為110010101010101111101010000101,長(zhǎng)度為30位,平均比特率為3,十分接近熵(2.9219)。
10. Huffman coding and Arithmetic coding
Suppose the alphabet is [A;B;C], and the known probability distribution is PA = 0.5; PB =0.4; PC = 0.1. For simplicity, let’s also assume that both encoder and decoder know that the length of the messages is always 3, so there is no need for a terminator. How many bits are needed to encode the message BBB by Huffman coding? How many bits are needed to encode the message BBB by arithmetic coding?
Huffman tree:
BBB需要6bit。
Arithmetic coding:
初始化:A:[0,0.5), B:[0.5,0.9), C:[0.9,1).
B:A:[0.5,0.7), B:[0.7,0.86), C:[0.86,0.9).
B:A:[0.7,0.78), B:[0.78,0.844), C:[0.844,0.86).
B:A:[0.78,0.812), B:[0.812,0.8376), C:[0.8376,0.844).
最終的目標(biāo)區(qū)間為[0.78,0.844),在這個(gè)區(qū)間的任意一個(gè)小數(shù)都能作為BBB的編碼,我們的訴求是進(jìn)行最短壓縮,所以我們從[0.78,0.844)選一個(gè)二進(jìn)制表示最短的小數(shù)。這里我們選定0.8125,二進(jìn)制為:0.1101,去掉整數(shù)位0以及小數(shù)點(diǎn)后,最終的二進(jìn)制編碼為1101,長(zhǎng)度為4bit,比哈夫曼編碼少2bit。
11. Arithmetic coding
Assume there is an information source with four characters and their frequencies as follows: A:(10%), B:(40%), C:(20%), and D:(30%). When an encoded message 0.514 is received, what the original string (3 characters) should be?
答:
初始化:
A:[0,0.1), B:[0.1,0.5), C:[0.5,0.7), D:[0.7,1).
因?yàn)?.514在E的區(qū)間,所以E是第一個(gè)字符,再對(duì)E的區(qū)間按比例劃分:
A:[0.5,0.52), B:[0.52,0.6), C:[0.6,0.64), D:[0.64,0.7).
因?yàn)?.514在A的區(qū)間,所以A是第二個(gè)字符,再對(duì)A的區(qū)間按比例劃分:
A:[0.5,0.502), B:[0.502,0.51), C:[0.51,0.514), D:[0.514,0.52).
因?yàn)?.514在D的區(qū)間,所以D是第三個(gè)字符。故原字符串為EAD。
12. 查詢
假設(shè)一個(gè)圖像檢索系統(tǒng)數(shù)據(jù)庫(kù)中有10個(gè)圖像,其中包含“貓”的圖像有5個(gè)。對(duì)于一個(gè)輸入“貓”的查詢圖像Q,圖像檢索模型的目的是檢索到數(shù)據(jù)庫(kù)中所有的5個(gè)“貓”的圖像。假設(shè)針對(duì)查詢圖像Q,小李設(shè)計(jì)了模型M1返回的10個(gè)圖像的排序?yàn)閧+,+,-,+,+,-,-,-,-,+}, 小張?jiān)O(shè)計(jì)了模型M2返回的10個(gè)圖像的排序?yàn)閧+,-,+,+,+,+,-,-,-,-},[注:+表示是“貓”的圖片,-表示不是“貓”的圖片]。試畫出兩位同學(xué)設(shè)計(jì)的檢索模型的準(zhǔn)確率-召回率曲線(Precision-Recall Curve),并判斷哪位同學(xué)設(shè)計(jì)的模型更準(zhǔn)確一些并說明原因。
[注: 準(zhǔn)確率與召回率的定義分別為:Precision = #relevant / #returned; Recall = #relevant / #total relevant]
M2的PR曲線完全包住了M1的PR曲線,則可斷言M2的性能優(yōu)于M1。
13. 查詢
分別用“√ ”和“×”代表與查詢相關(guān)和不相關(guān)的文檔:
(a) 假設(shè)針對(duì)查詢1,有5個(gè)相關(guān)的文檔,搜索引擎檢索對(duì)查詢1的排序(Ranking #1)結(jié)果為√×√××√××√√,求其平均準(zhǔn)確率(Average Precision);
(b) 假設(shè)針對(duì)查詢2,有3個(gè)相關(guān)的文檔,搜索引擎對(duì)查詢1的排序(Ranking #2)結(jié)果為×√××√×√×××,求其平均準(zhǔn)確率(Average Precision);
(c) 求該搜索引擎對(duì)兩次查詢的均值平均準(zhǔn)確率(Mean Average Precision)。
(a)
查詢序號(hào) | 準(zhǔn)確率(P) |
---|---|
1 | 1 |
2 | 1/2 |
3 | 2/3 |
4 | 1/2 |
5 | 2/5 |
6 | 1/2 |
7 | 3/7 |
8 | 3/8 |
9 | 4/9 |
10 | 1/2 |
(b)
查詢序號(hào) | 準(zhǔn)確率(P) |
---|---|
1 | 0 |
2 | 1/2 |
3 | 1/3 |
4 | 1/4 |
5 | 2/5 |
6 | 1/3 |
7 | 3/7 |
8 | 3/8 |
9 | 3/9 |
10 | 3/10 |
(c)
Mean Average Precision=1/2 (Average Precision 1+Average Precision 2)=0.4284.
14. 傅立葉變換
傅立葉變換是對(duì)圖像的頻率域信息進(jìn)行操作的一種重要方法,在對(duì)圖像進(jìn)行傅立葉變換后,幅度譜和相位譜分別包含圖像的什么信息?如果舍棄相位譜會(huì)對(duì)圖像造成什么影響?如果將幅度譜均勻衰減到原幅度譜的一半,并和相位譜一起重建圖像,和原圖像相比會(huì)有什么變化?
答:
f(x)為連續(xù)可積函數(shù),其傅立葉變換定義為:
通常f(x)的傅立葉變換為復(fù)數(shù),可有通用表示式為:F(u)=R(u)+jI(u),R(u)、I(u)分別稱為傅立葉變換F(u)的實(shí)部和虛部。
其中,|F(u)|=√(R^2 (u)+I^2 (u))稱為 f(x)的幅度譜,Φ(u)=arctan (I(u))/R(u) 稱為 f(x)的相位譜。
圖像的幅度譜代表的是圖像各像素點(diǎn)的亮度信息,即該像素應(yīng)該顯示什么顏色。幅度譜雖然存儲(chǔ)了各個(gè)像素點(diǎn)的幅值信息,但是原像素點(diǎn)的位置已經(jīng)被打亂,所以僅憑幅度譜是沒有辦法重構(gòu)原圖像的。幅度譜的中心是低頻部分,越亮的地方代表的幅度越大。
相位譜記錄的是所有點(diǎn)的相位信息,它保留了圖像的邊緣以及整體結(jié)構(gòu)的信息,沒有它將無法從品頻譜還原出原圖像。
幅度譜只包含圖像的灰度信息,對(duì)圖像內(nèi)容起決定性作用的是圖像的相位譜。利用相位譜記錄的位置信息和幅度譜記錄的亮度信息,就可以用雙譜重構(gòu)的方法恢復(fù)出原圖像。
舍棄相位譜,只用幅度譜逆傅里葉變換出來的圖像只有一個(gè)亮點(diǎn),無法恢復(fù)出原圖像。
如果將幅度譜均勻衰減到原幅度譜的一半,并和相位譜一起重建圖像,和原圖像相比,亮度為原圖像的一半。
15. 積分圖像
下圖展示了5×5大小的圖像像素矩陣。
(1)請(qǐng)用文字解釋積分圖像的定義,并說明其與傳統(tǒng)圖像表示的區(qū)別與特點(diǎn);
(2)計(jì)算該圖像中陰影3×3區(qū)域?qū)?yīng)的積分圖。
答:
對(duì)于一幅灰度的圖像,積分圖像中的任意一點(diǎn)(x,y)的值是指從圖像的左上角到這個(gè)點(diǎn)的所構(gòu)成的矩形區(qū)域內(nèi)所有的點(diǎn)的灰度值之和,表示如下:
傳統(tǒng)圖像的每個(gè)點(diǎn)的灰度值為該像素本身灰度i(x,y),而積分圖的每個(gè)點(diǎn)的灰度值為ii(x,y)。積分圖算法由Crow在1984年首次提出,是為了在多尺度透視投影中提高渲染速度。積分圖算法是一種快速計(jì)算圖像區(qū)域和以及圖像區(qū)域平方和的算法。它的核心思想就是對(duì)每一個(gè)圖像建立起自己的積分圖查找表,在圖像處理的階段就可以根據(jù)預(yù)先建立積分圖查找表直接查找從而實(shí)現(xiàn)對(duì)均值卷積的線性時(shí)間計(jì)算。做到了卷積執(zhí)行的時(shí)間與窗口大小無關(guān)。
該圖像中陰影3×3區(qū)域?qū)?yīng)的積分圖:文章來源:http://www.zghlxwxcb.cn/news/detail-759307.html
文章來源地址http://www.zghlxwxcb.cn/news/detail-759307.html
到了這里,關(guān)于《高級(jí)計(jì)算機(jī)視覺》期末樣題匯總的文章就介紹完了。如果您還想了解更多內(nèi)容,請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!