国产无码综合区,色欲AV无码国产永久播放,无码天堂亚洲国产AV,国产日韩欧美女同一区二区

java中使用Jsoup和Itext實(shí)現(xiàn)將html轉(zhuǎn)換為PDF

2年前作者：李楷杰分類：Toy博客閱讀(24)違法舉報(bào)

這篇具有很好參考價(jià)值的文章主要介紹了java中使用Jsoup和Itext實(shí)現(xiàn)將html轉(zhuǎn)換為PDF。希望對(duì)大家有所幫助。如果存在錯(cuò)誤或未考慮完全的地方，請(qǐng)大家不吝賜教，您也可以點(diǎn)擊"舉報(bào)違法"按鈕提交疑問。

1.在build.gradle中安裝所需依賴：

implementation group: 'com.itextpdf', name: 'itextpdf', version: '5.5.13'
implementation group: 'com.itextpdf.tool', name: 'xmlworker', version: '5.5.13'
implementation group: 'org.jsoup', name: 'jsoup', version: '1.15.3'

2.創(chuàng)建工具類，實(shí)現(xiàn)轉(zhuǎn)換方法

/**
     * convert the html to pdf.
     */
    public void htmlToPdf(String oldFilePath, String newFilePath) throws IOException, com.itextpdf.text.DocumentException {
        Document doc = Jsoup.parse(new File(oldFilePath), "UTF-8");
        // jsoup標(biāo)準(zhǔn)化標(biāo)簽，生成閉合標(biāo)簽
        doc.outputSettings().syntax(org.jsoup.nodes.Document.OutputSettings.Syntax.xml);
        doc.outputSettings().escapeMode(Entities.EscapeMode.xhtml);

        com.itextpdf.text.Document document = new com.itextpdf.text.Document(PageSize.A4, 36, 36, 36, 36);
        PdfWriter pdfWriter = PdfWriter.getInstance(document, new FileOutputStream(newFilePath));
        document.open();
        //html to pdf, base64 image support.
        final TagProcessorFactory tagProcessorFactory = Tags.getHtmlTagProcessorFactory();
        tagProcessorFactory.removeProcessor(HTML.Tag.IMG);
        tagProcessorFactory.addProcessor(new ImageTagRefreshFilter(), HTML.Tag.IMG);
        
        //設(shè)置中文字體
        final CssFilesImpl cssFiles = new CssFilesImpl();
        cssFiles.add(XMLWorkerHelper.getInstance().getDefaultCSS());
        final StyleAttrCSSResolver cssResolver = new StyleAttrCSSResolver(cssFiles);
        final HtmlPipelineContext hpc = new HtmlPipelineContext(new CssAppliersImpl(asianFontRefreshFilter));
        hpc.setAcceptUnknown(true).autoBookmark(true).setTagFactory(tagProcessorFactory);
        final HtmlPipeline htmlPipeline = new HtmlPipeline(hpc, new PdfWriterPipeline(document, pdfWriter));
        final Pipeline<?> pipeline = new CssResolverPipeline(cssResolver, htmlPipeline);

        final XMLWorker worker = new XMLWorker(pipeline, true);
        final Charset charset = StandardCharsets.UTF_8;
        final XMLParser xmlParser = new XMLParser(true, worker, charset);
        InputStream inputStream = new ByteArrayInputStream(doc.html().getBytes());
        xmlParser.parse(inputStream, charset);
//        XMLWorkerHelper.getInstance().parseXHtml(pdfWriter, document, inputStream, Charset.forName("UTF-8"));
        document.close();
    }

3.base64過濾類：

import com.itextpdf.text.Chunk;
import com.itextpdf.text.Element;
import com.itextpdf.text.Image;
import com.itextpdf.text.pdf.codec.Base64;
import com.itextpdf.tool.xml.NoCustomContextException;
import com.itextpdf.tool.xml.Tag;
import com.itextpdf.tool.xml.WorkerContext;
import com.itextpdf.tool.xml.exceptions.RuntimeWorkerException;
import com.itextpdf.tool.xml.html.HTML;
import com.itextpdf.tool.xml.pipeline.html.HtmlPipelineContext;

public class ImageTagRefreshFilter extends com.itextpdf.tool.xml.html.Image {

    /**
     * html to pdf, base64 image support.
     * */

    @Override
    public List<Element> end(final WorkerContext ctx, final Tag tag, final List<Element> currentContent) {
        final Map<String, String> attributes = tag.getAttributes();
        String src = attributes.get(HTML.Attribute.SRC);
        List<Element> elements = new ArrayList<Element>(1);
        if (null != src && src.length() > 0) {
            Image img = null;
            if (src.startsWith("data:image/")) {
                final String base64Data = src.substring(src.indexOf(",") + 1);
                try {
                    img = Image.getInstance(Base64.decode(base64Data));
                } catch (Exception e) {
                    throw new RuntimeException(e);
                }
                if (img != null) {
                    try {
                        final HtmlPipelineContext htmlPipelineContext = getHtmlPipelineContext(ctx);
                        elements.add(getCssAppliers().apply(new Chunk((com.itextpdf.text.Image) getCssAppliers().apply(img, tag, htmlPipelineContext), 0, 0, true), tag,
                                htmlPipelineContext));
                    } catch (NoCustomContextException e) {
                        throw new RuntimeWorkerException(e);
                    }
                }
            }

            if (img == null) {
                elements = super.end(ctx, tag, currentContent);
            }
        }
        return elements;
    }
}

4.字體類代碼，window用戶可在C:\windows\font\中尋找自己所需字體即可。我這里用的為黑體：

simhei.ttf

import com.itextpdf.text.Font;
import com.itextpdf.text.pdf.BaseFont;
import com.itextpdf.tool.xml.XMLWorkerFontProvider;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Component;


@Component
public class AsianFontRefreshFilter extends XMLWorkerFontProvider {
    //此處寫字體文件的絕對(duì)路徑
    private String fontPath;

    @Override
    public Font getFont(String fontname, String encoding, float size, final int style) {
        try {
            //字體文件絕對(duì)路徑
            BaseFont bfChinese = BaseFont.createFont(fontPath, BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
            return new Font(bfChinese, size, style);
        } catch (Exception e) {
            e.printStackTrace();
        }
        return super.getFont(fontname, encoding, size, style);
    }
}

效果如下：

html頁面預(yù)覽：

java中使用Jsoup和Itext實(shí)現(xiàn)將html轉(zhuǎn)換為PDF,java

pdf頁面預(yù)覽：

java中使用Jsoup和Itext實(shí)現(xiàn)將html轉(zhuǎn)換為PDF,java

?文章來源地址http://www.zghlxwxcb.cn/news/detail-627124.html

到了這里，關(guān)于java中使用Jsoup和Itext實(shí)現(xiàn)將html轉(zhuǎn)換為PDF的文章就介紹完了。如果您還想了解更多內(nèi)容，請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章，希望大家以后多多支持TOY模板網(wǎng)！

本文來自互聯(lián)網(wǎng)用戶投稿，該文觀點(diǎn)僅代表作者本人，不代表本站立場(chǎng)。本站僅提供信息存儲(chǔ)空間服務(wù)，不擁有所有權(quán)，不承擔(dān)相關(guān)法律責(zé)任。如若轉(zhuǎn)載，請(qǐng)注明出處：如若內(nèi)容造成侵權(quán)/違法違規(guī)/事實(shí)不符，請(qǐng)點(diǎn)擊違法舉報(bào)進(jìn)行投訴反饋，一經(jīng)查實(shí)，立即刪除！

分享到：

領(lǐng)支付寶紅包贊助服務(wù)器費(fèi)用

使用Python 實(shí)現(xiàn) PDF 到 HTML 的轉(zhuǎn)換
PDF 文件是共享和分發(fā)文檔的常用選擇，但提取和再利用 PDF 文件中的內(nèi)容可能會(huì)非常麻煩。而利用 Python 將 PDF 文件轉(zhuǎn)換為 HTML?是解決此問題的理想方案之一，這樣做可以增強(qiáng)文檔可訪問性，使文檔可搜索，同時(shí)增強(qiáng)文檔在不同場(chǎng)景中的實(shí)用性。此外，HTML 格式使得搜索引擎能
2024年04月08日
瀏覽(25)
SpringBoot、Java 使用 Jsoup 解析 HTML 頁面
什么是 Jsoup？ Jsoup 是一個(gè)用于處理 HTML 頁面的 Java 庫，它提供了簡(jiǎn)單的 API，使得從 HTML 中提取數(shù)據(jù)變得非常容易。無論是獲取特定標(biāo)簽的內(nèi)容還是遍歷整個(gè)頁面的元素，Jsoup 都能輕松勝任。如何使用 Jsoup 解析 HTML 頁面？首先，確保你的 Java 項(xiàng)目中已經(jīng)添加了 Jsoup 的依賴。
2024年02月13日
瀏覽(24)
Itext生成pdf文件，html轉(zhuǎn)pdf時(shí)中文一直顯示不出來
嘗試好多種方式，最后可能是跟字體有關(guān)系字體設(shè)置為C:/Windows/Fonts/simhei.ttf? 黑體，同時(shí)html頁面上樣式要添加 pdf生成方式參考項(xiàng)目：E:myfilesprojectgithubdemo-html2pdf 字體問題參考文章：https://blog.51cto.com/u_15127651/4527950 最后完美解決字體問題??！
2024年02月20日
瀏覽(19)
使用IText導(dǎo)出復(fù)雜pdf
? ? ? ? 需要將發(fā)票導(dǎo)出成pdf，要求每頁都必須包含發(fā)票信息和表頭行。 ? ? ? ? 使用IText工具實(shí)現(xiàn)PDF導(dǎo)出 ????????IText8文檔：Examples (itextpdf.com) ? ? ?? ? ????????引入Itext依賴，我這里用的是8.0.1版本 ????????測(cè)試一下： ? ? ? ? IText8不支持中文，需要引入外部
2024年02月12日
瀏覽(17)
Java中Itext生成Pdf,并給PdfCell添加圖片
2024年02月16日
瀏覽(19)
使用itext7為pdf文檔添加水印
iText7是一款功能強(qiáng)大的開源PDF處理庫，用于創(chuàng)建、編輯和處理PDF文檔。相比于iTextSharp，iText7具有更先進(jìn)的功能和更好的性能。添加水印是iText7的一個(gè)常見應(yīng)用場(chǎng)景。水印可以用于保護(hù)文檔的版權(quán)，標(biāo)識(shí)文檔的狀態(tài)或來源等。使用iText7添加水印可以通過以下步驟實(shí)現(xiàn)：導(dǎo)入
2024年04月22日
瀏覽(27)
【Java】OpenPDF、iText、PDFBox 是三種常用的 PDF 處理庫
OpenPDF、iText、PDFBox 是三種常用的 PDF 處理庫，它們各自具有獨(dú)特的優(yōu)勢(shì)和特點(diǎn)，同時(shí)也存在一些局限性和差異。本文將對(duì)這四種庫進(jìn)行詳細(xì)的比較，并通過代碼示例來展示它們的使用。 1、OpenPDF OpenPDF 是一個(gè)用于創(chuàng)建和編輯 PDF 文檔的 Java 庫，它基于 iText 庫的一個(gè)分支，提供
2024年02月09日
瀏覽(34)
Java 使用 poi 和 aspose 實(shí)現(xiàn) word 模板數(shù)據(jù)寫入并轉(zhuǎn)換 pdf 增加水印
本項(xiàng)目所有源碼和依賴資源都在文章頂部鏈接，有需要可以下載使用 1. 需求描述從指定位置讀取一個(gè) word 模板獲取業(yè)務(wù)數(shù)據(jù)并寫入該 word 模板，生成新的 word 文檔將新生成的 word 文檔轉(zhuǎn)換為 pdf 格式對(duì) pdf 文檔添加水印 2. 效果預(yù)覽 word 模板帶水印的 pdf 文檔 3. 實(shí)現(xiàn)思路
2024年02月08日
瀏覽(29)
java 中開源的html解析庫Jsoup 簡(jiǎn)單例子
????????下面是一個(gè)使用Jsoup庫解析HTML的簡(jiǎn)單Java例子。這個(gè)例子展示了如何使用Jsoup從一個(gè)HTML字符串中提取數(shù)據(jù)。 ????????首先，確保你已經(jīng)將Jsoup作為依賴項(xiàng)添加到你的項(xiàng)目中。如果你使用的是Maven，可以在`pom.xml`文件中添加以下依賴： ???????? ???????? 然后
2024年02月22日
瀏覽(24)
記錄使用iText7查找PDF內(nèi)容關(guān)鍵字坐標(biāo)，加蓋電子簽名、印章
項(xiàng)目以前簽字都是由C端那邊進(jìn)行合成操作，最近項(xiàng)目要求把那塊功能，由后端進(jìn)行實(shí)現(xiàn)，其中包含坐標(biāo)、、任意位置進(jìn)行簽字操作，坐標(biāo)是最容易實(shí)現(xiàn)的，曾經(jīng)也寫過類似的功能在（添加圖片印章到PDF）直接復(fù)用就可以了為了實(shí)現(xiàn)位置簽字，在網(wǎng)上查找了挺多
2024年02月07日
瀏覽(96)