国产无码综合区,色欲AV无码国产永久播放,无码天堂亚洲国产AV,国产日韩欧美女同一区二区

<samp id="qcatp"></samp><style id="qcatp"></style>

<fieldset id="qcatp"><legend id="qcatp"><thead id="qcatp"></thead></legend></fieldset>

java怎么去掉html標(biāo)簽

2年前作者：10000guo分類：Toy博客閱讀(19)違法舉報

這篇具有很好參考價值的文章主要介紹了java怎么去掉html標(biāo)簽。希望對大家有所幫助。如果存在錯誤或未考慮完全的地方，請大家不吝賜教，您也可以點擊"舉報違法"按鈕提交疑問。

java去掉html標(biāo)簽的方法：
1、通過純正則方法去掉html標(biāo)簽；
2、使用“javax.swing.text.html.HTMLEditorKit”去掉html標(biāo)簽；
3、通過使用Jsoup框架去掉html標(biāo)簽等等。

一.純正則方法

import java.util.regex.Matcher; 
import java.util.regex.Pattern; 
 
public class HTMLSpirit{ 
    public static String delHTMLTag(String htmlStr){ 
        String regEx_script="<script[^>]*?>[\\s\\S]*?<\\/script>"; //定義script的正則表達(dá)式 
        String regEx_style="<style[^>]*?>[\\s\\S]*?<\\/style>"; //定義style的正則表達(dá)式 
        String regEx_html="<[^>]+>"; //定義HTML標(biāo)簽的正則表達(dá)式 
          
        Pattern p_script=Pattern.compile(regEx_script,Pattern.CASE_INSENSITIVE); 
        Matcher m_script=p_script.matcher(htmlStr); 
        htmlStr=m_script.replaceAll(""); //過濾script標(biāo)簽 
          
        Pattern p_style=Pattern.compile(regEx_style,Pattern.CASE_INSENSITIVE); 
        Matcher m_style=p_style.matcher(htmlStr); 
        htmlStr=m_style.replaceAll(""); //過濾style標(biāo)簽 
          
        Pattern p_html=Pattern.compile(regEx_html,Pattern.CASE_INSENSITIVE); 
        Matcher m_html=p_html.matcher(htmlStr); 
        htmlStr=m_html.replaceAll(""); //過濾html標(biāo)簽 
 
        return htmlStr.trim(); //返回文本字符串 
    } 
}

二.使用 javax.swing.text.html.HTMLEditorKit

import java.io.IOException;
import java.io.FileReader;
import java.io.Reader;
import java.util.List;
import java.util.ArrayList;
 
import javax.swing.text.html.parser.ParserDelegator;
import javax.swing.text.html.HTMLEditorKit.ParserCallback;
import javax.swing.text.html.HTML.Tag;
import javax.swing.text.MutableAttributeSet;
 
public class HTMLUtils {
  private HTMLUtils() {}
 
  public static List<String> extractText(Reader reader) throws IOException {
    final ArrayList<String> list = new ArrayList<String>();
 
    ParserDelegator parserDelegator = new ParserDelegator();
    ParserCallback parserCallback = new ParserCallback() {
      public void handleText(final char[] data, final int pos) {
        list.add(new String(data));
      }
      public void handleStartTag(Tag tag, MutableAttributeSet attribute, int pos) { }
      public void handleEndTag(Tag t, final int pos) {  }
      public void handleSimpleTag(Tag t, MutableAttributeSet a, final int pos) { }
      public void handleComment(final char[] data, final int pos) { }
      public void handleError(final java.lang.String errMsg, final int pos) { }
    };
    parserDelegator.parse(reader, parserCallback, true);
    return list;
  }
 
  public final static void main(String[] args) throws Exception{
    FileReader reader = new FileReader("java-new.html");
    List<String> lines = HTMLUtils.extractText(reader);
    for (String line : lines) {
      System.out.println(line);
    }
  }
}

三.使用Jsoup框架

import java.io.IOException;
import java.io.FileReader;
import java.io.Reader;
import java.io.BufferedReader;
import org.jsoup.Jsoup;
 
public class HTMLUtils {
  private HTMLUtils() {}
 
  public static String extractText(Reader reader) throws IOException {
    StringBuilder sb = new StringBuilder();
    BufferedReader br = new BufferedReader(reader);
    String line;
    while ( (line=br.readLine()) != null) {
      sb.append(line);
    }
    String textOnly = Jsoup.parse(sb.toString()).text();
    return textOnly;
  }
 
  public final static void main(String[] args) throws Exception{
    FileReader reader = new FileReader
          ("C:/RealHowTo/topics/java-language.html");
    System.out.println(HTMLUtils.extractText(reader));
  }

四.使用Apache Tika

mport java.io.FileInputStream;
import java.io.InputStream;
 
import org.apache.tika.metadata.Metadata;
import org.apache.tika.parser.AutoDetectParser;
import org.apache.tika.parser.ParseContext;
import org.apache.tika.parser.Parser;
import org.apache.tika.sax.BodyContentHandler;
import org.xml.sax.ContentHandler;
 
public class ParseHTMLWithTika {
  public static void main(String args[]) throws Exception {
 
    InputStream is = null;
    try {
 
         is = new FileInputStream("C:/Temp/java-x.html");
        WriteOutContentHandler contenthandler = new WriteOutContentHandler(100000000);
         Metadata metadata = new Metadata();
         Parser parser = new AutoDetectParser();
         parser.parse(is, contenthandler, metadata, new ParseContext());
         System.out.println(contenthandler.toString());
    }
    catch (Exception e) {
      e.printStackTrace();
    }
    finally {
        if (is != null) is.close();
    }
  }
}

注意這里經(jīng)過本人實驗有個小坑，WriteOutContentHandler參數(shù)是限制的字符數(shù)，這個如果不設(shè)置默認(rèn)是1萬，超過會報異常。

具體的jar包請自行到中央倉庫里搜索依賴配置

https://search.maven.org/ 和 https://mvnrepository.com/

工具類文章來源地址http://www.zghlxwxcb.cn/news/detail-646306.html

public class ResourceUtil {
    /**
     * 根據(jù)當(dāng)前類路徑，獲取資源文件夾對應(yīng)文件的所有字符串
     *
     * @param currentClass 如 this.class
     * @param resourcePath 如 /data/json/xxx.json （相對于resources文件夾）
     */
    public static String resource2String(Class currentClass, String resourcePath) throws IOException {
        return IOUtils.toString(new FileReader(new File(currentClass.getResource(resourcePath).getFile())));
    }
 
}

到了這里，關(guān)于java怎么去掉html標(biāo)簽的文章就介紹完了。如果您還想了解更多內(nèi)容，請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章，希望大家以后多多支持TOY模板網(wǎng)！

本文來自互聯(lián)網(wǎng)用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務(wù)，不擁有所有權(quán)，不承擔(dān)相關(guān)法律責(zé)任。如若轉(zhuǎn)載，請注明出處：如若內(nèi)容造成侵權(quán)/違法違規(guī)/事實不符，請點擊違法舉報進(jìn)行投訴反饋，一經(jīng)查實，立即刪除！

分享到：

領(lǐng)支付寶紅包贊助服務(wù)器費用

HTML的段落中怎么樣顯示出標(biāo)簽要使用的尖括號＜＞?
很簡單：符號用 lt; 替代；符號用 gt; 替代。示例代碼如下：運行效果如下：
2024年02月09日
瀏覽(19)
Hive學(xué)習(xí)（11）hive去除空格或去掉字符串前后或中間的某一字符串
問題: 需求：需要去除字符串中前N位/后N位指定字符例如：11ABC11 MySQL中可使用以下函數(shù)解決： Hive中可使用以下函數(shù)解決：其他函數(shù)亦可利用正則表達(dá)式規(guī)則自行變換替代。
2024年02月12日
瀏覽(54)
帝國cms后臺發(fā)布信息時的多圖上傳去掉鏈接a標(biāo)簽
修改編輯器文件 e/admin/ecmseditor/editorfun.php 文件里面的 $pic=\\\"\\\"; return $pic; } 改成 $pic=\\\"\\\"; return $pic; } 這樣就ok
2024年02月03日
瀏覽(90)
數(shù)據(jù)截取處理、富文本去除所有標(biāo)簽
前提：在做項目中有個需求是填寫表單后生成一份文檔，文檔可以編輯、保存。這部分用富文本處理了，涉及到的邏輯就是對象--富文本標(biāo)簽形式在給后端傳的數(shù)據(jù)格式再把富文本標(biāo)簽形式--對象形式。涉及到文字，圖片、表格，以及圖片表格的標(biāo)題。 ?item就是要處理的字
2024年01月22日
瀏覽(18)
Element UI 中 el-input 標(biāo)簽去掉邊框的一種辦法
最近在做一個簡單的頁面，想把 el-input 去掉邊框，在網(wǎng)上找了很多方法，最終發(fā)現(xiàn)下邊這種方法是可行的： HTML代碼：樣式代碼如下：注意這里的樣式語言用的是 less，其深度穿透寫法是 /deep/。由此便可去除 el-input 的外邊框。
2024年02月16日
瀏覽(22)
Python 之 Http 獲取網(wǎng)頁的 html 數(shù)據(jù)，并去掉 html 格式等相關(guān)信息
目錄 Python之 Http 獲取網(wǎng)頁的 html 數(shù)據(jù)，并去掉 html 格式等相關(guān)信息
2024年02月13日
瀏覽(14)
Unity C# 之 Http 獲取網(wǎng)頁的 html 數(shù)據(jù)，并去掉 html 格式等相關(guān)信息
目錄 Unity C# 之 Http 獲取網(wǎng)頁的 html 數(shù)據(jù)，并去掉 html 格式等相關(guān)信息一、簡單介紹二、實現(xiàn)原理三、注意事項四、效果預(yù)覽 ?五、關(guān)鍵代碼 Unity中的一些知識點整理。本節(jié)簡單介紹在Unity開發(fā)中的，使用 HttpClient，獲取指定網(wǎng)頁的相關(guān)信息，然后進(jìn)行數(shù)據(jù)清洗，去掉html 格
2024年02月12日
瀏覽(17)
html去除頁面的滑動條
CSS 禁止?jié)L動條，有4種方法，具體如下： 1、完全隱藏在body里直接加入，可隱藏滾動條； 2、另一種方法，寫在style type=\\\"text/css\\\"中
2024年02月16日
瀏覽(14)
短視頻平臺視頻怎么去掉水??？
短視頻怎么去水印，困擾很多人，例如，有些logo水印，動態(tài)水印等等，分享操作經(jīng)驗：抖音作為中國最受歡迎的社交娛樂應(yīng)用程序之一，已成為許多人日常生活中不可或缺的一部分。在使用抖音過程中，我們經(jīng)常會遇到一些需要保存或分享的視頻內(nèi)容。但是，抖音的水印卻
2024年02月14日
瀏覽(17)
微信小程序button按鈕怎么去掉邊框
代碼：結(jié)果如下：在css樣式里寫入：即可解決
2024年04月23日
瀏覽(163)

<style id="pwwmd"></style>