问题描述
比如说我通过地址获取到了网页的html文件,我现在想获取<span class="value" id="sku-discount-price" itemprop="price">6.89</span> 标签之间的6.89这个值,用java该怎么写呢?怎么做才是最合理的,我自己也尝试的写了一点,请各位高手们指教。如果有更好的方案,欢迎分享一下。<div class="inf-pnl-price-detail"> <dl> <dt>Price:</dt> <dd> <div class="price price-highlight"> <del class="original-price">US $ <span class="" id="sku-price">7.66</span> <span class="separator">/</span> <span class="unit">piece</span> </del> </div> </dd> <dt>Discount Price:</dt> <dd> <div class="price price-highlight"> <span class="currency" itemprop="priceCurrency" content="USD">US $</span> <span class="value" id="sku-discount-price" itemprop="price">6.89</span> <span class="separator">/</span><span class="unit"> piece </span> <span class="time-left">(7 days left )</span> </div> </dd> </dl> </div> 我自己尝试写的代码:public class TestUrl {public static void main(String[] args) {Long l1 = System.currentTimeMillis();String string = "http://www.aliexpress.com/item/10pcs-lot-New-arrival-Hot-sale-fashion-hoomia-jonadab-magicpencil-magic-pencil-earphones-in-earfree-shipping/848760252.html";String str3 = "";String str[] = new String[750];String str2 = "";int i = 0;try {URL readSource = new URL(string);BufferedReader input = new BufferedReader(new InputStreamReader(readSource.openStream()));input.skip(15555);while((str2 = input.readLine()) !=null){str[i] = str2;i++;}str3 = str[1]+str[2]+str[3]+str[4]+str[5]+str[6]+str[7];System.out.println("1====================>"+str3);} catch (Exception e) {e.printStackTrace();}String tempStr2 = str3.replaceAll(".*itemprop="price">", "");String tempStr3 = tempStr2.replaceAll("</span>.*", "");System.out.println("tempStr2:"+tempStr3);Long l2 = System.currentTimeMillis();System.out.println("time:"+(l2-l1));}}
解决方案
直接使用jsoup css选择器语法 进行选择
解决方案二:
String content = "你的网页内容"; Pattern p = Pattern.compile("<span[^>]+sku-discount-price[^>]+>([0-9.]+)</span>"); Matcher m = p.matcher(content); if (m.find()) { String str =m.group(1);//你的要的结果 System.out.println(str); }