问题描述
比如获取http://bbs.csdn.net/home的网页源文件<!DOCTYPEhtmlPUBLIC"-//W3C//DTDXHTML1.0Transitional//EN""http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><htmlclass='csdn-bbs'><head><scriptid="allmobilize"charset="utf-8"src="http://a.yunshipei.com/1327c36bdd7197e30fd9f4b48d1a5bcc/allmobilize.min.js"></script><metahttp-equiv="Cache-Control"content="no-transform"/><linkrel="alternate"media="handheld"href="#"/><metahttp-equiv="Content-Type"content="text/html;charset=utf-8"/>→<title>CSDN论坛首页-CSDN.NET-CSDN论坛-CSDN.NET-中国最大的IT技术社区</title><linkhref="/assets/index-d4be409d127e37381f8dd32f3df7a29d.css"media="screen"rel="stylesheet"type="text/css"/><linkhref="//static.csdn.net/public/themes/default/css/btn.css"media="screen"rel="stylesheet"type="text/css"/>→<scriptsrc="/assets/application-99904a9de34bdfafccfcb75f0ffbe29d.js"type="text/javascript"></script><linkhref="/assets/main-2b48a4310d11c3a55009510efa912fed.css"media="screen"rel="stylesheet"type="text/css"/><scriptsrc="http://counter.csdn.net/a/js/AreaCounter.js"type="text/javascript"charset="utf-8"></script><linkhref="http://c.csdnimg.cn/public/favicon.ico"rel="SHORTCUTICON"><linkrel="stylesheet"href="http://static.csdn.net/public/common/toolbar/css/index.css">…………省略1300行
我只需要读取到<title>CSDN论坛首页-CSDN.NET-CSDN论坛-CSDN.NET-中国最大的IT技术社区</title>或者<scriptsrc="/assets/application-99904a9de34bdfafccfcb75f0ffbe29d.js"type="text/javascript"></script>然后后面1300多行都不读取,直接停止,虽然完整获取时间只需要0.5秒,但乘上万、十万之后时间会大大增加,效率低下注意:单获取源文件这种我会,但是要可以读取到指定字符或行数时停止的DimXmlHttp,HtmlXmlHttp=CreateObject("MSxml2.XMLHTTP")XmlHttp.Open("GET","http://bbs.csdn.net/home",False)XmlHttp.Send()Html=XmlHttp.responseText
需要各位高手附上达到目的的声明过程和所要用到的控件或其他,越详细越好(最好能返回成string类型)PS:由于问题比较冷门所以分较少如果有能实现的,可追加到100分
解决方案
解决方案二:
参考:HowtoreadtheresponsestreambeforetheHttpresponsecompletes
解决方案三:
参考(基本上和上面的说的是同一种方案):Using.NETHttpClienttocapturepartialResponses
解决方案四:
引用2楼findcaiyzh的回复:
参考(基本上和上面的说的是同一种方案):Using.NETHttpClienttocapturepartialResponses
是这段的意思吗[TestMethod]publicasyncTaskHttpClientGetStreamTest(){stringurl="http://west-wind.com/presentations/DotnetWebRequest/DotNetWebREquest.htm";intsize=1000;using(varhttpclient=newHttpClient()){httpclient.DefaultRequestHeaders.Range=newRangeHeaderValue(0,size);varresponse=awaithttpclient.GetAsync(url,HttpCompletionOption.ResponseHeadersRead);using(varstream=awaitresponse.Content.ReadAsStreamAsync()){varbytes=newbyte[size];varbytesread=stream.Read(bytes,0,bytes.Length);stream.Close();}}}
能帮我转成VBnet不软件帮我转成这样,不能用[TestMethodUnknownpublicDimTaskAsasyncHttpClientGetStreamTest{DimurlAsString="http://west-wind.com/presentations/DotnetWebRequest/DotNetWebREquest.htm"DimsizeAsInteger=1000Imports(DimhttpclientAsvar=NewHttpClientDimresponseAsvar=awaitImports(DimstreamAsvar=awaitDimbytesreadAsvar=stream.Read(bytes,0,bytes.Length)stream.CloseUnknownUnknownUnknown
要是能实际运行成功就更好了,然后全部代码贴一下,我给你加分啊大神!
解决方案五:
还要考虑响应速度问题,最好一秒能获取100个左右的网址,当然,只截取前面几行比如用这种方法,完全代码响应速度大概是每秒4个DimMyClientAsNet.WebClient=NewNet.WebClientDimMyReaderAsNewSystem.IO.StreamReader(MyClient.OpenRead(“http://bbs.csdn.net/home”),System.Text.Encoding.UTF8)DimMyWebCodeAsString=MyReader.ReadToEndMyReader.Close()
而这种是3个或者2个DimXmlHttp,HtmlXmlHttp=CreateObject("MSxml2.XMLHTTP")XmlHttp.Open("GET","http://bbs.csdn.net/home",False)XmlHttp.Send()Html=XmlHttp.responseText