我正在尝试执行对https://sede.educacion.gob.es/publiventa/catalogo.action?cod=E的GET请求;使用cod=E参数,在浏览器中,网站在"Materiasón“下面打开一个菜单,但是当我使用C#执行请求时,这个菜单不会加载,我需要它。这是我将readHtml作为字符串使用的代码,用于稍后使用HtmlAgilityPack解析它。
private string readHtml(string urlAddress)
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(urlAddress);
request.UserAgent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:56.0) Gecko/20100101 Firefox/56.0";
request.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
request.AutomaticDecompression = DecompressionMethods.GZip;
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
if (response.StatusCode == HttpStatusCode.OK)
{
Stream receiveStream = response.GetResponseStream();
StreamReader readStream = null;
if (response.CharacterSet == null)
{
readStream = new StreamReader(receiveStream);
}
else
{
readStream = new StreamReader(receiveStream, Encoding.GetEncoding(response.CharacterSet));
}
string data = readStream.ReadToEnd();
response.Close();
readStream.Close();
return data;
}
return null;
}发布于 2018-03-16 02:28:52
您发布的Uri (https://sede.educacion.gob.es/publiventa/catalogo.action?cod=E)使用Javascript开关显示菜单内容。
当您连接到该Uri (不单击菜单链接)时,该站点将显示该页面的三个不同版本。
1)关闭菜单的页面和拟议的新版本
2)带有关闭菜单和搜索引擎字段的页面
3)打开菜单的页面和菜单内容的选择
此切换基于记录当前会话的内部过程。除非单击菜单链接(连接到事件侦听器),否则Javascript proc将显示处于不同状态的页面。
我看了一下它;这些脚本相当长(一个完整的多用途库),我没有时间解析所有这些(也许您可以这样做)来找出事件侦听器正在传递哪些参数。
但是,三状态版本开关是不变的。
我的意思是,你可以三次调用那个页面,保存Cookie容器:当你第三次连接到它时,它会流出整个菜单内容和它的链接。
如果请求三次相同的页面,则第三次Html页面将包含所有
Materias de educación链接
public async void SomeMethodAsync()
{
string HtmlPage = await GetHttpStream([URI]);
HtmlPage = await GetHttpStream([URI]);
HtmlPage = await GetHttpStream([URI]);
}或多或少,这就是我用来得到那一页的东西:
CookieContainer CookieJar = new CookieContainer();
public async Task<string> GetHttpStream(Uri HtmlPage)
{
HttpWebRequest httpRequest;
string Payload = string.Empty;
httpRequest = WebRequest.CreateHttp(HtmlPage);
try
{
httpRequest.CookieContainer = CookieJar;
httpRequest.KeepAlive = true;
httpRequest.ConnectionGroupName = Guid.NewGuid().ToString();
httpRequest.AllowAutoRedirect = true;
httpRequest.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
httpRequest.ServicePoint.MaxIdleTime = 30000;
httpRequest.ServicePoint.Expect100Continue = false;
httpRequest.UserAgent = "Mozilla/5.0 (Windows NT 10; Win64; x64; rv:56.0) Gecko/20100101 Firefox/56.0";
httpRequest.Accept = "ext/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
httpRequest.Headers.Add(HttpRequestHeader.AcceptLanguage, "es-ES,es;q=0.8,en-US;q=0.5,en;q=0.3");
httpRequest.Headers.Add(HttpRequestHeader.AcceptEncoding, "gzip, deflate;q=0.8");
httpRequest.Headers.Add(HttpRequestHeader.CacheControl, "no-cache");
using (HttpWebResponse httpResponse = (HttpWebResponse)await httpRequest.GetResponseAsync())
{
Stream ResponseStream = httpResponse.GetResponseStream();
if (httpResponse.StatusCode == HttpStatusCode.OK)
{
try
{
//ResponseStream.Position = 0;
Encoding encoding = Encoding.GetEncoding(httpResponse.CharacterSet);
using (MemoryStream _memStream = new MemoryStream())
{
if (httpResponse.ContentEncoding.Contains("gzip"))
{
using (GZipStream _gzipStream = new GZipStream(ResponseStream, System.IO.Compression.CompressionMode.Decompress))
{
_gzipStream.CopyTo(_memStream);
};
}
else if (httpResponse.ContentEncoding.Contains("deflate"))
{
using (DeflateStream _deflStream = new DeflateStream(ResponseStream, System.IO.Compression.CompressionMode.Decompress))
{
_deflStream.CopyTo(_memStream);
};
}
else
{
ResponseStream.CopyTo(_memStream);
}
_memStream.Position = 0;
using (StreamReader _reader = new StreamReader(_memStream, encoding))
{
Payload = _reader.ReadToEnd().Trim();
};
};
}
catch (Exception)
{
Payload = string.Empty;
}
}
}
}
catch (WebException exW)
{
if (exW.Response != null)
{
//Handle WebException
}
}
catch (System.Exception exS)
{
//Handle System.Exception
}
CookieJar = httpRequest.CookieContainer;
return Payload;
}https://stackoverflow.com/questions/49291852
复制相似问题