我需要先登录某个网站,然后才能尝试下载数据。我正在尝试通过PHP脚本来做到这一点。
登录页面位于:
http://abc.example.com/login.aspx它的形式如下(去掉无关的元素):
<form name="Form1" method="post" action="login.aspx" id="Form1">
<input type="hidden" name="__EVENTTARGET" value="" />
<input type="hidden" name="__EVENTARGUMENT" value="" />
<input type="hidden" name="__VIEWSTATE" value="dDw1MTU3NTkxNTI7O2w8Y2hrYm94Oz4+08TlRVm+gb75yz3dIctChP3qf/E=" />
<script language="javascript" type="text/javascript">
<!--
function __POSTBack(eventTarget, eventArgument) {
var thisform;
if (window.navigator.appName.toLowerCase().indexOf("microsoft") > -1) {
thisform = document.Form1;
}
else {
thisform = document.forms["Form1"];
}
thisform.eventTarget.value = eventTarget.split("$").join(":");
thisform.eventArgument.value = eventArgument;
thisform.submit();
}
// -->
</script>
<input name="text_userid" id="text_userid" type="text" />
<input name="text_password" type="password" maxlength="30" id="text_password" />
<input name="chkbox" id="chkbox" type="checkbox" value="checkbox" />
<a id="submit" href="javascript:__POSTBack('submit','')"><img src="images/login.jpg"></a>
</form>我试图通过PHP登录它,但我不确定我做错了什么。这就是我要做的:
<?php
$pages = array(
'login_pre' => 'http://abc.example.com/login.aspx',
'login' => 'http://abc.example.com/login.aspx');
$ch = curl_init();
//Set options for curl session
$options = array(CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6',
CURLOPT_HEADER => TRUE,
CURLOPT_RETURNTRANSFER => TRUE,
CURLOPT_COOKIEFILE => 'cookie.txt',
CURLOPT_COOKIEJAR => 'cookies.txt');
$options[CURLOPT_URL] = $pages['login_pre'];
curl_setopt_array($ch, $options);
$login_pre_content = curl_exec($ch);
preg_match('/__VIEWSTATE" value="(.*)"/', $login_pre_content, $matches);
$VIEWSTATE = $matches[1];
//Login
$options[CURLOPT_URL] = $pages['login'];
$options[CURLOPT_POST] = TRUE;
$options[CURLOPT_POSTFIELDS] = '__EVENTTARGET=submit&__EVENTARGUMENT=&__VIEWSTATE='.$VIEWSTATE.'&text_userid=MYEMAIL&text_password=MYPASSWORD&chkbox=on';
$options[CURLOPT_FOLLOWLOCATION] = TRUE;
curl_setopt_array($ch, $options);
$login_post_content = curl_exec($ch);
echo $login_post_content;
//Close curl session
curl_close($ch);
?>如果我执行上述操作,我会收到以下错误消息(在$login_post_content生成的网页中):
Invalid length for a Base-64 char array.堆栈跟踪如下:
[FormatException: Invalid length for a Base-64 char array.]
System.Convert.FromBase64String(String s) +0
System.Web.UI.LosFormatter.Deserialize(String input) +24
System.Web.UI.Page.LoadPageStateFromPersistenceMedium() +101
[HttpException (0x80004005): Invalid_Viewstate
Client IP: MYIPADDRESS
Port: 24885
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6
ViewState: dDw1MTU3NTkxNTI7O2w8Y2hrYm94Oz4 08TlRVm gb75yz3dIctChP3qf/E=
Http-Referer:
Path: /login.aspx.]
System.Web.UI.Page.LoadPageStateFromPersistenceMedium() +442
System.Web.UI.Page.LoadPageViewState() +18
System.Web.UI.Page.ProcessRequestMain() +441但是,如果我按如下方式修改代码:
$options[CURLOPT_POSTFIELDS] = '...&__VIEWSTATE='.base64_encode($VIEWSTATE).'...');我得到以下错误消息:
Unable to validate data.堆栈跟踪如下:
[HttpException (0x80004005): Unable to validate data.]
System.Web.Configuration.MachineKey.GetDecodedData(Byte[] buf, Byte[] modifier, Int32 start, Int32 length, Int32& dataLength) +195
System.Web.UI.LosFormatter.Deserialize(String input) +59
[HttpException (0x80004005): Authentication of viewstate failed. 1) If this is a cluster, edit <machineKey> configuration so all servers use the same validationKey and validation algorithm. AutoGenerate cannot be used in a cluster. 2) Viewstate can only be posted back to the same page. 3) The viewstate for this page might be corrupted.]
System.Web.UI.LosFormatter.Deserialize(String input) +117
System.Web.UI.Page.LoadPageStateFromPersistenceMedium() +101
[HttpException (0x80004005): Invalid_Viewstate
Client IP: MYIPADDRESS
Port: 25029
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6
ViewState: ZER3MU1UVTNOVGt4TlRJN08ydzhZMmhyWW05NE96NCswOFRsUlZtK2diNzV5ejNkSWN0Q2hQM3FmL0U9
Http-Referer:
Path: /login.aspx.]
System.Web.UI.Page.LoadPageStateFromPersistenceMedium() +442
System.Web.UI.Page.LoadPageViewState() +18
System.Web.UI.Page.ProcessRequestMain() +441发布于 2011-02-07 14:35:56
表单字段的值需要进行urlencoded。特别是,需要对ViewState的值进行适当的and编码(其中包含+和=符号,需要对其进行转义)。
https://stackoverflow.com/questions/4918639
复制相似问题