我正在做一个纯粹的C99练习项目,它可以为我们学校的CAS登录系统做一个登录模拟。
现在我正在尝试使用Gumbo解析器来解析我们学校的登录页面。这里是表单部分,在运行POST请求提交表单之前,我需要从其中获取登录票,这是--名为"lt".的“隐藏”类型输入元素(即与<input type="hidden" name="lt" value="LT-000000-b4LktCXyzXyzXyzXyzXyzXyz" />一行,我需要解析“<input type="hidden" name="lt" value="LT-000000-b4LktCXyzXyzXyzXyzXyzXyz" />”)。
我编写了一些代码,但似乎无法找到这个输入元素。下面是我的C程序的功能:
const char * parse_login_ticket_old(char * raw_html)
{
// Parse HTML into Gumbo memory structure
GumboOutput * gumbo_output = gumbo_parse(raw_html);
// Prepare the node
GumboNode * gumbo_root = gumbo_output->root;
assert(gumbo_root->type == GUMBO_NODE_ELEMENT);
assert(gumbo_root->v.element.children.length >= 2);
const GumboVector* root_children = &gumbo_root->v.element.children;
GumboNode* page_body = NULL;
for (int i = 0; i < root_children->length; ++i)
{
GumboNode* child = root_children->data[i];
if (child->type == GUMBO_NODE_ELEMENT && child->v.element.tag == GUMBO_TAG_BODY)
{
page_body = child;
break;
}
}
assert(page_body != NULL);
GumboVector* page_body_children = &page_body->v.element.children;
for (int i = 0; i < page_body_children->length; ++i)
{
GumboNode* child = page_body_children->data[i];
GumboAttribute * input_name_attr = gumbo_get_attribute(&child->v.element.attributes, "name");
if (child->type == GUMBO_NODE_ELEMENT && child->v.element.tag == GUMBO_TAG_INPUT && strcmp(input_name_attr->value, "lt") == 0)
{
GumboAttribute * input_value_attr = gumbo_get_attribute(&child->v.element.attributes, "value");
return input_name_attr->value;
}
}
return NULL;
}如果有人需要调试,下面是我们学校的一个例子。可能的敏感数据已被删除。
<body>
<div id="wrapper">
<div id="contentArea" role="main">
<div class="form login" role="form">
<h2 class="hidden">Login</h2>
<form id="fm1" class="fm-v clearfix" action="/schoolcas/login?jsessionid=1234567890" method="post"><div class="formRow">
<label for="username" class="label">Student ID</label>
<div class="textBox">
<input id="username" name="username" class="schoolcas text" aria-required="true" type="text" value="" size="25" maxlength="25"/></div>
</div>
<div class="formRow">
<label for="password" class="label">Password</label>
<div class="textBox">
<input id="password" name="password" class="schoolcas text" aria-required="true" type="password" value="" size="25" autocomplete="off"/></div>
</div>
<div class="formRow">
<input type="hidden" name="lt" value="LT-000000-b4LktCXyzXyzXyzXyzXyzXyz" />
<input type="hidden" name="execution" value="e2s1" />
<input type="hidden" name="_eventId" value="submit" />
<input class="button grey submit" name="submit" value="Login" type="submit" />
</div>
</form>
</div>
</div>
</div>
</body>无论如何,我的程序似乎只是停留在body元素的顶部,然后返回NULL。
,所以我想知道如何进行正确的搜索,并找到我需要的输入元素?。
发布于 2017-05-15 10:17:56
我已经从谷歌的示例代码(links.cc)中自己解决了这个问题。
这是密码。很烂,但不管怎样都能用。
const char * find_attribute(GumboNode * current_node, GumboTag element_tag_type,
char * element_term_key, char * element_term_value, char * desired_result_key)
{
const char * lt_token = NULL;
// Return NULL if it is in WHITESPACE
if (current_node->type != GUMBO_NODE_ELEMENT)
{
return NULL;
}
// Set the element's term key,
// e.g. if we need to find something like <input name="foobar"> then element search term key is "name",
// and element search value is "foobar"
GumboAttribute* lt_attr = gumbo_get_attribute(¤t_node->v.element.attributes, element_term_key);
if (lt_attr != NULL && current_node->v.element.tag == element_tag_type && (strcmp(lt_attr->value, element_term_value) == 0))
{
lt_token = gumbo_get_attribute(¤t_node->v.element.attributes, desired_result_key)->value;
return lt_token;
}
GumboVector* children = ¤t_node->v.element.children;
for (unsigned int i = 0; i < children->length; ++i)
{
lt_token = find_attribute(children->data[i], element_tag_type,
element_term_key, element_term_value, desired_result_key);
// Force stop and return if it gets a non-null result.
if(lt_token != NULL)
{
return lt_token;
}
}
}https://stackoverflow.com/questions/43973535
复制相似问题