我发现我可以使用XDocument . https://bitbucket.org/neuecc/sgmlreader.sl/从html生成https://bitbucket.org/neuecc/sgmlreader.sl/对象。
密码是这样的。
public XDocument Html(TextReader reader)
{
XDocument xml;
using (var sgmlReader = new SgmlReader { DocType = "HTML", CaseFolding = CaseFolding.ToLower, InputStream = reader })
{
xml = XDocument.Load(sgmlReader);
}
return xml;
}此外,我们还可以从XDocument对象中获取img标记的src属性。
var ns = xml.Root.Name.Namespace;
var imgQuery = xml.Root.Descendants(ns + "img")
.Select(e => new
{
Link = e.Attribute("src").Value
});同时,我们还可以下载和转换图像的流数据到BASE64字符串。
public static string base64String;
WebClient wc = new WebClient();
wc.OpenReadAsync(new Uri(url)); //image url from src attribute
wc.OpenReadCompleted += new OpenReadCompletedEventHandler(wc_OpenReadCompleted);
void wc_OpenReadCompleted(object sender, OpenReadCompletedEventArgs e)
{
using (MemoryStream ms = new MemoryStream())
{
while (true)
{
byte[] buf = new byte[32768];
int read = e.Result.Read(buf, 0, buf.Length);
if (read > 0)
{
ms.Write(buf, 0, read);
}
else { break; }
}
byte[] imageBytes = ms.ToArray();
base64String = Convert.ToBase64String(imageBytes);
}
}所以,我想做的是下面的步骤。我想在一个方法链中做下面的步骤,比如LINQ或反应性扩展。
最简单的源和输出在这里。
有人知道解决这个问题的办法吗?
我想问一下专家。
发布于 2012-01-19 06:01:29
LINQ和Rx的设计都是为了促进产生新对象的转换,而不是修改现有对象的转换,但这仍然是可行的。你已经完成了第一步,把任务分成几个部分。下一步是使实现这些步骤的可组合函数。
1)您已经有了这个,但是我们可能应该保留这些元素,以便稍后更新。
public IEnumerable<XElement> GetImages(XDocument document)
{
var ns = document.Root.Name.Namespace;
return document.Root.Descendants(ns + "img");
}2)从可组合的角度看,这似乎是你碰壁的地方。首先,让我们制作一个可观察到的FromEventAsyncPattern生成器。开始/结束异步模式和标准事件已经有了,因此这将出现在两者之间。
public IObservable<TEventArgs> FromEventAsyncPattern<TDelegate, TEventArgs>
(Action method, Action<TDelegate> addHandler, Action<TDelegate> removeHandler
) where TEventArgs : EventArgs
{
return Observable.Create<TEventArgs>(
obs =>
{
//subscribe to the handler before starting the method
var ret = Observable.FromEventPattern<TDelegate, TEventArgs>(addHandler, removeHandler)
.Select(ep => ep.EventArgs)
.Take(1) //do this so the observable completes
.Subscribe(obs);
method(); //start the async operation
return ret;
}
);
}现在,我们可以使用这种方法将下载转化为可观察的。根据您的用法,我认为您也可以在WebClient上使用WebClient。
public IObservable<byte[]> DownloadAsync(Uri address)
{
return Observable.Using(
() => new System.Net.WebClient(),
wc =>
{
return FromEventAsyncPattern<System.Net.DownloadDataCompletedEventHandler,
System.Net.DownloadDataCompletedEventArgs>
(() => wc.DownloadDataAsync(address),
h => wc.DownloadDataCompleted += h,
h => wc.DownloadDataCompleted -= h
)
.Select(e => e.Result);
//for robustness, you should probably check the error and cancelled
//properties instead of assuming it finished like I am here.
});
}编辑:根据您的评论,您似乎在使用Silverlight,其中WebClient不是IDisposable,也没有我使用的方法。要解决这一问题,请尝试如下:
public IObservable<byte[]> DownloadAsync(Uri address)
{
var wc = new System.Net.WebClient();
var eap = FromEventAsyncPattern<OpenReadCompletedEventHandler,
OpenReadCompletedEventArgs>(
() => wc.OpenReadAsync(address),
h => wc.OpenReadCompleted += h,
h => wc.OpenReadCompleted -= h);
return from e in eap
from b in e.Result.ReadAsync()
select b;
}您需要找到ReadAsync的实现来读取流。你应该能很容易地找到一个,而且这个帖子已经够长了,所以我把它忘了。
3& 4)现在我们已经准备好把所有的内容放在一起,并更新这些元素。由于步骤3非常简单,所以我将它与步骤4合并。
public IObservable<Unit> ReplaceImageLinks(XDocument document)
{
return (from element in GetImages(document)
let address = new Uri(element.Attribute("src").Value)
select (From data in DownloadAsync(address)
Select Convert.ToBase64String(data)
).Do(base64 => element.Attribute("src").Value = base64)
).Merge()
.IgnoreElements()
.Select(s => Unit.Default);
//select doesn't really do anything as IgnoreElements eats all
//the values, but it is needed to change the type of the observable.
//Task may be more appropriate here.
}https://stackoverflow.com/questions/8897769
复制相似问题