我正在编写一个pyDatalog程序来分析来自weather Underground的天气数据(就像现在我和公司里的其他人的演示一样)。我已经编写了一个自定义谓词解析器,它返回开始和结束时间之间的读数:
# class for the reading table.
class Reading(Base):
__table__ = Table('reading', Base.metadata, autoload = True, autoload_with = engine)
def __repr__(self):
return str(self.Time)
# predicate to resolve 'timeBetween(X, Y, Z)' statements
# matches items as X where the time of day is between Y and Z (inclusive).
# if Y is later than Z, it returns the items not between Z and Y (exclusive).
# TODO - make it work where t1 and t2 are not bound.
# somehow needs to tell the engine to try somewhere else first.
@classmethod
def _pyD_timeBetween3(cls, dt, t1, t2):
if dt.is_const():
# dt is already known
if t1.is_const() and t2.is_const():
if (dt.id.Time.time() >= makeTime(t1.id)) and (dt.id.Time.time() <= makeTime(t2.id)):
yield (dt.id, t1.id, t2.id)
else:
# dt is an unbound variable
if t1.is_const() and t2.is_const():
if makeTime(t2.id) > makeTime(t1.id):
op = 'and'
else:
op = 'or'
sqlWhere = "time(Time) >= '%s' %s time(Time) <= '%s'" % (t1.id, op, t2.id)
for instance in cls.session.query(cls).filter(sqlWhere):
yield(instance, t1.id, t2.id)这在t1和t2绑定到特定值的情况下工作得很好:
:> easterly(X) <= (Reading.WindDirection[X] == 'East')
:> + rideAfter('11:00:00')
:> + rideBefore('15:00:00')
:> goodTime(X) <= rideAfter(Y) & rideBefore(Z) & Reading.timeBetween(X, Y, Z)
:> goodTime(X)
[(2013-02-19 11:25:00,), (2013-02-19 12:45:00,), (2013-02-19 12:50:00,), (2013-02-19 13:25:00,), (2013-02-19 14:30:00,), (2013-02-19 15:00:00,), (2013-02-19 13:35:00,), (2013-02-19 13:50:00,), (2013-02-19 12:20:00,), (2013-02-19 12:35:00,), (2013-02-19 14:05:00,), (2013-02-19 11:20:00,), (2013-02-19 11:50:00,), (2013-02-19 13:15:00,), (2013-02-19 14:55:00,), (2013-02-19 12:00:00,), (2013-02-19 13:00:00,), (2013-02-19 14:20:00,), (2013-02-19 14:15:00,), (2013-02-19 13:10:00,), (2013-02-19 12:10:00,), (2013-02-19 14:45:00,), (2013-02-19 14:35:00,), (2013-02-19 13:20:00,), (2013-02-19 11:10:00,), (2013-02-19 13:05:00,), (2013-02-19 12:55:00,), (2013-02-19 14:10:00,), (2013-02-19 13:45:00,), (2013-02-19 13:55:00,), (2013-02-19 11:05:00,), (2013-02-19 12:25:00,), (2013-02-19 14:00:00,), (2013-02-19 12:05:00,), (2013-02-19 12:40:00,), (2013-02-19 14:40:00,), (2013-02-19 11:00:00,), (2013-02-19 11:15:00,), (2013-02-19 11:30:00,), (2013-02-19 11:45:00,), (2013-02-19 13:40:00,), (2013-02-19 11:55:00,), (2013-02-19 14:25:00,), (2013-02-19 13:30:00,), (2013-02-19 12:30:00,), (2013-02-19 12:15:00,), (2013-02-19 11:40:00,), (2013-02-19 14:50:00,), (2013-02-19 11:35:00,)]但是,如果我使用另一顺序中的条件声明goodTime规则(即Y和Z在它试图解析timeBetween的点处未绑定),它将返回一个空集:
:> atoms('niceTime')
:> niceTime(X) <= Reading.timeBetween(X, Y, Z) & rideAfter(Y) & rideBefore(Z)
<pyDatalog.pyEngine.Clause object at 0x0adfa510>
:> niceTime(X)
[]这似乎是错误的-两个查询应该返回相同的结果集。
我的问题是,在pyDatalog中是否有处理这种情况的方法?我认为需要发生的是,timeBetween谓词应该能够告诉引擎以某种方式后退,并在尝试此规则之前先尝试解析其他规则,但我在文档中看不到任何与此相关的引用。
发布于 2013-03-04 20:02:41
pyDatalog reference说:“尽管pyDatalog语句的顺序是无关紧要的,但主体中文字的顺序很重要”,pyDatalog确实会按照谓词声明的顺序解析主体中的谓词。
话虽如此,可以改进pyDatalog以首先解析带有绑定变量的谓词,但我不确定这为什么重要。
https://stackoverflow.com/questions/15199636
复制相似问题