首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >无法使用pycassa读取列族

无法使用pycassa读取列族
EN

Stack Overflow用户
提问于 2014-02-04 05:49:26
回答 1查看 408关注 0票数 0

我刚刚开始使用pycassa,所以如果这是一个愚蠢的问题,我很抱歉。

我有一个具有以下架构的列族:

代码语言:javascript
复制
create column family MyColumnFamilyTest
  with column_type = 'Standard'
  and comparator = 'CompositeType(org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.TimeUUIDType)'
  and default_validation_class = 'BytesType'
  and key_validation_class = 'UTF8Type'
  and read_repair_chance = 0.1
  and dclocal_read_repair_chance = 0.0
  and populate_io_cache_on_flush = false
  and gc_grace = 864000
  and min_compaction_threshold = 4
  and max_compaction_threshold = 32
  and replicate_on_write = true
  and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
  and caching = 'KEYS_ONLY'
  and compression_options = {'sstable_compression' : 'org.apache.cassandra.io.compress.SnappyCompressor'};

当我尝试用一个有效的键执行get()时(在cassandra-cli中工作得很好),我得到:

代码语言:javascript
复制
Traceback (most recent call last):
  File "<pyshell#19>", line 1, in <module>
    cf.get('mykey',column_count=3)
  File "/usr/local/lib/python2.7/dist-packages/pycassa-1.11.0-py2.7.egg/pycassa/columnfamily.py", line 664, in get
    return self._cosc_to_dict(list_col_or_super, include_timestamp, include_ttl)
  File "/usr/local/lib/python2.7/dist-packages/pycassa-1.11.0-py2.7.egg/pycassa/columnfamily.py", line 368, in _cosc_to_dict
    ret[self._unpack_name(col.name)] = self._col_to_dict(col, include_timestamp, include_ttl)
  File "/usr/local/lib/python2.7/dist-packages/pycassa-1.11.0-py2.7.egg/pycassa/columnfamily.py", line 444, in _unpack_name
    return self._name_unpacker(b)
  File "/usr/local/lib/python2.7/dist-packages/pycassa-1.11.0-py2.7.egg/pycassa/marshal.py", line 140, in unpack_composite
    components.append(unpacker(bytestr[2:2 + length]))
  File "/usr/local/lib/python2.7/dist-packages/pycassa-1.11.0-py2.7.egg/pycassa/marshal.py", line 374, in <lambda>
    return lambda v: uuid.UUID(bytes=v)
  File "/usr/lib/python2.7/uuid.py", line 144, in __init__
    raise ValueError('bytes is not a 16-char string')
ValueError: bytes is not a 16-char string

下面是我发现的更多信息:

当使用cassandra-cli时,我可以看到如下数据:

% cassandra-cli -h 10.249.238.131

代码语言:javascript
复制
Connected to: "LocalDB" on 10.249.238.131/9160
Welcome to Cassandra CLI version 1.2.10-SNAPSHOT

Type 'help;' or '?' for help.
Type 'quit;' or 'exit;' to quit.

[default@unknown] use Keyspace;
[default@Keyspace] list ColumnFamily;
Using default limit of 100
Using default cell limit of 100
-------------------
RowKey: urn:keyspace:ColumnFamily:a36e8ab1-7032-4e4c-a53d-e3317f63a640:
=> (name=autoZoning:::, value=01, timestamp=1391298393966000)
=> (name=creationTime:::, value=00000143efd8b76e, timestamp=1391298393966000)
=> (name=inactive:::14fe78e0-8b9b-11e3-b171-005056b700bb, value=00, timestamp=1391298393966000)
=> (name=label:::14fe78e0-8b9b-11e3-b171-005056b700bb, value=726a6d2d766e782d76613031, timestamp=1391298393966000)

1 Row Returned.
Elapsed time: 16 msec(s).

由于不清楚是什么导致了异常,我决定在columnfamily.py中的'return self._name_unpacker(b)‘行之前添加一个打印,我看到了:

代码语言:javascript
复制
>>> cf.get(dict(cf.get_range(column_count=0,filter_empty=False)).keys()[0])
Attempting to unpack: <00>\rautoZoning<00><00><00><00><00><00><00><00><00><00>

Traceback (most recent call last):
  File "<pyshell#172>", line 1, in <module>
    cf.get(dict(cf.get_range(column_count=0,filter_empty=False)).keys()[0])
  File "/usr/local/lib/python2.7/dist-packages/pycassa-1.11.0-py2.7.egg/pycassa/columnfamily.py", line 665, in get
    return self._cosc_to_dict(list_col_or_super, include_timestamp, include_ttl)
  File "/usr/local/lib/python2.7/dist-packages/pycassa-1.11.0-py2.7.egg/pycassa/columnfamily.py", line 368, in _cosc_to_dict
    ret[self._unpack_name(col.name)] = self._col_to_dict(col, include_timestamp, include_ttl)
  File "/usr/local/lib/python2.7/dist-packages/pycassa-1.11.0-py2.7.egg/pycassa/columnfamily.py", line 445, in _unpack_name
    return self._name_unpacker(b)
  File "/usr/local/lib/python2.7/dist-packages/pycassa-1.11.0-py2.7.egg/pycassa/marshal.py", line 140, in unpack_composite
    components.append(unpacker(bytestr[2:2 + length]))
  File "/usr/local/lib/python2.7/dist-packages/pycassa-1.11.0-py2.7.egg/pycassa/marshal.py", line 374, in <lambda>
    return lambda v: uuid.UUID(bytes=v)
  File "/usr/lib/python2.7/uuid.py", line 144, in __init__
    raise ValueError('bytes is not a 16-char string')
ValueError: bytes is not a 16-char string

我不知道列名周围的额外字符来自哪里。但这让我很好奇,所以我在columnfamily.py的_cosc_to_dict中添加了另一个打印,我看到了:

代码语言:javascript
复制
    >>> cf.get(dict(cf.get_range(column_count=0,filter_empty=False)).keys()[0])
    list_col_or_super is: []
    list_col_or_super is: [ColumnOrSuperColumn(column=Column(timestamp=1391298393966000, 
name='\x00\rautoZoning\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', value='\x01', ttl=None), 
counter_super_column=None, super_column=None, counter_column=None), 
ColumnOrSuperColumn(column=Column(timestamp=1391298393966000, 
name='\x00\x0ccreationTime\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', 
value='\x00\x00\x01C\xef\xd8\xb7n', ttl=None), counter_super_column=None, super_column=None, 
counter_column=None), ColumnOrSuperColumn(column=Column(timestamp=1391298393966000, 
name='\x00\x08inactive\x00\x00\x00\x00\x00\x00\x00\x00\x10\x14\xfex\xe0\x8b\x9b\x11\xe3\xb1q\x00PV\xb7\x00\xbb\x00', value='\x00', ttl=None), counter_super_column=None, super_column=None, 
counter_column=None), ColumnOrSuperColumn(column=Column(timestamp=1391298393966000, 
name='\x00\x05label\x00\x00\x00\x00\x00\x00\x00\x00\x10\x14\xfex\xe0\x8b\x9b\x11\xe3\xb1q\x00PV\xb7\x00\xbb\x00', value='thisIsATest', ttl=None), counter_super_column=None, super_column=None, counter_column=None)]
    autoZoning unpack: 
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/local/lib64/python2.6/site-packages/pycassa-1.11.0-py2.6.egg/pycassa/columnfamily.py", line 666, in get
        return self._cosc_to_dict(list_col_or_super, include_timestamp, include_ttl)
      File "/usr/local/lib64/python2.6/site-packages/pycassa-1.11.0-py2.6.egg/pycassa/columnfamily.py", line 369, in _cosc_to_dict
        ret[self._unpack_name(col.name)] = self._col_to_dict(col, include_timestamp, include_ttl)
      File "/usr/local/lib64/python2.6/site-packages/pycassa-1.11.0-py2.6.egg/pycassa/columnfamily.py", line 446, in _unpack_name
        return self._name_unpacker(b)
      File "/usr/local/lib64/python2.6/site-packages/pycassa-1.11.0-py2.6.egg/pycassa/marshal.py", line 140, in unpack_composite
        components.append(unpacker(bytestr[2:2 + length]))
      File "/usr/local/lib64/python2.6/site-packages/pycassa-1.11.0-py2.6.egg/pycassa/marshal.py", line 374, in <lambda>
        return lambda v: uuid.UUID(bytes=v)
      File "/usr/lib64/python2.6/uuid.py", line 144, in __init__
        raise ValueError('bytes is not a 16-char string')
    ValueError: bytes is not a 16-char string

我假设列名周围的额外字符是导致'ValueError: bytes is not a 16-char string‘异常的原因,对吗?

另外,如果我尝试使用列名并选择它,我会得到:

代码语言:javascript
复制
>>> cf.get(u'urn:keyspace:ColumnFamily:a36e8ab1-7032-4e4c-a53d-e3317f63a640:',columns=['autoZoning:::'])

Traceback (most recent call last):
  File "<pyshell#184>", line 1, in <module>
    cf.get(u'urn:keyspace:ColumnFamily:a36e8ab1-7032-4e4c-a53d-e3317f63a640:',columns=['autoZoning:::'])
  File "/usr/local/lib/python2.7/dist-packages/pycassa-1.11.0-py2.7.egg/pycassa/columnfamily.py", line 651, in get
    cp = self._column_path(super_column, column)
  File "/usr/local/lib/python2.7/dist-packages/pycassa-1.11.0-py2.7.egg/pycassa/columnfamily.py", line 383, in _column_path
    self._pack_name(column, False))
  File "/usr/local/lib/python2.7/dist-packages/pycassa-1.11.0-py2.7.egg/pycassa/columnfamily.py", line 426, in _pack_name
    return self._name_packer(value, slice_start)
  File "/usr/local/lib/python2.7/dist-packages/pycassa-1.11.0-py2.7.egg/pycassa/marshal.py", line 115, in pack_composite
    packed = packer(item)
  File "/usr/local/lib/python2.7/dist-packages/pycassa-1.11.0-py2.7.egg/pycassa/marshal.py", line 298, in pack_uuid
    randomize=True)
  File "/usr/local/lib/python2.7/dist-packages/pycassa-1.11.0-py2.7.egg/pycassa/util.py", line 75, in convert_time_to_uuid
    'neither a UUID, a datetime, or a number')
ValueError: Argument for a v1 UUID column name or value was neither a UUID, a datetime, or a number

还有进一步的想法吗?

谢谢,

抢夺

EN

回答 1

Stack Overflow用户

发布于 2014-02-06 00:13:38

原来问题不是键的问题,而是由pycassa中的一个bug引起的,该bug没有处理列UUID中的空(null)字符串。谷歌群组的答案中有一个短期的解决办法:

https://groups.google.com/d/msg/pycassa-discuss/Vf_bSgDIi9M/KTA1kbE9IXAJ

答案的另一部分是通过使用元组(将UUID作为UUID而不是str)来获取列,而不是使用带有':‘分隔符的字符串,因为我发现这是cassandra-cli的事情。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/21538545

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档