Cassandra读取比预期更长 [英] Cassandra Read taking longer than expected

查看:166
本文介绍了Cassandra读取比预期更长的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用cassandra 1.2与CQL3。我在我的键空间有三个列系列。当我查询列族(手机)之一时,需要很长时间才能找回。这是我的查询

I am using cassandra 1.2 with CQL3. I have three column families in my keyspace. When I query one of the column family(phones), it takes a long time to retrive. Here is my query

**select * from phones where phone_no in ('9038487582');**

这是查询的跟踪输出。

activity                                        | timestamp    | source      | source_elapsed
-------------------------------------------------+--------------+-------------+----------------
                              execute_cql3_query | 16:35:47,675 | 10.1.26.155 |              0
                               Parsing statement | 16:35:47,675 | 10.1.26.155 |             58
                              Peparing statement | 16:35:47,675 | 10.1.26.155 |            335
      Executing single-partition query on phones | 16:35:47,676 | 10.1.26.155 |           1069
                    Acquiring sstable references | 16:35:47,676 | 10.1.26.155 |           1097
                       Merging memtable contents | 16:35:47,676 | 10.1.26.155 |           1143
 Partition index lookup complete for sstable 822 | 16:35:47,676 | 10.1.26.155 |           1376
 Partition index lookup complete for sstable 533 | 16:35:47,686 | 10.1.26.155 |          10659
      Merging data from memtables and 2 sstables | 16:35:47,704 | 10.1.26.155 |          29192
              Read 1 live cells and 0 tombstoned | 16:35:47,704 | 10.1.26.155 |          29332
                                Request complete | 16:35:47,704 | 10.1.26.155 |          29601

我在键盘空间上只有一个复制因子。并有3个节点集群。手机有大约4000万行,每行只有两列。它回到29ms,15ms,8ms,5ms,3ms,但它不一致。你能给我任何建议,我可能会做什么错误?此外,我的usecase将有极低的缓存命中,因此缓存键不是一个解决方案。此外,这是我的列家庭定义。

I have only 1 replication factor on the keyspace. and have 3 node cluster. Phones have around 40 million rows and just two columns in each row. it comes back in 29ms, 15ms, 8 ms, 5 ms, 3 ms but it is not consistent. Can you guys give me any suggestions regarding what mistake I might be doing ? Also my usecase will have extremely low cache hit so caching keys is not a solution for me. Also, this is my column family definition.

CREATE TABLE phones (
  phone_no text PRIMARY KEY,
  ypids set<int>
) WITH
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=0.100000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};


推荐答案

索引查找速度相当快正在被OS高速缓存,因为它被频繁访问);在那里你失去所有的时间是在那之间和合并数据步骤。在这些之间发生的事情实际上是寻找到sstable中的数据位置。 (我已经为1.2.6添加了一个新的跟踪条目,使其清楚。)

The index lookups are reasonably fast (probably the index file is being cached by the OS since it is accessed frequently); where you are losing all the time is in between that and the "merging data" step. What happens in between those is actually seeking to the data location in the sstable. (I've added a new trace entry for 1.2.6 to make that clear.)

这解释了为什么有时它是快,有时不是 - 如果你的寻求无争用,或者更好的缓存,那么查询会很快。

This explains why sometimes it's fast, and sometimes not -- if your seek is uncontended, or better yet cached, then the query will be fast. Otherwise it will be slower.

我看到以下几个选项可以帮助:

I see several options that could help:


  1. 切换到分层压缩( http://www.datastax.com) / dev / blog / when-to-use-leveled-compaction

  2. 添加更多机器以通过暴力获得更多iops

  3. 切换到SSD以获得更好的硬件

  4. 添加更多RAM以使缓存更有效地掩盖缺少iops

  1. Switch to Leveled compaction (http://www.datastax.com/dev/blog/when-to-use-leveled-compaction)
  2. Add more machines to get more iops by brute force
  3. Switch to SSD to get more iops by better hardware
  4. Add more RAM to make caching more effective at covering up the lack of iops

你会注意到,只有第一个选项不包括更多或不同的硬件,所以这是我先评估。但是上行是有限的:最多你会减少sstables的数量为1。

You'll note that only the first option doesn't include more or different hardware, so that's what I'd evaluate first. But the upside is limited: at best you'll reduce the number of sstables to 1.

这篇关于Cassandra读取比预期更长的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆