Friday, September 29, 2023

Cassandra: How to tell where the data resides and on which nodes.

 


cqlsh:linkedin> select * from location;

 vehicle_id | date       | time                            | latitude | longtitude
------------+------------+---------------------------------+----------+------------
   ME100AAS | 2014-05-19 | 2014-05-19 08:50:00.000000+0000 | 44.74941 |   -67.2507
   ME100AAS | 2014-05-19 | 2014-05-19 08:40:00.000000+0000 | 44.74648 |  -67.26444
   ME100AAS | 2014-05-19 | 2014-05-19 08:30:00.000000+0000 | 44.74258 |  -67.34272
   ME100AAS | 2014-05-19 | 2014-05-19 08:20:00.000000+0000 | 44.72795 |  -67.40177
   ME100AAS | 2014-05-19 | 2014-05-19 08:10:00.000000+0000 | 44.69965 |  -67.47043
   ME100AAS | 2014-05-19 | 2014-05-19 08:00:00.000000+0000 | 44.61909 |   -67.8462
   WA063JXD | 2014-05-19 | 2014-05-19 08:50:00.000000+0000 | 47.70144 | -117.01791
   WA063JXD | 2014-05-19 | 2014-05-19 08:40:00.000000+0000 | 47.69589 | -117.04126
   WA063JXD | 2014-05-19 | 2014-05-19 08:30:00.000000+0000 | 47.68711 |  -117.0701
   WA063JXD | 2014-05-19 | 2014-05-19 08:20:00.000000+0000 | 47.68017 | -117.08932
   WA063JXD | 2014-05-19 | 2014-05-19 08:10:00.000000+0000 | 47.67093 | -117.10924
   WA063JXD | 2014-05-19 | 2014-05-19 08:00:00.000000+0000 | 47.67547 | -117.23619
  CA6AFL218 | 2014-05-19 | 2014-05-19 08:50:00.000000+0000 | 36.11959 | -115.17258
  CA6AFL218 | 2014-05-19 | 2014-05-19 08:40:00.000000+0000 | 36.04423 | -115.18112
  CA6AFL218 | 2014-05-19 | 2014-05-19 08:30:00.000000+0000 | 35.91153 | -115.20631
  CA6AFL218 | 2014-05-19 | 2014-05-19 08:20:00.000000+0000 | 35.88326 | -115.22528
  CA6AFL218 | 2014-05-19 | 2014-05-19 08:10:00.000000+0000 | 35.76301 | -115.33514
  CA6AFL218 | 2014-05-19 | 2014-05-19 08:00:00.000000+0000 | 35.69166 |  -115.3681


cqlsh:linkedin> desc table location

CREATE TABLE linkedin.location (
vehicle_id text,
date text,
time timestamp,
latitude double,
longtitude double,
PRIMARY KEY ((vehicle_id, date), time)
) WITH CLUSTERING ORDER BY (time DESC)
AND additional_write_policy = '99p'
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND cdc = false
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '16', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND memtable = 'default'
AND crc_check_chance = 1.0
AND default_time_to_live = 0
AND extensions = {}
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair = 'BLOCKING'
AND speculative_retry = '99p';


Provide the token function with the compound keys. Compound keys are hashed and stored in corresponding nodes.


cqlsh:linkedin> select token(vehicle_id,date) from location;

system.token(vehicle_id, date)
--------------------------------
-7657837382140274291
-7657837382140274291
-7657837382140274291
-7657837382140274291
-7657837382140274291
-7657837382140274291
7624412873298128873
7624412873298128873
7624412873298128873
7624412873298128873
7624412873298128873
7624412873298128873
8294848196898204914
8294848196898204914
8294848196898204914
8294848196898204914
8294848196898204914
8294848196898204914

(18 rows)


Use nodetool getendpoint function to figure out where they reside.


cassandra@cassandra02:/var/log/cassandra$ nodetool getendpoints linkedin location '8294848196898204914'
192.168.1.35
cassandra@cassandra02:/var/log/cassandra$ nodetool getendpoints linkedin location '-7657837382140274291'
192.168.1.32

No comments:

Post a Comment