Entwicklung eines Leistungsdatenmodells für Cassandra

DataStax arbeitet an der Erstellung eines Leistungsdatenmodells für Apache Cassandra. Was diese Arbeit ist und wie man sie richtig macht, sagte Artyom Chebotko, Lösungsarchitekt bei DataStax, auf der Konferenz zum Cassandra Day Russia 2021.







Bild







Apache Cassandra. DataStax. use cases, . .

. , Cassandra , , . . 3 , . , , .







Cassandra



Cassandra , , KEYSPACE — . . , replication strategy, - replication factors .







Bild







DC-WEST — - replication factor 3. DC-EAST replication factor 5. KEYSPACE. , KEYSPACE, replication strategy.







KEYSPACE . Create Table — .







Bild







. SQL: 4 , 4 . primary key — — , , 2 . — year. , partition key, . — name. clustering key, , .







Bild







Partition key YEAR , . . YEAR partition key. partition. , 2015 partition, 2015 partition. - .







Bild







— Cassandra , , , replication factor. , partition — - 3 , - 5 . 1- partition 3 . partition key Cassandra , , , .







Bild







KEYSPACE, — Cassandra Query Language, Structured Query Language, SQL.







, Create Table, .







Bild







partition key, , primary key partition key , , clustering key. , clustering key.







, . , . , , - , . partition, partition.







clustering order by — , partition, . , , clustering key. Cassandra , . , , , .







Bild







, partitions. , primary key. primary key ID, partition key. partition . . « , » — Single-Row Partitions. , Cassandra. partitions , 1. Multi-Row Partitions.







Bild







, partition key, clustering key, Cassandra, . . . . 10 , . partition partition - .







partition key. Venue year — «» «». DataStax Accelerate. partition key . , — - . title, — . .







Country , partition, . , .







. . ? , 5 , , K — partition key, — clustering key, — ascending descending, , — . S — .







Bild







, . , CQL. SQL: select, from, where, group by, order by, limit. allow filtering — .







Bild







Select — , from — . Cassandra . . , join — , union — , intersection — . , 2 , . , , , join, , join.







where — , primary key. partition key — . — — clustering key, , /. . use cases, , .







Group by primary key , .

Order by — . Cassandra , . , . , . . .







Limit — .







llow filtering — , . , . , , , , .







, artefacts_by_venue.







Bild







artefacts, venue - , year - , partition key. partition key clustering key — . clustering key. : partition key clustering key.







, .







Bild







, venue. partition key, Cassandra , , . partition key, clustering key.







venue, year — partition key, title , primary key, . Country. . , , .







Bild







Primary key , . -, , partition key, partition , , partition. .







clustering key ( ). , join, - , , . , , , . .









— . , , . .







Bild







— . . — . , — , . , . — , ( ). , .







, , . , — access patterns . . , , , , . . , , .







- — — .







, Cassandra , (consistency) , , . — join . , .







, — , , , . , .







Bild







4 :







  1. .
  2. , , .
  3. , .
  4. .


:







  1. Conceptual Data Model.
  2. Application Workflow Model.
  3. Logical Data Model.
  4. Physical Data Model.


- : Entity-Relationship Diagram (-), Application Workflow Diagram ( ), Chebotko Diagram Chebotko Diagram&CQL.







. — .







, : « — Conceptual Data Model Application Workflow Model»? . , , . , . , , .







: ? consistency level , ?



: , . . , . ? partition key, Cassandra- , . 100 , replication factor 3, partition key , 3 — . secondary index partition key, 100 , .



?

  1. partition key
  2. . , OLTP-, , . Cassandra, -. . - Cassandra — Spark, - . - -, , , .




consistency level . , . .



, , .




DataStax Academy , 2. , . , : , .









— Internet of Things . ? - , , . , , , , . - , , , . .







Bild







.







, . , ?







Bild







, - . - — .







, , . , . , , - .







, . — , . — , . , . ID — . , — . — — , , : , , . , .







, , , — . , . ID timestamp - . — timestamp — .







, Entity-Relationship (-), . , . , .







Bild







Application Workflow Model — . : , .







Application Workflow . . . : - — , . , - , . . - data access pattern. , , batch.







4 , 4 4 . , , 1 — . ?







  1. .
  2. . ? . , . . .
  3. : .
  4. : .


. , . ? : . — : /. clustering key, partition key. , . , , ID .







Bild







, . , , Application Workflow. — . — , . , , DataStax Academy.







sensors_bynetwork — . Network — partition key, partition. Temperatures by_sensor — , timestamp. , + . timestamp clustering key, . , . .







Bild







, ? , . — . 3 . — . bucket — partition key, name — clustering key. partition . partition. Bucket — , , partition.







: networks — . , partition.







? week — . . partition key. . partition , partition . ? — , . , , . , .







, , 100 000 100 . . , 5 , - 100 . 100 000 - — 10 . - 100 000 — 1 . .







, ? , , — 24 . , . 1 000 — 24 * 1 000 = 24 000 . , , . , . . .







— . — . timestamp — .







: , like - , ?



secondary indexes, , , secondary indexes . , , Cassandra . , , , . , — solar indexes, Cassandra, .

, — . , CQL. . . , KEYSPACE, . , , , , , partition key, clustering key — . — CQL , , Stargate API — .







Bild







2 : , . , , . , partition, .. bucket = all. , , , partition.







. forest-net, , . : network = forest-net, -. - . . .







, , ? ? 2 partition, 2 . , . 2 : , . . , in, . in, , 2 . , .







, , , .









. , . .







Bild







, . — . , . , — «» «». - . , mutual funds ( ), ETF (Exchange-traded fund). . , .







. keys, username, , , — . . , . , . -, , : , . , .







Bild







Workflow — 3 . . , , . — . , . . 5 . , 5 , . , . — . — : . — + + + . — + + . .







, ?







Bild







4 3- . 3.1 3.2. , , , . Trade_id — id . , : . partition — , trade_id.







, . ? . — . — . , .







, trades_by_a_d ? ? , — . , . , , 100 000 — . — — . , , , 100 000 .







Bild







, — trade_id . Trade_id — TIMEUUID. UUID — . timestamp, . , .







, - . .







Bild







? , TIMEUUID? TIMEUUID timestamp .







Bild







, , , . TIMEUUID — , .







, — TIMEUUID, . trade_id > maxTIMEUUID — , , . , timestamp. timestamp . .







: . ?



: ? — update insert . , . : trades — 4 , , -. -. ? baches, . baches , , baches, partition, . .



partition , . insert application retry, - . - — - , - , . Spark , , . join Spark, .



All Articles