twitter architecture

target TPS, read: 1000 tps write: 100 tps

data model

user
- id
- name
- email
tweet
- id
- content
- created
- type: retweet|reply|
follower N:M mapping
- 이것은 Flock이라는 Graph DB를 사용했다고 함. following, follower, block 등의 정보 저장.
- uid
- follow_uid

db는 nosql(Dynamo DB, mongo DB)를 쓰거나, mysql을 sharding해서.. sharding은 timestamp로 샤딩.
트위터는 write보다는 read요청이 훨씬 많으니 read에 최적화를 해야 함.
- 대량의 redis cluster로 timeline을 캐쉬해 둔다.
- 사용자가 트윗을 올리면, follower의 redis cache로 push해 준다. follwer가 많은 유명인의 경우 상당한 시간이 걸릴수 있다. 레이디 가가나 오바마 같은 경우
home timeline의 트윗은 최대 800개까지 redis에 저장