Overview
This class covers database design:
- design data with class and inheritance
- design a user system (Netflix 2015)
- design a payment system (Yelp, BigCommerce 2015)
Question 1
design account (login/out) system for our radio app.
Step one, scenario
- register, update, remove account
- login/out
- user balance, VIP services
Step Two, necessary
Ask
- total user: 100 million
- daily user: 1 million
predict
- daily active user in 3 month: 10 million
- register percentage: 1%
- daily new register: 100 thousand
more predict
- login percentage: 15%
- average login frequency: 1.2 (ppl may input wrong password 20% of time)
- daily login attempts: 10 million _ 15% _ 1.2 = 1.8 million
- average login frequency: 1.8 million / 24hr = 21 login/sec
- normal login frequency: 21 * 2 = 42 login/sec
- peak login frequency: 42 * 3 = 126 login/sec
Step Three, Application
Step Four, Kilobit
Data - User table should contain name and password. What else?
class User {
int userId; (primary key)
String name;
String password;
}
Data - User Table
class UserTable {
list<User> table;
public insert(){}
public delete(User){}
public update(User){}
public User select(){}
}
CRUD, (Sometimes called SCRUD with an “S” for Search) are the four basic functions of persistent storage.
Question 2
verification and forbidden accounts
We have to know the concept of Account State Lifecycle Graph:
ban: fake user, advertising users… bannned by the management
inactive: user choose to suspend his own account, voluntarily.
delete account: normally we won’t remove all related data (just make userId as “deleted”). Otherwise a lot of data can be violated. All your chatting history actually remains.
redesign User Table
Old User table:
class User {
int userId; (primary key)
String name;
String password;
}
Modified User table:
class User {
int userId;
char name[10];
char hiddenPassword[10];
int state;
}
We added state, to support Account life cycle.
We changed username to fixed size, for better performance on searching and storing. Can prevent certain attacks, too.
save encrypted password.
Question 3
design login/out process
- User account auto logged out after a certain period of time.
- multiple account logged in at same time.
Session
Session is a conversation between user and server.
- User can have >1 session, if he log in from different devices.
- Session must be verified, thus we have to keep sessionId.
Session status: “iPad online”, “PC online”…
Modify User table:
class User {
int userId;
char name[10];
char hiddenPassword[10];
int state;
List<session> sessionList;
}
Important in Session table:
device ID
time-out period
class Session {
private sessionId;
int userId;int deviceCode; long timeOut;
}
User table would include a session list.
further improvement: session
- we update sessionList very frequently.
- size of sessionList is dynamic.
Solution: seperate the table.
Question: When to clean up the session data (considering huge amount of data and frequent calculation)?
Answer: every end of day. Or store sessions in-memory, so it lose all the data when machine restarts (it is used in Gaming platforms). Or we can clean up one user’s session list whenever the user did a new log-in.
We do not remove session whenever it expires. It’s too much calculation.
further improvement: inheritance
Apply inheritance to UserTable and SessionTable:
class Table {
list<Row*> table;
public insert(){}
public delete(){}
public update(){}
public User select(){}
}
class UserTable extends Table {
}
class SessionTable extends Table {
}
As for the Row class:
class Row {
List<Attributes> row;
}
class User extends Row {
}
class Session extends Row {
}
Question 4
design search algorithm
- find my userId
- find my userId range
Solution 1: add HashMap in the table. Can do search in O(1), but can’t find range.
Solution 2: BST data structure. Can do search range and search in O(log2 n) time.
Best solution: B+ tree
B+ tree - everything in O(logb n) which is close to constant time.
Plus, B+ tree is hard disk friendly. Read more on a future post.