Overview

This class covers database design:

  1. design data with class and inheritance
  2. design a user system (Netflix 2015)
  3. design a payment system (Yelp, BigCommerce 2015)

Question 1

design account (login/out) system for our radio app.

Step one, scenario

  1. register, update, remove account
  2. login/out
  3. user balance, VIP services

Step Two, necessary

  1. Ask

    1. total user: 100 million
    2. daily user: 1 million
  2. predict

    1. daily active user in 3 month: 10 million
    2. register percentage: 1%
    3. daily new register: 100 thousand
  3. more predict

    1. login percentage: 15%
    2. average login frequency: 1.2 (ppl may input wrong password 20% of time)
    3. daily login attempts: 10 million _ 15% _ 1.2 = 1.8 million
    4. average login frequency: 1.8 million / 24hr = 21 login/sec
    5. normal login frequency: 21 * 2 = 42 login/sec
    6. peak login frequency: 42 * 3 = 126 login/sec

Step Three, Application

Step Four, Kilobit

Data - User table should contain name and password. What else?

class User {
    int userId; (primary key)
    String name;
    String password;
}

Data - User Table

class UserTable {
    list<User> table;

    public insert(){}
    public delete(User){}
    public update(User){}
    public User select(){}
}

CRUD, (Sometimes called SCRUD with an “S” for Search) are the four basic functions of persistent storage.

Question 2

verification and forbidden accounts

We have to know the concept of Account State Lifecycle Graph:

  1. ban: fake user, advertising users… bannned by the management

  2. inactive: user choose to suspend his own account, voluntarily.

  3. delete account: normally we won’t remove all related data (just make userId as “deleted”). Otherwise a lot of data can be violated. All your chatting history actually remains.

redesign User Table

Old User table:

class User {
    int userId; (primary key)
    String name;
    String password;
}

Modified User table:

class User {
    int userId;
    char name[10];
    char hiddenPassword[10];
    int state;
}
  1. We added state, to support Account life cycle.

  2. We changed username to fixed size, for better performance on searching and storing. Can prevent certain attacks, too.

  3. save encrypted password.

Question 3

design login/out process

  1. User account auto logged out after a certain period of time.
  2. multiple account logged in at same time.

Session

Session is a conversation between user and server.

  1. User can have >1 session, if he log in from different devices.
  2. Session must be verified, thus we have to keep sessionId.

Session status: “iPad online”, “PC online”…

Modify User table:

class User {
    int userId;
    char name[10];
    char hiddenPassword[10];
    int state;
    List<session> sessionList;
}

Important in Session table:

  1. device ID

  2. time-out period

    class Session {
    private sessionId;
    int userId;

     int deviceCode;
     long timeOut;
    

    }

User table would include a session list.

further improvement: session

  1. we update sessionList very frequently.
  2. size of sessionList is dynamic.

Solution: seperate the table.

Question: When to clean up the session data (considering huge amount of data and frequent calculation)?

Answer: every end of day. Or store sessions in-memory, so it lose all the data when machine restarts (it is used in Gaming platforms). Or we can clean up one user’s session list whenever the user did a new log-in.

We do not remove session whenever it expires. It’s too much calculation.

further improvement: inheritance

Apply inheritance to UserTable and SessionTable:

class Table {
    list<Row*> table;

    public insert(){}
    public delete(){}
    public update(){}
    public User select(){}
}

class UserTable extends Table {
}

class SessionTable extends Table {
}

As for the Row class:

class Row {
    List<Attributes> row;
}

class User extends Row {
}

class Session extends Row {
}

Question 4

design search algorithm

  1. find my userId
  2. find my userId range

Solution 1: add HashMap in the table. Can do search in O(1), but can’t find range.

Solution 2: BST data structure. Can do search range and search in O(log2 n) time.

Best solution: B+ tree

B+ tree - everything in O(logb n) which is close to constant time.

Plus, B+ tree is hard disk friendly. Read more on a future post.