Relational schema design for LinkedIn
Today, we are going to discuss an important topic that is ignored most of the time by beginners and NoSQL background devs while developing any application or system.
As we know the database is the main part of any system if it’s not designed well then the survival of our system is not going for a long time or we have to do a lot of code or investment to make it work.
So first let’s understand the term relational schema, Relation schema defines the design and structure of the relationship like it consists of the relation name, set of attributes/field names/column names. every attribute would have an associated domain.
To do a design schema for a relational data store first we need to follow these steps:-
List out all the features
Identify entities related to the feature.
So let’s understand in a better way by doing relation schema design for LinkedIn.
Note: I highlighted entities in the features list and we can do it in the same way.
Features of LinkedIn(Basic flow)
The User should be able to view or edit profile information.
On the top left the user should have a profile photo
On the right side of the profile pic, the name and contact information should be displayed.
Users can store or manage Skills, Education, Companies.
A user may send a Connection request to other users. unless the recipient approves the connection request, the sender will just be a follower. once the connection request is accepted, both sender and recipient become followers of each other.
A user should be able to write a Post.
A user should be able to Comment on a post or reply to a comment.
A user should be able to Like a post or a comment.
We can manage user basic details in the user's entity with columns id, name, email, profile_pic, and then we converted features 1 ,2 & 3.
As we can see in the 4th point, there are 3 functionalities are required so we have to create 3 more entities to develop these features. so we can create these entities with one too many relationships with users entity because users can have more than one skill, company, or education.
But there is a problem as we can understand different users may have some skill, education, and company so in this case data repetition will happen in these 3 entities and it’s not good practice for any DB schema.
To resolve this problem we will create master entities for each feature like skills will be a master entity and user_skills another entity to store skills for users(Many to Many) and the same as for education & company.
And in the above scenario, there will be a relationship many to many where a single user can have multiple records(skills, companies, education) and a record(skill, company, education) can belong to multiple users.
Let’s move on to point 5th where we need a connection feature. so to implement that simply we can create an entity connection.
And connections entity contains four columns id, sender_user_id, receiver_user_id, is_accepted(bool).
where we store the request sender in sender_user_id and receiver in receiver_user_id and by default the value of is_accepted will be false and it will change as the receiver accepts the request otherwise sender will be a follower only not a connection.
For feature 6th where we need to maintain user posts so simply, we can create entity posts and it will contain these columns id, user_id, post_text and it will be one to many relationships in between users and posts entities because a single user can have multiple posts.
For feature 7th where we need to develop a feature for comments on a post and to do that we will create entity comments and these are columns id, user_id, post_id, comment_text and it will be one to many relationship in between users and posts.
And for the last feature where user can like a comment or post, we can do it by adding a column in related entities and do increment as someone like that specific post or comment. But it’s not a good approach while we are developing a normalized DB schema. so to handle that we will create a separate entity likes and will manage our likes there we need four columns there and these are id, user_id, entity_id, entity_type and simply we can do an entry in this entity as someone likes any comment or post and we can fetch likes by using aggregate function count with where condition.
Before you go… Thanks for reading the article! If you enjoyed it, please don’t forget to show your appreciation by sharing this blog.