Distriartistas!

martes, 14 de septiembre de 2021

DynamoDB - Data Modeling

Steps of the process

  • Draw an entity diagram
  • Identify the relationships between entities (1:1, 1:N, N:M)
  • List down all the access pattern for each entity
  • Identity the primary key ( Hash + Sort) for each entity
  • Identify the secondary indexes for additional access patterns if required

Example | Project Management Tool

  • SAAS Tool
  • Multi-Tenant Support
  • Manage Projects and Employees

Step 01 - Draw and ERD

Diagram ERD

Step 02 - Identify Relationships

Relationships

Step 03 - Identify Access Patterns

Organization

  • Organization CRUD operations
  • Find all the projects of an organization
  • Find all the employees of and organization
  • Find all projects and employees of an organization
  • Find an organization by name

Project

  • Project CRUD operations
  • Find a project by name
  • Find the employees assigned to a project
  • Find agile projects
  • Find fixed-bid projects
  • Find on hold projects

Employee

  • Employee CRUD operations
  • Find all the projects an employee is part of
  • Find an employee by name

Project-Employee

  • Assign an employee to a project

Step 04 - Identify Primary Key

Identify primary key

  • Pk must be unique

Example Queries

Organization (org-id=1234)

Organization

  • Organization CRUD operations - PK=ORG#1234, SK=#METADATA·1234
  • Find all the projects of an organization - PK=ORG#1234, SK begins_with(PRO#)
  • Find all the employees of an organization - PK=ORG#1234, SK begins_with(EMP#)
  • Find both employees and projects - PK=ORG#1234
  • Find organization by name - Not satisfied yet

Project (org-id=1234, project-id=100)

Project

  • Project CRUD operations - PK= ORG#1234, SK=PRO#agile#100
  • Find a project by name - Not yet satisfied
  • Find the employees assigned to a project - PK=ORG#1234#PRO#100
  • Find agile projects - PK=ORG#1234, SK begins_with(PRO#agile)
  • Find fixed-bid projects - PK=ORG#1234, SK begins_with(PRO#fixed-bid)
  • Find on-hold projects - Not yet satisfied

Employee(org-id=1234, emp-id=300)

  • Employee CRUD operations - PK= ORG#1234, SK EMP#300
  • Find all projects an employee is part of - No yet satisfied
  • Find all employees by name - No yet satisfied

Step 05 - Identify Secondary Indexes

  • Find all projects an employee is part of - Use an inverted index

Project

Query on GSI - PK = ORG#1234#EMP#300

  • Find all organizations, projects, employees by name - GSI Overloading

Filter by name

Find by org name - PK=ORG#1234, filterName=ORG#HappyInc

Find by emp name - PK=ORG#1234, filterName=EMP#Manoj

  • Find On-hold projects - Use a Sparse Index

On hold project

You can Query or scan the GSI to find all the on-hold projects

Using Filter Conditions

  • These are used after the run queries and is necessary to add an extra filter with fields that aren’t indexes nor keys.
  • A filter query is applied after the query finished, but before the results are returned. A query consumes the same amount of read capacity, it not been affected if a filter condition exist
  • Use filter conditions if the secondary indexes cost more than the filter conditions due to low query velocity or frequency.

Inverted Index

Inverted index

Query on GSI - PK=ORG#1234#EMP#300

The reason to use an inverted index is because of the value to filter that I need is the SK value, so, I can’t use a query method without specified a PK, in this way we can swap the PK and SK to become SK in PK and now is possible to do the query.

Note: It’s necessary create an index table

GSI

GSI

It’s used when you need to make a search using a field different to SK to complement the search keys, so, you can create a GSI to make possible it query, in above example, the GSI is created for enable the filter by name query using PK + a new field called Data

  • Find by org name - PK=ORG1234, Data=ORG#HappyInc
  • Find by emp name - PK=ORG#1234, Data=EMP#Manoj

Note: It’s necessary create an index table

Converted from markdown using https://stackedit.io/app#