Data Patterns

7 ביוני 2007

no comments


The purpose of this post is to help you to find the right pattern for you data handling.


Data Personality are qualities we decide our data has, which means we assume assumption on our data that will lead us to the correct pattern. To understand the “data Personality” we should ask several questions:

The Questions can be put into 3 Groups:

1. The Database Sharing:

A. Is it a Private Database? – “My Database”
Only one application can access that data.

B. Is it a Shared Database? – “My Shared Database”
Many users can access this data concurrently

C. Is it a Shared Database in a for a particular group of users? – “Our Database”
Not many users can access this data concurrently

2. Timeliness and change rate (Read Only Vs Read Write . Quick Vs Slow Change Rate):

A. Is it a “Read Only” Data? – “Referenced Database”

B. Is it a ”Read Write” Data? – “Changing Database”

a. Does it change quickly ?

b. Do you need a fresh view ? – “Fresh Database”
When ever you look you get have the exact true data

c. Is the data stale and how stale is it – “Stale Database”

C. Is it always contain true consistent data? – “Correct Database”

3. Locality and the amount of data (Is it Huge Data base (you have a lot of data)):

A. Is this data distributed ? Can it be distributed?

a. Is your Data scattered across different databases? – “Distributed Database”

b. Do you have Different types of data in different places? “My Distributed Data”

c. Is your data distributed in geographically different places?

B. What view of data do you need?

a. Does this view exist in any one database?

C. How many users use the data?

a. Are there any scalability issues?

Patterns


Pattern also can be divided into 3 groups of consideration.
Ask yourself about each of the following considerations:

1. Access:

1. Access Direct – My data base

a. single user private local database

2. Access Remote (Client Server) – The database is shared

a. Two tier Small amount of users.

b. Simple concurrency

3. Intermediated Access

a. You need to handle your problems between the client and the Database.

b. Pooling (Scalability – Number of locks and connections)

c. Security (The database and the user live in different security realms)

d. Validation & Error handling

e. Data Transformation before the database

f. Code Manageability

2. Error handling or Concurrency

1. ACID Transaction

a. Use Distributed transaction when possible

2. Accounting

a. List the problems and deal with them one by one

3. Compensation

a. Create a proper error path

3. Distribution

1. Caching & Snapshoting

a. Transformation between one “timeliness and change” assumptions patterns to another. (example: Read Write to stale)

2. Federation

a. Create a view of date from many different sources that does not exist in any of them. Build another layer in front of your databases with a full CRUD capabilities which has the exact view you need. The data is scattered in many databases but your app does not know this.

3. Replication

a. You replicate data in different location for scalability

b. For “Read Only” data / Very slow changing data

4. Distribution

a. You distribute data to different location for scalability & simplicity

b. This is Replication but for Read Write data

c. Splitting the writable data on many machines

d. A router or broker sends you to the right database.

5. Reporting

a. “Read Only” and “Read Write” at the same time.

b. Copy the data you need to create a report from to another DB and continue the read write
    on the original DB formation


Enjoy


Manu Cohen-Yashar


 

Add comment
facebook linkedin twitter email

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*