Saturday, 7 September 2013

Persistence ( in reference to databases usage in programming )

 When we go through with persistence we have these most important topics to know about.

  •  MySql
  • SqlServer
  • MySql Work bench 
  • Query optimization
  • No sql

Mysql workbench has three very important portions
  1. SQL Development
  2. Data Modeling : Complete visual design and modeling
  3. Database Administration : GUI to start stop server, User accounts, Manage configuration files
Query Optemization

  • Why 
          - Large databases
          - Complex queries to reterieve data
          - More chances of table lock and data corruption
  •  How
          - Understanding of how database is executing our queries i.e.EXPLAIN keyword
            is used to see the query plan ( in Mysql).
          - Retrieve as little data as possible  ( do not use select * )
          - Store intermediate results i.e.User index, server tuning, an aggregate table,
            vertical partitioning ( partition the table by column decreases the amount of data
            to be process ),Horizontal partitioning ( partition the table by data value),
            De-normalization ( avoid joins).

  • Non relational databases
  • Designed for distributed data stores to store large scale of data ( google, facebook )
  • No fixed schema, scale horizontally
  • Properties
            Stands for not only SQL
            No declarative query language
            No predefined schema
            Key value pair storage,column store,document store,graph store
            Unstructured and unpredictable data
            High performance , high availability and scalability

  • CAP Theory
          1)  'C' for  Consistency : After each operation all users must see the same data.
          2) 'A' for Availability : No down time.
          3) 'P' for  Partition tolerance : System must continues to function even
           communication among the servers is unreliable.
         4) not possible to provide all 3 at same time, all available databases of the time are
            providing   the 2 of 3 above described i.e.
             I- CA
            II- CP
           III- AP

  •  Data storage formats
         - Key-value pairs ( Cassandra )
         - Document ( MongoDB)
         - XML format ( BaseX)
         - Graph format ( Neo4j)
  • Disadvantages
         - No standards
         - Limited query capabilities