Archive for the ‘database’ Category

NoSQL and Web applications

Wednesday, March 24th, 2010

If I’m asked to draw an “easy to understand” diagram about the next-generation architecture for Web-Applications, one of my sketches would look like this:

SQL And NoSQL

SQL And NoSQL

An obvious question about this picture is:

Why are two different Database-Systems necessary?

My artless answer is as follows:

In some crucial cases a NoSQL Database is considerably less expensive than a SQL Database, but a NoSQL Database cannot completely replace a SQL Database.

For example:

  • Try to store millions of Media Files or other large Documents on a distributed RDBMS. Each document has to be quickly accessible and updatable.
  • Try to create a very large dynamic key/value table as a replacement for an overgrown in-memory Hashmap.

Of course you can achieve this also with a relational database, but you will be faced with following problems:

  • RDBMS BLOBS are slow. You can compensate this problem with database clustering but the typical workaround is to store a link or file-path and use the file system as data storage. In this case you always have to check for inconsistencies.
  • You can create a key/value store in your SQL Database. But each access creates a heavy processing overhead. (Transaction handling, logging and versioning.)

NoSQL-Databases are optimized and designed to:

  • Store and access large “binary” objects very fast.
  • Accomplish fast data replication and use table separation (sharding) between different storages.
  • They allow very fast key-value stores.

(Easy schema modification is often stated as NoSQL advantage, but I think this is also a trivial task with DDL.

– Don’t forget to set the initial value of the new column and enable the trigger after the table alteration. )

In contrast there are tasks you should use a SQL – Database:

  • Data Presentation: In most cases visual entities (for example table lines and columns), aren’t 1:1 presentations from Database objects. They are sorted, mapped, reduced (filtered) and often consist of joins between different tables. Map-Reduce and sorting can be done efficiently with NoSQL. But joins aren’t supported. Of course you can rewrite the JOIN operation in your application. For example a simplified HashMap JOIN could look like this:
    var tableBMap : HashMap[idxType, Type of tableB] = tableB.foreach( line => tableBMap += (line.joinAttributB -> line))
    tableA.map( line => (line, tableBMap.get(line.joinAttributA)))

    The Disadvantage is, that you have to materialize the complete object / table-line.

  • NoSQL = No Transaction. You cannot use NoSQL for reliable database transactions (reliable like in ACID). This is a serious restriction for massive parallel database updates. (Hence, you should avoid programming the popular Bank Account example with a NoSQL Database.)

Why I prefer MongoDB as NoSQL Database System:

  • Very Fast: Sequential write and random read operations are done very fast on an “average” Server. (You need a 64 Bit OS if your database is larger than 2GB)
  • Scala(Java) Support: Several drivers are available scamongo, mongo-scala-driver or akka-persistence
  • Easy to use: There is no setup, parameter or table type “magic”. In less than an hour you can setup a secure and robust Database Server. The client uses JS/JSON.
  • A nice community: The Forum is very active. Questions should be answered in a couple of hours.

Addendum to my recent Post: FSM stands for “Flying Spaghetti Monster”

In my recent post I used the abbreviation FSM for “Finite State Machine”.

I was informed that FSM is the common abbreviation for the Flying Spaghetti Monster.

I searched for this term in the Kungle News document-storage and found some evidence for this claim:

Small Screendump

Small Screendump

Here is the complete list of references:

Query:

Start Main

New Hist Defined with:

Primary: List(Flying Spaghetti Monster, Pastafarianism, Pastafarian)

Secondary in: List(FSM)

Secondary out: List()

Calculated Interval: { “publishingDate” : { “$gt” : “2010-01-10T23:00:00.000Z” , “$lte” : “2010-03-22T23:00:00.000Z”} , “originalLanguage” : “ENGLISH”}

TitlePublished
Recorded
PublisherCitation
Mississippi dips its toe into antirealityTue Jan 19 15:00:00 CET 2010Discover MagazineFinally they will spread the message of how we were all created by the Flying Spaghetti Monster
"Senator Webb (D) shows fear: ""Suspend All Votes On Health Care Bill Until Senator-Elect Brown Is Seated"""Thu Jan 21 00:00:28 CET 2010Crooks and Liarsit'll be approximately Dec 2012, slightly before she takes office, so I figure what better time for Jebus, Buddah, Allah, Ra, Flying Spaghetti Monster, the Mayans, The 'V' lizard people, the Vogons, Daleks, V'Ger, etc to arrive?
FSM protect us!Tue Jan 26 19:40:17 CET 2010Discover MagazineSome people say the Church of the Flying Spaghetti Monster was a joke made ...
Video – Who Knew He Could Be A Swedish Hero?Mon Feb 01 16:00:11 CET 2010Dvorak UncensoredI wonder if it will work with the Flying Spaghetti Monster?
Minority Contractors Receive Just 2 Percent of Highway Stimulus CashThu Feb 04 20:31:08 CET 2010InfrastructuristWhat if only 2% of all infrastructure construction companies are women or minority owned? Setting a goal of having 10% of all contract recipients be minority or women owned is about as useful as setting a goal of having 10% of all contract recipients go to Wild Hildabeest and Flying Spaghetti Monster owned businesses.
When Did Jesus Become a Republican (or, for that matter, a Democrat?Mon Feb 15 00:44:01 CET 2010Care2 NewsI'm a Pastafarian. To me it's the only pure religion.
Do we really need a religious bill of rights?Mon Feb 15 15:27:14 CET 2010Discover MagazineIf the act passes, we need a Pastafarian as an agent provocateur.
Church of the Flying Spaghetti Monster FSM StoreFri Feb 19 02:33:10 CET 2010Suite101Church of the Flying Spaghetti Monster FSM Store
Iraq still embracing the magicWed Feb 24 02:02:34 CET 2010Discover MagazineNo, just kidding, it’s “For Flying Spaghetti Monster’s Sake”
Miss Beverly Hills tries to one-up Carrie Prejean, says it’s divine law that gays be put to death.Wed Feb 24 15:55:45 CET 2010Think ProgressThe Flying Spaghetti Monster offers more in life than any pagan based worship.
Video: Republican legislator says disabled children are 'God's punishment' for abortionWed Feb 24 19:00:09 CET 2010Crooks and LiarsI think my Pastafarianism makes me less able to understand why so many people think a superior being gives a flying noodle what they do or say?
South Dakota legislators tell schools to teach ‘astrological’ explanation for global warming.Thu Feb 25 19:49:40 CET 2010Think ProgressAll hail the Flying Spaghetti Monster!
"End of an Era: ""Lasts"" for Shuttle Program"Fri Feb 26 18:32:36 CET 2010Universe TodaySome spectacular pictures from the final SRB test. FSM-17, (that's flight support motor, not Flying Spaghetti Monster) burned for approximately 123 seconds — the same time each reusable solid rocket motor burns during an actual space shuttle launch.
Atheist Groups Visit The White House Causing A Right Wing TizzyMon Mar 01 15:45:41 CET 2010Dvorak Uncensored(Before the religious start jumping up and down “See, atheists ARE a religion”, the whole thing is a joke, like the Flying Spaghetti Monster)
Creationists And Climate Deniers Take On Teaching Climate Science In SchoolsThu Mar 04 17:20:14 CET 2010HuffingtonpostI think we can all look forward to the time when these three theories are given equal time in our science classrooms across the country, and eventually the world; One third time for Intelligent Design, one third time for Flying Spaghetti Monsterism (Pastafarianism), and one third time for logical conjecture based on overwhelming observable evidence.
Massa Will Resign MondayFri Mar 05 20:43:18 CET 2010Talking Points MemoThere is no doubt that there is a Flying Spaghetti Monster. The question is just how it flies, and what kind of sauce it's covered in.
ARD TV drama sparks Scientology's ireMon Mar 08 11:34:00 CET 2010The Local GermanyWhat Would the Flying Spaghetti Monster Do?
Christian leaders urge Congress to ignore misinformation on abortion provisions and pass health reform.Sat Mar 13 18:17:21 CET 2010Think ProgressSince a Pastafarian, I will say RAmen. ;)
Kreutz Comet VIDEO: WATCH Newly-Discovered Comet's Collision Course With The SunSun Mar 14 15:36:13 CET 2010HuffingtonpostIt's the great noodly appendage of the Flying Spaghetti Monster.
"To The 9th Circuit Court Of Appeals, God Is ""Patriotic"" And No Longer ""Religious"""Sun Mar 14 16:00:51 CET 2010Crooks and LiarsI'd tell him to substitute Flying Spaghetti Monster where appropriate.
Boehner Claims Student Loan Reform Will ‘Eliminate Every Bank In The Country’Fri Mar 19 23:42:19 CET 2010Think ProgressThe universe could have been created by a Flying Spaghetti Monster, or it could have always existed.

2; 0; 19

Topic Connections:

(fsm,1)(monster,1)(store,1)(flying,1)(church,1)(spaghetti,1)

Emotions:

(love,14)(hope,13) (+)/(-) (hate,7)(fear,7)

Public Tendency:

10; 5; 6

Country:

gb; us; no; cn; jp; in; se; au; ru; de; fr; ie; gr; nz; ca;

0; 20; 0; 0; 0; 0; 1; 0; 0; 0; 0; 0; 0; 0; 0;

Publisher Tendency:

(Discover Magazine,3)(Huffingtonpost,2)(The Local Germany,1)(Think Progress,0)(Talking Points Memo,-1)(Crooks and Liars,-3)

Calculation Done

This Week on Kungle.de: Nobel Prizes, Riots in Pakistan and the “Balloon Boy”

Friday, October 16th, 2009

Three issues with hundreds of similar news publications blocked the front page of Kungle.de. Each publication is interesting and informative by itself  but together they are hiding other noteworthy information.

I concluded that it was about time to build a new subsystem to reduce the amount of identical information. You can still find all articles via the new “related link”.

The new Subsystem “IssueMerger“ now merges  news with similar content. The older news entries are the more likely  they are consolidated to one issue.

For this, I defined a function to calculate the proximity of two entries. (The Result is 1 if two news entries  are identical and 0 if they completely different.)

It is necessary to  build a complete “News Topology” (A Matrix with up to 1.5 million elements) which defines the proximities of all entry combinations.

The calculation for all topics requires up to 40 hours. The Algorithm itself was coded in 80 lines of scala.

You can find a calculated result here:
http://www.kungle.de/Trend/entry/220033

Update 1: In comparison this merge was hand made:

http://www.kungle.de/Trend/entry/225189

Configure the Apache Derby DB Network/Client Mode in 5 Minutes

Sunday, May 24th, 2009

It is very easy to use Apache Derby with the embedded JDBC driver. The disadvantage of this solution is that you can only open one connection at a time. If you are developing an application with data persistence, it is often helpful to monitor the database from a different application (e.g. the Data Source Explorer from Eclipse). If you are not interested in reading the administration guide (http://db.apache.org/derby/docs/dev/adminguide/) you can follow these simple steps to configure Derby for Network/Client Mode:

Step  1: Download Apache Derby

Download the bin distribution from: http://db.apache.org/derby/derby_downloads.html

Step 2: Install

Unpack the archive in an appropriate application directory.

Step 3: Create a database directory

For example: mkdir db_storage

Step 4: Set environment variables:

Set DERBY_HOME and add its bin directory to your system path. Also amend your CLATHPATH with the Derby jars.

For example on your mac:

export DERBY_HOME=/Users/.../Programme/db-derby-10.5.1.1-bin

export PATH=$PATH:$DERBY_HOME/bin

export CLASSPATH="$DERBY_HOME/lib/derbynet.jar:$DERBY_HOME/lib/derbytools.jar:\ $DERBY_HOME/lib/derbyclient.jar:$DERBY_HOME/lib/derby.jar:$CLASSPATH"

On Windows systems:

SET CLASSPATH="%DERBY_HOME%/lib/derbynet.jar;%DERBY_HOME%/lib/derbytools.jar;%DERBY_HOME%/lib/derbyclient.jar;%DERBY_HOME%/lib/derby.jar;%DERBY_HOME%;%CLASSPATH%"

(You probably want to put it in a script or batch.)

Step 5: Create the database

Open a terminal/shell/command prompt, change into the database directory (Step3) and start ij.

Type:

connect ‘jdbc:derby:simple;create=true’;
exit;

Step 6: Database Configuration

In your database directory create the file derby.properties.

Add the following lines:

derby.connection.requireAuthentication=TRUE
derby.authentication.provider=BUILTIN
derby.user.alice=mypass

Step 7: Start Derby

Type:

startNetworkServer

on your mac or

startNetworkServer.bat

on your Windows system.

Step 8: Create a user schema and add a table

Open a new terminal/shell/command prompt, change into the database directory  and start ij.

Type:

CONNECT ‘jdbc:derby://localhost:1527/simple’ user ‘alice’ password ‘mypass’;
Create schema alice;

create table dummy (id int primary key, text varchar(20));

insert into dummy values (1, ‘eins’);

insert into dummy values (10, ‘zehn’);

insert into dummy values (100, ‘einhundert’);

exit;

Step 9: Connect from your IDE:

you can now use the derby jdbc network client driver to connect to your database.

connect