Archive for the ‘english’ Category

NoSQL and Web applications

Wednesday, March 24th, 2010

If I’m asked to draw an “easy to understand” diagram about the next-generation architecture for Web-Applications, one of my sketches would look like this:

SQL And NoSQL

SQL And NoSQL

An obvious question about this picture is:

Why are two different Database-Systems necessary?

My artless answer is as follows:

In some crucial cases a NoSQL Database is considerably less expensive than a SQL Database, but a NoSQL Database cannot completely replace a SQL Database.

For example:

  • Try to store millions of Media Files or other large Documents on a distributed RDBMS. Each document has to be quickly accessible and updatable.
  • Try to create a very large dynamic key/value table as a replacement for an overgrown in-memory Hashmap.

Of course you can achieve this also with a relational database, but you will be faced with following problems:

  • RDBMS BLOBS are slow. You can compensate this problem with database clustering but the typical workaround is to store a link or file-path and use the file system as data storage. In this case you always have to check for inconsistencies.
  • You can create a key/value store in your SQL Database. But each access creates a heavy processing overhead. (Transaction handling, logging and versioning.)

NoSQL-Databases are optimized and designed to:

  • Store and access large “binary” objects very fast.
  • Accomplish fast data replication and use table separation (sharding) between different storages.
  • They allow very fast key-value stores.

(Easy schema modification is often stated as NoSQL advantage, but I think this is also a trivial task with DDL.

– Don’t forget to set the initial value of the new column and enable the trigger after the table alteration. )

In contrast there are tasks you should use a SQL – Database:

  • Data Presentation: In most cases visual entities (for example table lines and columns), aren’t 1:1 presentations from Database objects. They are sorted, mapped, reduced (filtered) and often consist of joins between different tables. Map-Reduce and sorting can be done efficiently with NoSQL. But joins aren’t supported. Of course you can rewrite the JOIN operation in your application. For example a simplified HashMap JOIN could look like this:
    var tableBMap : HashMap[idxType, Type of tableB] = tableB.foreach( line => tableBMap += (line.joinAttributB -> line))
    tableA.map( line => (line, tableBMap.get(line.joinAttributA)))

    The Disadvantage is, that you have to materialize the complete object / table-line.

  • NoSQL = No Transaction. You cannot use NoSQL for reliable database transactions (reliable like in ACID). This is a serious restriction for massive parallel database updates. (Hence, you should avoid programming the popular Bank Account example with a NoSQL Database.)

Why I prefer MongoDB as NoSQL Database System:

  • Very Fast: Sequential write and random read operations are done very fast on an “average” Server. (You need a 64 Bit OS if your database is larger than 2GB)
  • Scala(Java) Support: Several drivers are available scamongo, mongo-scala-driver or akka-persistence
  • Easy to use: There is no setup, parameter or table type “magic”. In less than an hour you can setup a secure and robust Database Server. The client uses JS/JSON.
  • A nice community: The Forum is very active. Questions should be answered in a couple of hours.

Addendum to my recent Post: FSM stands for “Flying Spaghetti Monster”

In my recent post I used the abbreviation FSM for “Finite State Machine”.

I was informed that FSM is the common abbreviation for the Flying Spaghetti Monster.

I searched for this term in the Kungle News document-storage and found some evidence for this claim:

Small Screendump

Small Screendump

Here is the complete list of references:

Query:

Start Main

New Hist Defined with:

Primary: List(Flying Spaghetti Monster, Pastafarianism, Pastafarian)

Secondary in: List(FSM)

Secondary out: List()

Calculated Interval: { “publishingDate” : { “$gt” : “2010-01-10T23:00:00.000Z” , “$lte” : “2010-03-22T23:00:00.000Z”} , “originalLanguage” : “ENGLISH”}

TitlePublished
Recorded
PublisherCitation
Mississippi dips its toe into antirealityTue Jan 19 15:00:00 CET 2010Discover MagazineFinally they will spread the message of how we were all created by the Flying Spaghetti Monster
"Senator Webb (D) shows fear: ""Suspend All Votes On Health Care Bill Until Senator-Elect Brown Is Seated"""Thu Jan 21 00:00:28 CET 2010Crooks and Liarsit'll be approximately Dec 2012, slightly before she takes office, so I figure what better time for Jebus, Buddah, Allah, Ra, Flying Spaghetti Monster, the Mayans, The 'V' lizard people, the Vogons, Daleks, V'Ger, etc to arrive?
FSM protect us!Tue Jan 26 19:40:17 CET 2010Discover MagazineSome people say the Church of the Flying Spaghetti Monster was a joke made ...
Video – Who Knew He Could Be A Swedish Hero?Mon Feb 01 16:00:11 CET 2010Dvorak UncensoredI wonder if it will work with the Flying Spaghetti Monster?
Minority Contractors Receive Just 2 Percent of Highway Stimulus CashThu Feb 04 20:31:08 CET 2010InfrastructuristWhat if only 2% of all infrastructure construction companies are women or minority owned? Setting a goal of having 10% of all contract recipients be minority or women owned is about as useful as setting a goal of having 10% of all contract recipients go to Wild Hildabeest and Flying Spaghetti Monster owned businesses.
When Did Jesus Become a Republican (or, for that matter, a Democrat?Mon Feb 15 00:44:01 CET 2010Care2 NewsI'm a Pastafarian. To me it's the only pure religion.
Do we really need a religious bill of rights?Mon Feb 15 15:27:14 CET 2010Discover MagazineIf the act passes, we need a Pastafarian as an agent provocateur.
Church of the Flying Spaghetti Monster FSM StoreFri Feb 19 02:33:10 CET 2010Suite101Church of the Flying Spaghetti Monster FSM Store
Iraq still embracing the magicWed Feb 24 02:02:34 CET 2010Discover MagazineNo, just kidding, it’s “For Flying Spaghetti Monster’s Sake”
Miss Beverly Hills tries to one-up Carrie Prejean, says it’s divine law that gays be put to death.Wed Feb 24 15:55:45 CET 2010Think ProgressThe Flying Spaghetti Monster offers more in life than any pagan based worship.
Video: Republican legislator says disabled children are 'God's punishment' for abortionWed Feb 24 19:00:09 CET 2010Crooks and LiarsI think my Pastafarianism makes me less able to understand why so many people think a superior being gives a flying noodle what they do or say?
South Dakota legislators tell schools to teach ‘astrological’ explanation for global warming.Thu Feb 25 19:49:40 CET 2010Think ProgressAll hail the Flying Spaghetti Monster!
"End of an Era: ""Lasts"" for Shuttle Program"Fri Feb 26 18:32:36 CET 2010Universe TodaySome spectacular pictures from the final SRB test. FSM-17, (that's flight support motor, not Flying Spaghetti Monster) burned for approximately 123 seconds — the same time each reusable solid rocket motor burns during an actual space shuttle launch.
Atheist Groups Visit The White House Causing A Right Wing TizzyMon Mar 01 15:45:41 CET 2010Dvorak Uncensored(Before the religious start jumping up and down “See, atheists ARE a religion”, the whole thing is a joke, like the Flying Spaghetti Monster)
Creationists And Climate Deniers Take On Teaching Climate Science In SchoolsThu Mar 04 17:20:14 CET 2010HuffingtonpostI think we can all look forward to the time when these three theories are given equal time in our science classrooms across the country, and eventually the world; One third time for Intelligent Design, one third time for Flying Spaghetti Monsterism (Pastafarianism), and one third time for logical conjecture based on overwhelming observable evidence.
Massa Will Resign MondayFri Mar 05 20:43:18 CET 2010Talking Points MemoThere is no doubt that there is a Flying Spaghetti Monster. The question is just how it flies, and what kind of sauce it's covered in.
ARD TV drama sparks Scientology's ireMon Mar 08 11:34:00 CET 2010The Local GermanyWhat Would the Flying Spaghetti Monster Do?
Christian leaders urge Congress to ignore misinformation on abortion provisions and pass health reform.Sat Mar 13 18:17:21 CET 2010Think ProgressSince a Pastafarian, I will say RAmen. ;)
Kreutz Comet VIDEO: WATCH Newly-Discovered Comet's Collision Course With The SunSun Mar 14 15:36:13 CET 2010HuffingtonpostIt's the great noodly appendage of the Flying Spaghetti Monster.
"To The 9th Circuit Court Of Appeals, God Is ""Patriotic"" And No Longer ""Religious"""Sun Mar 14 16:00:51 CET 2010Crooks and LiarsI'd tell him to substitute Flying Spaghetti Monster where appropriate.
Boehner Claims Student Loan Reform Will ‘Eliminate Every Bank In The Country’Fri Mar 19 23:42:19 CET 2010Think ProgressThe universe could have been created by a Flying Spaghetti Monster, or it could have always existed.

2; 0; 19

Topic Connections:

(fsm,1)(monster,1)(store,1)(flying,1)(church,1)(spaghetti,1)

Emotions:

(love,14)(hope,13) (+)/(-) (hate,7)(fear,7)

Public Tendency:

10; 5; 6

Country:

gb; us; no; cn; jp; in; se; au; ru; de; fr; ie; gr; nz; ca;

0; 20; 0; 0; 0; 0; 1; 0; 0; 0; 0; 0; 0; 0; 0;

Publisher Tendency:

(Discover Magazine,3)(Huffingtonpost,2)(The Local Germany,1)(Think Progress,0)(Talking Points Memo,-1)(Crooks and Liars,-3)

Calculation Done

Trend – Visualization

Monday, March 22nd, 2010

Since January the New “Corpus Engine” is in development and recorded about 302.000 articles. All in all 1.130.000 “news headlines” and summaries were stored since Kungle.de went online.

Now new algorithms were developed to:

  • Identify the public opinion about political and economic topics.
  • Follow the image status of brands, corporations or companies.
  • Track public feelings and emotions about actual events.

The Challenge

The actual trend calculation, based on static dictionaries, isn’t able to identify new events like an ‘earthquake’ or a
‘political reform’. The “topic-tagging” is static and limited to 9 topics “Science, Economy, Politic, Technology, Entertainment, Sport, Boulevard, Adult and Religion”.

It would be an exhausting task to code every new subtopic or event in a FSM (Finite State Machine).

Therefore the new engine identifies topics by itself. So not only the trend is calculated dynamically also the topic classification is “calculated”.

How is this done?

A simplified breakdown:

NLP (Natural Language Processing) is based on two strategies for text analysis: Tagging via Dictionaries and word / N-gram frequency analysis.

For Example:

This is an animation of a small section from the Kungle English – Dictionary (about 300.000 words) since January. The daily word count (one hour = one frame) is represented in the column height. The Column color changes from green to red if the word occurred in more than 10 percent of all articles. The overall word count frequency decreases from left to right.

Bigrams:

This is an even smaller section from the weighted bigram Matrix (about 100.000 x 100.000 words) in the same timeframe. Also this animation is compressed you can identify some horizontal and vertical lines. These lines occur if a topic is heavily discussed.

Kungle.de Sitemap update

Thursday, August 13th, 2009

I have updated the kungle.de sitemap without announcement.
I apologize for the trouble caused. Please update your bookmarks.

Scala’s simple strategy to reduce debugging costs: “Some” and “None”

Sunday, August 9th, 2009

Null Pointer Exceptions (NPEs) are most common runtime errors in Java (more than 500.000 results in Google). Most Java libraries are interspersed with methods returning null if the computation could not be finished. Invoking method calls on null raises Null Pointer Exceptions.

The best practice in Java to avoid such errors is a test against null.

if(aObject == null) {...

or pass the null test to another function.

For example, Instead of writing:

if(aResultString.equals(“literal”)) {
...

you write:

if(“literal”.equals(aResultString)) {
...

More than eleven years ago (beginning with the java era) a data type existed, avoiding the NPE problem:

data  Maybe a =  Nothing | Just a deriving (Eq, Ord, Read, Show)

A Haskell one-liner ported to Java could have prevented thousands of errors.

Why?

You cannot forget to test against None (unlike the null check) because this would raise a compile time error. Based on Haskells Maybe-Type, Scala defines the Some and None Type with:

final case class Some[+A](x: A) extends Option[A]
...

and

case object None extends Option[Nothing]
...

To test an Option Type against None you write:

val result =  a getOrElse b..

If a is Some(_) than result is _. If result is None, than result is b. The value of result is therefore always defined!

Runtime Errors on my News Page Kungle.de so far

To Keep it short:
None

Configure the Apache Derby DB Network/Client Mode in 5 Minutes

Sunday, May 24th, 2009

It is very easy to use Apache Derby with the embedded JDBC driver. The disadvantage of this solution is that you can only open one connection at a time. If you are developing an application with data persistence, it is often helpful to monitor the database from a different application (e.g. the Data Source Explorer from Eclipse). If you are not interested in reading the administration guide (http://db.apache.org/derby/docs/dev/adminguide/) you can follow these simple steps to configure Derby for Network/Client Mode:

Step  1: Download Apache Derby

Download the bin distribution from: http://db.apache.org/derby/derby_downloads.html

Step 2: Install

Unpack the archive in an appropriate application directory.

Step 3: Create a database directory

For example: mkdir db_storage

Step 4: Set environment variables:

Set DERBY_HOME and add its bin directory to your system path. Also amend your CLATHPATH with the Derby jars.

For example on your mac:

export DERBY_HOME=/Users/.../Programme/db-derby-10.5.1.1-bin

export PATH=$PATH:$DERBY_HOME/bin

export CLASSPATH="$DERBY_HOME/lib/derbynet.jar:$DERBY_HOME/lib/derbytools.jar:\ $DERBY_HOME/lib/derbyclient.jar:$DERBY_HOME/lib/derby.jar:$CLASSPATH"

On Windows systems:

SET CLASSPATH="%DERBY_HOME%/lib/derbynet.jar;%DERBY_HOME%/lib/derbytools.jar;%DERBY_HOME%/lib/derbyclient.jar;%DERBY_HOME%/lib/derby.jar;%DERBY_HOME%;%CLASSPATH%"

(You probably want to put it in a script or batch.)

Step 5: Create the database

Open a terminal/shell/command prompt, change into the database directory (Step3) and start ij.

Type:

connect ‘jdbc:derby:simple;create=true’;
exit;

Step 6: Database Configuration

In your database directory create the file derby.properties.

Add the following lines:

derby.connection.requireAuthentication=TRUE
derby.authentication.provider=BUILTIN
derby.user.alice=mypass

Step 7: Start Derby

Type:

startNetworkServer

on your mac or

startNetworkServer.bat

on your Windows system.

Step 8: Create a user schema and add a table

Open a new terminal/shell/command prompt, change into the database directory  and start ij.

Type:

CONNECT ‘jdbc:derby://localhost:1527/simple’ user ‘alice’ password ‘mypass’;
Create schema alice;

create table dummy (id int primary key, text varchar(20));

insert into dummy values (1, ‘eins’);

insert into dummy values (10, ‘zehn’);

insert into dummy values (100, ‘einhundert’);

exit;

Step 9: Connect from your IDE:

you can now use the derby jdbc network client driver to connect to your database.

connect

Wecker(engl. alarm-clock) on ubuntu

Monday, April 20th, 2009

You can use my small java application “wecker” on Ubuntu.

weckerwithubuntu

You can find it here: http://yousry.de/AppWecker.html.

Memory-Game on Ubuntu

Sunday, April 19th, 2009

You can now play my 3D memory-game on Ubuntu/Linux.

memorygameonubuntu

Make sure you have installed java6 from sun.

You can find it here: http://www.yousry.de/Memory-Game.html

Update: Memory Game

Wednesday, April 15th, 2009

You can find a small bugfix release of the memory-game. Reason was a newly discovered VM coredump under Vista (ogl library conflict).

Update: Verschlüsselung (eng.: Encryption)

Saturday, April 11th, 2009

First results show comparable processing speeds to commercial or proprietary products. A backport to Java5 / Mac was successful. The application may be used on all current desktop operation systems.

Algorithm Windows Mac
SHA1AndDESede
MD5AndTripleDES
SHA1AndRC2_40
MD5AndDES

In subsequent versions, major functional enhancements are planned. The complex user interface will be simplified.

Tasks until alpha release:

  • The user interface must be completed.
  • Software testing.

Tasks until beta release:

  • Add new encryption algorithms.
  • Add simplified user interface.
  • Add server functionality.
  • Batch processing.
  • Create API documentation.