[Type here]
NoSQL Databases
[Type here]
NoSQL DATABSES
SUBMITTED BY : SUBMITTED TO :
GAURAV ARORA
Ms. Harikiran Kaur
2CML1 101410018
1
[Type here]
NoSQL Databases
[Type here]
PAGE NO.
Introduction
3
Methodology
4
NoSQL Classification
6
Choosing NoSQL database
8
New Research : CAP THEOREM
9
Applications
10
References
12
2
[Type here]
NoSQL Databases
[Type here]
A NoSQL (originally referring to "non SQL" or "non relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Such databases have existed since the late 1960s, but did not obtain the "NoSQL" moniker until a surge of popularity in the early twenty-first century, triggered by the needs of Web 2.0 companies such as Facebook, Google and Amazon.com. Over the last few years we have seen the rise of a new type of databases, known as NoSQL databases, that are challenging the dominance of relational databases. Relational databases have dominated the software industry for a long time providing mechanisms to store data persistently, concurrency control, transactions, mostly standard interfaces and mechanisms to integrate application data, reporting. The dominance of relational databases, however, is cracking. Motivations for this approach include: simplicity of design, simpler "horizontal" scaling to clusters of machines, which is a problem for relational databases,[2] and finer control over availability. The data structures used by NoSQL databases (e.g. key-value, graph, or document) differ slightly from those used by default in relational databases, making some operations faster in NoSQL and others faster in rel ational databases. The particular suitability of a given NoSQL database depends on the problem it must solve. Sometimes the data structures used by NoSQL databases are also viewed as "more flexible" than relational database tables. NoSQL databases are increasingly used in big data and real-time web applications.NoSQL systems are also sometimes called "Not only SQL" to emphasize that they may support SQLlike query languages.
3
[Type here]
NoSQL Databases
[Type here]
Application developers have been frustrated with the impedance mismatch between the relational data structures and the in-memory data structures of the application. Using NoSQL databases allows developers to develop without having to convert in-memory structures to relational structures.
4
[Type here]
NoSQL Databases
[Type here]
There is also movement away from using databases as integration points in favour of encapsulating databases with applications and integrating using services. The rise of the web as a platform also created a vital factor change in data storage as the need to support large volumes of data by running on clusters. Relational databases were not designed to run efficiently on clusters. The data storage needs of an ERP application are lot more different than the data storage needs of a Facebook or an Etsy, for example.
5
[Type here]
NoSQL Databases
[Type here]
NOSQL can be broken into 4 different categories. Key Value Stores Big Table Document Databases Graph Databases Each database is individually good at dealing with size and complexities.
Key Value Stores Key value data model means that a value corresponds to a Key. Although the structure is simpler, the query speed is higher than relational database, supports mass storage and high concurrency, etc., It provided support for query and modify operations for data through the primary key [3]. Key values represent bucket of data. For example, in case of a shopping cart mentioned in Figure 3, each shopping cart are represented in individual buckets and represented using a key value which could be user id. The key values can be serialized using either java serialization or XML. This way is very fast to store as it just writes bits to the discs. Some of key value stores available in market are Berkeley DB, Tokyo Tyrant, Voldemart, Crassandra.
Big Table Search engine Zvents develop open source distributed data storage system hyper table by drawing big table. A BigTable is a light, scattered, constant multidimensional sorted map. Indexing of the map is done by a row key, column key, and a timestamp. In BigTable, uninterpreted arrays of bytes are used as values. BigTable stores structured data. Any type of data from text to serialized objects can be stored by applications. It does not impose any size constraint for each value. A table is allowed to have limitless number of columns. Data is indexed using row and column names that can be arbitrary strings
GRAPH DATABASE Graph databases allow you to store entities and relationships between these entities. Entities are also known as nodes, which have properties. Think of a node as an instance of an object in the application. Relations are known as edges that can have properties. Edges have directional significance; nodes are organized by relationships which allow you to find interesting patterns between the nodes. The organization of the graph lets the data to be stored once and then interpreted in different ways based on relationships.
Usually, when we store a graph-like structure in RDBMS, it's for a single type of relationship ("who is my manager" is a common example). Adding another relationship to the mix usually means a lot of schema changes and data movement, which is not the case
6
[Type here]
NoSQL Databases
[Type here]
when we are using graph databases. Similarly, in relational databases we model the graph beforehand based on the Traversal we want; if the Traversal changes, the data will have to change. In graph databases, traversing the joins or relationships is very fast. The relationship between nodes is not calculated at query time but is actually persisted as a relationship. Traversing persisted relationships is faster than calculating them for every query.
Document Databases Documents are the main concept in document databases. The database stores and retrieves documents, which can be XML, JSON, BSON, and so on. These documents are self-describing, hierarchical tree data structures which can consist of maps, collections, and scalar values. The documents stored are similar to each other but do not have to be exactly the same. Document databases store documents in the value part of the key-value store; think about document databases as key-value stores where the value is examinable. Document databases such as MongoDB provide a rich query language and constructs such as database, indexes etc allowing for easier transition from relational databases.
Some of the popular document databases we have seen are MongoDB, CouchDB , Terrastore, OrientDB, RavenDB, and of course the well-known and often reviled Lotus Notes that uses document storage.
7
[Type here]
NoSQL Databases
[Type here]
Given so much choice, how do we choose which NoSQL database? As described much depends on the system requirements, here are some general guidelines: Key-value databases are generally useful for storing session information, user profiles, preferences, shopping cart data. We would avoid using Key-value databases when we need to query by data, have relationships between the data being stored or we need to operate on multiple keys at the same time. Document databases are generally useful for content management systems, blogging platforms, web analytics, real-time analytics, ecommerce-applications. We would avoid using document databases for systems that need complex transactions spanning multiple operations or queries against varying aggregate structures. Column family databases are generally useful for content management systems, blogging platforms, maintaining counters, expiring usage, heavy write volume such as log aggregation. We would avoid using column family databases for systems that are in early development, changing query patterns. Graph databases are very well suited to problem spaces where we have connected data, such as social networks, spatial data, routing information for goods and money, recommendation engines
8
[Type here]
NoSQL Databases
[Type here]
Given the different behaviours of database system when they have to deal with partitions, the following is a popular classification of systems through the lens of CAP. CP systems: database systems that adhere to ACID properties, focus on CAP consistency first and then availability. They forfeit availability for consistency in c ase of partitions. AP systems: NoSQL systems designed to support applications that need to be highly available, in case of partitions they forfeit consistency,
thus falling into the AP category. The case of PNUTS, the NoSQL system from Yahoo, seem not to fit into this definition. PNUTS relaxes consistency by only guaranteeing "timeline consistency" where replicas may not be consistent with each other but updates are guaranteed to be applied in the same order at all replicas. It also gives up availability - if the master replica for a particular data item is unreachable, that item becomes unavailable for updates. CA systems: refer to systems that are not tolerant to network partitions, traditional RDBMS fall into this category. But what if a partition happens? It means that they lose availability, thus falling into the same group as CP systems. Tuneable Consistency: allows the user to decide the level of consistency he wants. The co nsistency level is a setting that clients must specify on every operation (insert, update, read) and that allows the user to decide how many replicas in the cluster must acknowledge a write operation or respond to a read operation in order to be considered successful. As many of the NoSQL can tune the consistency level in case of partitions, this is the reason some NoSQL systems fall into AP and CP categories. Table gives a summary at a high level of the features these systems provide.
9
[Type here]
NoSQL Databases
[Type here]
Social Gaming
Ad Targeting
Social games are data-intensive applications that can explode from zero to millions of players literally overnight. That kind of rapid growth, both in terms of data volume and number of users, necessitates the right class of database to store all that information and scale to a growing user base. NoSQL provides scalability, consistently high performance, always-on 24x365 operations and a flexible data model. Some of the most popular social and mobile games come from the likes of Zynga, Electronic Arts, Tencent and Shuffle Master, which are all powered by NoSQL.
Selecting an ad to display or an offer to present on a Web page is a choice with direct revenue impact. To decide where to place such ads and what groups to target, ad platforms collect behavioural, demographic and psychographic characteristics of users—and they have at most about 40 milliseconds to do so. A NoSQL database enables ad platforms to track user attributes and also access ads to place extremely quickly, increasing the probability of a click. Examples of ad targeting platforms utilizing NoSQL include those from AOL, Mediamind and PayPal.
Session Store
User Profile Store
Managing session information using relational technology has been a pain point for many Web application developers, especially as applications have grown in scale. In those cases, a global session store—i.e., one that manages session information for each user who visits the Website —is the right approach, and NoSQL has emerged as one of the best options for storing Web app session information. This is due in part to the key value storing properties of NoSQL databases: The
All Web applications require user profiles and the ability to log in. A global user profile store is another example of where the key value characteristics of NoSQL come into play. A NoSQL database can store the user IDs, user preferences, multiple ID mappings and additional user information so that the app can quickly look up a user and authenticate access. Given the importance of this functionality to any Web app, the "always on" and scale-out characteristics of NoSQL are
unstructured nature of session data is easier to store in a schema-less document than in a structured (and more rigid) RDBMS record. In addition, low-latency access to session data is critical for ensuring a great user experience.
essential. TuneWiki recently drafted a blog post about how it uses NoSQL as a user profile store.
10
[Type here]
NoSQL Databases
[Type here]
Mobile Applications
App developers' ability to update and enhance mobile apps —quickly and without service disruption—is critical to user adoption and loyalty. Because NoSQL databases can store user information and application content in a schema-less format, developers can quickly modify apps without major database infrastructure changes. That means users experience no interruption to application uptime. Some popular companies that take advantage of NoSQL for their mobile apps are Kobo and Playtika, both of which serve millions of users across the globe. Globally Distributed Data Repository
Organizations are generating enormous volumes of data spread across different systems. Using NoSQL as a data repository allows users to not only bring this information together but to better understand and use the information. With their real-time access, scalability and flexible data model that accommodates a wide variety of data types, NoSQL document databases can be a great fit to build such platforms. E-Commerce
E-commerce companies live and die by seasonal swings. Come Christmastime, users are scrambling to purchase last-minute gifts online or through mobile purchasing apps, creating a massive spike in usage. The ability to handle these spikes —without overinvesting in infrastructure—is critical to ensuring a pleasing shopper experience and minimizing abandoned purchase transactions (and lost revenue). NoSQL is a good fit for this use pattern because of its dynamic scalability (the ability to scale up to accommodate increased user activity and to scale down as user activity subsides). Companies such as The Hut Group depend on NoSQL to get them through the holiday rush.
11
[Type here]
NoSQL Databases
[Type here]
REFERENCES
NoSQL Databases , Christof Strauch (
[email protected]) , Hochschule der Medien, Stuttgart (Stuttgart Media University) Comparative Study of the New Generation, Agile, S calable, High Performance NOSQL Databases , Clarence J M Tauro , Aravindh S , Shreeharsha A.B Master Thesis No. 3460 Extending a Methodology for Migration of the Database Layer to the Cloud Considering Relational Database Schema Migration to NoSQL Rilinda Lamllari
www.infoworld.com
www.thoughtworks.com
www.wikipedia.org
IMAGES SOURCE : Internet
12