DB2 RELATIONAL DATABASE MANAGEMENT SYSTEM
1
COURSE OBJECTIVES
After completing this course you should be able to • List and descri describe be the major major function functionss , components components and data management management techniques for DB2 . • Describe Describe DB2’s SQL and its efficient efficient use with 3GL languages like COBOL. • Use DB2 associated associated facilitie facilitiess like SPUFI SPUFI ,DCLGENS. ,DCLGENS. • Use DB2 DB2 utilities utilities like like LOAD LOAD , RUNSTAT RUNSTATS. S.
2
COURSE PLAN
• Introduct Introduction ion to Database Database Managem Management ent System System . • DB2 DB2 Over Overvi view ew.. • DB2 DB2 Data Data Obj Objec ects ts . • SQL SQL - Data Data Defi Defini nitio tion n Lan Langu guag agee . • SQL SQL - Data Data Man Manip ipul ulat atio ion n Lang Langua uage ge . • SQL SQL - Data Data Cont Contro roll Lan Langu guag agee . • DB2 DB2 - Prog Progra ram m Pre Prepa para rati tion on . • DB2 DB2 - Appl Applic icat atio ion n Pro Progr gram ammi ming ng • DB2 Feat Feature uress and Util Utiliti ities es .
3
SESSION 1
INTRODUCTION TO DBMS
4
INTRODUCTION TO DBMS
SESSION CONTENTS : • Data and Information . • What is a Database ? • Database Management systems . • Hierarchical DBMS. • Network DBMS . • Relational DBMS .
5
INFORMATION:
Information is refined data,Data that have been put into a meaningful and useful context and communicated to a recipient who uses it to make decisions.
6
DATABASE:
Database is a collection of interrelated data stored together with controlled redundancy to serve one or more application in an optimal fashion.The data is stored in such a fashion that they are independent of the programs or people using the data. NEED OF DATABASE: • Easy retrieval and updation of data. • Avoid inconsistency in data. • Avoid multiple copies of same data.
7
DBMS CHARACTERISTICS
• Centralised Controls. • Inconsistency elimination. • Data can be shared. • Standards. • Controlled redundancy. • Authorise access. • Data integrity. • Data independence. • Performance and efficiency.
8
DBMS MODEL EVOLUTION
•File Management System •Hierarchical Database Management System •Network Database Management System •Relational Database Management System
9
FILE MANAGEMENT SYSTEM
It was the first method to store data in computers.The data was stored and retrieved sequentially from the disk. LIMITATIONS : • Relationships between data items unknown. • Code is dependent on data. • Slow search process. • Sorting records is difficult. • Error prone. • Interpretation of fields by accessing programs. • Data inconsistency. • Data redundancy. • Changes to data structure cumbersome. 10
HIERARCHICAL DBMS
REPRESENTATION: • Tree structure originating from a root. • Record types at different levels. • Parent/Child relationship . • Successor/Predecessor relationship • Node/Leaf. FEATURES • Unique parent for each child . • 1:M relationship between parent and child. • All nodes to be accessed through root parent node except root. 11
HIERARCHICAL MODEL EXAMPLE
Department
Department manager
Project
Project manager
Non-project employee
Project employee
12
DISADVANTAGES OF HIERARCHICAL DBMS
• Many to many relationship not possible. • Cross relationship not possible. • Structural changes -adding or deleting a level is cumbersome. • Limited flexibility in accessing lower nodes.
13
NETWORK DBMS
Network DBMS was proposed by CODASYL (Conference On DAta SYstem Languages) database task group in 1971. REPRESENTATION • Using Using Sets Sets and Links Links . FEATURES • Many to to many many relations relationships hips possib possible. le. • Same data at multip multiple le levels. levels.
14
NETWORK MODEL EXAMPLE
Dept-mgr
Managed by
Department
Has employee
Has project Manages Project Proj-mgr Project ass.
Employee
Employee ass.
15
Assignment
NETWORK MODEL LIMITATIONS
• All interrelat interrelationsh ionships ips difficult difficult to map. map. • Code traces traces out out differen differentt path. path. • Complexity Complexity with high high number number of of operators. operators. • Reorganisa Reorganisation tion complex complex and has has wide impact impact on the system system • Requires Requires exper expertise tise on on part part of user.
16
RELATIONAL DBMS
• Conceptualised by Dr. E.F.Codd at IBM in 1969. • Build on sound Mathematical foundation. REPRESENTATION : • Relation - Table . • Tuple - Record/Row . • Attributes - Field/Columns . • Domain - Set of valid values of attributes. • Degree - Number of columns in a table. • Cardinality - Number of rows in a table.
17
RELATIONAL MODEL EXAMPLE
RELATION
T U P L E S
ATTRIBUTES
EMP# 2000 3000
NAME AMIT SANJAY
AGE 23 23
DEPT# 101 102
4000 5000
ARVIND SATISH
23 24
101 105
18
KEYS PRIMARY KEY
The column or set of columns which can uniquely identify every row in a table is termed as a CANDIDATE key. Every candidate key satisfies these two properties, • Uniqueness - At no time , no two row have the same value for the column or the set of columns. • Minimality - None of the columns can be removed from the key without violating the uniqueness property . For a given table one candidate key is designated as PRIMARY key and all other are designated as ALTERNATE keys.
19
KEYS
FOREIGN KEY
It is possible for one table to contain a column ,or a set of columns , that contains data elements values drawn from the same domains as the columns that form the primary key of some other table. This column or set of columns is called the FOREIGN KEY.
20
FOREIGN KEY
EXAMPLE: EMP# 2000 3000
NAME AMIT SANJAY
DEPT# 101 102
4000 5000
ARVIND SATISH
101 105
DEPT# 101 102
NAME MARKETING PERSONNEL
103 104 105
ADMIN TRAVEL FINANCE
EMPLOYEE The department number column from the employee table can draw values from the department number in department table.
DEPARTMENT DEPT# Primary key of department table Foreign key of employee table
21
RELATIONAL MODEL FEATURES
• Reorganisation of one table does not affect others-Dynamic connections. • No pre-defined connections. • Non procedural data manipulation. • Abandons parent child relationship. • Data arranged in logical mathematical datasets. • Based on mathematical concepts of relational sets. • Each row identified by unique set of attributes. • Same column name used to relate different tables.
22
RELATIONAL MODEL ADVANTAGES
• Flexible ,Simple and easy to use. • 1:1 ,1:M and M:N relationships easily represented. • Simple representation with ternary and higher order relationships easy to represented. • Structural changes simple to make . • Data integrity maintained. • Code does not trace path of data. • Flexible querying.
23
INTEGRITY CONSTRAINTS
• ENTITY INTEGRITY
The entity integrity rule states that no column that is a part of a primary key can have a null value. Otherwise, it will be tough to uniquely identify a row. Primary key with a NULL value is a contradiction in terms - in effect, it would be saying that there is some entity that has no ‘identity’.
24
INTEGRITY CONSTRAINTS
• REFERENTIAL INTEGRITY The referential integrity rule states that every foreign key must either match a primary key value in its associated table or it must be wholly NULL.
25
SESSION 2
DB2 OVERVIEW AND DATA OBJECTS
26
DB2 OVERVIEW
SESSION CONTENTS :
• DB2 History. • DB2 and MVS Relationship . • DB2 objects > Storage Groups . > Tablespaces. > Tables . > Indexes . > Bufferpools . • DB2 Data Types .
27
DB2 OVERVIEW
This course is all about DB2, IBM’s flagship relational database management system. DB2 is available on multiple platforms but we will concentrate on DB2 for MVS/ESA in this course.
28
DB2 – SOME HISTORY ♦
The foundations of Relational Database technology were laid by Dr. E.F. Codd, who in his paper ‘ A Relational model of Data for Large Shared Data Banks ’ laid the basic principles of RDBMS.
♦
IBM built a research prototype called System R which resulted in two commercial releases : > SQL / DS for VM in 1982 and > DB2 for MVS in 1983.
♦
DB2 is available on other platforms too : DB2/2 for OS/2 DB2/6000 for AIX DB2 for Windows/NT. 29
DB2 ENVIRONMENT
IMS IMS
IMS IMS IMS IMS online batch IMS online batch databases
TSO TSO TSO TSO online online
CICS CICS
TSO TSO CICS CICS CICS CICS batch batch online online batch batch
VSAM, DAM files/
DB2 DB2 MVS DB2 DB2database database
30
DASD DASD
IMS databases
DB2 ARCHITECTURE
IMS IMS//DB DB//DC DC SQL statements
Locking Locking Services Services (IRLM) (IRLM)
CICS CICS
TSO TSO
DB2 utilities
Relational Data data Manager system
Other components Buffer Manager
System System services services
Data Base Services
DB2 databases
31
DB2 log
DB2 ARCHITECTURE
DB2 has three major components : • IRLM - IMS resource Lock Manager IRLM provides concurrency control mechanism , called locking, which is required to isolate different users from each other and to maintain integrity. • System services System services control the overall DB2 execution environment . This includes managing the log datasets, gathering statistics for performance monitoring, handling system startup and shutdown.
32
DB2 ARCHITECTURE
• Data Base Services Data base services support the function of the SQL language, i.e. , definition , access control , retrieval and update of user and system data . This component has several subcomponents , among them the relational data system (RDS) , the data manager and the buffer manager.
33
DB2 DATA OBJECTS
♦ ♦ ♦ ♦ ♦ ♦ ♦
DATABASE STORAGE GROUP TABLESPACES TABLES INDEXES BUFFERPOOLS VIEWS
34
DB2 DATA OBJECTS
TABLESPACES TABLE
TABLE
TABLESPACES TABLE
INDEX INDEX STORAGE GROUPS
35
STORAGE GROUPS
Storage Groups are a named collection of direct access volumes, all of the same device type. A volume can appear in more than one storage group and a storage group can contain more than one volume.
Each tablespace and index is associated with a Storage Group.
36
DATABASE
• A Database is a logical grouping of related tablespaces, tables and indexes for administrative purposes. • There is no restriction on accessing data from more than one table in more than one database. • Only restriction is that an index is placed in the same database as the tablespace containing the table. • There is no one to one mapping between databases and storage groups. 37
DB2 TABLESPACES
♦
A table space can be considered as a logical address space on secondary storage that is used to hold one or more stored tables.
♦
One tablespace can be upto approximately 64 billion bytes and there is effectively no limit to the number of tablespaces in a database.
♦
DB2 provides three types of tablespaces : Simple Segmented Partitioned
38
PAGES
PAGE Header
SPACE
Footer Data records Space : A ‘Space’ is a collection of one or more VSAM linear datasets that are logically concatenated to form a linear addressing range. Pages : The datasets in a space contains pages . The pages for a tablespace can be either 4KB or 32KB . All pages use a control interval of 4KB; therefore when a page with 32KB is needed , 8 CI’s are assigned . 39
PAGES
A page is the unit of transfer between secondary and primary storage. Even to access one byte of data in a page , the complete page is brought in the main memory. A page can hold upto 127 data records . With the DB2 V3 introduced compression techniques , upto 255 data records can be placed in a page . Each record is held completely in a page , that is , records do not span pages .
40
SIMPLE TABLESPACES
TABLESPACE
Table1 – Row1
Table1 – Row2
Table1 - Row3
Table1 - Row4
Table2 – Row2
Table2 – Row1
Table2 – Row3
Table2 – Row4
PAGE 1
PAGE 2
41
SIMPLE TABLESPACE
A Simple tablespace can contain more than one stored table. Rows from different tables can be placed in a page. If this tablespace is dropped , its rows are not deleted . The space occupied by the rows does not become free until the tablespace is reorganised . • Advantage
: faster access of data from related tables.
• Disadvantage : if a query requires data from only one table DB2 still has to interrogate all pages resulting in additional I/O activity.
42
SEGMENTED TABLESPACE
SEGMENT1
SEGMENT2
SEGNMENT3
TABLE3 TABLE1
TABLE2
43
SEGMENTED TABLESPACE
A segmented tablespace is intended to store more than one table. The space within the tablespace is divided into segments , where a segment consists of logically contiguous set of ‘n’ pages (where n is a multiple of 4 between 4 and 64)and is the same for all segments in the tablespace. A segmented tablespace can have between 1 and 32 VSAM linear datasets . The maximum size of a linear dataset in a segmented tablespace is 2 GB and so the maximum size of a segmented tablespace is 64 GB .
44
SEGMENTED TABLESPACE
•Advantages
: To search all rows of a table , it is not necessary to search the whole tablespace , but only the segments that contain the table. If a segmented tablespace is dropped , its segments become immediately reusable .
45
PARTITIONED TABLESPACE
Part 1
Part 2
Part 3
46
Part 4
PARTITIONED TABLESPACES
Partitioned Tablespaces are intended for tables that are sufficiently large and operationally difficult to deal with as an entire unit.The table is partitioned in accordance with value ranges of the partitioning column. • Advantage : -improved data availability -each part can be placed on a different DASD volume thereby spreading the tablespace I/O load.
47
INDEX Index
101 102 103 104
1 3 4 2
101 104
TABLE
102 103
Key
Position
An index is an ordered set of pointers to data rows of a table. The contents of the index is sorted on one or more specified columns . Indexes are maintained by DB2 once they are created . Any number of indexes can be defined on a particular table.
48
INDEXES
•Disadvantages of large number of indexes : • DB2 has to update both table as well as indexes, which leads to slower processing of requests. •More storage space required.
49
INDEXES – Terminology
♦
Indexing Keys - the columns of the table on which the index is defined.
♦
Unique Index
- ensures that the values of the indexed column(s) are not duplicated.
♦
Primary Index
- is defined on the primary key of the table and is always unique.
50
INDEXES : Terminology
Partitioning Index - The column(s) on which the partitioning
Clustering Index
is done is called the partitioning key. The partitioning index must be specified on the partitioning key specifying the partitioning values. - determines the order in which records of the base table. Each table can have only one clustering index. Adding a clustering index after loading the data does not reorganize the data.
51
BUFFERPOOLS
The bufferpools in DB2 consists of 4KB slots in memory. After reading from the DASD , the data and the index pages go into the slots until the buffer manager decides to use the slot for other pages . The idea is that the data gets a chance to be reused , thus minimizing I/O. DB2 maintains logical chains of pages to be written ( because of they have been updated) and waits as long as reasonable before writing them . The pages writes are performed asynchronously and therefore do not affect response times. There are 50 4KB bufferpools - BP0 thru BP49 . There are 10 32KB bufferpools - BP32K0 thru BP32K9 . 52
VIEWS
A VIEW is another way to represent data , a different way to look at it . Views are derived from base tables , or from other views . Unlike base tables, which represent physically stored data , views are virtual tables that have no associated physical storage . Advantages : • A view may be used as a part of security mechanism which allows user to access only a portion of the table. • Complicated queries can be stored as a view . • Views can also minimize the program modification that may be required when base table changes . 53
VIEWS Example : Cust_no
Cust_name Cust_dob
Cust_branch Cust_rating
111111
MACK
19501110 12
4
222222
JACK
19770921 156
6
333333
PACK
19800202 8
3
Cust_no
Cust_name Cust_branch
111111
MACK
12
222222
JACK
156
333333
PACK
8
View
54
Basetable
DB2 - directory & Catalog
Information about the Db2 system is maintained in Db2 directory and catalog . Directory : The directory is kept solely for DB2’s internal use. Catalog : contains descriptive information about the plans , it may be accessed by DB2 and its users.Used by DB2 to determine the access paths and manage system resources.DB2’s catalog contains approximately 30 tables which are central to Db2’s functioning . Example : SYSIBM.SYSTABLES SYSIBM.SYSCOLUMNS SYSIBM.SYSINDEXES 55
DATA TYPES: STRINGS
STRING
CHARACTER
Fixed length
GRAPHIC
Varying length
Fixed length
56
Varying length
DATA TYPES : STRINGS
Strings can be divided into characters and graphic. CHARACTER: Character data type fields are used to store alphanumeric data items . • Fixed length character string : CHAR(X) or CHARACTER(X) A fixed length character string is ‘CHAR’ must have its length specified. Each value in this column is this length. A shorter value is padded with blanks in the end. The length (value of x) must be greater than 0 and less than 255 , it occupies x bytes. 57
DATA TYPES : STRINGS
•Varying length character string : VARCHAR(X) & LONG VARCHAR VARCHAR is the variable length character string with the length greater than 0 and less than the page size. If the maximum length is greater than 254, it is considered as LONG VARCHAR . It occupies x+2 bytes . GRAPHIC
Graphic strings are similar to character strings . The difference is that instead of occupying one byte per character, they use two bytes to represent a character.
58
DATA TYPES : DATETIME
DATETIME
DATE
TIMESTAMP
59
TIME
DATA TYPES : DATETIME DATE Date is represented as a sequence of eight unsigned packed decimal digits occupying four bytes; permitted value are legal dates in the range January 1st ,1 A.D to December 31st , 9999 A.D . Internal format :YYYYMMDD. TIME
Time is represented as a sequence of six unsigned packed decimal digits, occupying three bytes ; permitted values of legal times is in the range midnight to midnight i.e 000000 to 240000 Internal format : HHMMSS. 60
DATA TYPES : DATETIME
TIMESTAMP
Timestamp is represented as a sequence of 20 unsigned packed decimal digits, occupies ten bytes ; permitted values of legal timestamp are in the range 0001010100000000000 to 99991231230000000000 .
Internal format : YYYYMMDDHHMMSSnnnnnn .
61
DATA TYPES : NUMERIC
NUMERIC
Binary integer
Small
Decimal
Floating point
Small
Large
62
Large
DATA TYPES : NUMERIC INTEGER Integer is used to store non-decimal numeric information - 4-byte binary integer , 31 bits for number and 32nd bit to store sign of the number. Range : -2,147,483,648 to 2,147,483,647.
SMALLINT Two-byte binary,15 bits for number and 16th bit for sign of the number. Range : -32,768 to 32,767.
63
DATA TYPES : NUMERIC
DECIMAL(x,y)
A packed decimal number with precision of ‘x’ (ranging from 1 to 31) and a scale of ‘y’ (ranging from 1 to less than the precision value).
64
DATA TYPES : NUMERIC FLOAT(p)
Floating point number n, represented by a binary fraction f of p binary digits precision (-1
65
SQL
STRUCTURED QUERY LANGUAGE
66
STRUCTURED QUERY LANGUAGE
• SQL - DATA DEFINITION LANGUAGE. • SQL - DATA MANIPULATION LANGUAGE . • SQL - DATA CONTROL LANGUAGE .
67
SQL – Structured Query language
SQ L
DDL
DML
68
D CL
STRUCTURED QUERY LANGUAGE
Data Definition Language (DDL) are statements used to create and maintain DB2 objects . Data Manipulation Language (DML) are statements used to access and modify data available in tables. Data Control Language (DCL) are control statements that govern data security. 69
SQL – DDL
DATA DEFINITION LANGUAGE
70
DDL Operations
♦ ♦ ♦
♦
CREATE ALTER DROP
- Defines a new object - Modifies an object - Deletes a defined object
Entered interactively or embedded in application programs.
71
DDL – OBJECTS vs OPERATIONS
CREATE ALTER
DROP
Storage Group Database
Tablespace Table Index Synonym View
72
DDL – CREATE Storage Group
Syntax : CREATE STOGROUP stogroup-name VOLUMES (vol1, vol2, .......) VCAT catalog-name [ PASSWORD password ] Example : CREATE STOGROUP TRG1TO1 VOLUMES ( DBPK01,DBPK02) VCAT DB220TRG Create Storage Group defines a set of DASD volumes controlled by a VSAM catalog
73
DDL – ALTER Storage Group
Syntax : ALTER STOGROUP stogroup-name ADD VOLUMES (vol1, vol2, ....... ) REMOVE VOLUMES (vol1, vol2, ....... ) [PASSWORD password ] Example : ALTER STOGROUP TRG1T01 ADD VOLUMES (DBPK03) REMOVE VOLUMES (DBPK01) ALTER STOGROUP statement can be used to add or remove DASD volumes associated with the Storage Group.
74
DDL – CREATE Database
Syntax : CREATE DATABASE database-name [STOGROUP stogroup-name ] [BUFFERPOOL bufferpool-name ] Example : CREATE DATABASE TRG1T01 STOGROUP TRG1T01 BUFFERPOOL BPO CREATE DATABASE statement is used to define a database which uses stogroup-name as its storage group, that will be used to support DASD space requirements for tablespaces and indexes within the database . 75
DDL – CREATE TABLESPACE
Syntax : CREATE TABLESPACE tablespace-name IN database-name USING STOGROUP stogroup-name PRIQTY qty SECQTY qty ERASE YES / NO LOCKSIZE ANY/PAGE/TABLESPACE/TABLE BUFFERPOOL bufferpool-name CLOSE YES / NO FREEPAGE amount PCTFREE amount
76
DDL -Tablespace Parameters
PRIQTY
- amount of physical storage allocated when tablespace is created.
SECQTY
- secondary allocation of space as amount of data in tablespace grows in size.
ERASE
- indicates whether the DB2 defined data sets are to be erased when tablespace is dropped.
LOCKSIZE - indicates type of locking ( Page / Table /Tablespace /DB2 decided ) 77
DDL – Create Tablespace Parameters
BUFFERPOOL - bufferpool to be associated with the tablespace. CLOSE
- indicates whether data sets associated with the tablespace should be closed if there are no current users of the tablespace.
FREEPAGE
- specified number of pages after which an empty page is available.
78
DDL – CREATE TABLE
CREATE TABLE table-name ( col-name1 col-type1 [ NOT NULL / NULL / NOT NULL WITH DEFAULT ] [, col-name2 col-type2 ............]) [ PRIMARY KEY(col-name1, col-name2 ...) [ FOREIGN KEY [constraint name] (col-name1, col-name2 ...) REFERENCES base-table [ON DELETE RESTRICT / CASCADE / SET NULL ] ] [ IN database.tablespace name IN DATABASE database name ]
79
DDL – CREATE TABLE
Example : CREATE TABLE EMP ( EMP# CHAR ( 5 ) ENAME VARCHAR (2 ) SAL DECIMAL PRIMARY KEY ( EMP# ) )
80
NOT NULL, NOT NULL, NOT NULL WITH DEFAULT,
DDL – CREATE TABLE Like Existing Table
Syntax : CREATE TABLE table-name LIKE existing-table-name This format allows the user to create a table table-name with the same column description as some existing table existing-table-name . The table table-name does not inherit any primary or foreign key definitions from existing-tablename.
81
DDL – ALTER TABLE
Syntax : ALTER TABLE table-name ADD column definition PRIMARY KEY primary key definition DROP PRIMARY KEY DROP FOREIGN KEY constraint name
This statement is used to alter the columns, keys and other specifications of a previously defined table .
82
DDL – CREATE INDEX Syntax :
CREATE [ UNIQUE ] INDEX index-name ON table-name ( col-name [ ASC / DSC], ...) [ USING STOGROUP stogroup-name PRIQTY qty SECQTY qty ERASE YES / NO ] [ CLUSTER ] [BUFFERPOOL bufferpool-name ] [ CLOSE YES / NO ] [ PCTFREE amount ] [FREEPAGE amount ]
The above statement is used to define a index on a previously defined table . The various constraints are also checked during the CREATE INDEX STATEMENT. If any of the constraints is not met the index is not created. 83
DDL - CREATE INDEX
EXAMPLE : CREATE UNIQUE INDEX XS ON SUPPLIER (S#) USING STOGROUP TRG1T01 PRIQTY 16 SECQTY 4 ERASE NO
84
DDL – CREATE VIEW
Syntax : CREATE VIEW view-name (column-name,...) AS ( SELECT col-name1, col-name2 ... FROM table-name) WITH CHECK OPTION
The above statement creates a view on one or more tables or views. The column-name is a list of column in the view . If the column names are not specified then the view inherits the name of the columns used in the subselect. 85
DDL – DROP Statement
Syntax :
DROP object-type object-name
Example : DROP DATABASE database-name DROP TABLE table-name The DROP statement deletes an object . Any object that are directly or indirectly dependent on that object are also deleted. Object types can be : TABLE,VIEW,INDEX,SYNONYM,STOGROUP, DATABASE,TABLESPACE .
86
DDL – DROP Dependencies
TABLESPACE TABLE TABLE
VIEW1
VIEW2
VIEW3
87
SQL – DML
DATA MANIPULATION LANGUAGE
88
DML – Operations
♦
SELECT
- Retrieves data from the table
♦
UPDATE
- Changes values of columns / rows
♦
DELETE
- Deletes row(s)
♦
INSERT
- Inserts a new row
89
DML – EXAMPLE
EM P EMP#
ENAME
1000 1001 1002 1003
Arun Ramesh Rahul Rohit
DEPT#
SAL
10 20 10 10
8000.00 9000.00 8500.00 7500.00
DEPT DEPT# 10 20 30 40
DNAME Finance Admin Sales Personnel
90
MGR# 1002 1001 1001 1002
SELECT – SIMPLE QUERY ♦
SELECT EMP#, SAL FROM EMP WHERE EMP# = 1000
Output – EMP# 1000 ♦
SAL 8000.00
SELECT * FROM EMP WHERE EMP# = 1000
Output – EMP# 1000
ENAME Arun
DEPT# SAL 10 8000.00
91
MGR# 1002
DATA COMPARISON
• OPERATORS > , < , = , >= , <= , <> • BOOLEAN NOT , AND ,OR . • PARTIAL VALUES % , _ , LIKE • MISC. IN , BETWEEN .
92
DML – Retrieving With Ordering
SELECT * FROM EMP ORDER BY DEPT#
Output – EMP# 1000 1002 1003 1001
ENAME Arun Rahul Rohit Ramesh
DEPT# 10 10 10 20
SAL 8000.00 8500.00 7500.00 9000.00
93
MGR# 1002 1001 1001 1002
DML - JOIN Queries Simple Equijoin
SELECT EMP#, ENAME, DEPT#, DNAME FROM EMP, DEPT WHERE EMP.DEPT# = DEPT.DEPT#
Output – EMP#
ENAME
DEPT#
DNAME
1000 1001 1002 1003
Arun Ramesh Rahul Rohit
10 20 10 10
Finance Sales Finance Finance
94
DML – JOIN Queries
Self-Join ♦
Join of a table with itself SELECT E.EMP#, M.MGR# FROM EMP E, EMP M WHERE E.EMP# = M.MGR#
95
DML – Subqueries OR Nested queries
♦
Simple Subquery
List of Employees whose salary is greater than the salary of Employee number 1000 SELECT EMP# FROM EMP WHERE SAL > ( SELECT SAL FROM EMP WHERE EMP# = 1000 )
96
DML - Correlated Subquery
Correlated subquery provides further level of flexibility by permitting the nested SELECT statement to refer back to columns in the previous SELECT statement. Correlated subqueries differ from normal subqueries in that the nested SELECT statements refers back to the table in the first SELECT statement.
97
DML - Correlated Subqueries
EXAMPLE: SELECT DNAME FROM DEPT WHERE ‘ARUN’ IN (SELECT ENAME FROM EMP WHERE DEPT# = EMP.DEPT#)
98
DML – Column Functions
Functions operate on collection of values in a Column. ♦ ♦ ♦ ♦ ♦
COUNT - number of values in the column SUM - sum of values in the column AVG - average of values in the column MAX - largest value in the column MIN - smallest value in the column
99
DML – Column Functions
Example : SELECT COUNT(*) FROM EMP
-
Gives number of employees
♦
SELECT EMP#, MAX(SAL) FROM EMP
Output – EMP# 1001
MAX(SAL) 9000
100
DML – SELECT Statement
Group By Clause SELECT DEPT#, SUM(SAL) FROM EMP GROUP BY DEPT#
Output – DEPT# 10 20
SUM(SAL) 24000 9000
101
DML – SELECT Statement
Group By – Having Clause SELECT DEPT# FROM EMP GROUP BY DEPT# HAVING COUNT(*) > 1
Output – DEPT# 10
102
DML – INSERT Statement
Syntax : INSERT INTO table-name VALUES ( literal 1, [literal 2, .........] ) Example : INSERT INTO EMP VALUES ( 1004, ‘Sameer’, 10, 7500, 1002)
103
DML – UPDATE Statement
Syntax : UPDATE table-name SET col-name1 = expression [, col-name2 =expression , ......] [WHERE search-condition ] Example : UPDATE EMP SET DEPT# = 20 SAL = SAL + 100 WHERE EMP# = 1004
104
DML – DELETE Statement
Syntax : DELETE FROM table-name [ WHERE search-condition ]
DELETE FROM EMP WHERE EMP# = 1004
Execution depends upon DELETE RULE – CASCADE, RESTRICT or SET NULL.
105
PROGRAM PREPARATION
106
DB2 PROGRAM PREPARATION
• Execution cycle of DB2 program . • Definitions > DBRM .
> Bind . > Plans. > Packages.
107
PROGRAM PREPARATION COBOL PROGRAM WITH EMBEDDED SQL MODIFIED COBOL PGM
DB2 PRECOMPILER
DB2 CATALOG
COMPILE & LINKEDIT
LOAD MODULE
BIND
PACKAGE
108
DBRM’S
EXECUTION CYCLE OF DB2 PROGRAM
• Precompilation The Precompilation separates SQL statements from NON-SQL statements.From this step onwards the further processing is done in two separate paths. NON-SQL PATH: • Compilation And Linking The non-SQL part of COBOL program goes through compilation and linking ,after all the SQL statements are commented .This results in a LOAD module
109
EXECUTION CYCLE OF DB2 PROGRAM (CONT.) SQL PATH :
•Bind The extracted SQL part of the COBOL program which is called Data Base Request Module(DBRM) goes through an analogous process called BIND ,to produce an executable PLAN/PACKAGE.
110
EXECUTION CYCLE OF DB2 PROGRAM(CONT) RUNNING THE PROGRAM.
After the above mentioned steps are over ,the two separate physical components are produced • PLAN-containing the access path specifications for the SQL statements in the program. • LOAD MODULE- containing the executable machine instructions for the COBOL statements in the program. This program can now be executed in a TSO batch process.
111
WHAT IS ...?
• DBRM DBRM is a module containing SQL statements extracted from the source program by the DB2 precompiler.It is stored as a member of a partitioned dataset. It is not stored in the db2 catalog or directory. •BIND Bind is a DB2 routine that analyses each SQL statement and determines the most efficient access path to get the data.It also checks for errors and accesses the DB2 catalog to check that the resources mentioned in the SQL statement actually exists and also that the binder is authorised to perform each statement in the program.
112
WHAT IS ...?
•PLAN Plan is an executable module containing the access path logic provided by the db2 optimizer.It can be composed of one or more DBRM’s and packages. Plans are created by BIND command.
113
WHAT IS ?
•PACKAGE Package is a single bound DBRM with optimized access paths. Before DB2 v2.3 the only bind option was at the Plan level. By using Packages the table access logic is packaged at a lower level for granularity - at the Package or program level. To execute a Package it must be first included in the Packages list of a Plan.Packages can never be directly executed.
114
WHAT IS ...?
•COLLECTION Collection is a user defined name (1 to 18 characters) that the programmer must specify for every package. A collection is not an actual,physical data-base object. A collection is a grouping of DB2 packages. By specifying different collection identifier for a package ,the same DBRM can be bound to different packages. This capability permits programmers to use the same DBRM for different packages , enabling easy access to tables that have the same structure but different owners.
115
DB2 - APPLICATION PROGRAMMING
Using Embedded SQL
116
HOST LANGUAGES
♦ ♦ ♦ ♦ ♦
COBOL PL/I C ASSEMBLER FORTRAN
117
EMBEDDED SQL
Steps to coding SQL in a program : ♦ ♦ ♦ ♦ ♦
Delimit all SQL statements. Describe host variables. Declare a Communication Area. Code SQL statements to access data. Handle exceptional conditions.
118
STATIC SQL
Statement : ♦ ♦
Coded in the program. Same function applied on the same tables and columns.
Bind : On all SQL statements ♦ ♦
Before program execution Access strategy is permanant
119
SQL Delimiters in COBOL
EXEC SQL SQL Statement END-EXEC
♦
♦
The EXEC SQL must be coded after column 12 thru 72 . No COBOL Statements allowed between the delimiters.
120
COBOL Host Variables
Fields in the program’s WORKING-STORAGE or LINKAGE-SECTION that are referenced in your SQL statements are called Host Variables. ♦
All COBOL host variables must be declared in the DATA DIVISION.
♦
A colon (:) must precede all host variables in an SQL statement.
121
COBOL Host Variable
Example : EXEC SQL SELECT ENAME INTO :WS-ENM FROM EMP WHERE EMP# = 1000 END-EXEC.
122
SQLCA
The SQLCA contains a set of fields fields that DB2 updates updates after each SQL statement is executed . It indicates the results of executing the statement. In a COBOL program they will be a combination of binary and alphanumeric fields . The most important field in SQLCA is SQLCODE , a binary full word field that is updated with a SQL return code after each SQL statement .
123
INCLUDE
EXEC SQL INCLUDE SQLCA | SQLDA | member-name END-EXEC. Member-name: names a member of partitioned dataset. To include SQL statement or COBOL host variable declarations from a member of a partitioned dataset, we use the INCLUDE statement.
124
ERROR HANDLING ♦
After the execution of every SQLstatement DB2 sets the SQLCODE value. SQLCODE = 0 ,
execution was successful
SQLCODE > 0 ,
execution was successful with a warning
SQLCODE < 0 ,
execution was not successful
SQLCODE = 100 ,
no data found
125
ERROR HANDLING
WHENEVER : EXEC SQL WHENEVER < condition > < action > END-EXEC.
Condition :
SQLWARNING SQLERROR NOT FOUND
Action
CONTINUE PERFORM para-name
:
126
PROCESSING MULTIPLE ROWS
Multi-Row SELECT
TABLE RESULT TABLE
127
Retrieves one row at a time.
DECLARE CURSOR
EXEC SQL DECLARE CUREMP CURSOR FOR SELECT EMP#, ENAME FROM EMP WHERE DEPT# = 10 END-EXEC.
128
OPEN CURSOR
♦
The Cursor must be opened before any rows are retrieved.
EXEC SQL OPEN cursor-name END-EXEC.
129
FETCH CURSOR
♦
The FETCH statement is used to move the contents of the current selected row into host variables.
EXEC SQL FETCH CUREMP INTO :WS-ENO, :WS-ENM END-EXEC.
130
End-Of-Data Processing
TABLE 1000 1002 1003
Arun Rahul Rohit
HOST VARIABLES SQLCODE FETCH
1000 1002 1003 1003
131
Arun Rahul Rohit Rohit
0 0 0 100
CLOSE CURSOR
The Cursor has to be closed after processing the rows of the result table.
EXEC SQL CLOSE CUREMP END-EXEC.
132
DYNAMIC SQL
Statement : Acquired during execution. Function can vary and can be applied to different tables and columns. Bind : On a single statement At statement execution. Access strategy not saved.
133
Executing a program using Dynamic SQL Program Capture, format DSQL
Translate, precompile DSQL
Precompile, Link, Bind
Bind DSQL
Process DSQL Process Static SQL
134
STATIC
vs
DYNAMIC
Statement : Static : Object and Action known Dynamic : Object or Action not known
Bind : Static : is bound once Dynamic : is is bound every time ime
135
TYPES OF DYNAMIC SQL
♦
♦
EXECUTE IMMEDIATE NON SELECT DYNAMIC SQL
♦
FIXED LIST SELECT
♦
VARYING LIST SELECT
136
EXECUTE IMMDEDIATE
Implicitly prepares and executes complete SQL statements coded in host variables. If any data has to be retrieved, the SQL portion of the program should consist of two parts : - moving moving complete complete text text for statement statement to be executed executed into the host variable. - issu issuin ing g an an Execute Immediate statement.
137
EXECUTE IMMEDIATE
Syntax : EXEC SQL EXECUTE IMMEDIATE :host variable END-EXEC.
Example : WORKING- STORAGE SECTION. 01 WS-HOSTVAR. 05 WS-HOSTVAR-LEN PIC S9(4) COMP. 05 WS-HOSTVAR-TXT PIC X(50).
138
EXECUTE IMMEDIATE
Example ( Cond. )
PROCEDURE DIVISION. MOVE 32 TO WS-HOSTVAR-LEN. MOVE “DELETE FROM EMP WHERE DEPT# = 10” TO WS-HOSTVAR-TXT. EXEC SQL EXECUTE IMMEDIATE :WS-HOSTVAR END-EXEC.
139
NON-SELECT Dynamic SQL
• This statement is used to PREPARE and EXECUTE the SQL statement in an application program. • Cannot be used to SELECT statements. • Host variables cannot be used in the Prepared Statement. • Syntax :
EXEC SQL PREPARE statement-name FROM :host variable END-EXEC. EXEC SQL EXECUTE statement-name END-EXEC. 140
Parameter Markers
• Since host variables cannot be included in the statement string, a similar feature is provided called the Parameter Marker . • It is a question mark ( ? ) which is included in the statement. • DB2 substitutes the values for the parameter markers before it executes the SQL statement.
141
FIXED LIST SELECT
• This is used when the structure of the result table is known, when the program is coded. • To prepare a Fixed List Select Dynamic SQL, a program uses five different SQL statements :
Declare Cursor
Prepare Statement
Open Cursor
Close Cursor
Fetch
142
FIXED LIST SELECT Dynamic SQL
Example : SQL to execute : SELECT EMP#, ENAME FROM EMP WHERE DEPT# = ?
Move the ‘ SQL to execute’ to WS-HOSTVAR-TXT. EXEC SQL DECLARE EMPCUR CURSOR FOR FLSQL END-EXEC. EXEC SQL PREPARE FLSQL FROM : WS-HOSTVAR-TXT END-EXEC.
143
FIXED LIST SELECT Dynamic SQL Example (Cond.) Move the required value to DNO. EXEC SQL OPEN EMPCUR USING :DNO END-EXEC.
Loop until no more rows to fetch. EXEC SQL FETCH EMPCUR INTO :ENO, :ENM END-EXEC. EXEC SQL CLOSE EMPCUR END-EXEC.
144
VARYING LIST SELECT Dynamic SQL
• Facility to change columns, tables during execution. • Thus more flexible than any other Dynamic SQL. • Since the number and type of host variables cannot be known beforehand, this class of SQL is most complicated among the dynamic SQL's.
145
DB2 FEATURES
• Transactions & its properties . • Recovery utilities . • Security . • DB2 privileges . • DB2 Utilities.
146
Transactions : ACID Properties Atomicity
: means either all work of the transaction is applied or none.
Consistency : means all that the database is in a consistent state after the execution of the transactions. Isolation
: requires that a transaction not be influenced by changes made by other concurrently executing transactions.
Durability : means that the work associated with a successfully executed transaction is applied to the database.
147
CONCURRENCY
To provide concurrent access , the database manager uses a software mechanism called locks. Attributes of locks : • Object
: The resource being locked.
•Duration : How long the lock is needed. •Mode
: The type of access allowed for the lock owner as well as type of access permitted for concurrent users of the locked object.
148
CONCURRENCY : DeadLock
User 1
User 2
Lock
Requested
Lock held
Lock held
Object 1
Object 2
149
NEED FOR BACKUP
FAILURES : • Hardware Failure • Program Failure • Natural Calamity
150
RECOVERY UTILITIES
• BACKUP
- IMAGE COPY ( FULL, INCREMENTAL ) - MERGECOPY
• RECOVER
-
RECOVER QUIESCE REPORT CHECK REPAIR
151
RECOVERY UTILITIES COPY • Backup of pages are prepared • Copies pages of a tablespace into a sequential dataset. • DB2 lets users take a full backup called IMAGECOPY or backup of changes only since the last backup called INCREMENTAL backup. MERGECOPY :
• Merges all the incremental copies to produce a single copy or merges the full image copy with all the incremental copies to produce a full image copy.
152
RECOVERY UTILITIES
QUIESCE :
•Records the point of consistency for related tablespaces. •Ensures that all the tablespaces in the scope of QUIESCE are referentially intact. CHECK :
• Checks referential integrity of related tables. • Checks consistency of indexes with the data.
153
RECOVERY UTILITIES
REPORT :
•The input is a single tablespace and the output is a report containing information about related tables and tablespaces. • Provides information necessary for the recovery of a tablespace. REPAIR :
•This utility is designed to modify DB2 data and associated data structures when there is an error or problem.
154
SECURITY
What can be protected ?
Access to the DB2 Subsystem. Datasets used by DB2. DB2 Objects.
155
How Protection Is Achieved ?
RACF : Security managers like RACF ( Resource Access Control Facility ) are used to protect DB2 resources - DB2 Subsystem, DB2 Objects.
• Protection of DB2 objects is done within DB2. • Each access to the DB2 object is validated against a set of privileges associated with process issuing the request.
156
DB2 Privileges
• IMPLICIT -
• EXPLICIT
Automatic Privileges for the OWNER of the object. - Specific privileges provided by GRANT SQL.
• INDIRECT - Through EXECUTE privilege or as a member of a group.
157
GRANTING Explicit Privileges
Syntax : GRANT privilege ON object-type object-name TO ( user-id | PUBLIC ) WITH GRANT OPTION.
PUBLIC
: to all the users in the system.
WITH GRANT OPTION
: users can grant to other users.
158
REVOKING Privileges
REVOKE undoes a matching GRANT Syntax : REVOKE privilege ON object-type, object-name FROM (user-id | PUBLIC )
159
Cascading REVOKE
Grants privilege Grants privilege to to User1 User2 User3
User1 REVOKES the privilege from User2.
User1
User2
160
User3
DB2 - Utilities
LOAD UNLOAD REORG RUNSTATS
161
LOAD Utility
Used to load data into one or more tables. Input : • File containing data. Output : • Loaded table. • Summary report of errors encountered.
162
LOAD Utility
Features : • Automatic data conversion between compatible data types. • Data loaded in the sequence presented, no sort invoked. • Indexes built.
163
UNLOAD Utility
• An IBM supplied program that unloads data from a table in the LOAD utility format. • A where clause can be supplied to selectively unload row.
OUTPUT:
• Sequential data set containing the unloaded data. • Load control statements.
164
REORGANISING Utility
REORG : • Eliminates fragmentation in tables and indexes. • May order the pages of a table according to the order of the index. • Will restore any free space for later insertion.
165
RUNSTATS
• Updates statistics of tables and indexes and stores this information in the DB2 catalog. • The Optimizer ( part of Bind ) analyses this information in determining the best access strategy. • Should be used after massive changes to the data and after running REORG.
166